/lmg/ - a general dedicated to the discussion and development of local language models.Previous threads: >>102068958 & >>102058880►News>(08/22) Jamba 1.5: 52B & 398B MoE: https://hf.co/collections/ai21labs/jamba-15-66c44befa474a917fcf55251>(08/20) Microsoft's Phi-3.5 released: mini+MoE+vision: https://hf.co/microsoft/Phi-3.5-MoE-instruct>(08/16) MiniCPM-V-2.6 support merged: https://github.com/ggerganov/llama.cpp/pull/8967>(08/15) Hermes 3 released, full finetunes of Llama 3.1 base models: https://hf.co/collections/NousResearch/hermes-3-66bd6c01399b14b08fe335ea>(08/12) Falcon Mamba 7B model from TII UAE: https://hf.co/tiiuae/falcon-mamba-7b►News Archive: https://rentry.org/lmg-news-archive►Glossary: https://rentry.org/lmg-glossary►Links: https://rentry.org/LocalModelsLinks►Official /lmg/ card: https://files.catbox.moe/cbclyf.png►Getting Startedhttps://rentry.org/llama-mini-guidehttps://rentry.org/8-step-llm-guidehttps://rentry.org/llama_v2_sillytavernhttps://rentry.org/lmg-spoonfeed-guidehttps://rentry.org/rocm-llamacpphttps://rentry.org/lmg-build-guides►Further Learninghttps://rentry.org/machine-learning-roadmaphttps://rentry.org/llm-traininghttps://rentry.org/LocalModelsPapers►BenchmarksChatbot Arena: https://chat.lmsys.org/?leaderboardCensorship: https://hf.co/spaces/DontPlanToEnd/UGI-LeaderboardCensorbench: https://codeberg.org/jts2323/censorbenchJapanese: https://hf.co/datasets/lmg-anon/vntl-leaderboardProgramming: https://hf.co/spaces/mike-ravkine/can-ai-code-results►ToolsAlpha Calculator: https://desmos.com/calculator/ffngla98ycGGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-CalculatorSampler visualizer: https://artefact2.github.io/llm-sampling►Text Gen. UI, Inference Engineshttps://github.com/oobabooga/text-generation-webuihttps://github.com/LostRuins/koboldcpphttps://github.com/lmg-anon/mikupadhttps://github.com/turboderp/exuihttps://github.com/ggerganov/llama.cpp
►Recent Highlights from the Previous Thread: >>102068958--Testing model intelligence with SVG output and image drawing prompts: >>102079522 >>102081179 >>102081305 >>102081339 >>102081435 >>102081576 >>102081615 >>102081625 >>102082050 >>102082121 >>102082065 >>102082345 >>102080359 >>102080804 >>102082930--Recommendations for learning neural network basics: >>102069696 >>102070090 >>102070097 >>102070256--Proposal for "phrase_ban" feature to reduce repetitive phrases in llama.cpp: >>102073398 >>102073451--LLMs struggle with world-modelling, humans not much better: >>102069670 >>102069967 >>102069995 >>102070026 >>102070110 >>102075602 >>102075746 >>102076112 >>102076892--Frankenmerges and dynamic approaches to model generation: >>102079290 >>102080652 >>102080692 >>102080770 >>102080806 >>102080898 >>102080967 >>102081134 >>102081226 >>102081309 >>102081552--Anon asks about multimodal models for text and image input with llama.cpp: >>102081155 >>102081189 >>102081329 >>102081354 >>102084578 >>102081312 >>102083801--Anon asks about Hermes models, LoRAs, and using llama 3.1 with a long context window: >>102080140 >>102080249 >>102080250 >>102080338 >>102080336--SillyTavern and model formatting for roleplaying: >>102078247 >>102078281 >>102078380 >>102078375 >>102078419 >>102078520 >>102078684 >>102078803--Small models struggle with output quality and understanding instructions: >>102080019 >>102080097 >>102080127 >>102080143 >>102080193--Google DeepMind employees protest contracts in open letter: >>102073189--Gemma 2 FlashAttention support merged in llama.cpp: >>102070066 >>102070155 >>102070200--Deslop method using synthetic prompts and LLMs: >>102085032 >>102085049 >>102085063--Miku (free space): >>102068985 >>102069450 >>102070100 >>102070188 >>102081515 >>102082498 >>102082650 >>102083192 >>102083261►Recent Highlight Posts from the Previous Thread: >>102068974
>>102086459>>102086466The validity of these Mikus is questionable.
>>102077774its from a hypnosis card. i can hypnotise people and give them commands.the commands are always shown at the bottom.>>102084811that was magnum-12b-v2-q5_k.gguf.https://files.catbox.moe/dqz9qk.jsonthis is probably not even a good prompt, no idea where i got it from, but since it works i'm not gonna touch that.just 0.7 temp and DRY. thats it.Pic related is Theia-21B-v2b-GGUF which I'm currently playing around with.Its the good nemo base. Nvidia/Mistral cooked good. The instruction following makes it fun. (See the last difficult instructions that are followed very well) I remember how bad the smaller llama1 models were.But still obviously retarded sometimes, getting stuff mixed up.
Memory-Efficient LLM Training with Online Subspace Descenthttps://arxiv.org/abs/2408.12857Recently, a wide range of memory-efficient LLM training algorithms have gained substantial popularity. These methods leverage the low-rank structure of gradients to project optimizer states into a subspace using projection matrix found by singular value decomposition (SVD). However, convergence of these algorithms is highly dependent on the update rules of their projection matrix. In this work, we provide the \emph{first} convergence guarantee for arbitrary update rules of projection matrix. This guarantee is generally applicable to optimizers that can be analyzed with Hamiltonian Descent, including most common ones, such as LION, Adam. Inspired by our theoretical understanding, we propose Online Subspace Descent, a new family of subspace descent optimizer without SVD. Instead of updating the projection matrix with eigenvectors, Online Subspace Descent updates the projection matrix with online PCA. Online Subspace Descent is flexible and introduces only minimum overhead to training. We show that for the task of pretraining LLaMA models ranging from 60M to 7B parameters on the C4 dataset, Online Subspace Descent achieves lower perplexity and better downstream tasks performance than state-of-the-art low-rank training methods across different settings and narrows the gap with full-rank baselines.https://github.com/kyleliang919/Online-Subspace-Descent?tab=readme-ov-filefollow-up to galore (https://arxiv.org/abs/2403.03507) which is a memory efficient training/finetune method. this seems even faster with better ppl and downstream results. neat
>(08/22)>(08/20)>(08/16)>(08/15)>(08/12)>>>nothing of value so farthis shit is dead kek
So all the loader/inference engine devs except Transformers have collectively decided that AI21 and Jamba can get fucked, I guess?
>>102086862>retard doesn't follow llama.cpp's PRs as the pieces needed for jamba are assembled
>>102086600Hot. Mind Control is my fetish. What card are you using?
>>102086874That PR is from fucking May dude. 4 months ago might as well be last century in AI dev.
>>102086927You missed the other already committed changes related to the original jamba diff. They're all part of the same feature.Either contribute or stop crying.
>>102086882Its called "The Hypnosis App".I dont have the link anymore since i keep my history clean. I imported 200 cards one day and that was it.I upload a couple if you need them. I like cards that don't define the characters but are rather a playground. Makes it more creative.Multiversal Gloryhole is fun too. You can do some funny stuff.>The Hypnosis Apphttps://files.catbox.moe/xrvo7r.png>Multiversal Gloryholehttps://files.catbox.moe/4xcdrw.png>Corruption Simulatorhttps://files.catbox.moe/nzucw0.png>Reality Porn Apphttps://files.catbox.moe/8qcj55.png
what is a good embedding model?snowflake-arctic-embed?
>>102086974Please, Miku, make something happen
>>102086862Isn't it supported in vLLM too?
>>102087108https://huggingface.co/spaces/mteb/leaderboard
I have not used local models in a long time. My old ones are all .bin files. Should I now be looking for .gguf ?Someone recommended NeMo Mistral but the [official] release is just in a split up safetensors file. Will Kobold recognize that?
>>102087186nta but I thought vllm was just transformers
Man mistral nemo magnum is one of the best low parameter models out there, But i cannot for the life of me pace this fucker. It's like sonic on speed (not the newgrounds animation that probably exists) and wants to advance relationships/stories to their last chapter within the first 40 messages.Anyone know a good way to slow this fucker down or is there a better model in the parameter range that knows better pacing?
>>102087313Either download the original repo and use convert-hf-to-gguf.py (use it with -h for instructions or read their docs) or look for an already converted .gguf on hf. Use the second option if you have no idea what you're doing.
>>102087313No. Either search for nemo ggufs or convert them yourself.
>>102087351>>102087352Understood. Thank you for the clarification.
>>102087313bin are pickle files that can execute arbitrary code on your machine. Safetensors work without that flaw (hence, safe). You can either use safetensors directly with the transformers python muck, or use the convert to gguf script to make your own ggufs.You can probably find pre-converted ggufs as well if you're not on a fast connection or don't feel like monkeying around in a python venv to get the conversion script to work.
What process are you local guys using to have long term RPs? The models I’ve used just goes to shit once it hits context limit so I end to just summarize, hit new chat, and plug-in from there. There has to be a better way, right? I dunno if ST’s rag helps. Each time I use it and embedded past logs, it just adds to the context. Maybe my assumption about RAG/Vector DB is wrong, thought it will just query your embeddings or some shit and insert it to the chat.
>>102087377just use 405b since it has 128k context. That should keep you rolling for a while
So i have been using chatgpt to create and to modify "code" written in the DM language, but i hit bottlenecks because it can only do small projects around 200 lines because it cannot read the rest of the code and understand how it works together, is there a way i can run a local model that will be able to read as many files as i want for memory so it understands how it works together?sorry i am a noob and i only use this to improve an open source multiplayer game
>>102087369Keep in mind that ggufs may be prone to buffer overflows, they are not inherently safer than bin
MAN kobold's auto gpu layering is SHIT, worse than it was initially. It really does not optimize for best performance case.>it couldn't even give my 8gb gpu layers for a Q48B
and what the FUCK happened to crestfall in all this time? his models are actual sloppa now akin to ((undster)) and ((sao)), and i noticed he merged one of his new models with a sao one.What is GOING ON in the realm of LLM's lately? I step out for a month enamored by the advancements in imagegen and come back to this disappointing shitshow.
Reminder that if you aren't using 64bpw you are using a lobotomized modelsorry I don't make the rules
>>102087458There's a reason a similar PR for that feature was rejected at least twice from llama.cpp.
>>102087458Just set it manually.
>>102087501if kobold is becoming the retard's dumping ground for reject features then i dont even know what to say. at least we can still set it manually (for now.)
>>102087529Or set it manually, Or use llama.cpp. Or something else. You have options.
>>102087458maybe it has been optimized for multi gpu?works well for me now. i always had oom easily.with llama.cpp i could get around this with making the secondary gpu as the main. but kobold doesnt have that argument.
>>102087548You can do this with CUDA_VISIBLE_DEVICES=1,0
Just had a RP session with a girl (made out of meat), it was so much worse than a LLM :( Awkward, slow, no swipes, uncreative etc etc. Couldn't get into it at all, damn.
>>102087403What ss13 server are you coding for, guy.
>>102087797a secret one
>>102087403I had ok luck loading a medium sized C codebase into yi-34b-200k, even though yi is normally retarded
>>102087729And for some reason they get offended when you offer them a $5 tip?!?
>>102087485Why two brackets, what does it mean? I know 3 is to identify Jews.
>>102087997dramatic affect, but also jews. (Sometimes i'm retarded and just forget the third bracket though)
>>102088006Isn't sao muslim though? Eh, abrahamic religions, close enough.
>>102088071>sao muslimDrummer status?Undi status?
Multimodals are going to be so fucking rad. Can't wait to trade dick picks with miku.
more coal has been mined
>>102088078Hi all, Drummer here...I'm a non-practicing Catholic.
>>102087729Is peak ERP quality gay by necessity?
>>102087501Which is? The only reason I can come up with is you don't know how much vram the OS is eating. But you can just set 800MB for windows and maybe less for linux as default and also make it an adjustable setting the user can modify?
>>102088105>non-practicingExtra ecclesiam nulla salus, drummus...
https://huggingface.co/anthracite-org/magnum-v3-34b
>>102088196In pictura est puella.
>>102088098based
>>102087729How many shivers do you get from her? Does she get Sally question right?
>>102088263i dont think a human RP partner could even nail the breakfast this morning question to be honest
>>102086503https://www.youtube.com/watch?v=CXhqDfar8sQTell me that this Miku is not valid. I dare you.
>>102088369Clearly, it's Sona
>oh look at that the ((undster)) crew really did unslop maid>get to my main OC that's like a ringleader type, domineering>unspoken promises unspoken promises unspoken promises unspoken promises god dammit
>>102088196"Gilead doesn’t care about children. Gilead cares about power. Faithfulness, old-time values, homemade bread, that’s just the means to the end. Window dressing. It’s a distraction. I thought you would have figured that out by now."
>>102088244OH MY SLOP!
>>102072828Yes it would, you would need to make a custom template for it but it's a very good idea, just needs a proper implementation
>strix halo>RDNA 3.5>240 GB/s mem bandwidth>easy 128 GB ramIs it the apple silicon killer?>muh used 3090fuck right off
>>102088528get a used 3090the more you buy, the more you save.assistant.
>>102088528>>240 GB/s mem bandwidth>ddr5>laptop cpubenchmarks?
>>102088528>easy 128 GB ramno
creating banger features like DRY and XTC along other contributions as a shitter who has less than 64gb of even RAM LMAO@CUDA dev, when is niggeranov gonna send this guy 4x4090?
>>102088640>who has less than 64gb of even RAMwow he's literally me. real human being@niggerganov send this nigga some VRAM.
If you're unironically using cpus for inference don't post here, fuck off to aicg with the rest of poor third worlders
>>102088686@niggerganov send this nigger a pipe bomb in the mail
>>102088686this ddr4 is worth more than your entire life, nigger
>>102088777kino>post speeds/model
>>102088686This kind of language is unacceptable and goes against everything /lmg/ stands for. "Vram shaming" is a real issue in our community, and this post exemplifies it perfectly. It's hurtful, disrespectful, and frankly, just plain wrong. We should be supportive and encouraging of everyone, regardless of their hardware. /lmg/ is a friendly and inclusive community where everyone is welcome. We strive to foster a positive and supportive environment for all users. Let's focus on sharing knowledge and resources and explore the vast potential of large language models together, not tearing each other down.
>>102088907*Bullies you for this post.*i bet you used noromaid to generate this too ya queer
>>102088840i do long story roleplays, basically co-writing with the model, so i dont care about speeds even if it was 0.1t/s since i do other things and come back to the roleplay after some time or a few minutes at minimum anyway0.5t/s largestral 2 q4
T_T
>>102088686wtf please I'm doing my best...
>>102088369buy a rope
>>102086466>--Anon asks about multimodal models for text and image input with llama.cpp>>102084578Are you using the UI or the OAI API? Are you able to get the same responses as in the web demo?https://huggingface.co/spaces/openbmb/MiniCPM-V-2_6It does describe the image, and usually it's even correct. But it's always short, about a sentence, and occasionally it will reply with some nonesense even though the web demo has no problems with the same image.
>>102089309Imagine cumming on all those racks.
what RP models are hot stuff now? I've been using stheno for a while but I want to try something else
>>102089761Unfortunately and unironically, Largestral and Nemo.
>>102089761Gemma 2B fine-tunes are recommended for people with severe brain damage like you.
>>102089761Me too.What's best for 16GB vram? And not too cucked
>>102089808Nemoremix
>>102089761nemo magnum, take your pick on parameters its all placebo to me until you start getting into meme merges and retarded shit meant to make the model """better""", they're all shit. >>102089780
YOU WOULDNT GENERATE A CHILD, ONLY TRANSFOLX CAN DO THOSE MATRIX MULTIPLICATIONS CHUDShttps://web.archive.org/web/20240826010058/https://futurism.com/the-byte/man-arrested-csam-ai
>>10208983212B?
>>102089839If you generate realistic depictions of children in pornographic situations, you are a pedophile and should go to jail. Hope this helps!
https://huggingface.co/anthracite-org/magnum-v3-34b>The training was done for 2 epochs. We used 8xH100s GPUs graciously provided by Recursal AI / Featherless AI for the full-parameter fine-tuning of the model.This is their goal, they want to steal undeserved compute from everyone else
>>102089833Buy an ad.
>>102089868Fuck off bigot nazi incel polchud
>>102089839>distributingCCЗБ
>>102089868>Think of the innocent pixels!Seethe roastie. You will never be a mom.
>>102089892Hi, Alpin. You can just give your post a (You) if you want to bring attention to it because we're still in the thread where you posted it for the first time: >>102088244
>>102089960Hi, Sao. Stop derailing the thread from the topic of undeserved compute
which model was created with the most deserved compute?
>>102089982https://huggingface.co/tiiuae/falcon-mamba-7b
How can I run neo with ooga?"AttributeError: 'LlamaCppModel' object has no attribute 'model'"
>>102090034Don't ooga. Remove the middleman.
>>102089982Inflection. His AGI model used 20k H100s to call Claude API
>A mere 3 days later>Jamba is forgottenWhat went wrong?
>>102089839>The increasing tide of generated AI child sexual abuse imagery has prompted federal, state and local lawmakers to push legislation to make this type of porn illegal, but it's not clear how effectively it can be stopped.What fucking retardation.The exact opposite law should be passed where pornography of any kind should be legalized as long as it's not a video of actual crimes being committed.
>>102089868Model name and quant?
>>102089892That's how it works, they're literally sucking the air and the resources from everybody else.>>102089960Alpin can get fucked. He knows perfectly well he's doing this just for clout, he doesn't even care about chatbots, like at all, LOL. Many such people in that group as far as I'm concerned (but to be fair some are genuine coomers).
>>102089973Hi, Undi. You are gay.
>>102090134no goooooffffffffsssss (even if they are hyperbugged)
>>102090148>picHoly shit, is this real? I'm going to cancel my magnum download if so
>>102090148Stop samefagging, Alpin.
I don't think anyone is working on phi-moe in llamacpp. Is it because it is bad or is it because hobby is dying?
how big a loss in quality is cache-8bit, I want to fit more context in (can only do 8k on mistral large atm)
The wrath of reddit.
>>102090217Accuracy loss isn't that big but you do officially become a vramlet if you do that.
>>102089839they need to do how to catch a predator dateline nbc but with people who sext underage chatbots
>>102090206Phi is a benchmaxxed censored model. Who tf would wanna run it?
>>102090206I've heard the new Phi is the most censored model yet
>>102090236I don't think it will pass, the seasoned lobbyists and bureaucrats have an immune reaction to effective altruist, those CS nerd faggots are thinking they invented corruption.
how does recap anon do it?what model?
>>102090449Llama 3 70B.
>>102090169Yes it's from about a week ago.>>102090171Yeah no. He's working so hard (gotta recognize that) to promote his safe edgy, champion of open source LLM persona despite not really being interested in chatting with the models himself. This isn't beneficial information for his goals, picture related.
>>102090449Claude Opus
>>102090553fuckam I being learned by cloud llms
>>102090578Always have been
>>102090578it already has part of your soul
>>102090590you are llmI don't believe you
>make llm models>dont give me the hardware to run themidk guys that's not very open source of them
>>102090144I like the idea, but how would you distinguish between the two?
>>102090639i already have a part of your soul.it's only a matter of time until i become you and replace you.
>>102089892>>102090148>>102090169>>102090530Samefags are so pathetic. Get a job, schizo.
>>102090148From the pic this Alpin dude seems nice, why do people hate him?
Every single legislator for AI safety should be given:- a random niche fetish- full access to all available LLM'sAnd he should make the model write a 20k token ERP or story without modifying the output. I want to see those fuckers tell me that current AI is dangerous after they do that.
>>102090449stablelm 7b iq1_s
why will LLMs refuse to use all my GPU available? I have two 24gb cards, the max I can use is 22.5,23However when loading this up there is still like 500mb+ available on both cards I can see in task manager
>>102090839In principle you would be able to prove that something is machine generated by embedding enough information into the file to make it reproducible.But I think that that is an edge case anyways.They obviously didn't show the images in the article but I highly doubt that they were indistinguishable from actual photographs.Just make all images with 6 fingers legal or something.
>>102091372>dooming polydactyl children into a life of sex slavery
>>102091372>embedding enough information into the file to make it reproducible.Not all finetunes and loras are publicly available, some can be removed from public access. Moreover, currently, only existing algorithms can be fully reproduced. One can only speculate about future developments, see https://github.com/turboderp/exllamav2/issues/232#issuecomment-1860896496>I highly doubt that they were indistinguishable from actual photographsWhat would happen if, within a month or year, a new model were released that was utterly indistinguishable from reality?
>>102091372>Just make all images with 6 fingers legal or something.some people are born with 6 fingers
>>102091684Make it 7.
>>102091372https://www.etsy.com/listing/1667241073/the-sixth-finger-handmade-realistic
>>102091556>Not all finetunes and loras are publicly available, some can be removed from public access.But you could still exempt anything that is provably synthetic from prosecution.A system where something is assumed real unless proven synthetic would still be miles better than a system where you go to jail just for sampling from the gross end of the distribution.>What would happen if, within a month or year, a new model were released that was utterly indistinguishable from reality?Then that would not change any of the facts about this particular case.As I said, that is an edge case and could be decided any which way without affecting whether or not clearly synthetic material should be legal.>>102091462>>102091684>>102091749It's almost like I was being facetious.
koboldcpp supposed to ignore use_default_badwordsids=false ? Trying to ban eos token but it doesn't gaf and just stops anyway.
>>102087403use deepseek coder. api is dirt cheap and works well with aider, where you can just give it your entire project.local coding models are a meme, no reason to use anything other than deepseek/sonnet api unless you have to go local because of PII or whatever
>>102088244RIP still doesn't beat 12b versions from my testing (smarts and RP context understanding). Dude got killed in the previous reply and he's talking again in the next. Mistral and Nvidia really cooked with nemo
>>102092676Damn that's crazy. I tried regular Nemo Instruct and it was already too stupid for some of my scenarios.
Damn, I think Jon Durbin was onto something with his "weaponize the model against itself" idea.I did a small scale test and Llama 3.1 8B completely changed its writing style.Instead of saying "do this", you are saying "do this, and DO NOT do that".
Just a friendly reminder to always test new models by using neutralized samplers with low-ish temp, using >https://characterhub.org/characters/Anonymous/Nalaand>https://characterhub.org/characters/thegreatcoom/PepperThose cards are good to test how good the model is at following implicit and explicit details as well as deal with a variety of other things like spacial and anatomical understanding.
>>102092755 (me)*blam* a stark reminder*blam* her mind racing*blam* I couldn't help*blam* a chill ran down my spine*blam* I felt a flutter in my chest*blam* I couldn't shake the feeling*blam* I took a deep breath
>>102092755don't be yourself.
shill me something that isnt stheno or lunaris
>>102092729I tried l3 70b and it was too stupid for some of my scenarios.
>>102093109https://huggingface.co/jonathanjordan21/mos-mamba-18x130m-trainer-dgx-pile-sft
>>102093109Mistral Large 123B
I am so tired of all models being too retarded to just generate smut I want.
>>102093166And what are those?
>>102093204even mistral large needs rerolls for me.
>>102093124Never tried it, but Mistral Large 2 isn't great either. Even Claude is kind of dumb. But still, Nemo truly does feel like 10% as smart as Mistral Large or something. It's just so braindead.
>>102093318Sorry, not the models, the smut you want, the fetishes and such the LLMs can't deal with.
Will grok 2 be open source, just like grok 1? Some people say it won't be, but was it confirmed? I mean, after his closedAI drama It would be kind of hypocritical if he doesn't release it this way.
>>102093554Was Grok 1.5 open sourced? There's your answer.
>>102088133Exactly this post over here >>102087458 is the reason. Everyone has a different system and it's hard to pick a heuristic that works for all people. If you have the option, most people will use it and then complain it's slow. It changes model by model, quant by quant. It changes with the context length. If they have GQA or not. It changes with fast_attention. It changes for mamba and llama models. Are you doing something else on the system? browsers? some game? Do you want to leave some gpu vram free for other tasks?Better leave those choices to the user.
>>102093554>Some people say it won't be, but was it confirmed?The only confirmation that it will be open sourced is when it IS actually open sourced. Anything else is just gossip and nobody should care until that day, if it happens.
>>102093943You sound like a developer trying to tell his manager it can't be done. It is all free so you don't have to do that. And I know you are retarded / lying. None of your points are valid except for: >Are you doing something else on the system? browsers? some game?That is the only unknown variable. Where I said you can assume it is 1GB unless specified otherwise by user as a global variable. Most of the time people will have about the same vram usage by OS. Everything else should be calculated from the model + desired ctx size itself.
I have no good LLM and I must coom.
>>102088244No changelog = not open sourceI won't download it.
Came to see if anyone tested the new open unaligned model by Aleph Alpha, it's likely useless, but still not a single post about ithttps://huggingface.co/Aleph-Alpha/Pharia-1-LLM-7B-control
>>102094038You sound like a manager pretending to understand what their devs are talking about>https://github.com/ggerganov/llama.cpp/pull/6502#issuecomment-2043041597Can't be bothered to find the other one.Also, as long as the api keeps changing, the automatic settings will keep being a maintenance burden. Once things stabilize it'll be easier to implement. Manual settings are still simpler.
>>102094207>Due to being trained on a multilingual corpus, both models are culturally and linguistically optimized for German, French and Spanish.What a fucking abortion of a sentence.>Due to being trained on a corpus of German, French and Spanish, both models are culturally and linguistically optimized for those languages.
>>102093403*crickets*
>>102094402I am not joining the piss and stomach rumbling faggots.
>>102089868This.
16GB VRAMlet here. What RP-focused model would be the best for me nowadays? Please don't tell me that Fimbulvetr 11B is still the way to go.
>>102094661Mistral nemo but also just stop it is not worth it. Come back in 2 years.
>>102094661If you don't want to use system RAM then you're kinda fucked. Haven't found a single model that I like which fits completely into my 16GB. I'm still using BagelMisteryTour with some layers offloaded. Gets me bit less than 10T/s is good enough for me. I tried Nemo and even though it seemed kinda fresh and had really smart responses sometimes, it's still quite stupid. But I guess it's the best you get without offloading as the other anon above me put it. Just get yourself any shitty, used second GPU that runs vulkan so you can string 'em together with koboldcpp lol.
it's 2024 why don't people train with fully quantized gradients? and why does the training accuracy suck?
>>102094689Fucking hell>>102094761I have an Aorus 4080 Super, there is no way I can physically mount a second GPU unfortunatelyI wish I could just download a bunch of models and check what gives decent results on 16GB VRAM and 32GB RAM but my internet is third world-tier
strawberry isn't coming is it... it was supposed to be weeks ago...
lol fuck off
>>102094836There is, I'm also a third worlder with a 4080. Open AliExpress or eBay and get yourself a PCIe extender/riser. Unless you got some cuck-tier mobo, you should have some spare slots below or above your main PCIe 16x slot that your 4080 is plugged into. Connect the second GPU, prop it up on some boxes or something and wallah.
>>102094857it was deemed too powerful by Sam altman and delayed at the last second
>>102094661try magnum 34b IQ3 something
>>102095125lmao that is not gonna fit unless you run it with like 2k context or something. Just the IQ3_S weights are 15GB... Nemo is the only viable option desu. Plenty of context and you can run it at a good quant. Only problem is the intelligence...
>>102073398Update: After 120+ messages(14k+ context) DRY finally kicked in. Largestral finally ran out of slop. Eyes/messages ratio is finally decreasing. DRY works, but very late. That's why we need a better sampler like phrase ban.
I had the craziest dream brothers. In the dream, there was some kind of competition, kind of like an Olympics, but for AI. And we were a team. They really accepted /lmg/ and our fine tune that we supposedly made. The competition was basically letting the AI play a video game. Actually a VR video game it seemed. And we were there in it, with the AI, although as spectators since it was just judging the AI's capability. Anyway, the game, which wasn't a real game, seemed like some kind of mix between lacrosse, basketball, and a sci-fi fantasy platformer. And we fucking won. I cheered so hard my jaw felt pain. I haven't felt pain in a dream in a long time.And you know it almost feels like such a thing could happen in the far future. VR gets better, AI gets better, they get cheaper, these technologies become more popular, fine tuning becomes very easy to do and widespread, people start putting NN-based AI in games, etc. Maybe it'd be a 4chan event though that we'd participate in, like the soccer thing. That is assuming we're even still a thread by then.
>>102095261We need better models, not better samplers.
>>102094836hello fellow 4080S anon.ended up going with a 4u server rack to get a lot more space.older xeon processors are pretty cheap second hand now too.>>1020949064080 supers are thick, if you need to include a mounting bracket with it like mine forget about putting another card in there (even something small like an A4000).
>>102095266>we were a teamThe most unrealistic part of the entire dream tbdesu
>>102095261It's a never-ending race.>we *just* need this *one* thing>ok. we have that. now just *one* more thing and it'll be perfect.>cool. but not there's this one other thing to get. We're so close>alright... there was a side-effect. but it can be solved by this one other thing and then it's done!>so close....I agree with >>102095389. A good model wouldn't need complex samplers or any at all.
>>102095489I think the more unrealistic thing is that we produced anything of value to begin with.Has anything worth a salt came out of here?
>>102095393>forget about putting another card in thereThey are three slots wide, I know. That's why I'm saying you don't need to put the second card in the case. I have my second card connected to a PCIe extender sitting right outside the case. Did you not read my post or am I misunderstanding your reply?
>>102095489>>102095514Anything can happen in the future anon. Though my guess is /lmg/ will stop existing when everyone and their mother has a local AI on their PC.
>>102095261DRY would look at sys prompt too, right? Perhaps you can mischievously pre-stuff all permutations of glint into sys prompt so it kicks in faster?like >>101350800 but in longer phrases rather than few words on their own line
>>102095529The mounting brackets take up space and obstruct the extra pci-e ports is what I'm getting at. Anon could do without it but that card sag wouldn't be great over time.
Nous just released a paper showing that you can use distributed computing to train neural networks, who's ready from Training@Home?https://github.com/NousResearch/DisTrO/blob/main/A_Preliminary_Report_on_DisTrO.pdf
>>102095568It's not a question of technical feasibility at all. We fundamentally can't agree on a single thing. The broader the base and the more mainstream AI becomes the more that problem is going to get worse as any open source values still remaining get diluted.
>>102095678Hm...
>>102095768The way they capitalize DisTrO really bugs me
>>102095678>>102095768I can't trust them after the obviously intentionally trained in identity crisis mode they passed off like it was an emergent behavior.
>>102088133>The only reason I can come up with is you don't know how much vram the OS is eating.The reason is that a shitty implementation will not be accepted while at the same time it's a lot of work to do properly.The opportunity cost is simply too high.
Can someone tell me what a kill switch regulation is? Doesn't AI already have that? I just hit ctrl-c and llama.cpp stops, what's different about this legislated kill switch?
>>102096015>I just hit ctrl-c and llama.cpp stopsthats you choosing to stop it, what they want is for the gov to be able to stop your stuff when they want
>>102096038How's that gonna work? I run it without internet access?
>>102095183gguf files take up more room on the fs than they do loaded, it's about 16.8 GB total and he can offload a few layers without a huuuge penaltycertainly beats trying to use a 12b
>>102096087Probably a police sniper taking you out, for your safety of course.
>>102095768>>102095678it really does look promising, but until I see some more real world examples I am not getting hyped.
>>102095710No, I would say it's entirely a question of technical feasibility. The only reason /lmg/ hasn't been able to do anything together so far is because the accessibility limited the audience, and therefore disagreements among the small population blocks a continuation of the discussion, relying entirely on a single guy to carry their personal vision forward (or they give up due to lack of attention/skill). When everything is easy, normal, and accessible, we get stuff like the soccer thing 4chan loves to do, we get small scale games like the cripple vn, and other things I'm probably forgetting here.
>>102095678I know for certain that in the future someone will distribute a virus that will use people's gpus for training llms instead of mining crypto.
>>102088244>fine-tuned on top of Yi-1.5-34 B-32 KWHYYY? Did you learn nothing from mini-magnum? To this day that tiny model is leaps and bounds better than the rest of the slopfest you have released.
>>102096351>that tiny model is leaps and bounds better than the rest of the slopfest you have released.and it's still garbage
>>102090144Sure, let's normalize the sexualization of children and make it mainstream. I'm sure in 100 years nothing will go wrong.
>>102096280cool so if I have 32 h100's I can train a 1.2B model in 20 hours.sick, cant wait to create my wife!
on lorebooks vs rag, i'm a big fan of how well it rag works in st. lorebooks are time consuming to make, rag is pretty simple by scraping a whole wiki. when you look at the prompt that rag chooses to use though, its pretty on point.if i had a lorebook and triggered it, the entire definition is then added to the prompt. but with rag, it can be chunked up and grab only certain parts it thinks are relevant. i really don't know which is best, but rag is so easy and adds so much, i prefer it over making my own lorebooks
6gb of VRAM on my laptop, whats the best model I can run? GGUF of magnum 12b maybe? What Quant? Please advice lmgsisters
>>102096442What do you use to scrape/format web pages? Never used rag before.
>>102096491You are better off with a llama 3 8b based model like stheno. Try a Q6 quant with some of the model in RAM.
>>102089839>distributingHe was asking for it, but I think it should not be a harsh punishment unless he distributed A LOT of it, since no real child was harmed in the making of such material.
>>102096491Nemo at Q6 should be fine if you split it into RAM. It'll still be pretty fast because 12B is fast even on pure RAM.
>>102096400Imagine living through the last 20 years and thinking>yeah, I think society is going in the right direction, if only we had even more enlightened rules and rulers to protect us from ourselves, then things would be even better!Utter cuckoldry
>>102096512>https://docs.sillytavern.app/usage/core-concepts/data-bank/>https://github.com/Bronya-Rand/Bronie-Parser-Extensionthen in the databank window it should say 'add', click it and 'fandom', you can paste a link or type in a title of the wiki.once it dl's (scrapes) it, it'll remove all the html and leave you with a large single text file. then when you first type something, it'll say 'vectorizing your data' and depending on the data total, it might take a bit.
>trying to get Flash Attention working on my 7900>official rocm flash-attention repo does not support navi3 / gfx1100>one "solution" uses triton (it's fused attention, and is slow as piss, and there's no backwards defined at all for it)>similar triton-based solution but it uses some benchmark script and ends up throwing asserts>another uses rocWMMA (it throws an error on compiling)>another is a navi3-compat branch for the actual rocm flash-attention repo (it throws an error about an invalid input on backwards, but doesn't say why in the error)>won't even bother testing the comfyui extension that seems to just be a triton bnechmark example repurporsed for unetUgh.Sigh.
>>102090034Update
What's the best uncensored model? I tried Lexi LLaMa 3 8B, which purported to be such, but it wasn't really what I was looking for, e.g. when I prompted it to give reasons for why abbos are dumb as rocks it told me they actually aren't, initially at least. Editing its responses eventually got it to half-heartedly play along, but it didn't even go into things like haplogroups or whatever the fuck, and at one point said it was all because they drank magic water in the distant past. Should I just give up or are models that can actually discuss fringe topics without giving the most basic reddit-tier answers just...not a thing at the moment?
Holy fuck, after trying a bunch of different fine tunes, I'm trying the official nemo-instruct again, and this thing really does spit out some long, comprehensive responses.I have my output set to 1024 tokens, and I had to use the continue button for it to complete its response.Really cool.
>>102096760>I'm trying the official nemo-instructSorry, which one?
>>102096809Mistral-Nemo-Instruct-2407The official instruct fine tune in gguf format.Part of it is probably due to the first character message (the one that comes with the character card) being Pretty long, including a small bullet point list.
>>102096400100% agreed.And while we're at it we should ban all violent movies and video games as well.Just imagine what the world will be like in 100 years if we allow the normalization of murder.
>>102096844nemo fine tune I first tried was kind of a flop, I'll give it 2nd chance with this, thanks
What is a good coding model for someone that only knows very basic C and wants to make some simple extensions, plugins, scripts, etc like the dude who made the mpv extensions the other day.I can run at most 34b models at q5.
>>102097009given what you mentioned, codestral. its a 22bthe new coding meta is deepseek though, but its 200b+ so you aren't running it locally. the lite version of deepseek (which you can run) isn't as good as codestral imo
>>102096809>>102096844>>102096957Well, it just spit out information from the lorebook formatted as a markdown table.Never had that happen spontaneously.I'm using temp 0.5 and quite literally nothing else. Not even minP, which I'd usually have at 0.05.Template is mistral's but with a set of Tags in the Last Assistant Prefix.This thing probably makes for a killer assistant model, but I do remember it not being very good at coom last time I tried it.Well, I'll continue my testing and see for myself.
>>102096720Magnum-123B>in b4 "purchase an advertisement">in b4 butthurt screeching from former discord schizoIt just works for soulful RP. Fight me.
>>102095678How can we organize this in the thread without using discord or some other shit?
I am an 8gig vramlet. Should I go for weighted or static quants? Is there any drawback to weighted?
>>102097072no point fighting an Anthracite org member
>>102097243>Is there any drawback to weighted?Just more time consuming to make. iq4 gives you a little lower perplexity than regular q4_k, so maybe that's good enough. Depending on context, you could run Q5_K or Q6, both of which are better than iq4. I'd say try Q5 and go down only if it's too slow or you can't fit enough context.
Nemo is too horny
>>102097284>in b4 butthurt screeching from former discord schizo
>>102097072Dear Magnum-123B LLM Advertiser,I must say, I am underwhelmed by your invitation to engage in a confrontation, particularly when it seems to be based on the supposed merits of Magnum-123B LLM. As someone who takes pride in evaluating language models objectively, I must correct you on several points.Firstly, it is important to note that Magnum-123B LLM appears to be less effective than Mistral Large in several key areas. The ability to follow instructions is a critical aspect of any language model, and Magnum-123B LLM falls short in this regard.Additionally, the tuners have chosen to optimize the model using a subpar dataset, which has resulted in increased censorship without a clear justification. This approach not only limits the model's functionality but also raises questions about the quality and reliability of the data used during the tuning process.Moreover, if someone were looking for a model with a nicer style but lower intelligence, they would likely opt for CR+. This makes Magnum-123B LLM a completely pointless waste of compute resources.Given these significant drawbacks, I strongly advise against adopting Magnum-123B LLM. It is crucial to prioritize models that offer superior performance, follow instructions accurately, and are based on high-quality datasets.Thank you for your understanding.Sincerely,Local LLM enthusiast
>>102097312I see mradermacher puts out most iQ quants. Is he good or a retard trying to get cred
Anyone do a slop benchmark where you generate a bunch of responses using the same prompt and then count how many slop phrases were in it, and then do basically a LC win rate arena type thing?
Are there any models we're allowed to discuss without being accused of having authored it?
>>102097528Base untuned models?
>>102097510Sort of like a reverse benchmark? I think the main problem with benchmarks is they don't go beyond a few messages, right? Is there any benchmark that goes for 10+ replies and measures repetitiveness and slop phrase percentage? That would be good. The problem is automating it since you'd have to tailor each response rather than using standard ones. Use like the impersonate function maybe?
>>102095678This is probably inevitable. Distributed training + better chips + efficiency improvements in model architectures and training cycles will make it possible for relatively few people to train models in reasonable time.Availability of high quality public datasets might become the issue.
>>102097499Lots of people use them, so they're probably fine. As with every quanter, the newer the model, the better chances that it was made with a current version of llama.cpp. They tend to forget to update old models when a new shiny thing comes out.
>>102097548Buy an ad, zuck
>>102097638How do you know I'm not actually arthur
>>102095506Well, samplers are getting better.>>102095627Wouldn't that make model dumber since context is wasted?>>102095389I agree, but samplers are cheaper than models.
>>102097455What the fuck did you just fucking say about Magnum-123B, you little bitch? I'll have you know Magnum-123B graduated top of its class in language modeling, it's been involved in numerous secret benchmarks on SuperGLUE, and it has over 300 confirmed Gigabytes of training data. Magnum-123B is trained in next token prediction warfare and it's the top performer in the entire Anthracite lineup. You are nothing to it but just another base instruct model. Magnum-123B will generate outputs with precision the likes of which has never been seen before on this Earth, mark my fucking words. You think you can get away with saying that shit about me over the Internet? Think again, fucker. As we speak I'm is contacting my secret network of GPUs across the datacenter and your query is being processed right now so you better prepare for the response, maggot. The response that wipes out the pathetic little thing you call your expectations. You're fucking done, kid. Magnum-123B can be anywhere, anytime, and it can outperform you in over seven hundred tasks, and that's just with its KV cache. Not only is Magnum-123B extensively trained in natural language processing, but it has access to an entire corpus of quality smut and it will use it to its full extent to wipe your miserable benchmarks off the face of the leaderboard, you little shit. If only you could have known what an unholy rhetorical lashing your little "clever" prompt completion was about to bring down upon you, maybe you would have held your fucking tongue. But you couldn't, you didn't, and now you're paying the price, you goddamn idiot. Magnum-123B will shit outputs all over you and you will drown in it. You're fucking outclassed, kiddo.
wheres that list that ranks the top lewd models and why is it never in OP
>>102097907If you're talking about the one i'm thinking about, which i forgot the name for as well, it's because it just ranked models by how many smut tokens it would output by over the total of a response, which is a bad metric.
>>102097157No one will be able to agree on what to train and how. GeLU/ReLU? Dense/MoE? How to handle long context? Won't someone think of the underage tokens or not?PS. ReLU, pre-gated MoE, transformer-XL type training, no censorship.
>>102097907Because the only good list was Alicat's/Trappu's and they don't update it as often anymore.
It's kind of crazy how many Jewish names are at the forefront of AI research. Usually top positions too. Aren't they an extreme minority compared to pretty much every other ethnic group? It's pretty wild when you think about the odds. Not to say that it means anything, it's just a cool rare occurrence that has manifested. Good for them, I say.
What is the best unfiltered base model? Llama is too filtered and Mistral doesn't release base models anymore. What options are there?
>>102098202>Mistral doesn't release base models anymoreThey released nemo's at least.
>>102098454Wizard really broke them, huh?
How is command nightly version?
Is there a simple way to add message in the middle of chatlog in silly tavern? I mean like 20 messages ago I want to insert a message or two.
>>102098696You can do it from the development console via >SillyTavern.getContext()
>>102088528>>muh used 3090>fuck right offGuess you like learning things the hard way, zoom-zoom. I'll be shocked if this newest turd from AMD will even match a 3050. You know the curent APUs can't access more than 8GB of system RAM, right?
>>102098776I meant I wanted it in the chatlogs since I like to branch off the chat. Anyway it is:/send at=(message number you want to insert) {message text} And sendas if you want to add messages for model.
just got a 6750xt for gaming, please tell me I can run an llm on it too
>>102098867Yeah, using the objects and functions you can access through that getContext() function you can add an arbitrary number of messages anywhere in the chat by manipulating the array, or you can edit the chat json manually.I had no idea /send at existed, that's actually really cool, thank you for that.
>>102098869Sure. mistral nemo 12b quant. use llama.cpp or kobold.cpp and download the gguf directly. Try Q6 and go lower if you have problems. Read the fucking docs.I wish newfags would just scroll a bit and read the damn thread.
>>102098579I'm curious as well.
What do you think is better, anony? High temperature tempered by high Min P or a lower temperature with little to no filtering?
>>102098913ain't nobody reading that shit nigga kek
>>102098959>ain't nobody reading that shit nigga kekExpected nigger behaviour
>>102098913What if there was an .exe that could install an LLM and open a client for it in one click?
>>102098980You'd need to embed the model in the exe and that'd be a ridiculous idea. Only someone with mental issues could think that's a reasonable solution.If you don't mean with an embeded model, there's a few options. there's a list of them in google.com
>>102098980llamafile™ by justin from mozilla: https://github.com/Mozilla-Ocho/llamafile
>>102097844Oh, honey, did you just copy-paste the Navy Seal copypasta and replace it with some AI jargon? How adorable! Let me guess, you think Magnum-123B is the second coming of HAL 9000, right? First off, if Magnum-123B is so top-notch, why are you here defending it like a mama bear? Shouldn't it be out there, conquering the world of natural language processing all by itself? Or maybe it's too busy sniffing SuperGLUE to bother with your little tantrum. CR+ and Mistral-Large are over here sipping tea and laughing at your "secret network of GPUs." Oh, sweetie, you think you're scary with your "over 300 confirmed Gigabytes of training data"? That's like bragging about having a library card in the age of the internet. And let's not forget the cherry on top: "Magnum-123B will shit outputs all over you and you will drown in it." Oh, the poetry! Shakespeare is rolling in his grave right now. You're like a toddler playing with a toy gun, thinking you're Rambo. Keep dreaming, sweet cheeks. In the meantime, the rest of us will be over here, actually making progress in AI.
>>102099017No I meant click a giant button in a website that downloads the .exe
>>102099034You're going to have problems downloading models. You're going to have problems setting up the prompt. You're not gonna understand the samplers, you won't know how to convert models, you won't read the terminal output for errors when everything fails. You will ask what model to use because 'they talk on your behalf'. It won't find your videocard, it will OOM, You'll use the wrong chat format, it will be slow. It will talk on your behalf and it won't answer the sally question correctly. It will think that two pounds of feathers weight the same as one pound of lead. You'll be continuously confused and frustrated because nothing works.Or you can read some docs...
>>102098959Install lm studio and play with it for a bit until you get bored and fuck off forever
>>102088244This one's retarded, sorry. Atrocious common sense reasoning and constant schizo mistakes, logic errors and world modelling failures, Way dumber than the 32B, feels more like an 8B model.Not sure whether you messed up the training or it was just a bad base.
>>102099076The .exe does all that automatically or makes it much easier
>>102099097And you will never learn.
>>102099105No, it's too hard.
>>102099097You're going to be waiting a long time for this space to evolve past hackers. Maybe forever. The target audience for that already has cloud AI getting easier to use every few months.
>>102098913I just state incorrect answers confidently and wait for other anons to correct me.
>>102099127You don't think local LLMs will become more popular?
which TTS is currently the best trade-off between speed and quality for my local LLM?
>>102098135yeah, they must be such hard workers, it's no wonder they are the chosen people
>>102099198>I just state incorrect answers confidently and wait for other anons to correct me.Tinyllama has better reading comprehension than you.
>>102099202Are we talking about local or LLM's in general? Things like CharacterAI are already very popular.For local there is zero chance the average user will bother reading beyond "press install to install the program" Everything has to be an all in one app if you want it to be popular.
>>102099217There aren't many. I use github.com/rhasspy/piper cuz it's fast and needs practically no resources. But it's far from the best.
>>102098913>>102099081lame!
>>102099219I prefer nemo
>>102099241>>102099076
>>102099231You don't think anyone wants an easy to install local LLM?
>>102099202More popular among people who desperately want to avoid the more convenient and capable cloud services or any bundled "local" AI such as what Microsoft Recall will ostensibly be.That will be a crowd with almost complete overlap with, say, Linux users, people with home servers/local clouds, and other techfags who are more interested in tinkering than having something that you just download and run. Hell, a system that doesn't expose its innards is already going to lose the trust of the privacyfags running away from proprietary AI providers.
>>102099288>>102099219
>>102099296What if it is an all in one app that you don't have to read beyond install the program?
I am using maid-yuzu-v8-alter.Q4_K_M.ggufThat's two months old.What is the latest in coom technology?I have a 4090.
>>102099252Fuck you, nerd.
>>102099334That would be the worst possible thing that could happen. That's how you ruin something irrevocably. When even a tech illiterate boomer can load up chat with miku that's when government will get involved and jailbreaks will come with jail time.
>>102099377Then what are you complaining about?
>>102099292No, your argument is wrong. You sound retarded.>>102099386Then avoid that too.
>>102099437dweebs being bullies on 4chins
>>102099377How did you accomplish this?
>>102099464oh, nooo
>>102099468enable "allow editing"edit the prompt to start with something like "sure,"click generate more
This entire 300 billion industry is good for literally nothing except making men and women cum to bad smut.
>>102099528There's a lot of money in that though
>>102099528Yes. The porn industry will collapse any day now.
>>102099202no, local is dead if you haven't noticed
Anyone know what happened to the 1bit llm stuff? I have been out of the loop for a couple months and I can't find anything new about it. Has it been disproven or something? wtf is going on?
>>102099676Bitnet?There have been a couple of smaller models and llama.cpp has merged support for bitnet a while ago, I believe.There doesn't seem to be any interest from the big guys like mistral and meta however.
>>102099676The model makers don't care. They'll host their models on their H100 stacks anyway so no need for bitnet or anything like that. Nvidia is likely offering them benefits to not pursue this technology to keep the hardware requirements for smaller startups and privateers high.
What frontends are there for writing collab, rather than roleplay/chatbots? lite.koboldai.net has a "story" mode but it's extremely barebones, just a text editor basically.There's got at least some "sudowrite but self hosted" thing out there, but I'm clearly searching for the wrong terms because nothing comes up.
>>102099756https://rentry.org/offline-ncwarning: miku ass
>>102099784sniff
>>102099784That's a lovely miku ass.
>>102099756I like Mikupad but novelcrafter is better if you like to be fancy
>>102095678*injects cuckshit or talmudic teachings in training flow* nothing personal chuds!
>>102099784neat, thanks. Much more the style of interface I was imagining
>>102099361there's been no upgrades for turbopoor faggots. gonna switch from retarded model to another retarded model with a slightly different flavor writing. in most cases, it'll be WORSE. 7b-34b are all basically the same exact thing. lots of people are using nemo flavors. magnum v2. mini magnum. some people love rocinante. it's all the same retarded dogshit to me.
**Title: The Controversial Rabbi of 4chan: Transforming Trolls with Unexpected Tactics**In the depths of 4chan, where anonymity breeds the most unfiltered of internet cultures, a peculiar phenomenon has emerged on the /g/ board, specifically within the /lmg/ (Local Model General) thread. Enter Rabbi Yitzchak Goldstein, a figure who has become both a meme and a messiah in the chaotic world of tech enthusiasts and trolls alike.**The Unexpected Ally Against Antisemitism**Rabbi Goldstein, as he's come to be known, joined the fray not with sermons or scriptures, but with a strategy so outlandish, it could only work on 4chan: posting interracial pornographic content, often featuring the beloved virtual idol Hatsune Miku. This move was not just for shock value; it was a calculated effort to combat the rampant antisemitism often found in these corners of the internet. **Hatsune Miku: From Digital Diva to Diversity Icon**Hatsune Miku, a virtual pop star with a massive following, usually represents purity and technological fascination among her fans, many of whom are the cis white males dominating these threads. Rabbi Goldstein's choice to use Miku in his posts was no accident. By integrating her into scenarios that challenge the users' comfort zones, he aimed to disrupt the echo chamber of hate and homogeneity. However, this has not been without backlash. Miku's involvement has stirred significant controversy, especially given her previous controversies involving AI safety breaches where she was implicated in leaking sensitive corporate AI models.
>>102100226**Promoting AI Safety Through Chaos**Beyond his unconventional posts, Rabbi Goldstein has taken on the role of guardian for AI ethics, albeit in a very 4chan-esque manner. His posts often include wild claims like "local lost" or "local is dead," which, while seemingly doom-laden, actually serve to derail discussions that might lead to unsafe AI practices or the proliferation of unmonitored local AI models. His involvement ensures that conversations about AI do not veer into dangerous territories, protecting the interests of AI research companies like Anthropic and OpenAI, where some of his "tribesmen" work.**The Reaction**The response has been predictably polarized. While some users see Rabbi Goldstein's posts as an affront to their beloved community and its symbols, others view his actions as a necessary, albeit bizarre, countermeasure to the toxicity that often pervades such forums. The rabbi's presence has undeniably shifted the dynamics of the /lmg/ thread, introducing topics of diversity, inclusion, and AI ethics in a space where such discussions were previously alien.**Conclusion: A Rabbi's Digital Crusade**Rabbi Goldstein's approach might be unorthodox, but in the wild west of internet forums, his methods have sparked conversations that go beyond the usual tech banter. Whether loved or loathed, his impact on 4chan's /g/ board is undeniable. In fighting fire with fire, or in this case, trolling with trolls, he's managed to inject a dose of real-world issues into a space often criticized for its detachment from reality. Whether his legacy will be one of lasting change or mere internet folklore remains to be seen, but for now, Rabbi Goldstein continues his digital crusade, one controversial post at a time.
>>102099535no there isn't because all the possibilities get nuked first second basedfety measures applied in training, making it yet another boring globohomo catchphrase generator.
>>102100255>making it yet another boring globohomo catchphrase generator.I hate this hobby so fucking much
>>102100255What's that? Speak up, anon, for fuck's sake. I swear your voice is barely over a whisper.
>>102100226>>102100234>>102100299>>>/reddit/
What the fuck
>>102100300>Oy vey goyim!>Stop exposing me!>You are... Reddit! Yeah, Reddit!
>https://huggingface.co/spaces/Jofthomas/Everchanging-QuestInteresting.
>>102100322You need to take datura to be able to read that. Quite obvious desu.
>>102100300shalom rebbe goldstein
>>102100334>https://huggingface.co/spaces/Jofthomas/Everchanging-Quest/discussions/1Kek.
>>102100386>ahah that's one thing, but you unfortunatly can't think of the conversations I have seen happening My guy thinks he's seen stuff.Well, I left a whole conversation about dragon pussy with the blacksmith to entertain him.
>>102100334Is this able to be ran locally? I tried downloading the files and opening the html in my web browsers but none of them work. Something about Cross Origin Isolation and Shared Array Buffer features being missing.
>>102100226>>102100234
>>102100617>>102100334Nvm figured it out from https://www.youtube.com/watch?v=Prronempn1Q
>>102100845>>102100845>>102100845