/lmg/ - a general dedicated to the discussion and development of local language models.Previous threads: >>103256272 & >>103248793►News>(11/21) Tülu3: Instruct finetunes on top of Llama 3.1 base: https://hf.co/collections/allenai/tulu-3-models-673b8e0dc3512e30e7dc54f5>(11/20) LLaMA-Mesh weights released: https://hf.co/Zhengyi/LLaMA-Mesh>(11/18) Mistral and Pixtral Large Instruct 2411 released: https://mistral.ai/news/pixtral-large>(11/12) Qwen2.5-Coder series released: https://qwenlm.github.io/blog/qwen2.5-coder-family>(11/08) Sarashina2-8x70B, a Japan-trained LLM model: https://hf.co/sbintuitions/sarashina2-8x70b►News Archive: https://rentry.org/lmg-news-archive►Glossary: https://rentry.org/lmg-glossary►Links: https://rentry.org/LocalModelsLinks►Official /lmg/ card: https://files.catbox.moe/cbclyf.png►Getting Startedhttps://rentry.org/lmg-lazy-getting-started-guidehttps://rentry.org/lmg-build-guideshttps://rentry.org/IsolatedLinuxWebService►Further Learninghttps://rentry.org/machine-learning-roadmaphttps://rentry.org/llm-traininghttps://rentry.org/LocalModelsPapers►BenchmarksLiveBench: https://livebench.aiProgramming: https://livecodebench.github.io/leaderboard.htmlCode Editing: https://aider.chat/docs/leaderboardsContext Length: https://github.com/hsiehjackson/RULERJapanese: https://hf.co/datasets/lmg-anon/vntl-leaderboardCensorbench: https://codeberg.org/jts2323/censorbench►ToolsAlpha Calculator: https://desmos.com/calculator/ffngla98ycGGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-CalculatorSampler Visualizer: https://artefact2.github.io/llm-sampling►Text Gen. UI, Inference Engineshttps://github.com/lmg-anon/mikupadhttps://github.com/oobabooga/text-generation-webuihttps://github.com/LostRuins/koboldcpphttps://github.com/ggerganov/llama.cpphttps://github.com/theroyallab/tabbyAPIhttps://github.com/vllm-project/vllm
►Recent Highlights from the Previous Thread: >>103256272--Paper: Hymba: A Hybrid-head Architecture for Small Language Models:>103264040 >103265024--Papers:>103264142 >103264271--Debate on hybrid models vs separate models for AI tasks:>103264086 >103264117 >103264132 >103264382 >103264396 >103264458 >103264550--Critique of quantization benchmark chart and discussion of optimal quantization levels:>103260714 >103260772 >103260883 >103260927 >103260881 >103262523 >103262630--Unsloth adds vision model support with reduced VRAM usage:>103261067--R1 finds serialization problem in large codebase:>103259678--OpenAI's deleted evidence in copyright lawsuit sparks skepticism and negligence concerns:>103257257 >103258483 >103258547 >103258594--NVIDIA kvpress: 80% compression ratio without significant losses:>103261925 >103261982 >103262008 >103262600 >103262698--Local AI transcription tools for English speech:>103256528 >103256545 >103257215--Local AI girlfriend setup and conversation limitations:>103257768 >103258014 >103258042 >103258110 >103258065 >103258157 >103258450--Anons discuss Tulu 3 Models, a new instruct finetune series:>103259680 >103259735 >103260672 >103262111 >103262312 >103262391--Anon tries to adjust Dell 3090 fan speed:>103259508 >103259624 >103259677 >103259766 >103259810 >103259900 >103259898--Anon struggles to prevent Nemotron 70B from misusing ellipses, finds solution in token banning:>103259994 >103260008 >103260047 >103260176 >103263273--Anon asks about LS3 and Nvidia GPU fan control issues:>103259803 >103259840 >103259885 >103259915 >103259958 >103260001 >103260019--AI model responses to a question about making Sharo squirt:>103256682 >103256751 >103256761 >103256872 >103256989 >103262833 >103257397--Miku (free space):>103259181 >103259966 >103260220 >103260446 >103261147 >103265119►Recent Highlight Posts from the Previous Thread: >>103256368Why?: 9 reply limit >>102478518Fix: https://rentry.org/lmg-recap-script
>>103265207>UOHToT
thoughts on the sana release ?i havent tried it but looking at the license i think the whole promise of efficent inference is a lie or completely gimped last minute the model also uses more memory then it should the fucking 0.6 b model uses 8gb and the 1.6 b 16gb vram though they did say with quanting it will go down so heres to hoping they trained the model in fp 64 or 128
>>103265207So are you just shilling these tulu models or what
what does it mean when you model keeps repeating the same thing again and again regardless of your prompts. How do you fix that?
>>103265656It means you touch yourself at night.
>>103265656it means your setup is completely broken and you're not actually passing your inputs to the model
>>103265958I see, thanks
fuck is tulu?
ai isn't real it's jsut word associatomn and statistics
https://www.reddit.com/r/LocalLLaMA/comments/1gwyuyg/beware_of_broken_tokenizers_learned_of_this_while/>How can you tell?>A model's tokenizer is the tokenizer.json file, and you can tell if a tokenizer is borked by transformers by seeing if it's size is double of the base model's tokenizer size. >This can happen to any model, I have seen this on many finetunes or merges of Llama, Mistral or Qwen models. So if you are having issues with a model, be sure to check if the tokenizer is broken or not. >How to fix this?>Easy. Just copy over the base model's non-broken tokenizer.
Holy f**k level 2 reasoner strawberry 01
>>103266637Mistral-Nemo-Base-2407: correct 9,3MBMistral-Nemo-Instruct-2407: correct 9,3MBRocinante-v1.1: correct 9,3MBUnslopNemo-v1 & V2: correct 9,3MBNemomix-Unleashed: correct 9,3MBMN-12B-Mag-Mell-R1: correct 9,3MBCrestf411_nemo-sunfall-v0.6.1: correct 9,3MBUnslopNemo-v3, 4, 4.1: INCORRECT 17,1MBCrestf411_MN-Slush: INCORRECT 17,1MBResults from a few Mistral Nemo tunes who's weight I had dl.
Any worthwhile model I can run on my M4 Pro with 48GBs?
>>103266637How does this affect me as a regular llamacpp user? I never download anything other than the gguf file(s), do they already contain the tokenizer?
https://huggingface.co/AIDC-AI/Marco-o1https://arxiv.org/pdf/2411.14405
>>103266853>do they already contain the tokenizer?Yes, and possibly the broken ones>Ollama uses GGUF file type. So it depends on which tokenizer was used when the model was converted to GGUF.
>>103266857Just read that as well, this ain't goodIs there a way to merge a gguf with a fixed tokenizer?
>>103266885I think the easiest is to redo the gguf with a "fixed" base tokenizer. I don't know if you can edit the tokenizer metadata in the gguf to the level you'd need.
Why is it that every few months a major (and in hindsight quite obvious) bug in the AI ecosystem gets unearthed? And why is it usually the tokenizer?
>>103267004>usually More like every time
>>103266637>>103266757I just checked Mistral Large>2407: Tokenizer 1.96MB>2411: Tokenizer 3.96MBaren't these supposed to be the same when it's just a minor refresh?
>>103267063Nah, New large has the instruct tags, right?
>>103267004AI developers are bad programmers, that's why most of them use python
>>103267080>AI developers are bad programmers, that's why most of them use pythonas a data scientist, I confirm
>>103267074>>103267063>instruct tagsThat and the 2411 tokenizer is called v7>https://github.com/LostRuins/koboldcpp/pull/1224>Create Mistral-V7.json #1224 So I'd say it make sense they're very different.
>>103266637What exactly is it that 'breaks' inside a tokenizer? The json is just a long textfile that lists all tokens + some other stuff. I don't see what can go wrong here.>>103267074>>103267098I just checked both of them. Somehow the 2411 tokenizer has three times (90k vs 280k or so) the lines because the "merges" section now looks very different with a lot more spacing. Left is the 2407 one and right is the 2411 one. Both are up until the "merges" section basically the same. No idea what bloating it up like that would accomplish though.
HEY HEY HEY! WASO WASO WASO WASO WASO WASUP BITCONNEET
https://www.reddit.com/r/LocalLLaMA/comments/1gx6qyh/open_source_llm_intellect1_finished_training/Kek at comments>The first ever OPEN SOURCE model, not open weights but OPEN SOURCE!
>>103267331>not open weights but OPEN SOURCEit shouldn't be something to be celebrated, you have to hide your dataset so that you can train your model on good quality data, and not on some copyright free slop, those fucking ledditors...
>>103267346More importantly, it's far from the first open source model, there was K2>LLM360 has released K2 65b, a fully reproducible open source LLM matching Llama 2 70b And quite a few older ones as well who showed their data.
>>103267359>K2i remember the cope like "you can uncuck it" and stuff, and literally nothing came of it lmao
>>103266732woah, this is very interesting, will they release the weights?
>>103267442it's already here? >>103266856
>>103266856>by fine-tuning Qwen2-7B-Instruct with a combination of the filtered Open-O1 CoT dataset, Marco-o1 CoT dataset, and Marco-o1 Instruction dataset, Marco-o1 improved its handling of complex tasks.>Qwen2-7B-Instruct...
>>103267450thanks
>>103267476>Even the Chinese forget Qwen 2.5 exists
>>103266856Yeah this is no Deepseek
is quadro rtx 8000 worth it?
>>103266856
>>103267627Short answer:>noLong answer:>noooooooooooooooooooo
>>103267685but I can't find cheap 3090s
>>103267331>The first ever OPEN SOURCE model, not open weights but OPEN SOURCE!KEK
>>103267701You heard it here first, if you build your oss soft on multiple machines it's even MORE open!
>>103267697Wait for the 5090 to come out and hope that gaymers will sell their 3090s.
>>103267697Define cheap?A Quadro RTX 8000 costs 3 times as much as a 3090,
>>103267697Fb market has them for 700-900
>>103267751cheap as in freshly fallen from the delivery truck.
>>103267346>it shouldn't be something to be celebratedtrue, but it's not they had any other choice in this case, the dataset has to be public for decentralized training
Now that distributed training worked fine it's time to make distributed inference so that we GPU poors can get some gibs
>>103267966Already exists, for a while too
>>103267331>All that effort and time>For a model trained on 1T tokensIt's really over isn't it
>>103267987I hope somebody pays you to do this all fucking day.Because if you do it for free you are the most miserable fucking sub-human pile of flesh to ever escape the abortion process.
>>103267978Kobold horde you mean?
>>103267990>I hope somebody pays you to do this all fucking day.>Because if you do it for free you are the most miserable fucking sub-human pile of flesh to ever escape the abortion process.
>>103267993Nohttps://github.com/bigscience-workshop/petalsLlama.cpp RPC>>101582942 >vLLM distributed inference actually worked...>I got 15 T/s with Mistral Large with 2 PCs with 2x3090 each. To name a few options.
I'm getting 1.40 tokens/s with Cydonia-22B-v2q-Q3_K_M.gguf on a 3060 with 12gb, are my settings fucked or is this normal?
>>103268060That seems rather low, yeahDid you limit the context size to something like 32k? Flash attention? Other programs hogging your gpu?
>>103268024>PetalsWeren't those the people who made BLOOM
>>103268078I have flash attention and context length is 8192. I'm retarded, I started using LLMs on my machine less than 24 hours ago and don't know what I'm doing
>>103267990Anon, if you want to treat 30 people lending a 10B model trained on 1T tokens like it's the second coming of Christ then feel freeBut as far as breaking the chain of corpo dependency goes, there's still a ways to go
how do I make macos not send any telemetry so that I can enjoy both high token generation/s and power efficiency with privacy?
>>103268112Shut the fuck up you retarded piece of shit.
>>103268106You should get about 5 t/s with this kind of context. Maybe you're offloading too many layers into RAM.
>>103268127Sorry samsja, didn't mean to offend you
>>103268127Trvke We Need To Support The First Ever OPEN SOURCE model, Not Open Weights But OPEN SOURCE Y'all!!!!
>>103268060If you are using Windows: check that the driver setting where VRAM is swapped into RAM is disabled (I forgot what it's called).If you did not manually set the number of GPU layers your frontend may be trying to set the value automatically; I know that KoboldCpp and ollama have logic like this and to my knowledge the estimates tend to be too conservative.
>>103268124>apple>no telemetrylol. You could unplug the power cable and encase it in concrete.
Ultra censored LLM from applel is coming btw https://x.com/MacRumors/status/1859707331392757812
>>103268362but it uses emojis more now! great tradeoff
>>103268175It's called "CUDA - Sysmem Fallback Policy" in the nvidia control panel
>>103268362>mistal large shits the bed in every categorygrim
>>103268362I mean yeah it's worse but at least they got to top the LMSYS leaderboa-
Morning, /lmg/...
>>103268540it literally does better than a recent 4o release
>>103268606Good morning to you, Miku
>Ask Project Euler question to write code to solve the problem>Get thisThanks DeepSeek
Holy fuck someone buy this https://www.ebay.com/itm/276743259844
>5 minutes to train GPT-2Are we back?
>>103268750This is literally benchmaxxing, the condition for a completed training run is just achieving a specific eval score.
>>103268737Wtf is that real
>>103268727SOUL
>>103268606gm betufel
>>103268737Someone take the plunge
>>103268362Where is Largestral 2411?
>>103268769>Must not modify the train or validation data pipelines.It's the same dataset and parameter size, though.
>>103268784I got one before they nuked the listing. Any bets as to whether I get it?
>>103268831Honestly, I doubt it. Looks like a pricing error from the other stuff they sell.
>>103268831You'll get a box. You'll have some GPU in it if you are lucky.
>>103268831Interesting. So it was probably legit and the guy accidentally a decimal place.
>>103268831I don't know how UK law specifically but under German law if they mistyped the price they would now have a legal obligation to actually sell you the item at that price (but they may refuse and you'll need to take them to court).If it was a scam you'll never get anything.
>>103268362Qwen that high?So running it at q4 is the reason why I get so many dumb coding errors…>>103268727Lmao qwen did something similar yesterday for me, it was too lazy to write a section of code and just made it a “suggestion”.
>>103268926No, Qwen and other chinkshit just do everything to look good on benchmarks
>>103268866They offered paypal so it should be pretty easy to get your money back if it's scam.
>>103268945Regardless if it is better or worse it is still the best coder model that I can run.I just want to know how to squeeze more performance out of it.
Expect the first two weeks of December to be crazy for local models.
>>103268866This is false, that law only applies to retail shops
>>103269118https://www.rechtsindex.de/internetrecht/4542-bgh-urteil-viii-zr-42-14-ein-fahrzeug-fuer-1-euro-schnaeppchenpreis-bei-einer-ebay-auktion
>>103268362wtf is OpenAI doing, they're getting destroyed by the competition, they can't even beat themselves anymore lmao
>>103269137einen gutes offen modell zu dir auch guten herren
>>103268737holy fucking shit>Located in: ShenZhen, ChinaI smell hogwash
>>103269217All talent left and they are hitting the wall of diminishing returns while raking up debt. They made a fucking CoT tune and marketed it as innovation. A fucking CoT tune.
no one seriously believes these things are alive, do they?
>>103269328My boomer parents do.
>>103269328Some people think the earth is flat, others think that you can sustain yourself with just sunlight. Believing that some enhanced text prediction model is sentient is one of the less egregious cases of retardation
>>103269328jewlywood portraying ai this way is to blame.
>>103269328I had this illusion during the early days of c.ai, but it was quickly gone.
Looks like INTELLECT-1 is finally done training. From what I can gleam, it should be released in a week and is currently going through post training with something called Arcee AI.
>>103269217It doesn't exactly help that their talent left (actually it might even make it worse, since all their investors are probably looking to see if they can recover from their exodus lmao)Still kinda crazy to see how far their lead is slipping. I still remember when OpenAI had GPT-3 and all we plebians had was GPT-fucking-Neo-2.7BNow DALL-E 3 is mogged by Black Forest Labs, GPT-4o is mogged by Claude in the intelligence department and Gemini in the human preference department, o1 already has a competitor, and Sora is basically MIA
>>103269402>Arcee AIThat's mergetkit people with Charles O. Goddardhttps://github.com/arcee-ai/mergekit
>>103268540?Did you misread it thinking they are sorted top to bottom? It beats many top corporate models.
Which local model if I want to try out those new meme IDEs?
>>103269402Who's going to do the red teaming and rhlf?
>>103269455Arcee> Arcee AI empowers businesses to train, deploy, and continuously improve proprietary, specialized, secure, and scalable small language models (SLMs) within their own environments, revolutionizing data privacy and security. >Their all-in-one system enables pre-training, aligning, and continuous adaptation of small language models. >This ensures security, compliance, and enhances model relevance and accuracy.
>>103269466>aligningkek they're gonna cuck the model, it's ova
>>103269466>no, goys, you can't have the base model, that's too unsafe for you>here's (((aligned))) instruct
>>103269402>alignment It's DOA.
>>103269466kek @ whoever paid for an aligned model
>>103269419>the guy who's responsible for the shitty merge era is now offering alignment servicesThis guy is a grifter and a net negative for local
>>103269483Y'all love censored models though, from all the shilling I've seen here.
>>103269466>>103269402>1st ever fully open source model>aligned to fuck and we don't even get the base modelscam
>>103269328llama 3 7b is sentient
>>103269328Claude is the closest to have that 'ghost in the shell' feel
>>103269508>turkish rapebaby tranny balkanoid does his low effort trolling againhi petr*
>retards freaking out over the word "alignment"
>>103269514Safety and tolerance are the most basic values of Open Source and its community.
>>103269466>ArceeThey sure got big thoOthers include>AWS and Intel
>>103269541Right, like that hasn't meant practically only one thing since the word became commonly used.
>>103269571Releasing an unsafe and offensive model would only hurt the image of open source.
>>103269550Yes, xister! Free software should be replaced by ethical software to own le chuds! #RemoveStallman
>>103269550That's actually true, considering /g/'s love for establishment and queer e-celebs.
>>103269586Correct
>>1032695502/10 ragebait
>>103269571you haven't realized that every single big release pays lip service to the concept of alignment regardless of how censored they end up being
>>103269616>>103269603But sure, if you want to be hyped for something that'll 100% be ultra-corpo safe go ahead.
AI isn't real and everybody who's making money off this field knows this but pretends otherwise. Get that bag and gtfo before the bubble blows up. It's okay to be a bystander who just wants a local smut autocomplete, a bunch of h100s will be liquidated.
God I hope R1's weights are actually released. This model is legit better than closed source.
>>103269627Bro you don't understand, sama has made GPT5 smarted than a human, it's fully multimodal AGI or even ASI! Please invest.
>>103269642It's pretty entertaining to see it's thoughts but it just badly fucked up a coding problem that even free chatgpt solved for me. And it's coding knowledge seems to be really outdated.
>>103269666? Its the only model that got some stuff only claude 3.5 and qwen2.5 32B coder did before. Maybe got a bad "reasoning" roll?
>>103269688lol, no way. reasoning doesn't do shit for coding abilities.
>>103269714And why wouldn't it?
Speaking of alignment, I wonder what this last line is supposed to be about?I don't have anything about flags or alignment in my prompt or the card description.
>>103269740I'd say it's the name "Naomi" pulling all kinds of CoTs / jbs in the garbage logs your model was finetuned on.
>>103267649>feeling of stepping on feces
>>103269804I'm happy to see that your relationship with tranx qwxxn is going great, keep us updated.
>>103269550Fuck that shit.
https://x.com/ltxstudio/status/1859964100203430280Local video generation now down to a single 4090
>>103266757All of Drummer's Small tunes seem to have the bloated tokenizer.
>>103269898*except Cydonia v1.0
>>103269883yeah, can confirm, this shit is pretty good, and it's really fast>25 fps, 129 frames (5 seconds), 50 steps>01:10<00:00, 1.42s/it>13gb VRAM peak (during the vae decoding)>RTX 3090
>>103269883>Local video generation now down to a single 4090We've had that since Mochi 1
>>103269989but with mochi you had to wait for 30 mn to get a single 5 second video, for that one it's only 1 mn because they managed to efficiently compress the VAE
>>103269883>https://github.com/Lightricks/ComfyUI-LTXVideo>https://github.com/Lightricks/LTX-Video> first commit 6 hours agoyeah this is an obvious shill for anon Guinea pigs
the chinks will save us all
I wonder what kind of AI models Aliens use.
>Product Security Engineer @ Red Hat- AI Security, Safety and Trustworthiness>https://huggingface.co/posts/huzaifas-sidhpurwala/601513758334151>As AI models become more widespread, it is essential to address their potential risks and vulnerabilities. Open-source AI is poised to be a driving force behind tomorrow's innovations in this field. This paper examines the current landscape of security and safety in open-source AI models and outlines concrete measures to monitor and mitigate associated risks effectively.>https://huggingface.co/papers/2411.12275We need more of this! Much more!
>>103270001Near real time on a 4090. First video model worth using because of it. Prepare to start seeing porn finetunes of it.
>>103269981I like that one
>>103270112https://arxiv.org/abs/2411.12275
>>103269514Looks like they actually will be releasing the base model, as well as the post trained model.
>>103270150sounds like they aren't as retarded as I thought, that's cool
>>103270150oh wow lmg was dooming over nothing who would have thought
>>103270150who cares, the chance of this model being better than llama2 7B is slim.
>>103270176for once /lmg/'s doomerism was wrong, usually we get fucked in the ass pretty hard
Update on sarashina2 8x70b...its pretty unhinged with good temp/minp. I'd say its the jap ERP king after doing some completion on existing chats. Super spicy.The initial release had a busted tokenizer_config.json, but after requanting it works properly.
>>103270185maybe you do
What is the best oobabooga preset (or parameter values like temperature, min p, etc) to use for the best Roleplay(mainly erotic but I care very much about characters following the scenario and not going OOC) experience in Mistral Nemo 12B finetunes?>Use DRYI don't have that yet, will get around to that.
>>103270274Depends heavily on the model, but i like to start with temp 2.6 and minp 0.008 and then back off until I get an amount of insanity that's appropriate for what I'm trying to achieve.
>>103269714Kek. Not sure what level of coding you've done, but I have to assume you either: (a) are just starting out and have somehow Dunning-Kruger'd yourself into thinking you're an expert, or (b) the only coding you've done has been via prompting an LLMReason I say this is because generally, unless you're truly doing basic toy shit, you generally don't get very far before you get fucking steamrolled (or, on the off chance it does work, write horrifically inefficient code vomit) if you don't know what you're doingIf you disagree, I invite you to check out TAOCP, Concrete Mathematics, and Algorithm Design by Kleinberg and Tardos>>103269688r1 has some pretty heavy variance. It generally ranks below o1 in some of my tests of programming / algorithm problems, though it definitely often comes a lot closer than Claude and Qwen. I don't think it would fully replace o1-preview for the people that use it, but it would make the ridiculous prices OpenAI charges at the moment quite a bit more questionable
>>103270322Top p 1, top k 0, typical p 1, right?And repetition penalty at?
>>103270347>urrr durrr skill issueStop being stupid anon, I'm talking about the kind of reasoning that these LLMs do. If you think I'm wrong, why does o1 gets mogged by Claude 3.5?
>>103270434It genuinely doesn't though, it's just better at the easier stuff. If you disagree, you can test it yourself. Here's the problem: https://atcoder.jp/contests/dp/tasks/dp_jPop that into Claude and see what it gives you. Here's the (correct) o1 solution for reference, which was its first attempt
>>103270542Claude test for reference (got murdered by a division by zero)
repetition penalty should be deprecatedliterally just exists as a newfag filter at this point, way too easy to go wrong and use retarded values that turn your output into adjective/adverb spam because every glue word got penalized into nonexistence
what's the current best method to have a chatbot that 1. can "read" images, so if i post an image it can describe it (within current model limits of course), and 2 (optional) i can tell it to prompt and gen an image?using sillytavern/koboldcpp backend currently, but not sure how i'd go about it there.in other words, i want to chat with teh ai about images and if i post one it can talk about it, and have it suggest prompts
>>103270696>remove feature with occasionally niche value because retards don't understand it and use it in the wrong wayNo, that's the spirit of proprietary software, not open sourceOSS does mean footguns for newfags sometimes but that's a price worth paying. Fuck outta here with your dumbing-down suggestions
>>103270886Open webui if you don't mind it raping your RAM.
>>103270992what niche value does it have over presence / frequency penalty (the same thing but with sane scales and less retarded implementations) or more advanced repetition samplers like DRY or w/e? it's just a super primitive and very poor sampler that sucks ass at its job. it's bad. there are NO pros to it. rip that shit out.backends can keep it for compatibility's sake but frontends should not be putting that garbage in front of a user's face unless they very specifically request it for some deluded reason
>>103271048NTA but simply updating ST's default presets would solve this.
>>103270365I keep top p and typical p at 1, but I crank top k all the way up to 200.I don't use rep-pen. If a model is to repetitious in a way I don't like I just don't use it.
anyone tried out the vision support in exllama2?
>>103271200exllama supports vision now?
>>103271102Thanks anon I will see if it works well for me.
i've also given up on rep penalty stuff, xtc, dry. sure they can help reduce overused slop but they also introduce errors when the model wants to say a shirt is red, but can't, so it picks another color which is wrong. so out of the choice of more slop or inaccuracies, i'll deal with the slop. low min p + adjusting temp is all i use these days
>>103271276I concur with this assessment. At first I thought Largestral wasn't that good, until I realized XTC was making it retarded and never went back.
Where do you reckon the tech will be in five years? ten years?
>>103271236I saw this in tabbyhttps://github.com/theroyallab/tabbyAPI/pull/249
>>103271276I agree on rep pen and especially XTC (that can REALLY make a model retarded...turns out lower probability tokens are lower probability for a good reason)but I find DRY is basically risk-free regarding the model's intelligence as long as you're not applying it to single tokens (so allowed length of 2 or higher)
>>103271011i have 128gb ram and 24gb vram, does that work?
>>103271331By then nvidia will release $300 24gb cards finally, and we will be able to run local o1 on it.
>>103271345NTA but>(so allowed length of 2 or higher)This should be at least 5 or it starts banning uncommon names. Also {{user}} and {{char}} in sequence breakers (persona and character names should consist of first name only).
i think the site died
>>103271647ayy finally, after like 4 attempts>>103271345out of the three (rep pen, dry, xtc) i liked xtc the least. it just seems like a horrible idea to chop off the top tokens because that token could be a name, color, any kind of detail. dry seemed ok but could also introduce errors.>>103271473is that something you setup already or just speculating on? i'd try it
>>103270176You think they were using good training data? Only copyrighted data is good training data, and them being open source means they will have zero of that.
>>103271712nice unrelated pivot
>>103271712Issue isn't the quality, it's the count
>>103271678>is that something you setup already or just speculating on? i'd try itWith --debugmode on in KCpp, I saw it trigger a lot on first syllables of non-English names. I guess I'm speculating a bit here: those names would pop up if necessary ("He was born in _") but otherwise not.
Any threestral finetunes yet?
>>103271871when i was using dry i noticed it fucked up jap names a lot. like it'd get through half a name and then just go nuts. (tsukino becomes tsukAKAK)i thought it was the model at first since i never saw the same issue with normal english names, but it went away when i turned dry off.
discord.gg/aicg/although we are chatbot focused we have many channels meant for prompting and ai art which includes dall-e, flux, stable diffusion, pix art, someone even hosts a proxy’ come join us!no lurking
>GPT-4o no longer topping any leaderboard, tried to top Google only to get smacked back down>China is about to take away what little value o1 had>OpenAI ran head long into a "fuck you" sized scaling wall that's turning any further upgrades into side grades (the more recent GPT-4o to try to top the LMSYS leaderboard is worse)>Anyone with any competence to save OpenAI from itself has long since left>Musk has power now and is out for Altman blood>OpenAI still dealing with a fuckton of lawsuits from NYT and Pajeets, "accidentally" deleted their datasets which makes them look more liable>Still billions of dollars in the hole with investors getting antsy for a return on their buckIt's like watching a train wreck
>>103272323>faggot noiseslol no
>>103272363>Anyone with any competence to save OpenAI from itself has long since leftThe inertia is spent. altman is finally paying for his hubris.
>>103272363was about fucking time, I alaways hated this fucker, his fear mongering of AI has done a lot of damage to the community
>>103272363i don't even consider openai to be relevant at this point. claude passed them months ago and its remained the same. now local has gotten so close on benches like coding which is amazing given the assumed size difference (qwen 32b vs whatever the fuck 8x+ monster gpt/claude is). openai's reign was over months ago, its just taking people a while to realize it
>>103272417Have we really started to take Chinese model benchmarks at face value? Come on now.
>>103272363Trust the plan Altman says AGI is coming in 2025. Strawberry is going to blow us away.
>>103272363Musk recently said he would make AGI by 2026 and xai is building a massive server farm at a unprecedented pace.
>>103272363OpenAI still has branding and first-mover's advantage. For most people AI = ChatGPT.
>>103272435i wouldn't even mention a benchmark if i didn't use it myself, i know how they game shit and especially china they lie and steal everything. but yeah, its a good model, its the first one to not shout at me in chinese half way through a message. not just qwen though, nemotron is also very good. hell even codestral is amazing for its size. local is eating well and the gap has shrunk insanely in the last year.
>>103272449America Online also had branding and first mover's advantage
>>103272449>AI = ChatGPTThat's why I mentioned inertia and referenced the brain-drain quote. You can only keep first-mover if you are close enough to state of the art to keep yourself relevant vs. people discover superior services.Its a serious advantage, but must be defended, especially since AI is in its infancy in the public imagination.
>>103264382To some extent the human brain contains dedicated centers, but those centers have extremely wide interfaces pumping a shitload of data between them. Bandwidth is the limiting factor for virtually every interesting computation. So what you describe will never work with English text or tokens or whatever as the shared language for the centers- the interface is too narrow, so it'd at best be ultra inefficient.It also just won't work well with the Von Neumann architecture.
>103272323What a sad end
>>103272488back in high school when aol was sending billions of cds to everyone, you could even find them at burger king, we used to shove hundreds of them through the vents of lockers so when you'd open it, 500 cds spill outgreat fun until its you that opens the locker
>>103272499This. First mover's advantage isn't going to save you if your services are inferior / more expensive than competitors (like, say, charging 0.50 per o1 query). It might delay your fall into obscurity, but eventually people will move on if you don't have something good enough to offer them. Unfortunately for them, OpenAI also happens to be in the position where it's costing a lot more than it's bringing in and it needs a plan to turn a profit fast.
>>103272581>people will move on if you don't have something good enough to offer themmy co-worker moved to anthropic a while back (understandable since IT), but my kid's friends are already moving to perplexity, so they could give a rat's ass whats on the back end. And this is in a rural area without any kind of tech sector presence.Also probably a majority of companies are using copilot branding via their MS EA, so the name brand is already severely diluted for knowledge workers.
>>103270886>sillytavernJust use the attach button, at least with the custom OpenAI API and using vLLM as a backend, it just works. I suppose now tabbyAPI can be used for that too.
>>103266622>artificial intelligence is artificial Oh wow.EveryoneThis guy is so smartHoly shit
>>103266622>ai isn't real it's jsut word associatomn and statisticsand what are we? our brain is just working thanks to a set of little electricity shocks
>>103272323Probably a troll honeypot server. So naturally I'm going to join out of morbid curiosity
>>103272807electricity shocks powered by God
>>103272840Aww it's a fake link. No friends for me :(
>>103272842if God created the world, then he also created the AI, CHECKMATE
>not real intelligence>But it knows about the hallway birds
Is there a model that's free to use commercially? It doesn't have to be gpt level, just needs to string a few sentences of text together.
>>103272976A ton of openly licensed models can do that.
q2 behemoth or 5km midnight miqu?
>>103272886those are just government drones that don't fly
>>103272995Thanks I'm retarded, I'll use T5.
>>103272997miqu for slop, q2 for tardation
>>103272363>Musk has power nowMy hobby didn't deserve it. It is a good thing it id dead anyway.
>>103272997Go back to the Kobold Discord.
>>103269586I'd say fuck your optics, concern troll, but what you say isn't even remotely true. There needs to be space for hobbyists and tinkerers to collaborate on uncensored models. If that is not allowed, then you can be sure you don't live in a free society.
>>103273159based
>>103267649Ah so it's completely useless for translating visual novels because it avoids anything offensive or adult in nature.
>>103273094dunno what that is
I hate Qwen. Largestral is too big. Nemo is too retarded. Nemotron wants to give me lists instead of being normal. What am I supposed to use?
>>103273292money
>>103273292Magnum v4 72B
the google colab hag gives me a hard on every time.....
>>103268024>Wanna host? Request access to weights (huggingface login), then run huggingface-cli login in the terminaliirc this was the issue. It's like saying "Run your media bittorrent-style. Provide your Netflix login to get started!"
>>103272363He is about to get what he fucking deserves (I will never forgive him for withholding GPT-3 and forcing people to eat shit for 2 fucking years).
>cheap radeon pro v620 32gb on ebayworth it?
>>103273292Wait 2 more weeks
>>103274158More like 2-4 months unless deepseek drops R1.
>>103265207>https://rentry.org/lmg-lazy-getting-started-guideI followed this guide and its still censored.koboldcpp/Mistral-Nemo-12B-Instruct-2407-Q4_K_Mkoboldcpp backendmistral v3 tekken context and instructetc etc, won't do anything uncensored. Should I get a different model is that what I did wrong?
One day after the AI goldrush dies completely some guy will have too much free compute in some of the big companies and he will just drop some discord rp dataset in the main training + will ease off the censoring a little and we will get a 7B that just gets everything.
>>103274275Spare compute for sure, but I don't know about that mythical rp dataset, chief
>>103274275>>103274320You don't really need it. Poe AI with a good enough prompt on GPT3.5 did some ungodly nasty things with me.
>>103274275I mean, no that's never ever going to happen.reasons for this:1. the dataset when training a model, really, really matters. so much so, that anthropic created Constitutional AI, which uses another AI to create the dataset. Having a junk dataset really harms the output.2. Many models have already done this already, take a look at ArliAI, they're not perfect, not by any means.3. the goldrush will get replaced by something better LLMs are step one, there will likely be better shit in two more weeks.
>>103274366>ArliAI>Training Duration: Approximately 3 days on 2x3090Ti>Epochs: 1 epoch training for minimized repetition sicknessAh so you just don't know what you are talking about.
>>103274386yeah and you're the expert clearly, dumbass.
>>103274366>anthropic's constitutional AI>clear improvement on the same model every new checkpoint>meta's SPIN>benches keep maxxing yet nobody can tell any difference
>>103269328they literally are. simulating thoughts are thoughts because thoughts are simulation
I am getting a ton of 404s on HF for model cards that were there last week. Was there a purge or was I just looking the wrong thing?
>>103274623Yes. That one was removed, but not the other ones. Except that other one.Just post the fucking links of you want someone checking them for you, retard.
>>103274652I included links for everything I wanted checked. ohhhh... I forgot to include links.Please see below.
>>103274486>yet nobody can tell any differenceFiltered
https://github.com/danny-avila/LibreChathas anyone used this? I'm just looking for a lightweight interface for chatgpt, claude and others
>>103274785I never heard of it. Most people use SillyTavern.
in ST, how do I set up an author's note that won't force a large portion of my context to be processed again when I either edit it or it gets inserted, currently I've got it at depth 0 insertion frequency 4 and for some reason it's going 6k deep and completely defeating the purpose I'm using it for (summarizing a very long chat that will take over 2 hours to process since I'm on cpu)
Hey Magnum anon here? Thank you for giving me the pointers yesterday. I had problems with Magnum 70b occasionally acting up and sometimes defaulting back to Qwen's, outputting slop or gibberish, then I copied the System prompt exactly from the hugging face page and that changed everything completely. Apparently having the same system prompt as what the model was tuned with MASSIVELY reinforces the tuning and makes it abundantly clear for the AI that it is not a helpful and polite AI assistant anymore. Copy and pasting the system prompt from the tuner's page completely changed the model's behavior and erased all traces of the pozzed censorship, bias or purple proze, now the waifus are coherent and wild as fuck.
>>103274902I think sillytavern only works with API keysI want to use my regular accounts but with a UI that doesn't take up 700mb of memory for a fresh tab like Claudeto be fair maybe I should look into using APIs directly but I imagine it's more expensive than regular premium for a power user
>>103275049Do ask the model tuners to provide the exact prompt and format they were tuning with, it's incredibly powerful. And if the model tuners are here - attack your system prompts to your model page.
>Qwen2.5>DeepSeek R1>Marco-o1WE BUILD FOR CHINA
>>103275122China will release them open in order to undermine US companies as long as the US has the lead. The moment China is in the lead they will go closed source.
>>103275139>The moment China is in the lead they will go closed source.Then the west would go open, isn't the competition a beautiful thing? Cold war was the reason the technology was developing quickly back in the 20th century, no competition = no progress.
bfloat16 is a memehttps://arxiv.org/abs//2411.13476
any igpu enjoyer here?https://www.reddit.com/r/LocalLLaMA/comments/1gheslj/testing_llamacpp_with_intels_xe2_igpu_core_ultra/should I go for intel or amd?
>>103275056>regular accountsAt least for Claude, I think people use this:https://gitgud.io/ahsk/clewdWhich makes a custom proxy that forwards the API calls to the web app. People used a similar method for Slack in the past.
>>103275254Do the modern integrated GPUs have limited ram allocation? Can i have a "tpu at home" if i bought a cpu with an igpu and allocated 120 gb of ram to it?
>>103274785Can it be used with local?If not then fuck off.
>>103275312>Can it be used with local?>If not then fuck off.
>>103275287afaik ryzen apu can address up to 64gbhttps://www.reddit.com/r/LocalLLaMA/comments/1efhqol/comment/lg24yh5/?utm_source=share&utm_medium=mweb3x&utm_name=mweb3xcss&utm_term=1&utm_content=share_buttonnot sure about intel
>>103275287As far as I know it doesn't work with AMD.
>>103275312yes, you gigantic retard, it can be used with local modelsI tried running it with docker but had errors with mongodb and i cbadoesn't look bad though
>>103275312Yeah it can connect to any openai-compatible API which most local LLM servers can do.
>>103275312God you're so painfully retardedDo you even listen to yourself?
The "local models are finally very good, but painfully slow" era is much more annoying than I thought it'd be
>>103275413Xe can't cuz cloud stuff lives rent free in xis head.
I'm getting extremely similar, near-deterministic outputs on every reroll with mistral large 2411 even with 1.2 temp. I have not touched any other samplers/params, it's all vanilla.Any idea how to fix it?
>>103275602are you using an old llamacpp variant you didn't update for a whileiirc there was a brief period where sampling wasn't working properly unless you were using the HF loader, so it would act deterministic with any settings
>>103275627Latest koboldcpp with sillytavern, with default settings (other than temp).I have been out for quite a while so everything is freshly downloaded.
what's teh best gguf of Midnight-Miqu for 24gb? i'm using 70B-v1.5.i1-IQ4_XS (34.6GB) and it's about one word/sec on 3090ti can i go to one of the smaller models (3M/3S/3X/3XXS) without making it crap out too much? or is there a better alternative? i find this model to be smart enought to keep a casual conversation going for a while
>>103275653>>103275627Also Q4_K_S from here: https://huggingface.co/bartowski/Mistral-Large-Instruct-2411-GGUF
>>103275653do you get gibberish/word salad if you crank the temp to 2.0 with all other samplers offthat's the easiest way to test if sampling is actually working or not (gibberish means it is working)
>>103275602Frequency Penalty 0.13 and Presence Penalty 0.2 seems to be working for me. I was just cranking both up to 1/1.5 and halving down until Lagestral 2411 became less repetitive.
>>103275697Yes, I am getting word salad at 2.I guess I'll just have to find some better sampler settings.
>>103275763temp 5 topK 3
>>103275518Accelerate
>>103275518Cloud models are smaller than you'd think. Current local models are just too big to justify their performance levels
>>103276223I think that's true in some cases, but Claude Opus (which most coomers think is the best coom model) is clearly a genuine behemoth based on its slow token generation rate.
>>103276250You'll never know with cloud models>Studies show that users associate a lower token generation rate with a higher perceived intelligence of the model
>>103276361Yeah but in this case Sonnet 3.5 has been their flagship "smart" model for half a year at this point.
good night, /lmg/
>>103276557Good night, Miku and friends
>>103276557
>How could I possibly relax when my body is still humming from what just happened?>humBros I want to know what the fuck the context is from the data poisoning source. "Shitty erotica" yeah I know but who what when why how exactly is it used when written by a human?
>>103276727there's a large market for commissioned smut, particular among furries (who have notoriously high levels of disposable income), and a lot of 'authors' just mass-produce that slop by copying and pasting chunks together and using find-and-replace to add names and pronouns in afterwards; I'm betting a lot of that made it into datasets, along with all the commercial erotica that's probably produced in a similar mannerwe need a model trained exclusively on Ao3, ff.net, and maybe some of the quest forums (sb, sv, qq, etc)
>Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models> Hallucinations in large language models are a widespread problem, yet the mechanisms behind whether models will hallucinate are poorly understood, limiting our ability to solve this problem. Using sparse autoencoders as an interpretability tool, we discover that a key part of these mechanisms is entity recognition, where the model detects if an entity is one it can recall facts about. Sparse autoencoders uncover meaningful directions in the representation space, these detect whether the model recognizes an entity, e.g. detecting it doesn't know about an athlete or a movie. This suggests that models can have self-knowledge: internal representations about their own capabilities. These directions are causally relevant: capable of steering the model to refuse to answer questions about known entities, or to hallucinate attributes of unknown entities when it would otherwise refuse. We demonstrate that despite the sparse autoencoders being trained on the base model, these directions have a causal effect on the chat model's refusal behavior, suggesting that chat finetuning has repurposed this existing mechanism. Furthermore, we provide an initial exploration into the mechanistic role of these directions in the model, finding that they disrupt the attention of downstream heads that typically move entity attributes to the final token. https://arxiv.org/abs/2411.14257>>103276557
>>1032742757b is coping will need at least a 34b
I got bored with my suno credits and made this with mostly Suno V4 (and some post)https://voca.ro/1n6LYL5sb8GUDedicated to you guys. UwU
Poorfag here.I have a laptop with a Ryzen 5, 8 GB of RAM and no dedicated GPU. I could upgrade the RAM up to 32 GB though. Is that enough to run a local model (for coom reasons) or would it be too slow to be useable?Pic unrelated.
>none of the local models know about the "bakery" fat ass jokeDoes everyone just train on the same CommonCrawl from 2 years ago?
>>103276957I'm running Cydonia on an R5 5600 and 32GB of DDR4, get about 1t/s until getting really deep into context, would definitely recommend DDR5 if you can get it.
>>103276927kek
>>103276984Nothing wrong with that
what's a good, small and performant model (preferably uncensored)?
The age of rasperry starts now. R1-lite is only the first step.
Went back through my recent models, testing each one. I think 70b Hanami is my favorite.
>>103275161>isn't the competition a beautiful thing?yes anon, it's really beautiful, without that we wouldn't advance at all
>>103278046>R1-lite is only the first stepdo we know what will be the size of that thing? and are we sure they'll release it locally?
>>103278069And I think you're piece of shit that only came here to spam Sao's models. Go fuck yourself, asshole.
>>103278167you think something that's wrong thenI just coomed to it and wanted to share the positivity, schizo
>>103278193Go buy a fucking ad, asshole. I know you're just about to start spamming that model because you're a fucking shill.
>>103278209Specifically, I compared it to Nemotron, Magnum, Gemmasutra, and EVA-Qwen. Each of those made frequent errors which demonstrated that they didn't "understand" what was going on. Hanami, on the other hand, would write nice, long progressions of the scene that even made anatomical sense. Not that it was perfect, but I'm definitely going to keep using it for now.
Used CLIP to organize my 4chan and porn folders and the Tkinter GUI and troubleshooting was made by Qwen-2.5 Coder 32B, I love local models
>>103278249Made up crap that only serve as an excuse to shill because that's how you make money. When the next thread is 90% filled with your shills we're supposed to think it was organic word of mouth, right? Go fuck yourself.
>>103278268My previous favorite was Euryale. I'd also been using Magnum v2 and v3 off and on. Magnum v4 just sucks every time I try it.
>>103278280What? Do you want another excuse to keep shilling? Go ahead. Reply to this post. You're leaving money on the table if you don't.
>DRY just causes the model to intentionally misspell words so it can keep repeating them
>>103278069Going to try this model, thank for sharing :)
>>103278305Eat it up goy.
>>103278292I'm getting rich here, yeah. Also, I got noticeably fewer llama-isms. Shivers down my spine, breath hot on my ear, eyes gleaming with ____, voices barely above a whisper.
>>103278316That's good to know. Reply to this post again to tell me more about it.
>>103278159>and are we sure they'll release it locally?I mean, they stated they will, that's the 2nd most assuring thing they could do
>>103278321I'm really running out of things to say, though. Let's see... pic-related is my current system prompt. The warning part was for when Nemotron started being faggy, but I'm sure I could take it out now.
>>103278350Why am I supposed to care about your system prompt?
>>103278361It might affect model outputs? I actually haven't tested changing it with Hanami, so I can't be sure. Mostly I was just looking for things to say, which I already mentioned.
>>103278384So what you're saying is that you would rather do anything else rather than showing how the model actually writes? That's quite concerning...
>>103278394yeah, if he doesn't want to show the output that means that the model is ass, that's probably a shill
>>103278394It's a pain in the ass to show logs. I tend to use my own name, which I'd want to change. I also tend to tweak things to fit my fetishes, so a lot of the final replies aren't pure machine output (more like 95% model, 5% human).
>>103278410>schizo bamblingyeah definitely a shill
>>103278423bambling isn't a word
>>103278427That's your opinion, shill.
>>103278435kek
>>103278410This is the way to use LLMs. Stronger models will lift heavier, but in the end there's not a single one that can give you what you want perfectly. Back when I used Claude Opus I had to wrangle pretty hard too
>>103278435no, that's objectiveYou seem to have trouble thinking logically. Like tranny-tier in that words don't have meanings except in their use as rhetoric. Are you a tranny by chance?
>>103278462>You seem to have trouble thinking logically.says the shill who want us to try his model based on a "trust me bro" evidence
>>103278441You basically have to narrate at least some of the other character's actions to steer them in the right direction or simply make things make logical sense. You can do so in a hinting, indirect way sometimes and it has the intended effect.
>>103278467>who want us to try his model based on a "trust me bro" evidence This is certified /lmg/ hood classic.
>>103278468A problem I keep running into is female characters gradually becoming more aroused from an activity that does not involve genital stimulation and they eventually just magically orgasm out of nowhere.Like, no, it doesn't work that way.I have to narrate "{{char}}'s hand finds its way into her panties" or something to make the whole thing make sense.
>>103278441>I don't know how to prompt and I have to cope by writing my own outputs
>>103278503That's a sloptune issue, or prompt issue, or both desu. The current sloptune datasets are like 70% smut, most of which were generated from sex-mode jailbroken claude
>>103278515>I don't know how to cope - Right.
Holy fuck, buy an ad schizo having a meltie
>>103278497true, I remember the L2 era with the endless finetunes, downloaded so much models it destroyed my ssd :'(
>>103278525Mistral Nemo Instruct does it.
>>103278525>>103278598Mixtral Instruct also did it.
>>103278503Also, if it's a femdom character, she'll just order me to cum while my cock is not being stimulated in any way.Like, no, it doesn't work that way.
>>103278503You have no idea how women work.
>>103278619Neither do you.
>>103278441I have like five roleplays that haven't progressed in months. I pick a model and run it through each of them to see what it says, then focus on one to autistically iterate on until I'm finished. Repeat with the next model.
>>103278619A great deal of women factually can't even orgasm WITH genital stimulation let alone without it.
Fixed my gpu crashing, we're so back... to running quanted garbage because 24gb isn't worth shit in this vram-inflated llm economy
>>103278646it'll get better during the 5090 era, I'm surprised that Nvdia went for 32gb, that's a lot when you know how stingy they are with their vram
>>103278267>foldersaccept hydrus tags as your lord and savior
>>103278467I ignore people who post logs. It's always some 2-message garbage where the AI character says and does like ten things without any input from the player. They're totally useless as evidence of how it will perform, which speaks to the intellect of those who post them and/or want them.
>>103278658The devil is in the details. The 5090 will be like 4-5 slots and eat 600W, with maybe the potential of power limiting it to 450W without losing too much performance. All of that at $2k+ most likely.Even building a small 96GB VRAM rig with three of them is going to be a pain in the ass and scaling them beyond that will be even harder.
>>103278658Yeah but just like >>103278703 said, it'll be overpriced, power-hungry garbageI'm a student, not a consoomer
>>103278810>>103278810>>103278810
>>103269445I'm just looking at mistral large 2407 in the screencap, it almost loses to qwen 32B
>>103278687I can look for more complex concepts with CLIP, I just wish there was a bigger model (1-2B instead of the 400M OpenAI CLIP)
>>103278867Yea, qwen2.5 is really smart. Mistral large writes better though.
>>103278658I really look forward to AMD next cards.If they keep putting more vram on it someone is going to make them work for AI eventually.nvidia has the advantage for now but won't be long.
>>103279374>I really look forward to AMD next cards.anon, AMD was a company made just so that Nvdia wouldn't be sued for AntiTrust monopoly
>>103279411How the fuck is corporate collusion a sign that we're living in a simulation?Fuck Twitter for giving double digit mitwits a soapbox to speak onAnd fuck you for posting it here and making me read it
>>103279452keep coping retard, AMD isn't gonna save you, it's only role is to save Nvdia
>>103272363The only thing he needed to do was keep open-sourcing GPT models. That would prevent others from wasting billions on training new models and allow for improvements to the GPT models, guaranteeing a monopoly.For a jew, he is a massive ratard.
>>103279535Your idea is worse. If he open-sourced GPT-3 and GPT-4, their competitors would just take and finetune their models and provide cheaper alternative platforms since they did not have to invest in training their own models.
>>103279563this, OpenAI managed to get a monopoly for almost 2 years because they decided to keep the secret sauce to themselves, but this is now over, other companies can train their models better than then, oh well, RIP in peace bozo you won't be missed
>>103279563That's the point. His competitors would stay on GPT and wait for OpenAI to release new GPTs.Effectively murdering any competition.Today, there would be no Claude or Gemini. A few years of monopoly is nothing in the long run and they could have licensed the same way Epic licenses Unreal Engine, making billions easily without even running their models and wasting a shit ton of money on that as they do now.
>>103279601>That's the point. His competitors would stay on GPT and wait for OpenAI to release new GPTs.>Effectively murdering any competition.why? their competitors would continue the pretraining or finetune their GPT models in a way that it would beat OpenAI, doing that would even make it easier for them
>>103279615>why? their competitors would continue the pretraining or finetune their GPT models in a way that it would beat OpenAI, doing that would even make it easier for themAnd? They would be forced to open-source their models and pay money to OpenAI after x amount of revenue.OpenAI's massive losses don't come from training models, they come from running their models.
>>103279631>They would be forced to open-source their models and pay money to OpenAI after x amount of revenue.If they make the license too restrictive, it's the same as keeping them close source. Their competitors will be forced to train their own models. All open-sourcing them would do is make us happy and make it easier for their competition to catch-up because they can just look at what OpenAI did in their latest models and use the same techniques themselves.>OpenAI's massive losses don't come from training models, they come from running their models.Bullshit.
>>103279652>If they make the license too restrictive, it's the same as keeping them close sourceThere’s nothing restrictive about requiring people to pay after a certain point. Companies would gladly spend tens of millions of dollars on OpenAI’s GPTs rather than billions to train and operate their own models, which would cost even more.Why do you think Microsoft or Apple aren’t spending billions to develop their own models? It’s because they essentially own OpenAI’s models. However, if competition overtakes OpenAI, they could easily turn to Claude or Google instead, and that would be the end of OAI.Dominance over the market should always come first.
>>103279740>Why do you think Microsoft or Apple aren’t spending billions to develop their own models?>>103268360>It’s because they essentially own OpenAI’s models.Microsoft* essentially owns OpenAI's models. Apple had to rely on OpenAI because they had nothing of their own. They recognize this is a problem, and are planning to train their own by next year.Which is what everyone would do if OpenAI licensed their model weights to everyone with fees for corporate usage.They would just be giving their competition a stop-gap until they had their own models ready.
>>103276927heh
>>103279803>They recognize this is a problem, and are planning to train their own by next year.They already trained smaller models and they performed terribly. It will take a few years before they reach anything similar to the current level of OAI. They don't even have the infrastructure for it.At best, they will use upcoming llamas, and at worst continue using OAI for some time and then switch it.
>>103279535There's a certain irony to the fact that his antics are likely in part what led to our current era of French and Chinese models and the west basically eating shitRemember he didn't just close off the weights - he closed off the research after GPT-3 instruct too. There's a lot of shit we could have learned about much earlier than we did. Instead, he decided to burn everyone to try to get a slight lead in a race that was always going to be his to lose anyway
>>103279944I don't think he cares at this point, he won 40 billions by changing his company's structure kek