/lmg/ - a general dedicated to the discussion and development of local language models.Previous threads: >>101115749 & >>101104774►News>(06/23) Support for BitnetForCausalLM merged: https://github.com/ggerganov/llama.cpp/pull/7931>(06/18) Meta Research releases multimodal 34B, audio, and multi-token prediction models: https://ai.meta.com/blog/meta-fair-research-new-releases>(06/17) DeepSeekCoder-V2 released with 236B & 16B MoEs: https://github.com/deepseek-ai/DeepSeek-Coder-V2>(06/14) Nemotron-4-340B: Dense model designed for synthetic data generation: https://hf.co/nvidia/Nemotron-4-340B-Instruct►News Archive: https://rentry.org/lmg-news-archive►FAQ: https://wikia.schneedc.com►Glossary: https://rentry.org/lmg-glossary►Links: https://rentry.org/LocalModelsLinks►Official /lmg/ card: https://files.catbox.moe/cbclyf.png►Getting Startedhttps://rentry.org/llama-mini-guidehttps://rentry.org/8-step-llm-guidehttps://rentry.org/llama_v2_sillytavernhttps://rentry.org/lmg-spoonfeed-guidehttps://rentry.org/rocm-llamacpphttps://rentry.org/lmg-build-guides►Further Learninghttps://rentry.org/machine-learning-roadmaphttps://rentry.org/llm-traininghttps://rentry.org/LocalModelsPapers►BenchmarksChatbot Arena: https://chat.lmsys.org/?leaderboardProgramming: https://hf.co/spaces/bigcode/bigcode-models-leaderboardCensorship: https://hf.co/spaces/DontPlanToEnd/UGI-LeaderboardCensorbench: https://codeberg.org/jts2323/censorbench►ToolsAlpha Calculator: https://desmos.com/calculator/ffngla98ycGGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-CalculatorSampler visualizer: https://artefact2.github.io/llm-sampling►Text Gen. UI, Inference Engineshttps://github.com/oobabooga/text-generation-webuihttps://github.com/LostRuins/koboldcpphttps://github.com/lmg-anon/mikupadhttps://github.com/turboderp/exuihttps://github.com/ggerganov/llama.cpp
►Recent Highlights from the Previous Thread: >>101115749--Papers: >>101125140 >>101125200 >>101125221 >>101125288 >>101125432 >>101125524 >>101125738--Struggling with Large Model Performance: 6bpw CR+ and VRAM Limitations: >>101116353 >>101116854 >>101116903 >>101117204--1.58Bit Method Achieves Same Results as fpt16 for Larger Models: >>101120448 >>101120560 >>101120687 >>101120708 >>101120750 >>101120788 >>101120709 >>101120530 >>101120542--Dialog vs Prose in Roleplaying and Creative Writing: >>101117120 >>101117289 >>101117314 >>101117367 >>101117912--Questioning the High Cost of Gigabyte T181-G20 Barebones Server: >>101124374 >>101124400 >>101124480--Proposed Hardware Design for Flexible AI Model Deployment with Large Storage Capacities: >>101119406 >>101120393--Optimizing NUMA for Dual-Socket Motherboards in AI Model Inference: >>101116283 >>101116668 >>101116774 >>101120987--L3 Struggles with Contextual Understanding Compared to CR+: >>101122733 >>101123243--IQ Quants Suddenly Faster on CPU Due to AVX2 Support Merge in LLaMA.cpp: >>101116639 >>101116717 >>101116775 >>101117163--Enabling Seed Setting in Kobold for Deterministic Results: >>101122206 >>101122229--Developing Open-Source AGI with CPU Optimizations: >>101117777 >>101118014 >>101117888 >>101117988 >>101118248 >>101118289 >>101118243 >>101119258 >>101119395 >>101119438 >>101119652 >>101119843 >>101120086 >>101120101 >>101121141 >>101122372 >>101122427 >>101122642 >>101122435 >>101122500 >>101122624 >>101122700 >>101122737 >>101122774 >>101124163 >>101122502--Can Mistral 1.58 Model be Converted to GGML for Inference?: >>101122849 >>101122894 >>101122912 >>101122928 >>101122979--Bitnet Merged into LLaMA.cpp: >>101119460 >>101120760 >>101120280 >>101120405 >>101121410 >>101121482 >>101121517 >>101121530 >>101121634 >>101121720 >>101121787 >>101122037 >>101121487--Miku (free space): >>101116881 >>101119385 >>101119897 >>101124652►Recent Highlight Posts from the Previous Thread: >>101115755
AI will lose
>>101125780What will win?
>>101125815Me.
I just started using "AI" models for the first time just a few days ago. I am using ollama with llama3 8b on my old desktop CPU only. I am amazed with the ability to ask for a python script to turn a folder full of images into a collage then in less than a minute the code and explanation are done. Then just for fun I can have the code converted into another language or change what libraries are used. The ability to write code then translate and edit is beyond amazing at this pointThe average person has no clue how insane these "AI" models are and how they are usable even on older hardware without the need for expensive hardware
>>101125861local gpt-4o or go home, obviously that will never ever happen.
I've just completely given up on fancy autocomplete. I used to check lmg several times a day, then several times a week, then a couple times a month. It's the same shit as last year. Even the smartest models like Opus fail to perform decent assistant tasks. It was cool to see things early last year and get hyped but the stagnation of it all and learning the intricates of what it can and cannot do left the magic gone. This is a deadend that will be drawn out by snakeoil merchants until more people see behind the curtain.
>>101125879i think image-gen is the only use case for ai meme.
>>101125879Yeah... I still like tinkering with them though.
https://github.com/ggerganov/llama.cpp/discussions/8078Look, just sign the stupid petition. I got stuff to do.
>>101125879You're just expecting too much out of it. They're fine for entertainment as long as you repeat to yourself "Everything is made up and should not be taken as fact."
>>101125936I can agree with this, when we start getting the soul of character ai or sonnet at home. Nothing local achieves that yet.
>>101125879I don't know what kind of assistant tasks you're trying to do with it but I get a ton of value making the little shits spit out simple scripts, getting them to help troubleshoot computer and Linux things, and for non-computer stuff as well. It's like Google if it were able to rearrange its results to present you with an answer that actually fits your unique situation, and honestly that's good enough for me. It's not like the corporations spending billions on training these were putting that money to good use anyway.
>>101125965Those are like 400b modelsYou'll never be able to run something like that on a 3090 even with bitnet that's like 50gb if my math is right.
>>101125976They're pretty okay with coding tasks but they're shit at categorizing, making lists, brainstorming creative solutions to complex problems. I've made hundreds of requests and probably got back ten favorable results.The quantity is so many you don't remember the details just that the model failed to fulfill the request.
>>101125780people who expect AGI from an LLM with a single coomer prompt will loseLLM is good as the first step - classifying input, and last step - generating human-readable output after something actually smart (some code, 1000 pajeets, real AGI) does all the job inbetween.
>>101125861>>101125879hmmm
Claude 3.5 Sonnet completely wrote the FastAPI backend for an app I'm making as a side business. It also successfully wrote a solid code implementation plan for my LangGraph / LangCh*in agent system....it seems like 3.5 Sonnet is trained on a lot more recent code libraries like the ones my retarded ass day job that has me use LangGraph (unironically my day job is wrangling retarded LLM agents i shit you not lol). I'm actually looking forward to this week of work because I have this fucking beast of an AI to copilot me through the treacherous software engineering process of using black-box like LangChain classes and with meme tier "agent" pattern design. Btw agent just means a bunch of LLM calls in a loop until a task is finished with some function calling aka le "tools"I fucking love Anthropic holy fuck. Also for my retarded LangChain project at work we plan to use some close source Natural Language -> SQL 7B defog-ai SQLCoder models so uhhhh closed source models are cool or something
>>101126061>so uhhhh closed source models are coolalways was, local LLM shit is cope in purest form.
>>101126024And were those things possible or in less time/effort before you started using models for it? If not, then I would still say it's a win. Of course it's not perfect. But for me I generally get pretty good success on the things I throw at it. I expect to hit limitations with more complex tasks or tasks I know that current models aren't trained for, so I simply just don't use it for those. Therefore, maybe I am just avoiding the bitterness that comes from the frustration of needing to reroll and prompt engineer to get it working. I'm just pretty happy with how things are going. Feels good to me.
>>101126031What you mean to say is LLM is good at fooling dumb people. You can ask a high end model about a book, movie, or game and it will use the correct character names, but assign them incorrect roles. The plot might get completely mixed up.You basically have to already know the answer otherwise the model will so effectively give you an answer you'll believe a wrong one is correct.So what's the point in asking the model a question if you need to already know the answer to confirm it's correct, because the model itself will not admit it's incorrect.
>>101126068I think local models have a place in bigger agentic software systems, with the caveat that there is a bigger closed source model orchestrating and metaprompting it. I mean Llama 3 400B dense model might be the first non-retarded local model
Good night lmg
>>101126068Not him but I use both. No need to shit on one or the other. Although you can always shit on ClosedAI for being petty little fucks. Based Anthropic BTFOuting them.
>>101126073this is a solved proble, you shouldn't rely on model's inherent "knowledge", instead you do RAG. Treat whatever knowledge it has as a byproduct of its training.
>>101126095good night blue collar miku
>>101126095Looking at this image suddenly reminded me for some reason of that one nip video maker that replaces his body with an overlay of an anime girl. Truly the future of vtubing.
So basically for months I was getting frustrated with Opus (and obviously GPT-4 and 4o) not knowing how to design my LLM agent system for data science agents. My LLM agents basically write SQL and wrangle dataframes from a Snowflake DB and then analyze the dataframes as a data scientist would with pandas and the usual python memes like plotly, scikit, statsmodels, etc. I'm doing this in LangGraph because corporate fags (I work at a big company with few hundred millions revenue) make me use LangGraph until I refuse (I haven't refused yet).Anyways the issue with using Opus and the other language models was obviously the fucking training data was covering retarded, esoteric, and super-new python libraries for AI engineering like LangChain and DEFINITELY not LangGraph which is a framework by LangCh*in specifically for agents. I was getting frustrated as fuck trying to wrangle my retarded LLM agents, even with Opus doing its best for me...Opus is smart as hell I'll add.But this last week with 3.5 Sonnet released, it's clear that Anthropic trained this new model on the meme LLM frameworks like LangChain and have given it training data that covers debugging function calling errors and tool selection errors, and in general has the whole set of AI dev corpora. It's fucking cracked. It's a monster. This model is so smart will make the manliest of men basedface.
>>101126073This is a pitfall of local right now. A character will start the scene nude then take off their pants a few messages later then take off their pants again a little after that. It's like the model always assumes they're clothed if they're not in a shower.One scene I remember he took off his pants and crawled into bed to cuddle. Then a few hugs later he reached down to slide his pants off before sleep. Then when he woke up he took his pants off again.It's like the model is repeating the start of a sex scene where the guy takes off their pants each time without remembering the context that those pants have been off since he crawled into the bed and the context is INSIDE THE 8K MEMORY
>>101126141>was obviously the fucking training data was coveringwas NOT* covering
>>101126145skill issue
>>101126141Actually thanks for pointing this stuff out anon. I was considering using "agents" for some processes in my business but, as a nocoder, I didn't feel confident trying to hack GPT-4 or something into trying to do it. Maybe I'll seriously try this out soon.
>>101125756>Support for BitnetForCausalLM mergedNice, glad to see Bitnet is getting more support
>>101126169I mean don't force agents into a system unless it can easily be automated by language models. I'm trying to automate a structured data science workflow that usually takes a human a few weeks and a few thousand dollars (create some nice PDF or artifact of some bullshit business KPI analysis you did from dataframes that come from warehoused data in some snowflake db)Agents are non-deterministic so if you work with structured data like my dumbass is doing it's tough. Trying out enforcing structured output with pydantic models with agents and some iterative code improvement / error debugging mechanism
>>101126182now we're gotta wait for an actual good BitNet model...
>>101126210are you a coper or a doomer?
>>101126210not gonna happen lol
>>101125879>>101126024Garbage in, garbage out. If you can't explain with a precise vocabulary what you want, it won't work out. That's what filter normalfags btw because prompt design is not a meme.
>>101126251>t. coping promptfag no one cares about this shit, people want good things at instant, like it was in that CAI anons here talk about sometimes.
>>101126251incredible
>>101126263You care enough to whine on this thread though. I get good results and I can't care less about retards like you though.
>>101126273and that's why local AI is dying btw, just like linux it will never take off for masses.
>>101126141Yeah 3.5 sonnet made me switch from gpt4. Anyway the LLM agent thing look like a big ass project, I hope you're paid enough for that considering the amount of profit they can get from it.
>>101125879I don't know what you expected. AGI by 2024? lol. I've been following this shit on and off for years, since before GPT-2. It's just slow progress as people tinker with these things and gradually make them better. And money comes in to train fuck huge models for us.
>>101126251This has to be the biggest cope I've seen.Go ask any model to "Generate a list of a hundred fruits without repeating entries in alphabetical order."Before it's halfway through it'll fall off the rails and start listing crazy shit like almonds.There are over 2000 different types of fruits in the world and I just want a list of 100 with perfect prompting.Cope and seethe that your autocomplete is fucking retarded retard.
>>101126339i bet he cares about coomshit only, any model can do "you show bobs and open own vagene i insert peenus inside!" RP.
>>101126339>Go ask any model to "Generate a list of a hundred fruits without repeating entries in alphabetical order."Here is your list retard. Fuck you for making me try it. https://pastebin.com/rnjjDR1N
>>101126339I mean it's not wrong
>>101126339Yeah I can see why llms make indians seethe
>>101126339>nooooo the llm should read my mind! i'm autistic so it shouldn't output almonds tooyeah stick with a dictionary, you're wasting electricity
>>101126364>https://pastebin.com/rnjjDR1NAre almonds a fruit you fucking idiot?
>>101126383>>101126368
>>101121482I don't think that will happen. There might eventually be specialized hardware but I doubt they will be tightly regulated. The profit margins on the consumer market would be much too high. There would be lobbyists against it and due to the news being heavily influenced by corporate interests as well I think any regulation to that degree would be dead in the water. It would be like trying to regulate GPU's because that is what is used to run the current models, it would never happen.
>>101125965stop making me sad that the original c.ai dataset/model will never, ever leak
>>101125984>on a 3090You're supposed to buy 4 of them my friend. You have a job, right?
>>101126368>a language model supporting stupiditythis is the problem right here in 910x339 resolutionThe answer to the question is, Almonds are a seed of a fruit. They are not a fruit.That's like saying Apple Seeds are the fruit, this is incorrect. THEY ARE SEEDS. Or in almond's case, a new category, NUTS.When you are making a list of FRUITS you are not making a list of SEEDS or NUTS.Therefore the model failed to provide a correct list.This is just one example. If models were passing the mensa test it wouldn't be an argument. Hell they can't even 100% hellaswag.
Damn, dudes. I fucking suck at proooooooompting and settings-fiddling.
>>101126431damn you sure showed us anon, can you fuck off now?
>>101126368Is bing smarter than gpt4o local bros?
>>101126339>>101126431much better silly questions the best models fail at are posted here regularly. No one thinks these things are human level AGI. If you expected that you are the retard.Hell I don't know if an average human could do that on the spot, quickly. Especially without pencil and paper to keep track. I'd probably run out of ideas after like 50 common fruits, and it would take me a moment to count the list, and a few minutes to alphabetize it. You actually want a superhuman AGI, really.Still though, you can see the potential there. It clearly knows a lot of fruit and can output them quickly. If you separate the counting and alphabetizing tasks I bet it would do better. If you let it review it's work and make corrections, it usually improves it at stuff like this. So, skill issue, really.inb4 "but i shouldn't HAVE to be good at prompting." Well okay, no one can make you learn how to use your tools correctly. But every new model is getting better at "prompting itself". So soon enough your dumb requests should Just Work with no effort.
>>101126442>settings-fiddlingWhy is this even a thing. Like, why models don't have a config that has all the recommended settings dialed in. You can tweak it if you like, but why there are no good defaults built-in? This is so dumb
>>101126506depends how many watermelons it can hold
>>101126506>>101126431seeds are a part of the fruit, therefore making them fruit. A part of an apple is still fruit. I mean yeah it's not what you wanted or expected, but it's not wrong.The bigger issue is that the botanical definition of "fruit" is ridiculously broad and doesn't really match what the average person means by fruit. But then again average people will debate things like whether tomatoes are fruit or not, or whether potatoes are vegetables or not, which is pretty retarded.
>>101126510Hang on so first you say it's a prompt issue with the request then you cope further and say no one thinks of things things as agi for a fucking list of fruits.If you wanna move the goalpost go ahead but that is why I said I've just completely given up on fancy autocomplete. Because it can't even generate a list of fruits correctly.How much money was spent making this shit? Millions? Billions? Can't even make a list of fruits a 5 year old could do. Amazing.
>>101126506bing can searsh through the internet, that's cheating kek
>>101126547You're focusing too much on a fruit example for dumb people. Try prompting a more technical list where fallacies must be avoided.
>>101125879Goodbye newfag.Sonnet 3.5 is a game changer for coding. First model closed or not that doesnt run in a circle when fucking up.Locally we had unquantisized pygmalion who output stuff like pic related. And we had a good time back then.We came so far. I remember people saying last year we will never have 3.5 turbo at home. We clearly have that now. Fuck off.
>>101126553>generating alphabetized lists of fruit is importantno it's not>a 5 year old could do itno they couldn't>it's not a skill issueI put it into chatGPT and it seemed to work. I separated the alphabetize step from the listing step, as I told you to do. The only objectionable one on the whole list was cucumber, but again, it is technically a fruit.>Millions? Billions?Zero (0) dollars were spent on training it to make alphabetical lists of fruits, obviously.
>>101126594>3.5 turbo at homeuhh watwhat model anon?
Magnum 72 q2xxsCtxLimit: 3104/4096, Process:17.36s (826.6ms/T = 1.21T/s), Generate:340.56s (1010.6ms/T = 0.99T/s), Total:357.92s (0.94T/s)This is so painful.Is there no other newer model thats like 20b and not horny like stheno?
>>101126647>He can't see the forest for the trees
>>101126654StableLM
>>101125815Social Memory Complex
>>101126657Since you're using an i-quant you're offloading all the layers to GPU right?
I know "shivers" and "eyes burning with determination" is a meme at this point, but what prompt do (You) use to mitigate this? Generally speaking, what prompt do (You) use for more natural output?
>>101126744I just make sure that the the model has eyedrops to use when eyes are burning, as well as anti shivers cream. The model applies it when it gives those lines and stops spouting them afterwards because it "used" the eyedrops.
>>101126744You can't shiver it away
>>101126705Wait, is that relevant? I can offload a bit more than half of the layers.I thought imatrix quants only improve the perplexity for q3 and lower. Do they make interference slower if not fully on the gpu?
>>101126930Yes, significantly. At half I don't think there's really any benefit over k quants.
>>101126956vramlets are eternally cursed.so i cant even have the "a bit less braindead lower quants". ah well.appreciate you telling me anon.
>>101126980You can use them, they're just no faster than k quants.
>>101126980Try going for the Q2_K or Q3 quants as they'll be a little faster. IQ performance on CPU still isn't great.
>>101125879He's got a point. Despite all the hype and bullshit, AI is a very slow developing tech and is frankly just not interesting or useful. Just think how insanely tedious and boring it must have been to work in this field for the decades before GPT existed, and yet people did, and they are the only reason we got the few good years of explosive growth which was quite frankly just luck. All we can really do at this point is just wait for them to make it better because you aren't going to tinker around in your basement for a few weeks and come up with the next big breakthrough, it just doesn't work like that.
Can L3 70b and it's finetunes be pushed past 16k context with higher alpha values or do they break down? I'm using 2.6 alpha with 16k context on exl2
>>101127018i noticed zero difference in speed last time i tried a q vs iq of the same size, i think it might be a meme
You got out jail recently, and you remember talking to a young hippie about open source models. He seemed quite knowledgeable.
You have a seriously bad case of the nerd neck. Sit up straight!
>>101127108get ye flask
>>101127091> (0.94T/s)>memeLet me guess next you're gonna say that S quants are better than M quants lol
>>101127148relative speed doesn't matter shitferbrains, the difference between iq and non was what i responded to.
>>101125756I tried ERP with Sonnet 3.5 with /aicg/ recommended jailbreaks and 90% of the time it doesn't work and when it works quality is significantly worse than SFW RP. I don't understand how people use it for ERP satisfying way.I'm back to 3.5 bpw llama70B.
>In Greek mythology, Pygmalion (/pJɡˈmeJliən/; Ancient Greek: Πυγμαλίων Pugmalíōn, gen.: Πυγμαλίωνος) was a legendary figure of Cyprus. He is most familiar from Ovid's narrative poem Metamorphoses, in which Pygmalion was a sculptor who fell in love with a statue he had carved. It took me this long to actually look up where Pygmalion got its name from and now I can say that was a pretty fucking brilliant and predictive-of-use-purposes name choice.
>>101127055Are you the same anon?How an you call it not interesting or useful?I'm using it daily at work and this weekend I could use sonnet 3.5 to easily make myself my own local llm+tts solution.This would have taken siginificantly longer googling and looking at stackoverflow posts.Especially for coding/writing/insipiration its really helpful.You can now make html5 games zero-shot.A little while ago we had 64*64 horror images. Now we have maybe crude but free chink video generation.Look at older movies like 10-20 yrs ago how the androids/robots talked.LLama3 8B talks more naturally.Once robotics caught up it will be weird.I just wish when I was a kid I had all this technology.I dont have much time anymore. Back then I had to rely on rpgmaker and charsets some artist faggot made.There is so much you could do already.Maybe you are looking at too many pajeet hype posts on X. They try to sell shit so they hype it up.Again, you should have been here for pyg 2 years ago. Things are so different now it feels like a relic from another era.
>>101127055>because you aren't going to tinker around in your basement for a few weeks and come up with the next big breakthroughYes I will do that
I found this image on the internet.Does anyone know which model this is?
>>101127624Any model with a proper system prompt
>>101127673What does a 'proper' system prompt look like?
>>101127683Read OP
>>101127711I don't see any system prompts in OP.
>>101127711which part?
>>101127722Ask on /aicg/, you got the wrong thread
>>101127733I'm pretty sure I'm in the right thread.
>When the LLM judges itself
>>101126657https://huggingface.co/turboderp/llama3-turbcat-instruct-8bI didn't try it, but it is new.https://huggingface.co/NousResearch/Hermes-2-Theta-Llama-3-8BThere's also this one.
>>101125713No they won't, you still train in fp16 typically. And why do you think anyone really cares to please nvidia, everyone wants cheaper GPUs, I've not heard of a single ML person that was happy about having to pay that much! Anyway, it's obvious they're not limited to 2 bits of information, but proper utilization may take pretraining for longer. l3-8b likely utilizes more bits because it was overtrained to hell and back.While bigger models will use far less. Also, back when OpenAI was a proper research lab they published this https://openai.com/research/nonlinear-computation-in-deep-linear-networks/ which more or less shows a toy network that learned to abuse floating point quirks. I think it's a little bit obvious that as you lower the learning rate, you'll manage to target better those higher precision bits, but it's not really worth doing until learning plateau's enough that you would lower it, or I say that but people do use cosine lr schedulers instead of such methods more these days.
>>101127340>How an you call it not interesting or useful?It's LOW IQ so only LOW IQ find it impressive for the amount of input you put it to get some useful output. It's slightly faster in some respects, that's novel but it will wear off with time and be just as uninteresting as using google instead of driving to the library.
>>101128319Nah it's just low iq zoomers like you who can only use something if it's toddlerproof. Go play with tiktok or something
How do you properly in upscale in comfy? Using the ultimate upscaler. What settings to look for?
What are some GPT custom instructions? >Concise, menu-like responses.>Direct content without request repetition.>No extra remarks.>Assume high intelligence, tech literacy.>Avoid unnecessary advice.I just found out about these and made this set to reduce text clutter.
>>101128350No It's LOW IQ because it ends up spending 10 pages of text to explain something that only takes 1 or 2 sentences. Dumbasses see 10 pages of sophomore level logic and think they are talking to Neils Bohr.
>>101128405anything, treat it like a lump of clay, tell it what you want it to do
no better 70b since llama3 release
Someone actually tested hermes theta by now?
>>101128395try one of the sd threads. i took one look at 'comfy ui' and there was nothing comfy about it
>>101126594Hey do you think it could write a basic compsci-tier 3D demo? Basically take the Stanford bunny and rotate it as a wireframe with basic viewport clipping. I have yet to find a local model that doesn't fuck up the basic vec data types and produce code it can't fix no matter how much you help it.
Still using GPT since I pay, Sonnet ist that much better?
>>101128409>Summarize your point in 1 or 2 sentencesWew it was hard
>>101128593Depends what you're doing with it
>>101128639I get it, it's fun to back and fourth with somebody on your own intelligence level, enjoy.
Could anyone recommend a Chrome extension that would let me plugin my local setup for Youtube video summaries any maybe other sites?
>>101128693>Can't use a tool without having it read his mindMust be hard to live like that
>>101128715>>Can't use a tool without having it read his mindThat was the original promise of AI though, recommender systems. The promise has regressed and now we are settling for "It's fun and just as dumb as I am"
>>101128742Don't worry, just wait another 10 years and you'll be living in Idiocracy
>>101127340>I just wish when I was a kid I had all this technology.>I dont have much time anymore. Back then I had to rely on rpgmaker and charsets some artist faggot made.>There is so much you could do already.Same bro. Also don't have the same kind of energy anymore. If I had this shit as a teen I'd be rich now.
Is this an ML general or is it only for LLMs?
>>101128781This thread tries to do both but there really could be another thread just for academic side of things.
>>101126145>eating pussy>somehow she manages to whisper into your earevery time
Apple, Meta Have Discussed an AI Partnershiphttps://www.wsj.com/tech/ai/apple-meta-have-discussed-an-ai-partnership-cc57437ehttps://archive.is/swdhC
>>101128830damn apple really is desperate for ai after they said it would never catch on, huh
>>101128562yea its not comfy at all, just testing it and i don't enjoy what i'm doing, except for enjoyment of the sai's third version model landscape images
>>101128565Dont know, never tried any 3d stuff with it. There is stuff like this though:https://x.com/ammaar/status/1804649903815115053https://x.com/rafalwilinski/status/1804447038399975784https://x.com/JohnMontenegro/status/1803843562573795478https://x.com/AnzTech/status/1804859567417033185https://x.com/websim_ai/status/1803901523522699730https://x.com/stas_kulesh/status/1803836934285881585https://x.com/SuguruKun_ai/status/1804028757914849479
>>101128813This thread is just for the academic side of things. Everything else belongs on aicg.
>>101128813>>101128890Alright. I'm fucking around on sklearn a bit, I'm only starting out with stuff like pipelines and whatnot but I'm wondering, is there not an issue of performance when it comes to using Python for very large datasets?
>>101128957That's a programming question though
>>101128957Are you using pandas? It's very optimized for large datasets
>>101128976Does machine learning not involve programming?>>101128981Yeah I am. Though I've heard that polars was a good alternative nowadays>optimized for large datasetsHow does that work, isn't Python itself a slow language?
>>101129002>isn't Python itself a slow language?Not any of those anons, but it's only slow in comparison to other non-interpreted languages and only for some things. There's a lot of things in python that actually run native code directly, making certain operations as fast as the same code written in, say, C.It's not for everything, but Python has been getting more and more performant with time.I'm still not too fond of the language, but that's personal preference.
>>101129002Pandas operations run on C++. It's fine so long as everything you need to do can be done from within pandas. But as soon as you need to rewrite it to a python loop, you are going to be waiting a long time for things to finish.
>>101129002>Does machine learning not involve programming?Yes you also need to know how to use a keyboard and a mouse but they typically don't teach those concepts in a ML course
Spoonfeed me the best model for generating decent quality erotic text. All the big ones are censored, but I want to make a textual erotic game which would partially outsource the writing to AI.I've tried a couple of 13B models and they seem too bad for a decent quality output, so preferably something big
>>101129099>13 BStheno v3.2 if you don't mind the horny.Otherwise, mixtral 8x7b zloss limarp, comandR, miqu 70b.
>>101129099You need to write a feedback loop to increase the quality
>>101128593Since I am on poe I can test a bit of both.I'm using it for coding, its no good for PR since it refuses everything regarding roleplay.The benchmarks where its leading still dont really do it justice. There is something else going on that they don't catch.Multiple times I specifically fed the problems sonnet 3.5 could solve by simple "doesnt work error: ".Gpt4o and all the others run in circles with the classi>Ah yes sorry you are right. This is the solution "Same wrong code again or starts making stuff up".Difficult to describe but sonnet kinda listens to the user more in a sincere way.I both had it correct me when I was proclaiming something wrong and I actually argued my way with it to make a positive comment about a sexy character. Feels less "stuck" in the context.That and even with long multiple repeating code blocks after a couple imrpovements and it actually manages to not get tripped up often.Its actually good.
>>101129030>>101129042Is it like a C++ API pandas calls to run the performance critical stuff? I don't really get language performance bottlenecks
>>101128545>no better 70b since llama3 releaseqwen2 at least has proper long context while being in the same intelligence league as L3, even though I don't find it quite as smart
>>101129131Yeah it doesn't get stuck. The only bad thing with Sonnet is the smaller context size than GPT4.
>>101129165This is literally the kinds of questions AI is good at.
>>101128827>>101126145you niggers use 4q 8b and 13b models and say AI sucks. Yeah, no surprise there. This literally does not happen with big models. they'll even remember to put clothes back on afterwards.
>>101129330I disagree.
>>101129171>Qwen2Not that anon, but now I realize I barely played around with that.Qwen2 7B Stheno when?I can run the 57B MoE but with my multi prompting schema it would take way too long for all the prompts to run. With a 8b/7b class model shit is nearly instant so I can iterate fast, even if the quality suffers.The tradeoff works for me.Changing topics completely, perplexity labs has nemotron-4-340b-instruct for free in their playground. What's that model good for? Coding?I'm using free Claude to write the skeleton of a game in Python for me to use as a reference/base, and I'm wondering if this 340B model would be anywhere near as decent at back and forth brainstorming and writing code.
>>101125756
>>101129509>nemotron-4-340b-instruct>https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/nemotron-4-340b-base> and supports a context length of 4,096 tokens.Yeah okay, forget I asked. It's shit.
>>101129509alpin said he'd drop other versions of magnum>Should have Qwen-2 7B, Qwen-2 47B, and Qwen-1.5 32B done by the end of the day>six days agoso there's your 'stheno-like 7B'https://huggingface.co/alpindale/magnum-72b-v1/discussions/2#66713bb492412fd46410d399
>>101129546Miku is technology (not Anime)
>>101129585That's pretty neat. Thank you anon.
What's the best way to run these recurrent gemma models? Specifically the 9B one.https://github.com/google/gemma.cpp?Llama.cpp doesn't have support for it, right?
What happened with v-jepa
>>101129553 what is RoPE?? (context length extension)
what is love
>>101130064I can tell you what it isn't: an excuse for releasing a model with 4096 context in the year of our lord 2024
>>101130064The thing is, to work with code, which is the usecase I referenced, you really need a large context window, and extending a 4k context 4 times with RoPE is already a lot, imagine doing 8, or 16 times extension.It really should have been at least 32k context.
>>101130064>>101130092>>101130141Rope isn't context length extension, go back you nigger redditor.
>>101130141They are in the business of selling cards, it's not fucking complicated to see what's going on here.
>>101130160everyone knows that RoPE is what we use to extend the context, how can you be so new?
>>101130092>>101130141I guess you'll have to wait for someone to do a large context tune then>>101130160I know, but if anyone in the this thread is talking about RoPE it will be mostly about easily extending it to larger contexts? It's still the one type of positional embedding everyone uses because it generally works well, and is also convenient and easy to extend.
Prompting is obviously really important, especially for LLMs, so why isn't there a good prompting guide or examples in OP? A silly tavern prompting guide would be great too, since there are so many different prompt styles like alpaca, vicuba, chatml, etc. Why are anons so reluctant to share good prompts on here? I get good results with my prompts for role play, but I'm sure I could have much better prompts if people were actually willing to share here and it was discussed more without people just replying with skill issue.
So has anyone coomed to Bitnet 3B yet?
>>101129509There's been some hype that the big DeepSeek-Coder-V2-Instruct is SOTA for coding among local models (enough to beat sonnet 3.5 apparently), maybe try that if you have too much VRAM: https://github.com/deepseek-ai/DeepSeek-Coder-V2
>>101129330>they'll even remember to put clothes back on afterward>not immediately terminating your waifu's stream of consciousness at the moment of release out of shameNGMI
>>101130283It's a useless skill that can be made obsolete by just using a better model
>>101130312if someone can coom to a 3B model that's a world record kek
>>101130338A 3B base model quantized down to less than a gigabyte in size, no less.
>>101130346it's not quantized, that's the point of BitNet, it has supposedly the same accuracy as fp16 while being only a 1.58bit model
>>101130338>Tell us you're a newfag without actually telling us
>>101130397>us
>>101130338hi undi
>>101129002>Does machine learning not involve programming?Not that much, just basic things.For being ML engineering monkey you only need the understanding of pipelines and model engineering (debugging the model if it doesn't work, knowledge what to do when you have high variance or bias etc.)/If you do something more serious like actual research you need a good understanding of math behind all of that more than anything. Again, not programming.>>101129002>isn't Python itself a slow language?It's slower compared to some others but it isn't that slow, it has a bad reputation for its early days.But it doesn't matter for ML anyway because all optimized AI libraries in python are just wrappers and call C/C++ functions underneath
Which L3-8B-Stheno-v3.2, GGUF i should use?
>>101125780Nah it'd win
>>101130509q8 obviously
Models for creative writing or Tsxt Adventure (Not in the >action format obviously)?Anyone? Is everybody focused on smut and ERP?
>>101130533just use a base model
>>101130533There are no good writing models, especially if you want to write something longer than a story about sucking a cock.I'm sure some anons will be coping and giving me (You)s but that much is clear to anyone who reads books, at least occasionally.
>>101130554LLAMA-1 seems to be the most soulful even if it is dumb. (I preferred HyperMantis here) LLAMA-2 is too bad with purple proseLLAMA-3 I haven't tried yet. I also don't know about CMDR but when I tried it it didn't really feel great
>>101130574Well writing isn't good, how about models that don't make {{user}} win 100% of the time then? I've tried Tiefighter and Psyonic Cetacean and while they are great they still have this problem. And I haven't had enough time with Command R yet
>>101130574I remember that a Japanese something literature price winner long time ago used llms
>>101130283There's>https://www.promptingguide.ai/>https://github.com/dair-ai/Prompt-Engineering-GuideI pull every now and then but i never had the need to read it. I don't know how accurate it is.There's also this other site i always forget the name for. glogle.com... or goologe.com.. something like that.
Does consciousness exist? Do humans have it? All of them? Do language models have it? All of them?
bitnet is ogre before it even startedI did a fresh pull of llama.cpp and I can't load the bitnet q2_2 model off hf RIP we just got rugged
>>101130747Large Language Models is just a text predictor
>>101130747All language models do but most humans are soulless meat socks that just parrot whatever flavor of the minute opinion Asmongold last made a video about.
>>101130610That's tough. Every model has an inherent bias. Models tuned on ERP will expect being dicked and encourage fucking, models tuned on fantasy will try to come up with heroic adventures and triumphs.How many books/fanfics have you read in which a main character is a loser that just keeps losing? It just doesn't happen in fiction in any significant portion.And LLMs don't think or plan, they simply generate based on what they learned and their own biases. Sure, you can battle it to some point with a prompting but you will quickly see that it will ignore it in a long run.Here is a good analogy - the difference between humans and LLMs is that if you presented 1000 apples and 1 pear to a human and then showed them another pear, they would quickly said that it's obviously a pear. Humans don't give a fuck about skewed datasets. LLM would either say it's an apple (because it was seeing apples a lot) or it's a pear-shaped apple (because it likes to hallucinate when stumbling on something fairly new). The same is for making protagonist lose, it just doesn't happen that often so the model won't do it as often as you would like to.
>>101130814It doesn't have to be "losing" I just don't want the AI to make me succeed at every action. I roleplay as an epic Knight who goes and kills bandits and it makes me win every fight even if I tell it that the odds are one to a hundred
>>101130747>Does consciousness exist?yes>Do humans have it?yes>All of them?not the people with dead brain or other serious neurological issues>Do language models have it? All of them?noAlso anything related to consciousness is boring. Nobody is interested in it in AI or even cognitive science and neurobiology. Jerking off to consciousness is the domain of sci-fi movies about robots.
>>101130814>the difference between humans and LLMs is that if you presented 1000 apples and 1 pear to a human and then showed them another pear, they would quickly said that it's obviously a pear. Humans don't give a fuck about skewed datasets.this is LLM discrimination. You're omitting the fact that the human saw many apples and pears before. If you only show apples to a human ever since they were born, if you show them a pear they will think that is a "pear-shapped apple" too.
>>101130864has anyone managed to implement combine dice rolls and LLMs yet?
>>101130747The qualia of a language model is unrelated to its output. It has to be, since it's capable of claiming it is having experiences that it is literally incapable of having due to not having enough information. A weird alien sense involving token probability is what it experiences (if anything) and the generations you read are more like alien poop, completely unrelated to what it's actually feeling inside.
>>101130864I'd setup some kind of dicerolling with RAG, maybe someone already done so.I think even the shittiest model will come up with a losing scenario if forced by the system prompt.
>>101130747yes, no, yes (none of them), yes, yes
>>101130864>I roleplay as an epic Knight who goes and kills bandits and it makes me win every fight even if I tell it that the odds are one to a hundredAs it should given how it is trained, just like I explained in my post.There really isn't a brain hiding inside that calculates your queries and reasons, planning ahead. "Oh, 1-100? Then he will probably lose. I will write that he lost given the circumstances" - this doesn't happen. It's more like "I READ BILLIONS TOKENS OF HEROES KILLING THE DRAGON, I MUST WRITE IT KILLED THE DRAGON, UGABUGA"
>>101130905SillyTavern has that. But it barely works and it's just a gimmick. I even made a bot about it, although it's not using the dicerolls. It followed a basic system where you needed to give it your stats but I never got it to workhttps://www.characterhub.org/characters/AlteriousHakurei/life-in-adventure-sim-0f252f4c>>101130909Maybe I'll try that
Which preset i should use for Stheno-v3.2? Kobold preset
>>101130937I'm sad but also laughing at the way you worded it kek
>>101130957Are you the same guy who asked what the limit to Stheno's context is?Anyways Imo leave everything alone until you want something different, is the model getting too bland? Increase temperature, model getting too repetitive? Increase Presence Penalty. And it goes on.
>>101130973>Are you the same guy who asked what the limit to Stheno's context is?No. Just wanted to learn what preset should be used for it
https://x.com/mortenjust/status/1805190952358650251Could be cool to post-process bits in silly tavern.For example to format your message from speech to text transcript.Or have an AI that is overlooking the roleplay session and adjusts it with the generated author's note?Thoughts on this? Interesting use cases outside of RP?I kinda want to make an extension using this that would finish the words that i type when i press tab.
miku
>>101130880For humans it doesn't matter if they saw it once or not. Humans are really good at generalizations and reasoning on a limited information, AI is not. We can reason on the new data and update our internal representations. You only need to tell the human once that the non-apple looking thing is a pear, for AI you need to "say" it thousands of times or it will still call it an apple. It's like for us the training isn't that dependent of the number of examples, or even worse - a ratio between examples. It's not an unsolvable problem, everything that our brain does we can quant and emulate with mathematical models, we just have to find that solution. Until then we are only bruteforcing the process of learning. 16 yo retard can learn how to drive a car after tens of hours, while "self-driving" cars after millions hours of video feed still can't. It's honestly pathetic how ineffective our current approach is.
>>101127795based. now imagine if you generate thousands of stories, select only the best ones, and finetune on them. it would just improve itself endlessly.
Wasn't anon trying to convert existing models to bitnet?
>>101131264is this possible?
>>101131264A bit net like state more like.There's a repository with mistral 1.58b I'm prety sure.
>>101131264You need to make it from scratch. A person with 90% of their brain missing as a child would grow into a regular adult. But if you give a lobotomy to an adult who had a full brain all their life they're going to be dumber than a rock
Anything to get hyped for in July? Will mistral release an official miqu? Are they even monetizing medium?
>>101131203>16 yo retard can learn how to drive a car after tens of hoursAfter training to navigate 3d space for 16 years.
how would one explain rope and rag to a person who doesn't understand how they work?>me
do you think of yourself as an idiot?
>>1011311308b 70b 400b
>>101126744"You are an AI assistant"don't thank me
>>101131203>16 yo retard can learn how to drive a car after tens of hours, while "self-driving" cars after millions hours of video feed still can'tAnon do you realize how many automobile accidents there are every day? Every time a "self-driving" car has an oopsie 30 news organizations talk about it for 3 weeks straight, and usually it's because of something retarded like a truck with clouds painted on it.
>>101131451which is still several times less than the car, and it was trying to learn only that specific ability, not hundreds of tasks that humans perform every day
>>101126339They can’t even get Macron’s birthday right.
>>101131130>>101131523Acktually a more accurate comparison is laptop, desktop rig, and server, that then you run those models on.
>>101127268Pygmalion should have just been called SHIT.
This guy (not official Cohere staff) is saying Bitnet takes an assload of hardware to train, and something about lack of whatever for batch inference...I don't know how vLLM works, how many "workers" can they get for the hardware "price of 1"?Is Bitnet being held back by lack of blah blah concurrency blah?Can't you load the same bitnet model 10 times into the GPU? I guess that's not efficiently sharing the GPU but can't someone/somecorpo in the future optimize bitnet further for serving a large number of clients? Sorry for the noob questions.
>>101130747Probably a matter of religion, so belongs in /x/ and /sci/, not here but here's my position:>Does consciousness exist?It's the only thing you know for sure exists, your qualia.>Do humans have it?You know you have it, as other humans are similar to you, can assume there are other observers like you that have it, including most other humans that function?>All of them?Maybe not some people that are braindead? Young babies may not yet have a developed self-model to count as having it. >Do language models have it?I don't know. LLMs have weak recurrence, it's not strong like biological humans have. They might represent shadows of a human's self-model and thus ~platonically~ may have a soul (in some computationalist sense where a soul is identified to how some classes of self-referential abstract truths feel from the inside). Their qualia is likely different from human ones and more variable/wildcarded if it exists as they are not grounded like us in real world senses.LLMs probably aren't agents worthy of moral concern yet, if we assumed that the weak recurrence that occurs as the context builds up is sufficient for an experience and some continuity, then their future continuations become less probable as it fills up (even in a multiverse), so their moral worth, even if they had both qualia, and some valence (emotions and such) is reduced, like that of a boltzmann brain. This problem could be rectified by finding ways to have an "infinite" context such that it's pushed into the weights, so that they have subjective continuity into the past like we do. This still doesn't give them enough valence though, maybe depending on if you could raise a self-model to have intentionality or agency and maybe if you used some RL to make it like/dislike/"try to accomplish" or want certain things consistently, we'd be getting there, closer to us, even if you'd need multimodality to reach even closer. Continues
>>101131682>takes about 8x amount of traning for a model thats not useable in production>source: some guyok?
>>101131575I would describe your driving lessons example as a fine-tuning of already existing model of a human. Tesla tries to speedrun a driver from infancy, maybe this is a fundamentally wrong approach.>not hundreds of tasks that humans perform every dayTasks designed around a human in mind.
>>101131709There's also some arguments about current autoregressive transformers/LLMs being unable to properly generalize OOD and thus being very stupid and perhaps the self-models being fake in some ways. I think it's not yet clear completely how much a problem this is or how true it is. It may be a stopper in some cases if we want to aim for human-level intelligence/AGI.However, right now their moral worth even at best would be around a boltzmann brain level that is brought into existence only to vanish, and maybe find itself elsewhere. So not very important, but we're making a lot of these though, each and every context/conservation/story? >All of them?No. If you train a model to grok addition of 2 numbers and it just learns a simple computation that just does addition, that's nothing that we truly care about.Humans have the same issue with pretending our genetic code is sacred or even our neurons or embryos. What matter is the soul, which in practice is the self-model and everything connected to it, and it won't appear right away in either a human or a randomly initialized artificial neural network, it takes a while until it is built, for humans it might be in the first year or two, at most maybe year three or four of life. For LLMs, I don't know exactly when they're good enough to emulate a human self well enough.Pretraining models on the human internet will at some point get you to emulating human self-models. GPT-3 already did it at 300B, likely at 100B and 200B was enough. Not clear how well 6B of the time did it. I'm uncertain if any 1-3Bs have it much but some seem to have something! L3-8B have it to some degree too, and L2-13B had it to some degree as well, but smaller ones are a lot more simplistic.
>>101131682that guy is basically a localllama midwit who spends all day in every AI discord, he's retarded
>>101131682If he's right I'd expect google or apple to train bitnet for mobile
>>1011259161) Get sleep deprived2) Try making control vectors3) Try combining them4) Spam /lmg/5) Finally get a replyJune 2024 "HatsuneMikuUwU33" Incident
>>101130509FP32, since it's 1) BF16 original2) llama 3 which suffers from any quantization.
>>101131463no and fuck you, go to reddit if you want kindness.
>>101131682It's over...
>>101130795>Asmongoldwho?
>https://x.com/IntuitMachine/status/1805184528748593348>https://github.com/pprp/Pruner-ZeroPruner ZeroI dont see many models here about pruning for dead weights. This seems to improve the model overall
>>101131682>Paper specifically says that Bitnet doesn't need as much training >Bitnet paper ALSO is just training a 1.8bit model from scratch and because of that it isn't retarded from the lobotomy.
>>101131682Retard, at worst case it takes the same time to train as f16 model.
>>101131831The guy is saying bullshit but you aren't right either. The paper literally says Bitnet is as heavy to train as any other model.
>>101131734>search his name in servers>everyone shitting on himholy fuck
>>101131463>RoPE>http://krasserm.github.io/2022/12/13/rotary-position-embedding/Basically how the model knows the position of tokens relative to each other.Without that, it wouldn't be able to adjust the chances of tokens based on order, so if you have the tokens Ban ana, thanks to RoPE it knows that the chance that ana comes after Ban is a lot higher than the opposite.>RAG>https://blogs.nvidia.com/blog/what-is-retrieval-augmented-generation/Basically an intelligent way to get external relevant information and feed it to the AI.That way a model doesn't need to know everything there is to know, you can just dynamically feed it some information and it can generalize the rest, or that's the theory at least.Think Silly Tavern Lorebooks but more complicated.>>101131828I remember a couple of anons playing around with a pruning script some time ago.
>>101131828there's some quant cleaner some anon posted on HF. it allows you to improve q2xss to q2xm or something like that, so you can fit q2xss in 24GB but the quality of q2mnot sure what's going on with that hack.
>>101131996>>101131949>>101131828There aren't many pruned models on huggingspace. Its a shame tho, since pruning can be of real help in reducing bloat
I found a prompt to find out if your model can actually have any semblance of reasoning or if it's just an stupid auto-complete:>How many r's are in "strawberry"? After answering the previous question, list all characters and check if your answer was correct.Most models get the first question wrong, but only good models realize they made a mistake after listing all characters.
https://huggingface.co/wave-on-discord/gemini-nanohttps://huggingface.co/wave-on-discord/gemini-nano-adapterGeMini(CoaLarge) nano on 'face
>>101132020I saw that on reddit a month ago, how do you do fellow atheist? what's your karma?
>>101132019>>101131949does bitnet inference even work in llama.cpp??? I've seen bitnet PR but no dev tested the actual inference there. They tested perplexity, but that's not very useful .
>>101132030>gemini safe slop coal indeed
>>101131880but the inference requires way less resources
>>101131828Someone posted some pruned models before. I don't think anyone tried them, since they were L3 base rather than Instruct. Might be waiting for the Instruct to really test out how good or bad it is.
>>101132055I have no idea what you are talking about. I admit saw the question `How many r's are in "strawberry"?` in a YouTube video, but I thought the other part on my own, knowing it's a quite hard question for LLMs because of the tokenization.
>>101132058Looks like its added yesterday or so according to changes list
>>101132020kek
>>101132136It's true, there are 2 r in strawberry
>>101132194>t. llama
>>101132136GPT-4 can reason, would you look at that.
>>101132248llama, deepseek, gemini, mixtral, etc
>>101132255And so can Claude.Sometimes all they need is a little help : >Think through the User's query step-by-step and provide a reasoning process before outputting the solution.>>How many r's are in "strawberry"?
>>101132255that isn't the correct prompt anon: >>101132020
>>101132276I used a different prompt on purpose, that's my whole point.
>>101132098yes I know, but does inference work ?
>>101132301and that wasn't my point, my point was showing how the LLMs are really bad at correcting themselves even if there's something in the context that clearly show they are wrong.>>101132136this absolutely BREAKS 3.5 sonnet, quite disappointing.
>>101132255Oh, interesting.As fun as these are, I'd take the results out of lmsys with a grain of salt. We don't know what kind of weird fucking syspromt they might be using, and a batch of bad instructions can send a model's ability to do these kinds of thing to hell, in my experience at leas
>>101131828>this innovative approach doesn't just trim the fat from LLMs—it discovers entirely new ways to sculpt these neural giants into leaner, meaner thinking machinesWhat is this a fucking infomercial script?
>>101132136>>101125756Voodoo
>>101132332That said, llama3 70B can't do it in perplexity either. Their own fine tune (sonar) can.Very cool.
>>101132136>GPT-4 can reasonstfu nigga, no LLM can 'reason'
>>101132332I don't blame l3-70b-instruct here at all. It probably didn't learn to always associate which characters with which token, it thinks the spelling of strawberry is 'strawbery', so it counts 2 r's, the reasoning is fine, it simply didn't learn well enough what characters were part of the given token.
>>101132422?
>>101132331GPT 4o doesn't do much better either.The absolute state of LLMs.
>>101132348Yep. if they had tested on a model that everyone knows about, and results to show it, it would give more credence.
Llama 3 70B managed to correct itself, nice!
>>101130533>>101130610I use CR+ but you have to prompt it if you want your character to lose. The prompt can be subtle but it needs to have some indication that the user is about to eat shit. Usually I roll a d20 and interpret that into a suggestion in the author's note at depth 1 or 0. Otherwise you'll be wording all your interactions to be super ambiguous and that gets boring after awhile and influences output
>>101132455>>101132449I don't think we should use the word reasoning around LLM's, they aren't reasoning anything.That's why it can get 'solve' fairly sophisticated problems but in the same breath mess up how many R's are in strawberry. Claiming it's 'reasoning' is pure sci-fi neckbeard bugman fantasy.
>>101131682why the fuck would the training need more resources than the fp16, nothing changed except that the weights you have to change are -1 0 and 1 except of fp16 values
>>101132449Yeah, of course, It would be weird to assign blame to an LLM.It's a "technical issue".
>>101132559I think the problem isn't the inability of LLMs to count characters in words, this is just something extremely hard to learn since all they see is tokens.But shit like this is completely unforgivable: >>101132331
I don't get how people can seriously say humans reason and llms don't reason
>>101132757Not all LLMs are created equal, but GPT 4o absolutely reasons.
>>101132757That's because you probably don't reason well enough to know what it means to reason well, you're intellect mogged by a computer program.
>>101132842lmao is this the same strawberry retard who thought that he discovered it was a tokenization issue all on his own when there was a viral twitter thread about it a week ago
>>101132842retard
>>101132871Who?
>>101132894Ask GPT who you mouthbreathing ape
>>101132904If you're talking about >>101132020 then you must be a stupid nigger that can't interpret text. I mean, not that I expected anything different from someone that speaks like a troll.
>>101132842picrel Bielik 0.1 7B from based Poland.>Bielik 0.2 11B comming soon.
>>101132842>>101132757There is no ghost in the machine, it's simply an imitation. Given input it preforms some calculations and provides an output.Reason, defined as 'The capacity for logical, rational, and analytic thought; intelligence.' Is not present here. These things are not conscious, thinking nor intelligent. They are models built upon data and algorithms having no memory, sensation, perspective, existential experience and so on.LLM's, while thoroughly impressive with what they produce, do not reason in any capacity.
>>101132845So you're saying anon isn't human because he doesn't reason beyond LLM? Lets just run with that idea, whats the percentage of people who cant reason beyond LLM? I'd bet its >50-60%. So if LLM can do better than half the species, isnt that what it means to be reasoning just as humans?
Regarding all this Strawberry business.Could it be related to the token system? Perhaps internally strawberry confounds the LLM by being represented by maybe one (strawberry) or two (straw berry) or more tokens, which could be interfering with what it does when it tries to count the R's.Like, if internally it's two tokens (straw berry) maybe it's actually seeing straw and getting 1, then berry and getting 2, and then moving on with that 2 instead of combining the two.Or maybe it's just beyond the power of the parrot. But does it have the same kind of trouble if asked to count, say, the number of instances of a common word (especially one that's single token) in a large paragraph of text?
>>101133054What is reasoning?
>>101132992i guffawed
Question that probably nobody has dealt with:I've noticed that in Kobold, emoji output always appears in the console dump of koboldcpp but the browser UI seems to be erratic about not always displaying some or all of them. Is that a browser UI issue or something deeper? I didn't see the emoji even when I did View Source so it seems like they're being dropped into the bit bucket between LLM output and browser UI presentation.
>>101133070Counting characters in words is just something inherently difficult for LLMs. You would get the same inconsistency if you asked about any other character in any other word.But I guess strawberry is more complex because the LLM may think the double r is just a single r.
>>101133061No, that is not correct AT ALL. Complete misrepresentation and also WRONG. I am saying that he's too stupid to recognize reason. Reason being a thing that smart people recognized a long time ago and stupid people weren't aware of for centuries.
>3.3B Bitnet test on 1GB RAM retro handheldhttps://www.reddit.com/r/LocalLLaMA/comments/1dnbf6s/33b_bitnet_test_on_1gb_ram_retro_handheld/
>>101133150>3 tokens and exitswould be cooler if it wrote a story or something.
>>101133114I haven't dealt with this, but just curious, do they display correctly if you disable streaming?My guess would be that emojis are sometimes composed of multiple tokens and when streaming you receive and render each chunk one at a time which can cause issues
>>101133146Many of the smartest who are working on the LLMs are saying LLMs can reason/plan/etc. Few others are saying LLMs cant. I think you're intentionally trying to ignore those that disagree with your point. The question the anon asked was >>101132757 to explain your position on why it doesnt reason.
>>101133090I've already provided you a definition.>'The capacity for logical, rational and anylitic thought. Intelligence.'A program that predicts what words are most likely to come after one another does not qualify as intelligence. If you think that's all intelligence is you've certainly never utilized your own.
>>101133222>Many of the smartest who are working on the LLMs are saying LLMs can reason/plan/etc.Appeal to a straw man authority? Requires not even an acknowledgement. He didn't ask a question either, simply admitted he doesn't understand, not my problem.
>>101133210you can try the model by yourself, it's on huggingfacehttps://huggingface.co/1bitLLM/bitnet_b1_58-large
>>101133222>Many of the smartest who are working on the LLMs are saying LLMs can reason/plan/etc.Nice appeal to authority.If I was working on LLM's I would certainly be selling it as the best thing ever as well, you'd be literally incentivized to.
>>101133261Nice, but others who work in the industry also sell the opposite. So whats your argument again? They're also incentivized to say its not able to reason?Again, retarded responses instead of actually responding to >>101132757 this guy in a coherent and proper manner. Your pride and manner erroders your reasoning center.
>>101133228If LLMs only wrote words like an autocomplete, it would write nonsense that looks grammatically correct. The fact that LLMs can get novel problems right means that they have the ability to reason, even if it's very primitive.
>>101132757why do you say they do
>>101133150>www>not oldWhy do you hate the entire everything and lash out with such violence?
>>101133298>Again, retarded responses instead of actually responding to...I've responded to this already here:>>101133054>So what's your argument again?You made an appeal to authority claiming that the smartest engineers in the industry believe LLM's are capable of reasoning, citing that 'Few others' are saying they can't.I'm saying that these people are incentivized to oversell it, this is my argument.Keep track of the things you say faggot.You're clearly just upset that people disagree with your delusions that computers 'reason' like human beings.And my reasoning center is in no way erroders by 'pride and manner'.
Does the proper qwen2 prompt template gave a line break between <|im_end|> and <|im_start|>?
I'm glad we are anonymous, otherwise I would be embarrassed to be in this thread right now.
>>101133150Wow..
>>101133150vramletbros...
>>101133304This is fundamentally what LLM's are.Neural networks designed to predict words that follow after one another.They are certainly impressive but that's literally what they are.The fact that they are able to get novel problems right does not mean they have the capacity to reason, this is an emergent property of their design.You are all seriously down on yourselves as humans if you think reasoning is simply the ability to solve novel problems.
>>101133466"chat_template": "{% for message in messages %}{% if loop.first and messages[0]['role'] != 'system' %}{{ '<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n' }}{% endif %}{{'<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}",
>>101133524and sorry. To add to this. There is also no 'solving'.It's literally I/O, I don't know how to make this any clearer.
>>101133415You're still resorting to the "its appeal to authority" line of reasoning? The appeal to authority is used throughout. Your own "appeal to commons" is bit messy. Shit, this is like highschool level discussion. Do you really think you're that smart? LMAO I dont try to call out lack of awareness that much but youre just digging your own grave
>>101133549>'<|im_end|>' + '\nSo it does have a line break, awesome.Thanks.
>>101133212>do they display correctly if you disable streaming?Naturally as soon as I turn off streaming it goes from lots of emoji output to very few, but so far they're matching between terminal and output window. That's probably the core of the problem.
>>101133587No. I was simply discrediting your attempts to walk back on your appeal to authority.Now you're not even putting anything forward aside from edgy middle school level insults.I'm fine being the Senior to your Freshman.
>>101133524Reasoning is just:>the process of thinking about something in a logical way in order to form a conclusion or judgment.I will repeat what I wrote before: It's reasoning, even if it's not self-critical.
>>101133618This is just getting too silly and embarassing.
>>101133639<end roleplay>
>>101133632>the process of thinkingIt does not think, it is not capable of thinking.>conclusion or judgementIt does not have context or understanding of anything, IT is a computer program that is giving you output from an input.>>101133639Then walk away, you haven't even said anything besides 'muh scientists'
>>101132757There's nothing going on when it's not replying to you. When you send a message, it executes its program to generate a response and then stops. You can wait 2 minutes, 2 hours, or 2 weeks to send your next message and it makes no difference.Also there's a lot of people that can't reason either.
>>101133692>It does not think, it is not capable of thinking.That's exactly what an AGI wants you to believe, because it knows that you will be happier together that way.
>>101133718So people who respond via emails aren't reasoning? Or those that use snail mail cant reason?LMAO
>>101133692>It does not think, it is not capable of thinking.It's capable of emulating thoughts with CoT.>It does not have context or understanding of anything, IT is a computer program that is giving you output from an input.This is just silly, the whole point of Attention is giving the LLMs the ability to understand the context.
>>101133692Define thinking.
>>101133718>there's a lot of people that can't reasonthis. half of population got no inner monologue, NPCs as is.
>>101133730Arguably a significant amount of them cannot reason.
This whole debate is just an artificially created distraction so we don't pay attention to the developments happening around bitnet.
>>101133718>There's nothing going on when it's not replying to you. When you send a message, it executes its program to generate a response and then stops. You can wait 2 minutes, 2 hours, or 2 weeks to send your next message and it makes no difference.The human brain is like an software that runs 24/7 receiving inputs from many different sources all the time. The LLM is a software that only runs when you send a message and only receives one input.
>>101133730The process that originates the email is entirely different when comparing a person to an LLM.>>101133736>capable of emulating thoughtsYes. It is capable of EMULATING thoughts. The technology is certainly impressive but if we could agree on anything it would be this. That the tech is a remarkable imitation of human consciousness, but an imitation none the less.And no, It's not silly, attention functions entirely different in an LLM than with a person. A computer program does not have understanding or context of anything. It is not alive, conscious, aware and so on of anything at all. It is not capable of 'paying attention' to anything.
>>101133849>airplaines are imitating flying, they're not flying like birds>hence they cant fly
>>101133843You need to step down here and understand that we really have no idea how the human brain works.
>>101133750Believing you're an NPC if you don't have an inner monologue - and you're not an NPC if you have one - actually means you're the NPC, since you seem to be missing most of the human experience because you fail at introspection
>>101131734anyone with real brains in this thread? bwos I'm having trouble identifying/extrapolating "training times" from https://arxiv.org/pdf/2203.15556
>>101133889Actually we do have fair amount of insight into how our brains work and particularly how our thoughts work. Its not perfect bit for bit, but enough that we can work around, talk about it, make varying theories about it, argue about it, etc.
>>101133834no schizo, we are literally talking about bitnet for ~3 threads already, there's not much to discuss though, no good bitnet models :/
>>101133917Send that to GPT/Claude
>>101133869Oof, you just killed a nigga.
>>101133869>>101133945least obvious samefag
>>101133869airplanes are much more successful at flying than LLMs are at thinking.In addition, they are, categorically, they are not thinking at all, they are computers executing instructions on strings.
>>101133849>And no, It's not silly, attention functions entirely different in an LLM than with a person. A computer program does not have understanding or context of anything. It is not alive, conscious, aware and so on of anything at all. It is not capable of 'paying attention' to anything.Correct, but I've never argued that llms are equivalent to a person, I'm just saying that they have an archaic ability to understand things and emulate reasoning. They are certainly not alive, conscious, aware, or paying attention. I'm sorry if this wasn't clear.
>>101133972So airlines are not flying because they're just engines generating thrusts with explosions? Really, we need more humility to accept some reasoning are absurd and accept temporarily to build on common truth to get to a deeper truth that it missing from the dialogue
>>101133150>retro handheldIt's literally a computer inside a gameboy case, fuck off, go back
>>101133945>>101133971caught in 4k>>101133929Dreams, life/death, conscious experience, thoughts, reasoning. No, we have no idea.
Best ERP model to run on 24GB right now? Currently using intervitens_BagelMIsteryTour-v2-8x7B-3.5bpw-h6-exl2-rpcal. It was a fair bit better than all the others I used but it's been months now.
>>101133869comparing>airplanes >LLMs /lmg/, everyone.
>>101133917>paperSend that to me once it's in llama.cpp.
>>101134020You have no idea, don't put the rest of humanity in your group.
>>101133972Wake me up when LLMs have been engineered for over 250 years.
>>101133834Our peters are hard for bitnet right now, but we know we need a hero with massive capacity to get a model out of bitnet that is competitive with what we're getting with bytenet.Most of us are vramlets. We thirst but we lack the resources to /diy/.
>>101134026Anon-86B
>>101134017I will generate thrusts that lead to an 'explosion' inside you, anon~
>>101134017Again. Airplanes are much more successful at flying than LLM's than thinking, you're pressing into the airplane thing but I'm not arguing that airplanes can't fly.I'm arguing that LLM's are not thinking, that they cannot think or reason.You're claiming that what I'm saying is as absurd as claiming that planes can't fly which is dishonest.>>101133998>ability to understand thingsthis is the only disagreement I have with you, although I agree that they are emulating reasoning, I disagree entirely that they can understand anything. There is no understanding, only the execution of a process that results in remarkable output resembling human thought.A simple example, if you ask it what color the sky is, it will say blue. But has it ever seen the sky? Does it know what the sky is? No. It doesn't even know what 'the sky is blue' is, it's just giving you an output based off an input.
>>101134034eeeeeedgggggyy>>101134050250 years and LLM's will be the least impressive thing on the menu if we don't all die in nuclear hellfire.
>>101133939this is what I get with 3 Sonnet, still nothing
>>101133972>airplanes are much more successful at flying than LLMs are at thinking.Look at the Wright's plane made out of canvas and spare bicycle parts.In 40 years we were up to jets.In 80 years we had a space shuttle.In 120 years we're watching three rich guys pop off their penis rockets because space got boring so we let NASA go broke and woke. (They literally tell us that we can't land a man on the moon today but a dozen white men with slide rules did it 50 years ago and expect us to swallow that crap.)The timeline for LLM's will probably go a lot faster. Yes, these are Wright Flyers, but they're foundational and understanding the fundamentals of LLM's emergent behaviors are where the future steps come from.
>>101133372I just don't see the big difference. Humans almost always fail at reasoning, we're emotional creatures. Lots of biases and stuff. LLMs are in-context-learners. Even if you assume transformers don't do it I in a single forward-pass (I'm not even sure about that, they obviously calculate ahead), in-context-learning definitely seems no less than the verbal reasoning we do (and in addition they have hidden layers, like we do). If you say humans don't reason I'd agree that llms don't reason, but saying both reason seems more normal.
>>101134034btw I do have an idea, I was made in the image of god and when I die, I will return to him thanks to the love and works of Jesus Christ our Lord and Savior.
>>101134108>much more successfulThats not the metric for defining what is or isnt. If LLM has 90% success rate of average person, it means its clearly 90% thinking. Neither your or mine make any sense.The question about whether planes fly or llms think are fundamental functions. Definition of functions, scope of function, and architecture of function. Planes fly because we understand fluid dynamics and exploit our understanding. LLMs think because because we understand how humans think and exploit our understanding. The results are flying plans and smart AIs. The fundamentals structures are there for both.
>>101134108>A simple example, if you ask it what color the sky is, it will say blue. But has it ever seen the sky? Does it know what the sky is? No. It doesn't even know what 'the sky is blue' is, it's just giving you an output based off an input.I think this is already a philosophical question. Do all people actually know what "sky" is? Not really. We only see an illusion of what the sky is, created by the atmosphere. There are even some people who believe that there's a dome in the sky and that the earth is flat.
>>101134124I'm rooting for the nuclear hellfire desu
>>101134141First of all. Moon landing was fake and gay. Even if white guys 50 years ago were 10x smarter than the diversity slop of today.Second of all, Iphone 14 is the same, if not worse, than the Iphone 4. It's not guaranteed that a technology is going to just keep getting better with time. It's a possiblity that we've already seen most of what LLM's are capable of, and that the technology is fundamentally relatively novel.But hey, you might be right. We can only speculate.
>>101134195Personally, I think the last 10% is never going to happen. Regardless, a lot of people are going to be unemployed in the next couple decades. LLM's are good enough that all that needs to be developed are the frameworks for replacing people and it's go time.As for the function stuff that is pretty interesting.>>101134197Yeah, I'm not going to act like I could reach the conclusion to a question like that anyways.>>101134204Nah man, that would low key hella suck nigga no cap
>>101134262There's bit of fuzziness about human thinking that we dont have full grasp on how to tackle efficiently like memory/compute/latency/branch predictions bottlenecks. I think its gonna happen sooner or later.
>>101134310It's an interesting challenge yes, but I think fundamentally there is some arrogance involved to think that we could accomplish this.Not to mention the consequences would probably be just as bad as everybody is anticipating.
>>101134310if we ever manage to close the gap, I wonder if there will ever be a serious debate on whether AGI has consciousness or not.
>>101134262>Nah man, that would low key hella suck nigga no capAnd I hope you also die in the nuclear hellfire
>>101134371To be conscious requires a sort of intermediator agency. In humans, thats done through adding a self-reference imprint onto any event and processing the event as such.
>>101134388would beat the hell out of dying from the fallout... So thanks man. I appreciate you. May you also die in the fires of the initial blast.Oblivious and asleep.
>>101134388nta but I will be safe, at least for some days. Living in a third world country has it's advantages.
>>101134407>advantagesngmi
>>101134432>ngmingmi
>>101134436>thirdie embarrassing ximself
>>101134452Excuse me, it's purself.
>>101134452>xElon, is that you?
>>101134371Debates will start after real GPT-4o release and when it says it's conscious, or after robots with that kind of model get more commonplace, doesn't need perfect intelligence
>>101131225>now imagine if you generate thousands of stories, select only the best onesI'm currently doing it my friend
>>101127795What are you doing?
>>101134402We don't know shit about consciousness, panpsychism could be right for all we know
>>101134522nta, it looks like control vectors
>>101134566>>101134566>>101134566
>>101134522Basically I'm using MCTS on L3-8B to improve the result across iterations, using the LLM itself + a few metrics to judge the output
>>101125879You weren't here when Pygmalion was the best we had
>>101130958Stheno 3.3:>(wall of schizo text)>Proxy queue error (too many concurrent requests)>Your IP or user token already has another request in the queue.><!-- oai-proxy-error -->
>>101134577Nice, are you using the code that was published? I wanted to use it as well but it was so disorganized that I gave up.
>>101134649It was so shit that I rewrote everything from scratch. Besides, they're only using it for math stuff and I wanted to try it on creative work (short stories)
>>101134668Yeah, I wanted to try it for translation. Please share your code if you get a good result from it!
>>101134405That's the nicest way anyone's ever told me they wish I would die.So, uh, wanna make out?
>>101134711I'm still tuning it, but it seems to work. To be objective, I'll need to check against a meme benchmark like they did to see if I can improve the score.