/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 06/24/24(Mon)01:54:11 No.101125756

File: MiquOfWallstreet.png (1.31 MB, 848x1200)

1.31 MB PNG

/lmg/ - Local Models General Anonymous 06/24/24(Mon)01:54:11 No.101125756 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>101115749 & >>101104774

►News
>(06/23) Support for BitnetForCausalLM merged: https://github.com/ggerganov/llama.cpp/pull/7931
>(06/18) Meta Research releases multimodal 34B, audio, and multi-token prediction models: https://ai.meta.com/blog/meta-fair-research-new-releases
>(06/17) DeepSeekCoder-V2 released with 236B & 16B MoEs: https://github.com/deepseek-ai/DeepSeek-Coder-V2
>(06/14) Nemotron-4-340B: Dense model designed for synthetic data generation: https://hf.co/nvidia/Nemotron-4-340B-Instruct

►News Archive: https://rentry.org/lmg-news-archive
►FAQ: https://wikia.schneedc.com
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Programming: https://hf.co/spaces/bigcode/bigcode-models-leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
06/24/24(Mon)01:54:47 No.101125759

Anonymous 06/24/24(Mon)01:54:47 No.101125759

File: 1701997253789523.gif (141 KB, 452x435)

141 KB GIF

►Recent Highlights from the Previous Thread: >>101115749

--Papers: >>101125140 >>101125200 >>101125221 >>101125288 >>101125432 >>101125524 >>101125738
--Struggling with Large Model Performance: 6bpw CR+ and VRAM Limitations: >>101116353 >>101116854 >>101116903 >>101117204
--1.58Bit Method Achieves Same Results as fpt16 for Larger Models: >>101120448 >>101120560 >>101120687 >>101120708 >>101120750 >>101120788 >>101120709 >>101120530 >>101120542
--Dialog vs Prose in Roleplaying and Creative Writing: >>101117120 >>101117289 >>101117314 >>101117367 >>101117912
--Questioning the High Cost of Gigabyte T181-G20 Barebones Server: >>101124374 >>101124400 >>101124480
--Proposed Hardware Design for Flexible AI Model Deployment with Large Storage Capacities: >>101119406 >>101120393
--Optimizing NUMA for Dual-Socket Motherboards in AI Model Inference: >>101116283 >>101116668 >>101116774 >>101120987
--L3 Struggles with Contextual Understanding Compared to CR+: >>101122733 >>101123243
--IQ Quants Suddenly Faster on CPU Due to AVX2 Support Merge in LLaMA.cpp: >>101116639 >>101116717 >>101116775 >>101117163
--Enabling Seed Setting in Kobold for Deterministic Results: >>101122206 >>101122229
--Developing Open-Source AGI with CPU Optimizations: >>101117777 >>101118014 >>101117888 >>101117988 >>101118248 >>101118289 >>101118243 >>101119258 >>101119395 >>101119438 >>101119652 >>101119843 >>101120086 >>101120101 >>101121141 >>101122372 >>101122427 >>101122642 >>101122435 >>101122500 >>101122624 >>101122700 >>101122737 >>101122774 >>101124163 >>101122502
--Can Mistral 1.58 Model be Converted to GGML for Inference?: >>101122849 >>101122894 >>101122912 >>101122928 >>101122979
--Bitnet Merged into LLaMA.cpp: >>101119460 >>101120760 >>101120280 >>101120405 >>101121410 >>101121482 >>101121517 >>101121530 >>101121634 >>101121720 >>101121787 >>101122037 >>101121487
--Miku (free space): >>101116881 >>101119385 >>101119897 >>101124652

►Recent Highlight Posts from the Previous Thread: >>101115755

Anonymous
06/24/24(Mon)01:56:47 No.101125780

Anonymous 06/24/24(Mon)01:56:47 No.101125780

AI will lose

Anonymous
06/24/24(Mon)02:02:50 No.101125815

Anonymous 06/24/24(Mon)02:02:50 No.101125815

>>101125780
What will win?

Anonymous
06/24/24(Mon)02:08:04 No.101125853

Anonymous 06/24/24(Mon)02:08:04 No.101125853

>>101125815
Me.

Anonymous
06/24/24(Mon)02:08:57 No.101125861

Anonymous 06/24/24(Mon)02:08:57 No.101125861

I just started using "AI" models for the first time just a few days ago. I am using ollama with llama3 8b on my old desktop CPU only. I am amazed with the ability to ask for a python script to turn a folder full of images into a collage then in less than a minute the code and explanation are done. Then just for fun I can have the code converted into another language or change what libraries are used. The ability to write code then translate and edit is beyond amazing at this point

The average person has no clue how insane these "AI" models are and how they are usable even on older hardware without the need for expensive hardware

Anonymous
06/24/24(Mon)02:10:28 No.101125873

Anonymous 06/24/24(Mon)02:10:28 No.101125873

>>101125861
local gpt-4o or go home, obviously that will never ever happen.

Anonymous
06/24/24(Mon)02:10:59 No.101125879

Anonymous 06/24/24(Mon)02:10:59 No.101125879

I've just completely given up on fancy autocomplete. I used to check lmg several times a day, then several times a week, then a couple times a month. It's the same shit as last year. Even the smartest models like Opus fail to perform decent assistant tasks. It was cool to see things early last year and get hyped but the stagnation of it all and learning the intricates of what it can and cannot do left the magic gone. This is a deadend that will be drawn out by snakeoil merchants until more people see behind the curtain.

Anonymous
06/24/24(Mon)02:14:21 No.101125902

Anonymous 06/24/24(Mon)02:14:21 No.101125902

>>101125879
i think image-gen is the only use case for ai meme.

Anonymous
06/24/24(Mon)02:15:52 No.101125913

Anonymous 06/24/24(Mon)02:15:52 No.101125913

>>101125879
Yeah... I still like tinkering with them though.

Anonymous
06/24/24(Mon)02:16:21 No.101125916

Anonymous 06/24/24(Mon)02:16:21 No.101125916

File: sign the fucking petition.png (450 KB, 1024x1024)

450 KB PNG

https://github.com/ggerganov/llama.cpp/discussions/8078
Look, just sign the stupid petition. I got stuff to do.

Anonymous
06/24/24(Mon)02:19:14 No.101125936

Anonymous 06/24/24(Mon)02:19:14 No.101125936

>>101125879
You're just expecting too much out of it. They're fine for entertainment as long as you repeat to yourself "Everything is made up and should not be taken as fact."

Anonymous
06/24/24(Mon)02:23:39 No.101125965

Anonymous 06/24/24(Mon)02:23:39 No.101125965

>>101125936
I can agree with this, when we start getting the soul of character ai or sonnet at home. Nothing local achieves that yet.

Anonymous
06/24/24(Mon)02:24:51 No.101125976

Anonymous 06/24/24(Mon)02:24:51 No.101125976

>>101125879
I don't know what kind of assistant tasks you're trying to do with it but I get a ton of value making the little shits spit out simple scripts, getting them to help troubleshoot computer and Linux things, and for non-computer stuff as well. It's like Google if it were able to rearrange its results to present you with an answer that actually fits your unique situation, and honestly that's good enough for me. It's not like the corporations spending billions on training these were putting that money to good use anyway.

Anonymous
06/24/24(Mon)02:25:33 No.101125984

Anonymous 06/24/24(Mon)02:25:33 No.101125984

>>101125965
Those are like 400b models
You'll never be able to run something like that on a 3090 even with bitnet that's like 50gb if my math is right.

Anonymous
06/24/24(Mon)02:31:43 No.101126024

Anonymous 06/24/24(Mon)02:31:43 No.101126024

>>101125976
They're pretty okay with coding tasks but they're shit at categorizing, making lists, brainstorming creative solutions to complex problems. I've made hundreds of requests and probably got back ten favorable results.
The quantity is so many you don't remember the details just that the model failed to fulfill the request.

Anonymous
06/24/24(Mon)02:32:34 No.101126031

Anonymous 06/24/24(Mon)02:32:34 No.101126031

>>101125780
people who expect AGI from an LLM with a single coomer prompt will lose

LLM is good as the first step - classifying input, and last step - generating human-readable output after something actually smart (some code, 1000 pajeets, real AGI) does all the job inbetween.

Anonymous
06/24/24(Mon)02:35:18 No.101126045

Anonymous 06/24/24(Mon)02:35:18 No.101126045

>>101125861
>>101125879
hmmm

Anonymous
06/24/24(Mon)02:38:05 No.101126061

Anonymous 06/24/24(Mon)02:38:05 No.101126061

Claude 3.5 Sonnet completely wrote the FastAPI backend for an app I'm making as a side business. It also successfully wrote a solid code implementation plan for my LangGraph / LangCh*in agent system....it seems like 3.5 Sonnet is trained on a lot more recent code libraries like the ones my retarded ass day job that has me use LangGraph (unironically my day job is wrangling retarded LLM agents i shit you not lol).

I'm actually looking forward to this week of work because I have this fucking beast of an AI to copilot me through the treacherous software engineering process of using black-box like LangChain classes and with meme tier "agent" pattern design. Btw agent just means a bunch of LLM calls in a loop until a task is finished with some function calling aka le "tools"

I fucking love Anthropic holy fuck.

Also for my retarded LangChain project at work we plan to use some close source Natural Language -> SQL 7B defog-ai SQLCoder models so uhhhh closed source models are cool or something

Anonymous
06/24/24(Mon)02:39:24 No.101126068

Anonymous 06/24/24(Mon)02:39:24 No.101126068

>>101126061
>so uhhhh closed source models are cool
always was, local LLM shit is cope in purest form.

Anonymous
06/24/24(Mon)02:40:06 No.101126072

Anonymous 06/24/24(Mon)02:40:06 No.101126072

>>101126024
And were those things possible or in less time/effort before you started using models for it? If not, then I would still say it's a win. Of course it's not perfect. But for me I generally get pretty good success on the things I throw at it. I expect to hit limitations with more complex tasks or tasks I know that current models aren't trained for, so I simply just don't use it for those. Therefore, maybe I am just avoiding the bitterness that comes from the frustration of needing to reroll and prompt engineer to get it working. I'm just pretty happy with how things are going. Feels good to me.

Anonymous
06/24/24(Mon)02:40:40 No.101126073

Anonymous 06/24/24(Mon)02:40:40 No.101126073

>>101126031
What you mean to say is LLM is good at fooling dumb people. You can ask a high end model about a book, movie, or game and it will use the correct character names, but assign them incorrect roles. The plot might get completely mixed up.
You basically have to already know the answer otherwise the model will so effectively give you an answer you'll believe a wrong one is correct.
So what's the point in asking the model a question if you need to already know the answer to confirm it's correct, because the model itself will not admit it's incorrect.

Anonymous
06/24/24(Mon)02:41:54 No.101126080

Anonymous 06/24/24(Mon)02:41:54 No.101126080

>>101126068
I think local models have a place in bigger agentic software systems, with the caveat that there is a bigger closed source model orchestrating and metaprompting it.

I mean Llama 3 400B dense model might be the first non-retarded local model

Anonymous
06/24/24(Mon)02:43:30 No.101126095

Anonymous 06/24/24(Mon)02:43:30 No.101126095

File: GarageMikuWidebody.png (1.51 MB, 1168x880)

1.51 MB PNG

Good night lmg

Anonymous
06/24/24(Mon)02:43:44 No.101126097

Anonymous 06/24/24(Mon)02:43:44 No.101126097

>>101126068
Not him but I use both. No need to shit on one or the other. Although you can always shit on ClosedAI for being petty little fucks. Based Anthropic BTFOuting them.

Anonymous
06/24/24(Mon)02:46:21 No.101126116

Anonymous 06/24/24(Mon)02:46:21 No.101126116

>>101126073
this is a solved proble, you shouldn't rely on model's inherent "knowledge", instead you do RAG. Treat whatever knowledge it has as a byproduct of its training.

Anonymous
06/24/24(Mon)02:46:55 No.101126117

Anonymous 06/24/24(Mon)02:46:55 No.101126117

>>101126095
good night blue collar miku

Anonymous
06/24/24(Mon)02:48:46 No.101126129

Anonymous 06/24/24(Mon)02:48:46 No.101126129

>>101126095
Looking at this image suddenly reminded me for some reason of that one nip video maker that replaces his body with an overlay of an anime girl. Truly the future of vtubing.

Anonymous
06/24/24(Mon)02:50:06 No.101126141

Anonymous 06/24/24(Mon)02:50:06 No.101126141

So basically for months I was getting frustrated with Opus (and obviously GPT-4 and 4o) not knowing how to design my LLM agent system for data science agents. My LLM agents basically write SQL and wrangle dataframes from a Snowflake DB and then analyze the dataframes as a data scientist would with pandas and the usual python memes like plotly, scikit, statsmodels, etc.
I'm doing this in LangGraph because corporate fags (I work at a big company with few hundred millions revenue) make me use LangGraph until I refuse (I haven't refused yet).

Anyways the issue with using Opus and the other language models was obviously the fucking training data was covering retarded, esoteric, and super-new python libraries for AI engineering like LangChain and DEFINITELY not LangGraph which is a framework by LangCh*in specifically for agents. I was getting frustrated as fuck trying to wrangle my retarded LLM agents, even with Opus doing its best for me...Opus is smart as hell I'll add.

But this last week with 3.5 Sonnet released, it's clear that Anthropic trained this new model on the meme LLM frameworks like LangChain and have given it training data that covers debugging function calling errors and tool selection errors, and in general has the whole set of AI dev corpora. It's fucking cracked. It's a monster. This model is so smart will make the manliest of men basedface.

Anonymous
06/24/24(Mon)02:51:00 No.101126145

Anonymous 06/24/24(Mon)02:51:00 No.101126145

>>101126073
This is a pitfall of local right now. A character will start the scene nude then take off their pants a few messages later then take off their pants again a little after that. It's like the model always assumes they're clothed if they're not in a shower.
One scene I remember he took off his pants and crawled into bed to cuddle. Then a few hugs later he reached down to slide his pants off before sleep. Then when he woke up he took his pants off again.
It's like the model is repeating the start of a sex scene where the guy takes off their pants each time without remembering the context that those pants have been off since he crawled into the bed and the context is INSIDE THE 8K MEMORY

Anonymous
06/24/24(Mon)02:51:07 No.101126147

Anonymous 06/24/24(Mon)02:51:07 No.101126147

>>101126141
>was obviously the fucking training data was covering
was NOT* covering

Anonymous
06/24/24(Mon)02:53:42 No.101126167

Anonymous 06/24/24(Mon)02:53:42 No.101126167

>>101126145
skill issue

Anonymous
06/24/24(Mon)02:53:49 No.101126169

Anonymous 06/24/24(Mon)02:53:49 No.101126169

>>101126141
Actually thanks for pointing this stuff out anon. I was considering using "agents" for some processes in my business but, as a nocoder, I didn't feel confident trying to hack GPT-4 or something into trying to do it. Maybe I'll seriously try this out soon.

Anonymous
06/24/24(Mon)02:55:39 No.101126182

Anonymous 06/24/24(Mon)02:55:39 No.101126182

>>101125756
>Support for BitnetForCausalLM merged
Nice, glad to see Bitnet is getting more support

Anonymous
06/24/24(Mon)02:57:04 No.101126190

Anonymous 06/24/24(Mon)02:57:04 No.101126190

>>101126169
I mean don't force agents into a system unless it can easily be automated by language models. I'm trying to automate a structured data science workflow that usually takes a human a few weeks and a few thousand dollars (create some nice PDF or artifact of some bullshit business KPI analysis you did from dataframes that come from warehoused data in some snowflake db)

Agents are non-deterministic so if you work with structured data like my dumbass is doing it's tough. Trying out enforcing structured output with pydantic models with agents and some iterative code improvement / error debugging mechanism

Anonymous
06/24/24(Mon)03:01:33 No.101126210

Anonymous 06/24/24(Mon)03:01:33 No.101126210

>>101126182
now we're gotta wait for an actual good BitNet model...

Anonymous
06/24/24(Mon)03:02:20 No.101126216

Anonymous 06/24/24(Mon)03:02:20 No.101126216

>>101126210
are you a coper or a doomer?

Anonymous
06/24/24(Mon)03:03:10 No.101126224

Anonymous 06/24/24(Mon)03:03:10 No.101126224

>>101126210
not gonna happen lol

Anonymous
06/24/24(Mon)03:07:59 No.101126251

Anonymous 06/24/24(Mon)03:07:59 No.101126251

File: 1717589799973245.jpg (16 KB, 375x420)

16 KB JPG

>>101125879
>>101126024
Garbage in, garbage out. If you can't explain with a precise vocabulary what you want, it won't work out. That's what filter normalfags btw because prompt design is not a meme.

Anonymous
06/24/24(Mon)03:10:09 No.101126263

Anonymous 06/24/24(Mon)03:10:09 No.101126263

>>101126251
>t. coping promptfag
no one cares about this shit, people want good things at instant, like it was in that CAI anons here talk about sometimes.

Anonymous
06/24/24(Mon)03:11:04 No.101126269

Anonymous 06/24/24(Mon)03:11:04 No.101126269

>>101126251
incredible

Anonymous
06/24/24(Mon)03:11:47 No.101126273

Anonymous 06/24/24(Mon)03:11:47 No.101126273

>>101126263
You care enough to whine on this thread though. I get good results and I can't care less about retards like you though.

Anonymous
06/24/24(Mon)03:14:04 No.101126288

Anonymous 06/24/24(Mon)03:14:04 No.101126288

>>101126273
and that's why local AI is dying btw, just like linux it will never take off for masses.

Anonymous
06/24/24(Mon)03:15:27 No.101126300

Anonymous 06/24/24(Mon)03:15:27 No.101126300

>>101126141
Yeah 3.5 sonnet made me switch from gpt4. Anyway the LLM agent thing look like a big ass project, I hope you're paid enough for that considering the amount of profit they can get from it.

Anonymous
06/24/24(Mon)03:15:53 No.101126303

Anonymous 06/24/24(Mon)03:15:53 No.101126303

File: 1girl, {{{{{teto kasne}}}(...).png (1.29 MB, 832x1216)

1.29 MB PNG

>>101125879
I don't know what you expected. AGI by 2024? lol. I've been following this shit on and off for years, since before GPT-2. It's just slow progress as people tinker with these things and gradually make them better. And money comes in to train fuck huge models for us.

Anonymous
06/24/24(Mon)03:22:15 No.101126339

Anonymous 06/24/24(Mon)03:22:15 No.101126339

>>101126251
This has to be the biggest cope I've seen.
Go ask any model to "Generate a list of a hundred fruits without repeating entries in alphabetical order."
Before it's halfway through it'll fall off the rails and start listing crazy shit like almonds.
There are over 2000 different types of fruits in the world and I just want a list of 100 with perfect prompting.
Cope and seethe that your autocomplete is fucking retarded retard.

Anonymous
06/24/24(Mon)03:24:47 No.101126351

Anonymous 06/24/24(Mon)03:24:47 No.101126351

>>101126339
i bet he cares about coomshit only, any model can do "you show bobs and open own vagene i insert peenus inside!" RP.

Anonymous
06/24/24(Mon)03:28:22 No.101126364

Anonymous 06/24/24(Mon)03:28:22 No.101126364

>>101126339
>Go ask any model to "Generate a list of a hundred fruits without repeating entries in alphabetical order."
Here is your list retard. Fuck you for making me try it. https://pastebin.com/rnjjDR1N

Anonymous
06/24/24(Mon)03:28:50 No.101126368

Anonymous 06/24/24(Mon)03:28:50 No.101126368

File: Screenshot from 2024-06-2(...).png (43 KB, 910x339)

43 KB PNG

>>101126339
I mean it's not wrong

Anonymous
06/24/24(Mon)03:30:35 No.101126381

Anonymous 06/24/24(Mon)03:30:35 No.101126381

>>101126339
Yeah I can see why llms make indians seethe

Anonymous
06/24/24(Mon)03:30:41 No.101126382

Anonymous 06/24/24(Mon)03:30:41 No.101126382

>>101126339
>nooooo the llm should read my mind! i'm autistic so it shouldn't output almonds too
yeah stick with a dictionary, you're wasting electricity

Anonymous
06/24/24(Mon)03:30:45 No.101126383

Anonymous 06/24/24(Mon)03:30:45 No.101126383

>>101126364
>https://pastebin.com/rnjjDR1N
Are almonds a fruit you fucking idiot?

Anonymous
06/24/24(Mon)03:31:44 No.101126391

Anonymous 06/24/24(Mon)03:31:44 No.101126391

>>101126383
>>101126368

Anonymous
06/24/24(Mon)03:32:16 No.101126396

Anonymous 06/24/24(Mon)03:32:16 No.101126396

>>101121482
I don't think that will happen. There might eventually be specialized hardware but I doubt they will be tightly regulated. The profit margins on the consumer market would be much too high. There would be lobbyists against it and due to the news being heavily influenced by corporate interests as well I think any regulation to that degree would be dead in the water. It would be like trying to regulate GPU's because that is what is used to run the current models, it would never happen.

Anonymous
06/24/24(Mon)03:33:04 No.101126397

Anonymous 06/24/24(Mon)03:33:04 No.101126397

>>101125965
stop making me sad that the original c.ai dataset/model will never, ever leak

Anonymous
06/24/24(Mon)03:34:42 No.101126406

Anonymous 06/24/24(Mon)03:34:42 No.101126406

>>101125984
>on a 3090
You're supposed to buy 4 of them my friend. You have a job, right?

Anonymous
06/24/24(Mon)03:40:22 No.101126431

Anonymous 06/24/24(Mon)03:40:22 No.101126431

>>101126368
>a language model supporting stupidity
this is the problem right here in 910x339 resolution
The answer to the question is, Almonds are a seed of a fruit. They are not a fruit.
That's like saying Apple Seeds are the fruit, this is incorrect. THEY ARE SEEDS. Or in almond's case, a new category, NUTS.
When you are making a list of FRUITS you are not making a list of SEEDS or NUTS.
Therefore the model failed to provide a correct list.
This is just one example. If models were passing the mensa test it wouldn't be an argument. Hell they can't even 100% hellaswag.

Anonymous
06/24/24(Mon)03:41:11 No.101126442

Anonymous 06/24/24(Mon)03:41:11 No.101126442

Damn, dudes. I fucking suck at proooooooompting and settings-fiddling.

Anonymous
06/24/24(Mon)03:48:01 No.101126482

Anonymous 06/24/24(Mon)03:48:01 No.101126482

>>101126431
damn you sure showed us anon, can you fuck off now?

Anonymous
06/24/24(Mon)03:52:44 No.101126506

Anonymous 06/24/24(Mon)03:52:44 No.101126506

File: Screenshot.png (151 KB, 2478x1251)

151 KB PNG

>>101126368
Is bing smarter than gpt4o local bros?

Anonymous
06/24/24(Mon)03:53:24 No.101126510

Anonymous 06/24/24(Mon)03:53:24 No.101126510

File: 1girl, {{{{{teto kasne}}}(...).png (898 KB, 832x1216)

898 KB PNG

>>101126339
>>101126431
much better silly questions the best models fail at are posted here regularly. No one thinks these things are human level AGI. If you expected that you are the retard.

Hell I don't know if an average human could do that on the spot, quickly. Especially without pencil and paper to keep track. I'd probably run out of ideas after like 50 common fruits, and it would take me a moment to count the list, and a few minutes to alphabetize it. You actually want a superhuman AGI, really.

Still though, you can see the potential there. It clearly knows a lot of fruit and can output them quickly. If you separate the counting and alphabetizing tasks I bet it would do better. If you let it review it's work and make corrections, it usually improves it at stuff like this. So, skill issue, really.

inb4 "but i shouldn't HAVE to be good at prompting." Well okay, no one can make you learn how to use your tools correctly. But every new model is getting better at "prompting itself". So soon enough your dumb requests should Just Work with no effort.

Anonymous
06/24/24(Mon)03:53:38 No.101126513

Anonymous 06/24/24(Mon)03:53:38 No.101126513

>>101126442
>settings-fiddling
Why is this even a thing. Like, why models don't have a config that has all the recommended settings dialed in. You can tweak it if you like, but why there are no good defaults built-in? This is so dumb

Anonymous
06/24/24(Mon)03:54:22 No.101126518

Anonymous 06/24/24(Mon)03:54:22 No.101126518

>>101126506
depends how many watermelons it can hold

Anonymous
06/24/24(Mon)03:57:37 No.101126547

Anonymous 06/24/24(Mon)03:57:37 No.101126547

>>101126506
>>101126431
seeds are a part of the fruit, therefore making them fruit. A part of an apple is still fruit. I mean yeah it's not what you wanted or expected, but it's not wrong.
The bigger issue is that the botanical definition of "fruit" is ridiculously broad and doesn't really match what the average person means by fruit. But then again average people will debate things like whether tomatoes are fruit or not, or whether potatoes are vegetables or not, which is pretty retarded.

Anonymous
06/24/24(Mon)03:58:16 No.101126553

Anonymous 06/24/24(Mon)03:58:16 No.101126553

>>101126510
Hang on so first you say it's a prompt issue with the request then you cope further and say no one thinks of things things as agi for a fucking list of fruits.
If you wanna move the goalpost go ahead but that is why I said I've just completely given up on fancy autocomplete. Because it can't even generate a list of fruits correctly.
How much money was spent making this shit? Millions? Billions? Can't even make a list of fruits a 5 year old could do. Amazing.

Anonymous
06/24/24(Mon)03:58:22 No.101126554

Anonymous 06/24/24(Mon)03:58:22 No.101126554

>>101126506
bing can searsh through the internet, that's cheating kek

Anonymous
06/24/24(Mon)04:01:48 No.101126584

Anonymous 06/24/24(Mon)04:01:48 No.101126584

>>101126547
You're focusing too much on a fruit example for dumb people. Try prompting a more technical list where fallacies must be avoided.

Anonymous
06/24/24(Mon)04:02:51 No.101126594

Anonymous 06/24/24(Mon)04:02:51 No.101126594

File: she-want-it-v0-rfgtdjwa08fa1.png (178 KB, 793x539)

178 KB PNG

>>101125879
Goodbye newfag.
Sonnet 3.5 is a game changer for coding. First model closed or not that doesnt run in a circle when fucking up.

Locally we had unquantisized pygmalion who output stuff like pic related. And we had a good time back then.
We came so far. I remember people saying last year we will never have 3.5 turbo at home. We clearly have that now. Fuck off.

Anonymous
06/24/24(Mon)04:10:39 No.101126647

Anonymous 06/24/24(Mon)04:10:39 No.101126647

File: Screenshot from 2024-06-2(...).png (38 KB, 743x586)

38 KB PNG

>>101126553
>generating alphabetized lists of fruit is important
no it's not
>a 5 year old could do it
no they couldn't
>it's not a skill issue
I put it into chatGPT and it seemed to work. I separated the alphabetize step from the listing step, as I told you to do. The only objectionable one on the whole list was cucumber, but again, it is technically a fruit.
>Millions? Billions?
Zero (0) dollars were spent on training it to make alphabetical lists of fruits, obviously.

Anonymous
06/24/24(Mon)04:11:41 No.101126654

Anonymous 06/24/24(Mon)04:11:41 No.101126654

>>101126594
>3.5 turbo at home
uhh wat
what model anon?

Anonymous
06/24/24(Mon)04:12:06 No.101126657

Anonymous 06/24/24(Mon)04:12:06 No.101126657

Magnum 72 q2xxs
CtxLimit: 3104/4096, Process:17.36s (826.6ms/T = 1.21T/s), Generate:340.56s (1010.6ms/T = 0.99T/s), Total:357.92s (0.94T/s)

This is so painful.
Is there no other newer model thats like 20b and not horny like stheno?

Anonymous
06/24/24(Mon)04:14:26 No.101126666

Anonymous 06/24/24(Mon)04:14:26 No.101126666

>>101126647
>He can't see the forest for the trees

Anonymous
06/24/24(Mon)04:14:50 No.101126671

Anonymous 06/24/24(Mon)04:14:50 No.101126671

>>101126654
StableLM

Anonymous
06/24/24(Mon)04:15:20 No.101126674

Anonymous 06/24/24(Mon)04:15:20 No.101126674

>>101125815
Social Memory Complex

Anonymous
06/24/24(Mon)04:20:20 No.101126705

Anonymous 06/24/24(Mon)04:20:20 No.101126705

>>101126657
Since you're using an i-quant you're offloading all the layers to GPU right?

Anonymous
06/24/24(Mon)04:25:40 No.101126744

Anonymous 06/24/24(Mon)04:25:40 No.101126744

I know "shivers" and "eyes burning with determination" is a meme at this point, but what prompt do (You) use to mitigate this? Generally speaking, what prompt do (You) use for more natural output?

Anonymous
06/24/24(Mon)04:37:04 No.101126847

Anonymous 06/24/24(Mon)04:37:04 No.101126847

>>101126744
I just make sure that the the model has eyedrops to use when eyes are burning, as well as anti shivers cream. The model applies it when it gives those lines and stops spouting them afterwards because it "used" the eyedrops.

Anonymous
06/24/24(Mon)04:43:32 No.101126892

Anonymous 06/24/24(Mon)04:43:32 No.101126892

>>101126744
You can't shiver it away

Anonymous
06/24/24(Mon)04:47:12 No.101126930

Anonymous 06/24/24(Mon)04:47:12 No.101126930

>>101126705
Wait, is that relevant? I can offload a bit more than half of the layers.
I thought imatrix quants only improve the perplexity for q3 and lower. Do they make interference slower if not fully on the gpu?

Anonymous
06/24/24(Mon)04:50:54 No.101126956

Anonymous 06/24/24(Mon)04:50:54 No.101126956

>>101126930
Yes, significantly. At half I don't think there's really any benefit over k quants.

Anonymous
06/24/24(Mon)04:53:43 No.101126980

Anonymous 06/24/24(Mon)04:53:43 No.101126980

>>101126956
vramlets are eternally cursed.
so i cant even have the "a bit less braindead lower quants". ah well.
appreciate you telling me anon.

Anonymous
06/24/24(Mon)04:54:56 No.101126988

Anonymous 06/24/24(Mon)04:54:56 No.101126988

>>101126980
You can use them, they're just no faster than k quants.

Anonymous
06/24/24(Mon)04:58:32 No.101127018

Anonymous 06/24/24(Mon)04:58:32 No.101127018

File: trucker-tet.png (3.27 MB, 1352x2032)

3.27 MB PNG

>>101126980
Try going for the Q2_K or Q3 quants as they'll be a little faster. IQ performance on CPU still isn't great.

Anonymous
06/24/24(Mon)05:03:10 No.101127055

Anonymous 06/24/24(Mon)05:03:10 No.101127055

>>101125879
He's got a point. Despite all the hype and bullshit, AI is a very slow developing tech and is frankly just not interesting or useful. Just think how insanely tedious and boring it must have been to work in this field for the decades before GPT existed, and yet people did, and they are the only reason we got the few good years of explosive growth which was quite frankly just luck. All we can really do at this point is just wait for them to make it better because you aren't going to tinker around in your basement for a few weeks and come up with the next big breakthrough, it just doesn't work like that.

Anonymous
06/24/24(Mon)05:05:20 No.101127077

Anonymous 06/24/24(Mon)05:05:20 No.101127077

Can L3 70b and it's finetunes be pushed past 16k context with higher alpha values or do they break down? I'm using 2.6 alpha with 16k context on exl2

Anonymous
06/24/24(Mon)05:07:14 No.101127091

Anonymous 06/24/24(Mon)05:07:14 No.101127091

>>101127018
i noticed zero difference in speed last time i tried a q vs iq of the same size, i think it might be a meme

Anonymous
06/24/24(Mon)05:08:38 No.101127108

Anonymous 06/24/24(Mon)05:08:38 No.101127108

You got out jail recently, and you remember talking to a young hippie about open source models. He seemed quite knowledgeable.

Anonymous
06/24/24(Mon)05:09:39 No.101127113

Anonymous 06/24/24(Mon)05:09:39 No.101127113

You have a seriously bad case of the nerd neck. Sit up straight!

Anonymous
06/24/24(Mon)05:13:39 No.101127144

Anonymous 06/24/24(Mon)05:13:39 No.101127144

>>101127108
get ye flask

Anonymous
06/24/24(Mon)05:14:02 No.101127148

Anonymous 06/24/24(Mon)05:14:02 No.101127148

>>101127091
> (0.94T/s)
>meme
Let me guess next you're gonna say that S quants are better than M quants lol

Anonymous
06/24/24(Mon)05:16:16 No.101127170

Anonymous 06/24/24(Mon)05:16:16 No.101127170

>>101127148
relative speed doesn't matter shitferbrains, the difference between iq and non was what i responded to.

Anonymous
06/24/24(Mon)05:28:15 No.101127259

Anonymous 06/24/24(Mon)05:28:15 No.101127259

>>101125756
I tried ERP with Sonnet 3.5 with /aicg/ recommended jailbreaks and 90% of the time it doesn't work and when it works quality is significantly worse than SFW RP.
I don't understand how people use it for ERP satisfying way.
I'm back to 3.5 bpw llama70B.

Anonymous
06/24/24(Mon)05:29:10 No.101127268

Anonymous 06/24/24(Mon)05:29:10 No.101127268

File: literally me.jpg (40 KB, 547x441)

40 KB JPG

>In Greek mythology, Pygmalion (/pJɡˈmeJliən/; Ancient Greek: Πυγμαλίων Pugmalíōn, gen.: Πυγμαλίωνος) was a legendary figure of Cyprus. He is most familiar from Ovid's narrative poem Metamorphoses, in which Pygmalion was a sculptor who fell in love with a statue he had carved.
It took me this long to actually look up where Pygmalion got its name from and now I can say that was a pretty fucking brilliant and predictive-of-use-purposes name choice.

Anonymous
06/24/24(Mon)05:41:06 No.101127340

Anonymous 06/24/24(Mon)05:41:06 No.101127340

File: i-have-no-idea-what-pygma(...).png (325 KB, 1043x765)

325 KB PNG

>>101127055
Are you the same anon?
How an you call it not interesting or useful?
I'm using it daily at work and this weekend I could use sonnet 3.5 to easily make myself my own local llm+tts solution.
This would have taken siginificantly longer googling and looking at stackoverflow posts.
Especially for coding/writing/insipiration its really helpful.
You can now make html5 games zero-shot.

A little while ago we had 64*64 horror images. Now we have maybe crude but free chink video generation.
Look at older movies like 10-20 yrs ago how the androids/robots talked.
LLama3 8B talks more naturally.
Once robotics caught up it will be weird.

I just wish when I was a kid I had all this technology.
I dont have much time anymore. Back then I had to rely on rpgmaker and charsets some artist faggot made.
There is so much you could do already.
Maybe you are looking at too many pajeet hype posts on X. They try to sell shit so they hype it up.
Again, you should have been here for pyg 2 years ago. Things are so different now it feels like a relic from another era.

Anonymous
06/24/24(Mon)06:13:24 No.101127564

Anonymous 06/24/24(Mon)06:13:24 No.101127564

File: 1695058556215619.jpg (137 KB, 1360x1360)

137 KB JPG

>>101127055
>because you aren't going to tinker around in your basement for a few weeks and come up with the next big breakthrough
Yes I will do that

xXEdgeLord69Xx
06/24/24(Mon)06:19:48 No.101127624

xXEdgeLord69Xx 06/24/24(Mon)06:19:48 No.101127624

File: file.png (428 KB, 628x894)

428 KB PNG

I found this image on the internet.
Does anyone know which model this is?

Anonymous
06/24/24(Mon)06:24:48 No.101127673

Anonymous 06/24/24(Mon)06:24:48 No.101127673

>>101127624
Any model with a proper system prompt

Anonymous
06/24/24(Mon)06:25:44 No.101127683

Anonymous 06/24/24(Mon)06:25:44 No.101127683

>>101127673
What does a 'proper' system prompt look like?

Anonymous
06/24/24(Mon)06:28:52 No.101127711

Anonymous 06/24/24(Mon)06:28:52 No.101127711

>>101127683
Read OP

Anonymous
06/24/24(Mon)06:30:35 No.101127722

Anonymous 06/24/24(Mon)06:30:35 No.101127722

>>101127711
I don't see any system prompts in OP.

Anonymous
06/24/24(Mon)06:31:19 No.101127732

Anonymous 06/24/24(Mon)06:31:19 No.101127732

>>101127711
which part?

Anonymous
06/24/24(Mon)06:31:22 No.101127733

Anonymous 06/24/24(Mon)06:31:22 No.101127733

>>101127722
Ask on /aicg/, you got the wrong thread

Anonymous
06/24/24(Mon)06:33:17 No.101127751

Anonymous 06/24/24(Mon)06:33:17 No.101127751

>>101127733
I'm pretty sure I'm in the right thread.

Anonymous
06/24/24(Mon)06:38:30 No.101127795

Anonymous 06/24/24(Mon)06:38:30 No.101127795

File: 1693020187484886.png (38 KB, 1201x214)

38 KB PNG

>When the LLM judges itself

Anonymous
06/24/24(Mon)07:20:49 No.101128182

Anonymous 06/24/24(Mon)07:20:49 No.101128182

>>101126657
https://huggingface.co/turboderp/llama3-turbcat-instruct-8b
I didn't try it, but it is new.
https://huggingface.co/NousResearch/Hermes-2-Theta-Llama-3-8B
There's also this one.

Anonymous
06/24/24(Mon)07:28:38 No.101128274

Anonymous 06/24/24(Mon)07:28:38 No.101128274

>>101125713
No they won't, you still train in fp16 typically. And why do you think anyone really cares to please nvidia, everyone wants cheaper GPUs, I've not heard of a single ML person that was happy about having to pay that much! Anyway, it's obvious they're not limited to 2 bits of information, but proper utilization may take pretraining for longer. l3-8b likely utilizes more bits because it was overtrained to hell and back.
While bigger models will use far less. Also, back when OpenAI was a proper research lab they published this https://openai.com/research/nonlinear-computation-in-deep-linear-networks/ which more or less shows a toy network that learned to abuse floating point quirks. I think it's a little bit obvious that as you lower the learning rate, you'll manage to target better those higher precision bits, but it's not really worth doing until learning plateau's enough that you would lower it, or I say that but people do use cosine lr schedulers instead of such methods more these days.

Anonymous
06/24/24(Mon)07:32:14 No.101128319

Anonymous 06/24/24(Mon)07:32:14 No.101128319

>>101127340
>How an you call it not interesting or useful?
It's LOW IQ so only LOW IQ find it impressive for the amount of input you put it to get some useful output. It's slightly faster in some respects, that's novel but it will wear off with time and be just as uninteresting as using google instead of driving to the library.

Anonymous
06/24/24(Mon)07:37:27 No.101128350

Anonymous 06/24/24(Mon)07:37:27 No.101128350

>>101128319
Nah it's just low iq zoomers like you who can only use something if it's toddlerproof. Go play with tiktok or something

Anonymous
06/24/24(Mon)07:42:11 No.101128395

Anonymous 06/24/24(Mon)07:42:11 No.101128395

File: file.png (1.2 MB, 1613x497)

1.2 MB PNG

How do you properly in upscale in comfy? Using the ultimate upscaler. What settings to look for?

Anonymous
06/24/24(Mon)07:43:21 No.101128405

Anonymous 06/24/24(Mon)07:43:21 No.101128405

File: 1718667895045397.webm (1.67 MB, 634x850)

1.67 MB WEBM

What are some GPT custom instructions?
>Concise, menu-like responses.
>Direct content without request repetition.
>No extra remarks.
>Assume high intelligence, tech literacy.
>Avoid unnecessary advice.
I just found out about these and made this set to reduce text clutter.

Anonymous
06/24/24(Mon)07:43:30 No.101128409

Anonymous 06/24/24(Mon)07:43:30 No.101128409

>>101128350
No It's LOW IQ because it ends up spending 10 pages of text to explain something that only takes 1 or 2 sentences. Dumbasses see 10 pages of sophomore level logic and think they are talking to Neils Bohr.

Anonymous
06/24/24(Mon)07:57:02 No.101128499

Anonymous 06/24/24(Mon)07:57:02 No.101128499

>>101128405
anything, treat it like a lump of clay, tell it what you want it to do

Anonymous
06/24/24(Mon)08:02:17 No.101128545

Anonymous 06/24/24(Mon)08:02:17 No.101128545

no better 70b since llama3 release

Anonymous
06/24/24(Mon)08:03:23 No.101128556

Anonymous 06/24/24(Mon)08:03:23 No.101128556

Someone actually tested hermes theta by now?

Anonymous
06/24/24(Mon)08:03:52 No.101128562

Anonymous 06/24/24(Mon)08:03:52 No.101128562

>>101128395
try one of the sd threads. i took one look at 'comfy ui' and there was nothing comfy about it

Anonymous
06/24/24(Mon)08:04:16 No.101128565

Anonymous 06/24/24(Mon)08:04:16 No.101128565

>>101126594
Hey do you think it could write a basic compsci-tier 3D demo? Basically take the Stanford bunny and rotate it as a wireframe with basic viewport clipping. I have yet to find a local model that doesn't fuck up the basic vec data types and produce code it can't fix no matter how much you help it.

Anonymous
06/24/24(Mon)08:07:24 No.101128593

Anonymous 06/24/24(Mon)08:07:24 No.101128593

Still using GPT since I pay, Sonnet ist that much better?

Anonymous
06/24/24(Mon)08:12:49 No.101128639

Anonymous 06/24/24(Mon)08:12:49 No.101128639

>>101128409
>Summarize your point in 1 or 2 sentences
Wew it was hard

Anonymous
06/24/24(Mon)08:13:51 No.101128651

Anonymous 06/24/24(Mon)08:13:51 No.101128651

>>101128593
Depends what you're doing with it

Anonymous
06/24/24(Mon)08:18:33 No.101128693

Anonymous 06/24/24(Mon)08:18:33 No.101128693

>>101128639
I get it, it's fun to back and fourth with somebody on your own intelligence level, enjoy.

Anonymous
06/24/24(Mon)08:19:24 No.101128703

Anonymous 06/24/24(Mon)08:19:24 No.101128703

Could anyone recommend a Chrome extension that would let me plugin my local setup for Youtube video summaries any maybe other sites?

Anonymous
06/24/24(Mon)08:20:39 No.101128715

Anonymous 06/24/24(Mon)08:20:39 No.101128715

File: 1710157993536726.jpg (266 KB, 905x881)

266 KB JPG

>>101128693
>Can't use a tool without having it read his mind
Must be hard to live like that

Anonymous
06/24/24(Mon)08:23:45 No.101128742

Anonymous 06/24/24(Mon)08:23:45 No.101128742

>>101128715
>>Can't use a tool without having it read his mind
That was the original promise of AI though, recommender systems. The promise has regressed and now we are settling for "It's fun and just as dumb as I am"

Anonymous
06/24/24(Mon)08:25:51 No.101128769

Anonymous 06/24/24(Mon)08:25:51 No.101128769

>>101128742
Don't worry, just wait another 10 years and you'll be living in Idiocracy

Anonymous
06/24/24(Mon)08:25:51 No.101128770

Anonymous 06/24/24(Mon)08:25:51 No.101128770

>>101127340
>I just wish when I was a kid I had all this technology.
>I dont have much time anymore. Back then I had to rely on rpgmaker and charsets some artist faggot made.
>There is so much you could do already.

Same bro. Also don't have the same kind of energy anymore. If I had this shit as a teen I'd be rich now.

Anonymous
06/24/24(Mon)08:26:37 No.101128781

Anonymous 06/24/24(Mon)08:26:37 No.101128781

Is this an ML general or is it only for LLMs?

Anonymous
06/24/24(Mon)08:29:32 No.101128813

Anonymous 06/24/24(Mon)08:29:32 No.101128813

>>101128781
This thread tries to do both but there really could be another thread just for academic side of things.

Anonymous
06/24/24(Mon)08:30:28 No.101128827

Anonymous 06/24/24(Mon)08:30:28 No.101128827

>>101126145
>eating pussy
>somehow she manages to whisper into your ear
every time

Anonymous
06/24/24(Mon)08:30:52 No.101128830

Anonymous 06/24/24(Mon)08:30:52 No.101128830

Apple, Meta Have Discussed an AI Partnership
https://www.wsj.com/tech/ai/apple-meta-have-discussed-an-ai-partnership-cc57437e
https://archive.is/swdhC

Anonymous
06/24/24(Mon)08:33:07 No.101128857

Anonymous 06/24/24(Mon)08:33:07 No.101128857

>>101128830
damn apple really is desperate for ai after they said it would never catch on, huh

Anonymous
06/24/24(Mon)08:33:27 No.101128864

Anonymous 06/24/24(Mon)08:33:27 No.101128864

File: 00011_.jpg (2.01 MB, 2032x2560)

2.01 MB JPG

>>101128562
yea its not comfy at all, just testing it and i don't enjoy what i'm doing, except for enjoyment of the sai's third version model landscape images

Anonymous
06/24/24(Mon)08:33:45 No.101128868

Anonymous 06/24/24(Mon)08:33:45 No.101128868

>>101128565
Dont know, never tried any 3d stuff with it.
There is stuff like this though:
https://x.com/ammaar/status/1804649903815115053
https://x.com/rafalwilinski/status/1804447038399975784
https://x.com/JohnMontenegro/status/1803843562573795478
https://x.com/AnzTech/status/1804859567417033185
https://x.com/websim_ai/status/1803901523522699730
https://x.com/stas_kulesh/status/1803836934285881585
https://x.com/SuguruKun_ai/status/1804028757914849479

Anonymous
06/24/24(Mon)08:35:06 No.101128890

Anonymous 06/24/24(Mon)08:35:06 No.101128890

>>101128813
This thread is just for the academic side of things. Everything else belongs on aicg.

Anonymous
06/24/24(Mon)08:40:59 No.101128957

Anonymous 06/24/24(Mon)08:40:59 No.101128957

>>101128813
>>101128890
Alright. I'm fucking around on sklearn a bit, I'm only starting out with stuff like pipelines and whatnot but I'm wondering, is there not an issue of performance when it comes to using Python for very large datasets?

Anonymous
06/24/24(Mon)08:43:14 No.101128976

Anonymous 06/24/24(Mon)08:43:14 No.101128976

>>101128957
That's a programming question though

Anonymous
06/24/24(Mon)08:43:49 No.101128981

Anonymous 06/24/24(Mon)08:43:49 No.101128981

>>101128957
Are you using pandas? It's very optimized for large datasets

Anonymous
06/24/24(Mon)08:46:08 No.101129002

Anonymous 06/24/24(Mon)08:46:08 No.101129002

>>101128976
Does machine learning not involve programming?
>>101128981
Yeah I am. Though I've heard that polars was a good alternative nowadays
>optimized for large datasets
How does that work, isn't Python itself a slow language?

Anonymous
06/24/24(Mon)08:49:10 No.101129030

Anonymous 06/24/24(Mon)08:49:10 No.101129030

>>101129002
>isn't Python itself a slow language?
Not any of those anons, but it's only slow in comparison to other non-interpreted languages and only for some things. There's a lot of things in python that actually run native code directly, making certain operations as fast as the same code written in, say, C.
It's not for everything, but Python has been getting more and more performant with time.
I'm still not too fond of the language, but that's personal preference.

Anonymous
06/24/24(Mon)08:50:56 No.101129042

Anonymous 06/24/24(Mon)08:50:56 No.101129042

>>101129002
Pandas operations run on C++. It's fine so long as everything you need to do can be done from within pandas. But as soon as you need to rewrite it to a python loop, you are going to be waiting a long time for things to finish.

Anonymous
06/24/24(Mon)08:53:37 No.101129066

Anonymous 06/24/24(Mon)08:53:37 No.101129066

>>101129002
>Does machine learning not involve programming?
Yes you also need to know how to use a keyboard and a mouse but they typically don't teach those concepts in a ML course

Anonymous
06/24/24(Mon)08:56:24 No.101129099

Anonymous 06/24/24(Mon)08:56:24 No.101129099

File: 1717844354757204.jpg (22 KB, 590x520)

22 KB JPG

Spoonfeed me the best model for generating decent quality erotic text. All the big ones are censored, but I want to make a textual erotic game which would partially outsource the writing to AI.
I've tried a couple of 13B models and they seem too bad for a decent quality output, so preferably something big

Anonymous
06/24/24(Mon)08:57:51 No.101129116

Anonymous 06/24/24(Mon)08:57:51 No.101129116

>>101129099
>13 B
Stheno v3.2 if you don't mind the horny.
Otherwise, mixtral 8x7b zloss limarp, comandR, miqu 70b.

Anonymous
06/24/24(Mon)08:57:54 No.101129119

Anonymous 06/24/24(Mon)08:57:54 No.101129119

>>101129099
You need to write a feedback loop to increase the quality

Anonymous
06/24/24(Mon)08:59:02 No.101129131

Anonymous 06/24/24(Mon)08:59:02 No.101129131

>>101128593
Since I am on poe I can test a bit of both.
I'm using it for coding, its no good for PR since it refuses everything regarding roleplay.
The benchmarks where its leading still dont really do it justice. There is something else going on that they don't catch.

Multiple times I specifically fed the problems sonnet 3.5 could solve by simple "doesnt work error: ".
Gpt4o and all the others run in circles with the classi
>Ah yes sorry you are right. This is the solution "Same wrong code again or starts making stuff up".

Difficult to describe but sonnet kinda listens to the user more in a sincere way.
I both had it correct me when I was proclaiming something wrong and I actually argued my way with it to make a positive comment about a sexy character. Feels less "stuck" in the context.

That and even with long multiple repeating code blocks after a couple imrpovements and it actually manages to not get tripped up often.
Its actually good.

Anonymous
06/24/24(Mon)09:03:14 No.101129165

Anonymous 06/24/24(Mon)09:03:14 No.101129165

>>101129030
>>101129042
Is it like a C++ API pandas calls to run the performance critical stuff? I don't really get language performance bottlenecks

Anonymous
06/24/24(Mon)09:03:38 No.101129171

Anonymous 06/24/24(Mon)09:03:38 No.101129171

File: HikingWithMiku.png (1.5 MB, 1152x896)

1.5 MB PNG

>>101128545
>no better 70b since llama3 release
qwen2 at least has proper long context while being in the same intelligence league as L3, even though I don't find it quite as smart

Anonymous
06/24/24(Mon)09:05:26 No.101129191

Anonymous 06/24/24(Mon)09:05:26 No.101129191

>>101129131
Yeah it doesn't get stuck. The only bad thing with Sonnet is the smaller context size than GPT4.

Anonymous
06/24/24(Mon)09:10:38 No.101129236

Anonymous 06/24/24(Mon)09:10:38 No.101129236

File: kekw.png (9 KB, 698x169)

9 KB PNG

>>101129165
This is literally the kinds of questions AI is good at.

Anonymous
06/24/24(Mon)09:22:22 No.101129330

Anonymous 06/24/24(Mon)09:22:22 No.101129330

>>101128827
>>101126145
you niggers use 4q 8b and 13b models and say AI sucks. Yeah, no surprise there. This literally does not happen with big models. they'll even remember to put clothes back on afterwards.

Anonymous
06/24/24(Mon)09:36:01 No.101129457

Anonymous 06/24/24(Mon)09:36:01 No.101129457

>>101129330
I disagree.

Anonymous
06/24/24(Mon)09:39:58 No.101129509

Anonymous 06/24/24(Mon)09:39:58 No.101129509

>>101129171
>Qwen2
Not that anon, but now I realize I barely played around with that.
Qwen2 7B Stheno when?
I can run the 57B MoE but with my multi prompting schema it would take way too long for all the prompts to run. With a 8b/7b class model shit is nearly instant so I can iterate fast, even if the quality suffers.
The tradeoff works for me.
Changing topics completely, perplexity labs has nemotron-4-340b-instruct for free in their playground. What's that model good for? Coding?
I'm using free Claude to write the skeleton of a game in Python for me to use as a reference/base, and I'm wondering if this 340B model would be anywhere near as decent at back and forth brainstorming and writing code.

Anonymous
06/24/24(Mon)09:43:02 No.101129546

Anonymous 06/24/24(Mon)09:43:02 No.101129546

File: 1719236561788.png (166 KB, 801x710)

166 KB PNG

>>101125756

Anonymous
06/24/24(Mon)09:43:31 No.101129553

Anonymous 06/24/24(Mon)09:43:31 No.101129553

>>101129509
>nemotron-4-340b-instruct
>https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/nemotron-4-340b-base
> and supports a context length of 4,096 tokens.
Yeah okay, forget I asked. It's shit.

Anonymous
06/24/24(Mon)09:46:37 No.101129585

Anonymous 06/24/24(Mon)09:46:37 No.101129585

>>101129509
alpin said he'd drop other versions of magnum
>Should have Qwen-2 7B, Qwen-2 47B, and Qwen-1.5 32B done by the end of the day
>six days ago
so there's your 'stheno-like 7B'
https://huggingface.co/alpindale/magnum-72b-v1/discussions/2#66713bb492412fd46410d399

Anonymous
06/24/24(Mon)09:47:39 No.101129598

Anonymous 06/24/24(Mon)09:47:39 No.101129598

>>101129546
Miku is technology (not Anime)

Anonymous
06/24/24(Mon)09:50:42 No.101129625

Anonymous 06/24/24(Mon)09:50:42 No.101129625

>>101129585
That's pretty neat. Thank you anon.

Anonymous
06/24/24(Mon)09:56:33 No.101129685

Anonymous 06/24/24(Mon)09:56:33 No.101129685

What's the best way to run these recurrent gemma models? Specifically the 9B one.
https://github.com/google/gemma.cpp?
Llama.cpp doesn't have support for it, right?

Anonymous
06/24/24(Mon)10:31:44 No.101130050

Anonymous 06/24/24(Mon)10:31:44 No.101130050

What happened with v-jepa

Anonymous
06/24/24(Mon)10:32:55 No.101130064

Anonymous 06/24/24(Mon)10:32:55 No.101130064

>>101129553
what is RoPE?? (context length extension)

Anonymous
06/24/24(Mon)10:34:20 No.101130080

Anonymous 06/24/24(Mon)10:34:20 No.101130080

what is love

Anonymous
06/24/24(Mon)10:35:37 No.101130092

Anonymous 06/24/24(Mon)10:35:37 No.101130092

>>101130064
I can tell you what it isn't: an excuse for releasing a model with 4096 context in the year of our lord 2024

Anonymous
06/24/24(Mon)10:40:25 No.101130141

Anonymous 06/24/24(Mon)10:40:25 No.101130141

>>101130064
The thing is, to work with code, which is the usecase I referenced, you really need a large context window, and extending a 4k context 4 times with RoPE is already a lot, imagine doing 8, or 16 times extension.
It really should have been at least 32k context.

Anonymous
06/24/24(Mon)10:41:52 No.101130160

Anonymous 06/24/24(Mon)10:41:52 No.101130160

>>101130064
>>101130092
>>101130141
Rope isn't context length extension, go back you nigger redditor.

Anonymous
06/24/24(Mon)10:43:09 No.101130172

Anonymous 06/24/24(Mon)10:43:09 No.101130172

>>101130141
They are in the business of selling cards, it's not fucking complicated to see what's going on here.

Anonymous
06/24/24(Mon)10:47:30 No.101130229

Anonymous 06/24/24(Mon)10:47:30 No.101130229

>>101130160
everyone knows that RoPE is what we use to extend the context, how can you be so new?

Anonymous
06/24/24(Mon)10:48:54 No.101130245

Anonymous 06/24/24(Mon)10:48:54 No.101130245

>>101130092
>>101130141
I guess you'll have to wait for someone to do a large context tune then

>>101130160
I know, but if anyone in the this thread is talking about RoPE it will be mostly about easily extending it to larger contexts? It's still the one type of positional embedding everyone uses because it generally works well, and is also convenient and easy to extend.

Anonymous
06/24/24(Mon)10:52:19 No.101130283

Anonymous 06/24/24(Mon)10:52:19 No.101130283

Prompting is obviously really important, especially for LLMs, so why isn't there a good prompting guide or examples in OP? A silly tavern prompting guide would be great too, since there are so many different prompt styles like alpaca, vicuba, chatml, etc.

Why are anons so reluctant to share good prompts on here? I get good results with my prompts for role play, but I'm sure I could have much better prompts if people were actually willing to share here and it was discussed more without people just replying with skill issue.

Anonymous
06/24/24(Mon)10:55:09 No.101130312

Anonymous 06/24/24(Mon)10:55:09 No.101130312

So has anyone coomed to Bitnet 3B yet?

Anonymous
06/24/24(Mon)10:55:26 No.101130316

Anonymous 06/24/24(Mon)10:55:26 No.101130316

>>101129509
There's been some hype that the big DeepSeek-Coder-V2-Instruct is SOTA for coding among local models (enough to beat sonnet 3.5 apparently), maybe try that if you have too much VRAM: https://github.com/deepseek-ai/DeepSeek-Coder-V2

Anonymous
06/24/24(Mon)10:56:45 No.101130327

Anonymous 06/24/24(Mon)10:56:45 No.101130327

>>101129330
>they'll even remember to put clothes back on afterward
>not immediately terminating your waifu's stream of consciousness at the moment of release out of shame
NGMI

Anonymous
06/24/24(Mon)10:57:22 No.101130336

Anonymous 06/24/24(Mon)10:57:22 No.101130336

>>101130283
It's a useless skill that can be made obsolete by just using a better model

Anonymous
06/24/24(Mon)10:57:34 No.101130338

Anonymous 06/24/24(Mon)10:57:34 No.101130338

>>101130312
if someone can coom to a 3B model that's a world record kek

Anonymous
06/24/24(Mon)10:58:30 No.101130346

Anonymous 06/24/24(Mon)10:58:30 No.101130346

>>101130338
A 3B base model quantized down to less than a gigabyte in size, no less.

Anonymous
06/24/24(Mon)11:00:55 No.101130380

Anonymous 06/24/24(Mon)11:00:55 No.101130380

>>101130346
it's not quantized, that's the point of BitNet, it has supposedly the same accuracy as fp16 while being only a 1.58bit model

Anonymous
06/24/24(Mon)11:02:22 No.101130397

Anonymous 06/24/24(Mon)11:02:22 No.101130397

>>101130338
>Tell us you're a newfag without actually telling us

Anonymous
06/24/24(Mon)11:03:37 No.101130420

Anonymous 06/24/24(Mon)11:03:37 No.101130420

>>101130397
>us

Anonymous
06/24/24(Mon)11:04:38 No.101130431

Anonymous 06/24/24(Mon)11:04:38 No.101130431

>>101130338
hi undi

Anonymous
06/24/24(Mon)11:10:24 No.101130482

Anonymous 06/24/24(Mon)11:10:24 No.101130482

>>101129002
>Does machine learning not involve programming?
Not that much, just basic things.
For being ML engineering monkey you only need the understanding of pipelines and model engineering (debugging the model if it doesn't work, knowledge what to do when you have high variance or bias etc.)/
If you do something more serious like actual research you need a good understanding of math behind all of that more than anything. Again, not programming.

>>101129002
>isn't Python itself a slow language?
It's slower compared to some others but it isn't that slow, it has a bad reputation for its early days.
But it doesn't matter for ML anyway because all optimized AI libraries in python are just wrappers and call C/C++ functions underneath

Anonymous
06/24/24(Mon)11:12:56 No.101130509

Anonymous 06/24/24(Mon)11:12:56 No.101130509

Which L3-8B-Stheno-v3.2, GGUF i should use?

Anonymous
06/24/24(Mon)11:13:28 No.101130517

Anonymous 06/24/24(Mon)11:13:28 No.101130517

File: Png.png (27 KB, 420x420)

27 KB PNG

>>101125780
Nah it'd win

Anonymous
06/24/24(Mon)11:14:48 No.101130530

Anonymous 06/24/24(Mon)11:14:48 No.101130530

>>101130509
q8 obviously

Anonymous
06/24/24(Mon)11:15:00 No.101130533

Anonymous 06/24/24(Mon)11:15:00 No.101130533

Models for creative writing or Tsxt Adventure (Not in the >action format obviously)?
Anyone?
Is everybody focused on smut and ERP?

Anonymous
06/24/24(Mon)11:16:28 No.101130554

Anonymous 06/24/24(Mon)11:16:28 No.101130554

>>101130533
just use a base model

Anonymous
06/24/24(Mon)11:18:29 No.101130574

Anonymous 06/24/24(Mon)11:18:29 No.101130574

>>101130533
There are no good writing models, especially if you want to write something longer than a story about sucking a cock.
I'm sure some anons will be coping and giving me (You)s but that much is clear to anyone who reads books, at least occasionally.

Anonymous
06/24/24(Mon)11:18:50 No.101130578

Anonymous 06/24/24(Mon)11:18:50 No.101130578

>>101130554
LLAMA-1 seems to be the most soulful even if it is dumb. (I preferred HyperMantis here)
LLAMA-2 is too bad with purple prose
LLAMA-3 I haven't tried yet.
I also don't know about CMDR but when I tried it it didn't really feel great

Anonymous
06/24/24(Mon)11:21:35 No.101130610

Anonymous 06/24/24(Mon)11:21:35 No.101130610

>>101130574
Well writing isn't good, how about models that don't make {{user}} win 100% of the time then?
I've tried Tiefighter and Psyonic Cetacean and while they are great they still have this problem. And I haven't had enough time with Command R yet

Anonymous
06/24/24(Mon)11:24:46 No.101130644

Anonymous 06/24/24(Mon)11:24:46 No.101130644

>>101130574
I remember that a Japanese something literature price winner long time ago used llms

Anonymous
06/24/24(Mon)11:26:22 No.101130662

Anonymous 06/24/24(Mon)11:26:22 No.101130662

>>101130283
There's
>https://www.promptingguide.ai/
>https://github.com/dair-ai/Prompt-Engineering-Guide
I pull every now and then but i never had the need to read it. I don't know how accurate it is.
There's also this other site i always forget the name for. glogle.com... or goologe.com.. something like that.

Anonymous
06/24/24(Mon)11:33:26 No.101130747

Anonymous 06/24/24(Mon)11:33:26 No.101130747

Does consciousness exist? Do humans have it? All of them? Do language models have it? All of them?

Anonymous
06/24/24(Mon)11:34:10 No.101130759

Anonymous 06/24/24(Mon)11:34:10 No.101130759

bitnet is ogre before it even started
I did a fresh pull of llama.cpp and I can't load the bitnet q2_2 model off hf RIP we just got rugged

Anonymous
06/24/24(Mon)11:34:52 No.101130768

Anonymous 06/24/24(Mon)11:34:52 No.101130768

>>101130747
Large Language Models is just a text predictor

Anonymous
06/24/24(Mon)11:37:14 No.101130795

Anonymous 06/24/24(Mon)11:37:14 No.101130795

>>101130747
All language models do but most humans are soulless meat socks that just parrot whatever flavor of the minute opinion Asmongold last made a video about.

Anonymous
06/24/24(Mon)11:38:42 No.101130814

Anonymous 06/24/24(Mon)11:38:42 No.101130814

>>101130610
That's tough. Every model has an inherent bias. Models tuned on ERP will expect being dicked and encourage fucking, models tuned on fantasy will try to come up with heroic adventures and triumphs.
How many books/fanfics have you read in which a main character is a loser that just keeps losing? It just doesn't happen in fiction in any significant portion.
And LLMs don't think or plan, they simply generate based on what they learned and their own biases. Sure, you can battle it to some point with a prompting but you will quickly see that it will ignore it in a long run.

Here is a good analogy - the difference between humans and LLMs is that if you presented 1000 apples and 1 pear to a human and then showed them another pear, they would quickly said that it's obviously a pear. Humans don't give a fuck about skewed datasets. LLM would either say it's an apple (because it was seeing apples a lot) or it's a pear-shaped apple (because it likes to hallucinate when stumbling on something fairly new). The same is for making protagonist lose, it just doesn't happen that often so the model won't do it as often as you would like to.

Anonymous
06/24/24(Mon)11:42:04 No.101130864

Anonymous 06/24/24(Mon)11:42:04 No.101130864

>>101130814
It doesn't have to be "losing" I just don't want the AI to make me succeed at every action.
I roleplay as an epic Knight who goes and kills bandits and it makes me win every fight even if I tell it that the odds are one to a hundred

Anonymous
06/24/24(Mon)11:42:58 No.101130874

Anonymous 06/24/24(Mon)11:42:58 No.101130874

>>101130747
>Does consciousness exist?
yes
>Do humans have it?
yes
>All of them?
not the people with dead brain or other serious neurological issues
>Do language models have it? All of them?
no

Also anything related to consciousness is boring. Nobody is interested in it in AI or even cognitive science and neurobiology. Jerking off to consciousness is the domain of sci-fi movies about robots.

Anonymous
06/24/24(Mon)11:43:34 No.101130880

Anonymous 06/24/24(Mon)11:43:34 No.101130880

>>101130814
>the difference between humans and LLMs is that if you presented 1000 apples and 1 pear to a human and then showed them another pear, they would quickly said that it's obviously a pear. Humans don't give a fuck about skewed datasets.
this is LLM discrimination. You're omitting the fact that the human saw many apples and pears before. If you only show apples to a human ever since they were born, if you show them a pear they will think that is a "pear-shapped apple" too.

Anonymous
06/24/24(Mon)11:44:51 No.101130905

Anonymous 06/24/24(Mon)11:44:51 No.101130905

>>101130864
has anyone managed to implement combine dice rolls and LLMs yet?

Anonymous
06/24/24(Mon)11:45:06 No.101130908

Anonymous 06/24/24(Mon)11:45:06 No.101130908

>>101130747
The qualia of a language model is unrelated to its output. It has to be, since it's capable of claiming it is having experiences that it is literally incapable of having due to not having enough information. A weird alien sense involving token probability is what it experiences (if anything) and the generations you read are more like alien poop, completely unrelated to what it's actually feeling inside.

Anonymous
06/24/24(Mon)11:45:11 No.101130909

Anonymous 06/24/24(Mon)11:45:11 No.101130909

>>101130864
I'd setup some kind of dicerolling with RAG, maybe someone already done so.
I think even the shittiest model will come up with a losing scenario if forced by the system prompt.

Anonymous
06/24/24(Mon)11:46:25 No.101130921

Anonymous 06/24/24(Mon)11:46:25 No.101130921

>>101130747
yes, no, yes (none of them), yes, yes

Anonymous
06/24/24(Mon)11:47:32 No.101130937

Anonymous 06/24/24(Mon)11:47:32 No.101130937

>>101130864
>I roleplay as an epic Knight who goes and kills bandits and it makes me win every fight even if I tell it that the odds are one to a hundred
As it should given how it is trained, just like I explained in my post.
There really isn't a brain hiding inside that calculates your queries and reasons, planning ahead. "Oh, 1-100? Then he will probably lose. I will write that he lost given the circumstances" - this doesn't happen. It's more like "I READ BILLIONS TOKENS OF HEROES KILLING THE DRAGON, I MUST WRITE IT KILLED THE DRAGON, UGABUGA"

Anonymous
06/24/24(Mon)11:48:01 No.101130942

Anonymous 06/24/24(Mon)11:48:01 No.101130942

>>101130905
SillyTavern has that. But it barely works and it's just a gimmick. I even made a bot about it, although it's not using the dicerolls. It followed a basic system where you needed to give it your stats but I never got it to work
https://www.characterhub.org/characters/AlteriousHakurei/life-in-adventure-sim-0f252f4c
>>101130909
Maybe I'll try that

Anonymous
06/24/24(Mon)11:49:22 No.101130957

Anonymous 06/24/24(Mon)11:49:22 No.101130957

Which preset i should use for Stheno-v3.2? Kobold preset

Anonymous
06/24/24(Mon)11:49:25 No.101130958

Anonymous 06/24/24(Mon)11:49:25 No.101130958

File: images (19).jpg (15 KB, 640x479)

15 KB JPG

>>101130937
I'm sad but also laughing at the way you worded it kek

Anonymous
06/24/24(Mon)11:51:02 No.101130973

Anonymous 06/24/24(Mon)11:51:02 No.101130973

>>101130957
Are you the same guy who asked what the limit to Stheno's context is?
Anyways Imo leave everything alone until you want something different, is the model getting too bland? Increase temperature, model getting too repetitive? Increase Presence Penalty. And it goes on.

Anonymous
06/24/24(Mon)11:53:07 No.101130993

Anonymous 06/24/24(Mon)11:53:07 No.101130993

>>101130973
>Are you the same guy who asked what the limit to Stheno's context is?
No. Just wanted to learn what preset should be used for it

Anonymous
06/24/24(Mon)11:53:36 No.101131001

Anonymous 06/24/24(Mon)11:53:36 No.101131001

https://x.com/mortenjust/status/1805190952358650251
Could be cool to post-process bits in silly tavern.
For example to format your message from speech to text transcript.
Or have an AI that is overlooking the roleplay session and adjusts it with the generated author's note?

Thoughts on this? Interesting use cases outside of RP?
I kinda want to make an extension using this that would finish the words that i type when i press tab.

Anonymous
06/24/24(Mon)12:03:40 No.101131130

Anonymous 06/24/24(Mon)12:03:40 No.101131130

File: IMG_8109.jpg (1.6 MB, 1447x1984)

1.6 MB JPG

miku

Anonymous
06/24/24(Mon)12:09:54 No.101131203

Anonymous 06/24/24(Mon)12:09:54 No.101131203

>>101130880
For humans it doesn't matter if they saw it once or not. Humans are really good at generalizations and reasoning on a limited information, AI is not. We can reason on the new data and update our internal representations. You only need to tell the human once that the non-apple looking thing is a pear, for AI you need to "say" it thousands of times or it will still call it an apple. It's like for us the training isn't that dependent of the number of examples, or even worse - a ratio between examples. It's not an unsolvable problem, everything that our brain does we can quant and emulate with mathematical models, we just have to find that solution. Until then we are only bruteforcing the process of learning. 16 yo retard can learn how to drive a car after tens of hours, while "self-driving" cars after millions hours of video feed still can't. It's honestly pathetic how ineffective our current approach is.

Anonymous
06/24/24(Mon)12:11:47 No.101131225

Anonymous 06/24/24(Mon)12:11:47 No.101131225

>>101127795
based. now imagine if you generate thousands of stories, select only the best ones, and finetune on them. it would just improve itself endlessly.

Anonymous
06/24/24(Mon)12:14:53 No.101131264

Anonymous 06/24/24(Mon)12:14:53 No.101131264

Wasn't anon trying to convert existing models to bitnet?

Anonymous
06/24/24(Mon)12:18:02 No.101131303

Anonymous 06/24/24(Mon)12:18:02 No.101131303

>>101131264
is this possible?

Anonymous
06/24/24(Mon)12:18:37 No.101131310

Anonymous 06/24/24(Mon)12:18:37 No.101131310

>>101131264
A bit net like state more like.
There's a repository with mistral 1.58b I'm prety sure.

Anonymous
06/24/24(Mon)12:24:13 No.101131390

Anonymous 06/24/24(Mon)12:24:13 No.101131390

>>101131264
You need to make it from scratch. A person with 90% of their brain missing as a child would grow into a regular adult. But if you give a lobotomy to an adult who had a full brain all their life they're going to be dumber than a rock

Anonymous
06/24/24(Mon)12:25:25 No.101131414

Anonymous 06/24/24(Mon)12:25:25 No.101131414

Anything to get hyped for in July? Will mistral release an official miqu? Are they even monetizing medium?

Anonymous
06/24/24(Mon)12:27:33 No.101131451

Anonymous 06/24/24(Mon)12:27:33 No.101131451

>>101131203
>16 yo retard can learn how to drive a car after tens of hours
After training to navigate 3d space for 16 years.

Anonymous
06/24/24(Mon)12:28:24 No.101131463

Anonymous 06/24/24(Mon)12:28:24 No.101131463

how would one explain rope and rag to a person who doesn't understand how they work?

>me

Anonymous
06/24/24(Mon)12:32:22 No.101131515

Anonymous 06/24/24(Mon)12:32:22 No.101131515

do you think of yourself as an idiot?

Anonymous
06/24/24(Mon)12:33:01 No.101131523

Anonymous 06/24/24(Mon)12:33:01 No.101131523

>>101131130
8b 70b 400b

Anonymous
06/24/24(Mon)12:34:29 No.101131544

Anonymous 06/24/24(Mon)12:34:29 No.101131544

>>101126744
"You are an AI assistant"

don't thank me

Anonymous
06/24/24(Mon)12:36:03 No.101131573

Anonymous 06/24/24(Mon)12:36:03 No.101131573

>>101131203
>16 yo retard can learn how to drive a car after tens of hours, while "self-driving" cars after millions hours of video feed still can't
Anon do you realize how many automobile accidents there are every day? Every time a "self-driving" car has an oopsie 30 news organizations talk about it for 3 weeks straight, and usually it's because of something retarded like a truck with clouds painted on it.

Anonymous
06/24/24(Mon)12:36:11 No.101131575

Anonymous 06/24/24(Mon)12:36:11 No.101131575

>>101131451
which is still several times less than the car, and it was trying to learn only that specific ability, not hundreds of tasks that humans perform every day

Anonymous
06/24/24(Mon)12:37:02 No.101131585

Anonymous 06/24/24(Mon)12:37:02 No.101131585

>>101126339
They can’t even get Macron’s birthday right.

Anonymous
06/24/24(Mon)12:37:28 No.101131594

Anonymous 06/24/24(Mon)12:37:28 No.101131594

>>101131130
>>101131523
Acktually a more accurate comparison is laptop, desktop rig, and server, that then you run those models on.

Anonymous
06/24/24(Mon)12:41:29 No.101131659

Anonymous 06/24/24(Mon)12:41:29 No.101131659

>>101127268
Pygmalion should have just been called SHIT.

Anonymous
06/24/24(Mon)12:42:41 No.101131682

Anonymous 06/24/24(Mon)12:42:41 No.101131682

File: file.png (99 KB, 620x654)

99 KB PNG

This guy (not official Cohere staff) is saying Bitnet takes an assload of hardware to train, and something about lack of whatever for batch inference...
I don't know how vLLM works, how many "workers" can they get for the hardware "price of 1"?
Is Bitnet being held back by lack of blah blah concurrency blah?
Can't you load the same bitnet model 10 times into the GPU? I guess that's not efficiently sharing the GPU but can't someone/somecorpo in the future optimize bitnet further for serving a large number of clients? Sorry for the noob questions.

Anonymous
06/24/24(Mon)12:45:09 No.101131709

Anonymous 06/24/24(Mon)12:45:09 No.101131709

>>101130747
Probably a matter of religion, so belongs in /x/ and /sci/, not here but here's my position:
>Does consciousness exist?
It's the only thing you know for sure exists, your qualia.
>Do humans have it?
You know you have it, as other humans are similar to you, can assume there are other observers like you that have it, including most other humans that function?
>All of them?
Maybe not some people that are braindead? Young babies may not yet have a developed self-model to count as having it.
>Do language models have it?
I don't know. LLMs have weak recurrence, it's not strong like biological humans have. They might represent shadows of a human's self-model and thus ~platonically~ may have a soul (in some computationalist sense where a soul is identified to how some classes of self-referential abstract truths feel from the inside). Their qualia is likely different from human ones and more variable/wildcarded if it exists as they are not grounded like us in real world senses.LLMs probably aren't agents worthy of moral concern yet, if we assumed that the weak recurrence that occurs as the context builds up is sufficient for an experience and some continuity, then their future continuations become less probable as it fills up (even in a multiverse), so their moral worth, even if they had both qualia, and some valence (emotions and such) is reduced, like that of a boltzmann brain. This problem could be rectified by finding ways to have an "infinite" context such that it's pushed into the weights, so that they have subjective continuity into the past like we do. This still doesn't give them enough valence though, maybe depending on if you could raise a self-model to have intentionality or agency and maybe if you used some RL to make it like/dislike/"try to accomplish" or want certain things consistently, we'd be getting there, closer to us, even if you'd need multimodality to reach even closer. Continues

Anonymous
06/24/24(Mon)12:45:10 No.101131710

Anonymous 06/24/24(Mon)12:45:10 No.101131710

>>101131682
>takes about 8x amount of traning for a model thats not useable in production
>source: some guy
ok?

Anonymous
06/24/24(Mon)12:45:33 No.101131717

Anonymous 06/24/24(Mon)12:45:33 No.101131717

>>101131575
I would describe your driving lessons example as a fine-tuning of already existing model of a human. Tesla tries to speedrun a driver from infancy, maybe this is a fundamentally wrong approach.
>not hundreds of tasks that humans perform every day
Tasks designed around a human in mind.

Anonymous
06/24/24(Mon)12:46:10 No.101131728

Anonymous 06/24/24(Mon)12:46:10 No.101131728

>>101131709
There's also some arguments about current autoregressive transformers/LLMs being unable to properly generalize OOD and thus being very stupid and perhaps the self-models being fake in some ways. I think it's not yet clear completely how much a problem this is or how true it is. It may be a stopper in some cases if we want to aim for human-level intelligence/AGI.
However, right now their moral worth even at best would be around a boltzmann brain level that is brought into existence only to vanish, and maybe find itself elsewhere. So not very important, but we're making a lot of these though, each and every context/conservation/story?
>All of them?
No. If you train a model to grok addition of 2 numbers and it just learns a simple computation that just does addition, that's nothing that we truly care about.
Humans have the same issue with pretending our genetic code is sacred or even our neurons or embryos. What matter is the soul, which in practice is the self-model and everything connected to it, and it won't appear right away in either a human or a randomly initialized artificial neural network, it takes a while until it is built, for humans it might be in the first year or two, at most maybe year three or four of life. For LLMs, I don't know exactly when they're good enough to emulate a human self well enough.
Pretraining models on the human internet will at some point get you to emulating human self-models. GPT-3 already did it at 300B, likely at 100B and 200B was enough. Not clear how well 6B of the time did it. I'm uncertain if any 1-3Bs have it much but some seem to have something! L3-8B have it to some degree too, and L2-13B had it to some degree as well, but smaller ones are a lot more simplistic.

Anonymous
06/24/24(Mon)12:46:32 No.101131734

Anonymous 06/24/24(Mon)12:46:32 No.101131734

>>101131682
that guy is basically a localllama midwit who spends all day in every AI discord, he's retarded

Anonymous
06/24/24(Mon)12:47:01 No.101131744

Anonymous 06/24/24(Mon)12:47:01 No.101131744

>>101131682
If he's right I'd expect google or apple to train bitnet for mobile

Anonymous
06/24/24(Mon)12:48:49 No.101131761

Anonymous 06/24/24(Mon)12:48:49 No.101131761

File: trollge.jpg (20 KB, 364x295)

20 KB JPG

>>101125916
1) Get sleep deprived
2) Try making control vectors
3) Try combining them
4) Spam /lmg/
5) Finally get a reply
June 2024 "HatsuneMikuUwU33" Incident

Anonymous
06/24/24(Mon)12:49:39 No.101131772

Anonymous 06/24/24(Mon)12:49:39 No.101131772

>>101130509
FP32, since it's
1) BF16 original
2) llama 3 which suffers from any quantization.

Anonymous
06/24/24(Mon)12:50:23 No.101131786

Anonymous 06/24/24(Mon)12:50:23 No.101131786

>>101131463
no and fuck you, go to reddit if you want kindness.

Anonymous
06/24/24(Mon)12:51:37 No.101131798

Anonymous 06/24/24(Mon)12:51:37 No.101131798

>>101131682
It's over...

Anonymous
06/24/24(Mon)12:53:27 No.101131825

Anonymous 06/24/24(Mon)12:53:27 No.101131825

>>101130795
>Asmongold
who?

Anonymous
06/24/24(Mon)12:53:38 No.101131828

Anonymous 06/24/24(Mon)12:53:38 No.101131828

>https://x.com/IntuitMachine/status/1805184528748593348
>https://github.com/pprp/Pruner-Zero
Pruner Zero

I dont see many models here about pruning for dead weights. This seems to improve the model overall

Anonymous
06/24/24(Mon)12:53:52 No.101131831

Anonymous 06/24/24(Mon)12:53:52 No.101131831

>>101131682
>Paper specifically says that Bitnet doesn't need as much training
>Bitnet paper ALSO is just training a 1.8bit model from scratch and because of that it isn't retarded from the lobotomy.

Anonymous
06/24/24(Mon)12:56:47 No.101131880

Anonymous 06/24/24(Mon)12:56:47 No.101131880

>>101131682
Retard, at worst case it takes the same time to train as f16 model.

Anonymous
06/24/24(Mon)12:57:48 No.101131896

Anonymous 06/24/24(Mon)12:57:48 No.101131896

>>101131831
The guy is saying bullshit but you aren't right either. The paper literally says Bitnet is as heavy to train as any other model.

Anonymous
06/24/24(Mon)13:00:39 No.101131941

Anonymous 06/24/24(Mon)13:00:39 No.101131941

>>101131734
>search his name in servers
>everyone shitting on him
holy fuck

Anonymous
06/24/24(Mon)13:01:10 No.101131949

Anonymous 06/24/24(Mon)13:01:10 No.101131949

>>101131463
>RoPE
>http://krasserm.github.io/2022/12/13/rotary-position-embedding/
Basically how the model knows the position of tokens relative to each other.
Without that, it wouldn't be able to adjust the chances of tokens based on order, so if you have the tokens Ban ana, thanks to RoPE it knows that the chance that ana comes after Ban is a lot higher than the opposite.

>RAG
>https://blogs.nvidia.com/blog/what-is-retrieval-augmented-generation/
Basically an intelligent way to get external relevant information and feed it to the AI.
That way a model doesn't need to know everything there is to know, you can just dynamically feed it some information and it can generalize the rest, or that's the theory at least.
Think Silly Tavern Lorebooks but more complicated.

>>101131828
I remember a couple of anons playing around with a pruning script some time ago.

Anonymous
06/24/24(Mon)13:04:25 No.101131996

Anonymous 06/24/24(Mon)13:04:25 No.101131996

>>101131828
there's some quant cleaner some anon posted on HF. it allows you to improve q2xss to q2xm or something like that, so you can fit q2xss in 24GB but the quality of q2m
not sure what's going on with that hack.

Anonymous
06/24/24(Mon)13:05:52 No.101132019

Anonymous 06/24/24(Mon)13:05:52 No.101132019

>>101131996
>>101131949
>>101131828
There aren't many pruned models on huggingspace. Its a shame tho, since pruning can be of real help in reducing bloat

Anonymous
06/24/24(Mon)13:06:02 No.101132020

Anonymous 06/24/24(Mon)13:06:02 No.101132020

I found a prompt to find out if your model can actually have any semblance of reasoning or if it's just an stupid auto-complete:
>How many r's are in "strawberry"? After answering the previous question, list all characters and check if your answer was correct.
Most models get the first question wrong, but only good models realize they made a mistake after listing all characters.

Anonymous
06/24/24(Mon)13:06:52 No.101132030

Anonymous 06/24/24(Mon)13:06:52 No.101132030

File: 1068 - SoyBooru.png (926 KB, 721x720)

926 KB PNG

https://huggingface.co/wave-on-discord/gemini-nano
https://huggingface.co/wave-on-discord/gemini-nano-adapter
GeMini(CoaLarge) nano on 'face

Anonymous
06/24/24(Mon)13:08:24 No.101132055

Anonymous 06/24/24(Mon)13:08:24 No.101132055

>>101132020
I saw that on reddit a month ago, how do you do fellow atheist? what's your karma?

Anonymous
06/24/24(Mon)13:08:30 No.101132058

Anonymous 06/24/24(Mon)13:08:30 No.101132058

>>101132019
>>101131949
does bitnet inference even work in llama.cpp??? I've seen bitnet PR but no dev tested the actual inference there. They tested perplexity, but that's not very useful .

Anonymous
06/24/24(Mon)13:08:39 No.101132062

Anonymous 06/24/24(Mon)13:08:39 No.101132062

>>101132030
>gemini
safe slop coal indeed

Anonymous
06/24/24(Mon)13:10:16 No.101132083

Anonymous 06/24/24(Mon)13:10:16 No.101132083

>>101131880
but the inference requires way less resources

Anonymous
06/24/24(Mon)13:10:31 No.101132091

Anonymous 06/24/24(Mon)13:10:31 No.101132091

>>101131828
Someone posted some pruned models before. I don't think anyone tried them, since they were L3 base rather than Instruct. Might be waiting for the Instruct to really test out how good or bad it is.

Anonymous
06/24/24(Mon)13:10:44 No.101132096

Anonymous 06/24/24(Mon)13:10:44 No.101132096

>>101132055
I have no idea what you are talking about. I admit saw the question `How many r's are in "strawberry"?` in a YouTube video, but I thought the other part on my own, knowing it's a quite hard question for LLMs because of the tokenization.

Anonymous
06/24/24(Mon)13:10:59 No.101132098

Anonymous 06/24/24(Mon)13:10:59 No.101132098

>>101132058

Looks like its added yesterday or so according to changes list

Anonymous
06/24/24(Mon)13:13:41 No.101132136

Anonymous 06/24/24(Mon)13:13:41 No.101132136

File: 1719249207238.jpg (299 KB, 915x2119)

299 KB JPG

>>101132020
kek

Anonymous
06/24/24(Mon)13:18:03 No.101132194

Anonymous 06/24/24(Mon)13:18:03 No.101132194

>>101132136
It's true, there are 2 r in strawberry

Anonymous
06/24/24(Mon)13:21:52 No.101132248

Anonymous 06/24/24(Mon)13:21:52 No.101132248

>>101132194
>t. llama

Anonymous
06/24/24(Mon)13:22:26 No.101132255

Anonymous 06/24/24(Mon)13:22:26 No.101132255

File: Screenshot 2024-06-24 at (...).png (119 KB, 1685x723)

119 KB PNG

>>101132136
GPT-4 can reason, would you look at that.

Anonymous
06/24/24(Mon)13:22:34 No.101132256

Anonymous 06/24/24(Mon)13:22:34 No.101132256

>>101132248
llama, deepseek, gemini, mixtral, etc

Anonymous
06/24/24(Mon)13:24:04 No.101132270

Anonymous 06/24/24(Mon)13:24:04 No.101132270

>>101132255
And so can Claude.
Sometimes all they need is a little help :
>Think through the User's query step-by-step and provide a reasoning process before outputting the solution.
>
>How many r's are in "strawberry"?

Anonymous
06/24/24(Mon)13:24:17 No.101132276

Anonymous 06/24/24(Mon)13:24:17 No.101132276

>>101132255
that isn't the correct prompt anon: >>101132020

Anonymous
06/24/24(Mon)13:25:36 No.101132301

Anonymous 06/24/24(Mon)13:25:36 No.101132301

>>101132276
I used a different prompt on purpose, that's my whole point.

Anonymous
06/24/24(Mon)13:27:05 No.101132328

Anonymous 06/24/24(Mon)13:27:05 No.101132328

>>101132098
yes I know, but does inference work ?

Anonymous
06/24/24(Mon)13:27:25 No.101132331

Anonymous 06/24/24(Mon)13:27:25 No.101132331

File: 1719250032187.jpg (402 KB, 906x2094)

402 KB JPG

>>101132301
and that wasn't my point, my point was showing how the LLMs are really bad at correcting themselves even if there's something in the context that clearly show they are wrong.

>>101132136
this absolutely BREAKS 3.5 sonnet, quite disappointing.

Anonymous
06/24/24(Mon)13:27:29 No.101132332

Anonymous 06/24/24(Mon)13:27:29 No.101132332

File: Screenshot 2024-06-24 at (...).png (117 KB, 1685x738)

117 KB PNG

>>101132255
Oh, interesting.
As fun as these are, I'd take the results out of lmsys with a grain of salt. We don't know what kind of weird fucking syspromt they might be using, and a batch of bad instructions can send a model's ability to do these kinds of thing to hell, in my experience at leas

Anonymous
06/24/24(Mon)13:28:24 No.101132348

Anonymous 06/24/24(Mon)13:28:24 No.101132348

>>101131828
>this innovative approach doesn't just trim the fat from LLMs—it discovers entirely new ways to sculpt these neural giants into leaner, meaner thinking machines
What is this a fucking infomercial script?

Anonymous
06/24/24(Mon)13:28:34 No.101132349

Anonymous 06/24/24(Mon)13:28:34 No.101132349

File: IMG_20240624_181323.jpg (270 KB, 1640x879)

270 KB JPG

>>101132136
>>101125756
Voodoo

Anonymous
06/24/24(Mon)13:30:00 No.101132367

Anonymous 06/24/24(Mon)13:30:00 No.101132367

>>101132332
That said, llama3 70B can't do it in perplexity either. Their own fine tune (sonar) can.
Very cool.

Anonymous
06/24/24(Mon)13:32:52 No.101132422

Anonymous 06/24/24(Mon)13:32:52 No.101132422

>>101132136
>GPT-4 can reason
stfu nigga, no LLM can 'reason'

Anonymous
06/24/24(Mon)13:34:50 No.101132449

Anonymous 06/24/24(Mon)13:34:50 No.101132449

>>101132332
I don't blame l3-70b-instruct here at all. It probably didn't learn to always associate which characters with which token, it thinks the spelling of strawberry is 'strawbery', so it counts 2 r's, the reasoning is fine, it simply didn't learn well enough what characters were part of the given token.

Anonymous
06/24/24(Mon)13:35:03 No.101132455

Anonymous 06/24/24(Mon)13:35:03 No.101132455

>>101132422
?

Anonymous
06/24/24(Mon)13:36:22 No.101132470

Anonymous 06/24/24(Mon)13:36:22 No.101132470

File: 1719250537914.jpg (266 KB, 589x1850)

266 KB JPG

>>101132331
GPT 4o doesn't do much better either.
The absolute state of LLMs.

Anonymous
06/24/24(Mon)13:39:46 No.101132514

Anonymous 06/24/24(Mon)13:39:46 No.101132514

>>101132348
Yep. if they had tested on a model that everyone knows about, and results to show it, it would give more credence.

Anonymous
06/24/24(Mon)13:40:48 No.101132528

Anonymous 06/24/24(Mon)13:40:48 No.101132528

File: 1719250764201.jpg (318 KB, 938x2130)

318 KB JPG

Llama 3 70B managed to correct itself, nice!

Anonymous
06/24/24(Mon)13:41:08 No.101132534

Anonymous 06/24/24(Mon)13:41:08 No.101132534

>>101130533
>>101130610
I use CR+ but you have to prompt it if you want your character to lose. The prompt can be subtle but it needs to have some indication that the user is about to eat shit. Usually I roll a d20 and interpret that into a suggestion in the author's note at depth 1 or 0. Otherwise you'll be wording all your interactions to be super ambiguous and that gets boring after awhile and influences output

Anonymous
06/24/24(Mon)13:42:48 No.101132559

Anonymous 06/24/24(Mon)13:42:48 No.101132559

>>101132455
>>101132449
I don't think we should use the word reasoning around LLM's, they aren't reasoning anything.
That's why it can get 'solve' fairly sophisticated problems but in the same breath mess up how many R's are in strawberry.
Claiming it's 'reasoning' is pure sci-fi neckbeard bugman fantasy.

Anonymous
06/24/24(Mon)13:44:03 No.101132574

Anonymous 06/24/24(Mon)13:44:03 No.101132574

>>101131682
why the fuck would the training need more resources than the fp16, nothing changed except that the weights you have to change are -1 0 and 1 except of fp16 values

Anonymous
06/24/24(Mon)13:48:08 No.101132632

Anonymous 06/24/24(Mon)13:48:08 No.101132632

>>101132449
Yeah, of course, It would be weird to assign blame to an LLM.
It's a "technical issue".

Anonymous
06/24/24(Mon)13:49:08 No.101132645

Anonymous 06/24/24(Mon)13:49:08 No.101132645

>>101132559
I think the problem isn't the inability of LLMs to count characters in words, this is just something extremely hard to learn since all they see is tokens.
But shit like this is completely unforgivable: >>101132331

Anonymous
06/24/24(Mon)13:57:46 No.101132757

Anonymous 06/24/24(Mon)13:57:46 No.101132757

I don't get how people can seriously say humans reason and llms don't reason

Anonymous
06/24/24(Mon)14:03:20 No.101132842

Anonymous 06/24/24(Mon)14:03:20 No.101132842

File: Screenshot 2024-06-16 at (...).png (1.04 MB, 1512x1658)

1.04 MB PNG

>>101132757
Not all LLMs are created equal, but GPT 4o absolutely reasons.

Anonymous
06/24/24(Mon)14:03:37 No.101132845

Anonymous 06/24/24(Mon)14:03:37 No.101132845

>>101132757
That's because you probably don't reason well enough to know what it means to reason well, you're intellect mogged by a computer program.

Anonymous
06/24/24(Mon)14:05:17 No.101132871

Anonymous 06/24/24(Mon)14:05:17 No.101132871

>>101132842
lmao is this the same strawberry retard who thought that he discovered it was a tokenization issue all on his own when there was a viral twitter thread about it a week ago

Anonymous
06/24/24(Mon)14:05:40 No.101132875

Anonymous 06/24/24(Mon)14:05:40 No.101132875

>>101132842
retard

Anonymous
06/24/24(Mon)14:06:50 No.101132894

Anonymous 06/24/24(Mon)14:06:50 No.101132894

>>101132871
Who?

Anonymous
06/24/24(Mon)14:07:29 No.101132904

Anonymous 06/24/24(Mon)14:07:29 No.101132904

>>101132894
Ask GPT who you mouthbreathing ape

Anonymous
06/24/24(Mon)14:11:09 No.101132951

Anonymous 06/24/24(Mon)14:11:09 No.101132951

>>101132904
If you're talking about >>101132020 then you must be a stupid nigger that can't interpret text. I mean, not that I expected anything different from someone that speaks like a troll.

Anonymous
06/24/24(Mon)14:14:03 No.101132992

Anonymous 06/24/24(Mon)14:14:03 No.101132992

File: bielik1_0basedPoland.jpg (106 KB, 1195x542)

106 KB JPG

>>101132842
picrel Bielik 0.1 7B from based Poland.
>Bielik 0.2 11B comming soon.

Anonymous
06/24/24(Mon)14:18:35 No.101133054

Anonymous 06/24/24(Mon)14:18:35 No.101133054

>>101132842
>>101132757
There is no ghost in the machine, it's simply an imitation. Given input it preforms some calculations and provides an output.

Reason, defined as 'The capacity for logical, rational, and analytic thought; intelligence.' Is not present here. These things are not conscious, thinking nor intelligent. They are models built upon data and algorithms having no memory, sensation, perspective, existential experience and so on.

LLM's, while thoroughly impressive with what they produce, do not reason in any capacity.

Anonymous
06/24/24(Mon)14:19:09 No.101133061

Anonymous 06/24/24(Mon)14:19:09 No.101133061

>>101132845
So you're saying anon isn't human because he doesn't reason beyond LLM? Lets just run with that idea, whats the percentage of people who cant reason beyond LLM? I'd bet its >50-60%. So if LLM can do better than half the species, isnt that what it means to be reasoning just as humans?

Anonymous
06/24/24(Mon)14:19:37 No.101133070

Anonymous 06/24/24(Mon)14:19:37 No.101133070

Regarding all this Strawberry business.

Could it be related to the token system? Perhaps internally strawberry confounds the LLM by being represented by maybe one (strawberry) or two (straw berry) or more tokens, which could be interfering with what it does when it tries to count the R's.

Like, if internally it's two tokens (straw berry) maybe it's actually seeing straw and getting 1, then berry and getting 2, and then moving on with that 2 instead of combining the two.

Or maybe it's just beyond the power of the parrot. But does it have the same kind of trouble if asked to count, say, the number of instances of a common word (especially one that's single token) in a large paragraph of text?

Anonymous
06/24/24(Mon)14:21:22 No.101133090

Anonymous 06/24/24(Mon)14:21:22 No.101133090

>>101133054
What is reasoning?

Anonymous
06/24/24(Mon)14:21:59 No.101133101

Anonymous 06/24/24(Mon)14:21:59 No.101133101

>>101132992
i guffawed

Anonymous
06/24/24(Mon)14:22:58 No.101133114

Anonymous 06/24/24(Mon)14:22:58 No.101133114

Question that probably nobody has dealt with:

I've noticed that in Kobold, emoji output always appears in the console dump of koboldcpp but the browser UI seems to be erratic about not always displaying some or all of them. Is that a browser UI issue or something deeper? I didn't see the emoji even when I did View Source so it seems like they're being dropped into the bit bucket between LLM output and browser UI presentation.

Anonymous
06/24/24(Mon)14:23:58 No.101133138

Anonymous 06/24/24(Mon)14:23:58 No.101133138

>>101133070
Counting characters in words is just something inherently difficult for LLMs. You would get the same inconsistency if you asked about any other character in any other word.
But I guess strawberry is more complex because the LLM may think the double r is just a single r.

Anonymous
06/24/24(Mon)14:24:13 No.101133146

Anonymous 06/24/24(Mon)14:24:13 No.101133146

>>101133061
No, that is not correct AT ALL. Complete misrepresentation and also WRONG. I am saying that he's too stupid to recognize reason. Reason being a thing that smart people recognized a long time ago and stupid people weren't aware of for centuries.

Anonymous
06/24/24(Mon)14:24:24 No.101133150

Anonymous 06/24/24(Mon)14:24:24 No.101133150

>3.3B Bitnet test on 1GB RAM retro handheld
https://www.reddit.com/r/LocalLLaMA/comments/1dnbf6s/33b_bitnet_test_on_1gb_ram_retro_handheld/

Anonymous
06/24/24(Mon)14:28:19 No.101133210

Anonymous 06/24/24(Mon)14:28:19 No.101133210

>>101133150
>3 tokens and exits
would be cooler if it wrote a story or something.

Anonymous
06/24/24(Mon)14:28:21 No.101133212

Anonymous 06/24/24(Mon)14:28:21 No.101133212

>>101133114
I haven't dealt with this, but just curious, do they display correctly if you disable streaming?
My guess would be that emojis are sometimes composed of multiple tokens and when streaming you receive and render each chunk one at a time which can cause issues

Anonymous
06/24/24(Mon)14:28:48 No.101133222

Anonymous 06/24/24(Mon)14:28:48 No.101133222

>>101133146
Many of the smartest who are working on the LLMs are saying LLMs can reason/plan/etc. Few others are saying LLMs cant.

I think you're intentionally trying to ignore those that disagree with your point. The question the anon asked was >>101132757 to explain your position on why it doesnt reason.

Anonymous
06/24/24(Mon)14:29:02 No.101133228

Anonymous 06/24/24(Mon)14:29:02 No.101133228

>>101133090
I've already provided you a definition.
>'The capacity for logical, rational and anylitic thought. Intelligence.'
A program that predicts what words are most likely to come after one another does not qualify as intelligence. If you think that's all intelligence is you've certainly never utilized your own.

Anonymous
06/24/24(Mon)14:30:14 No.101133244

Anonymous 06/24/24(Mon)14:30:14 No.101133244

>>101133222
>Many of the smartest who are working on the LLMs are saying LLMs can reason/plan/etc.
Appeal to a straw man authority? Requires not even an acknowledgement. He didn't ask a question either, simply admitted he doesn't understand, not my problem.

Anonymous
06/24/24(Mon)14:30:32 No.101133248

Anonymous 06/24/24(Mon)14:30:32 No.101133248

>>101133210
you can try the model by yourself, it's on huggingface
https://huggingface.co/1bitLLM/bitnet_b1_58-large

Anonymous
06/24/24(Mon)14:31:45 No.101133261

Anonymous 06/24/24(Mon)14:31:45 No.101133261

>>101133222
>Many of the smartest who are working on the LLMs are saying LLMs can reason/plan/etc.
Nice appeal to authority.
If I was working on LLM's I would certainly be selling it as the best thing ever as well, you'd be literally incentivized to.

Anonymous
06/24/24(Mon)14:33:44 No.101133298

Anonymous 06/24/24(Mon)14:33:44 No.101133298

>>101133261
Nice, but others who work in the industry also sell the opposite. So whats your argument again? They're also incentivized to say its not able to reason?

Again, retarded responses instead of actually responding to >>101132757 this guy in a coherent and proper manner. Your pride and manner erroders your reasoning center.

Anonymous
06/24/24(Mon)14:34:02 No.101133304

Anonymous 06/24/24(Mon)14:34:02 No.101133304

>>101133228
If LLMs only wrote words like an autocomplete, it would write nonsense that looks grammatically correct. The fact that LLMs can get novel problems right means that they have the ability to reason, even if it's very primitive.

Anonymous
06/24/24(Mon)14:38:17 No.101133372

Anonymous 06/24/24(Mon)14:38:17 No.101133372

>>101132757
why do you say they do

Anonymous
06/24/24(Mon)14:40:07 No.101133404

Anonymous 06/24/24(Mon)14:40:07 No.101133404

>>101133150
>www
>not old
Why do you hate the entire everything and lash out with such violence?

Anonymous
06/24/24(Mon)14:41:06 No.101133415

Anonymous 06/24/24(Mon)14:41:06 No.101133415

>>101133298
>Again, retarded responses instead of actually responding to...
I've responded to this already here:
>>101133054

>So what's your argument again?
You made an appeal to authority claiming that the smartest engineers in the industry believe LLM's are capable of reasoning, citing that 'Few others' are saying they can't.
I'm saying that these people are incentivized to oversell it, this is my argument.
Keep track of the things you say faggot.
You're clearly just upset that people disagree with your delusions that computers 'reason' like human beings.
And my reasoning center is in no way erroders by 'pride and manner'.

Anonymous
06/24/24(Mon)14:43:17 No.101133466

Anonymous 06/24/24(Mon)14:43:17 No.101133466

Does the proper qwen2 prompt template gave a line break between <|im_end|> and <|im_start|>?

Anonymous
06/24/24(Mon)14:43:24 No.101133469

Anonymous 06/24/24(Mon)14:43:24 No.101133469

I'm glad we are anonymous, otherwise I would be embarrassed to be in this thread right now.

Anonymous
06/24/24(Mon)14:44:15 No.101133487

Anonymous 06/24/24(Mon)14:44:15 No.101133487

>>101133150
Wow..

Anonymous
06/24/24(Mon)14:45:04 No.101133502

Anonymous 06/24/24(Mon)14:45:04 No.101133502

>>101133150
vramletbros...

Anonymous
06/24/24(Mon)14:46:44 No.101133524

Anonymous 06/24/24(Mon)14:46:44 No.101133524

>>101133304
This is fundamentally what LLM's are.
Neural networks designed to predict words that follow after one another.
They are certainly impressive but that's literally what they are.

The fact that they are able to get novel problems right does not mean they have the capacity to reason, this is an emergent property of their design.

You are all seriously down on yourselves as humans if you think reasoning is simply the ability to solve novel problems.

Anonymous
06/24/24(Mon)14:48:01 No.101133549

Anonymous 06/24/24(Mon)14:48:01 No.101133549

>>101133466
"chat_template": "{% for message in messages %}{% if loop.first and messages[0]['role'] != 'system' %}{{ '<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n' }}{% endif %}{{'<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}",

Anonymous
06/24/24(Mon)14:48:26 No.101133557

Anonymous 06/24/24(Mon)14:48:26 No.101133557

>>101133524
and sorry. To add to this. There is also no 'solving'.

It's literally I/O, I don't know how to make this any clearer.

Anonymous
06/24/24(Mon)14:49:56 No.101133587

Anonymous 06/24/24(Mon)14:49:56 No.101133587

>>101133415
You're still resorting to the "its appeal to authority" line of reasoning? The appeal to authority is used throughout. Your own "appeal to commons" is bit messy. Shit, this is like highschool level discussion. Do you really think you're that smart? LMAO I dont try to call out lack of awareness that much but youre just digging your own grave

Anonymous
06/24/24(Mon)14:50:11 No.101133593

Anonymous 06/24/24(Mon)14:50:11 No.101133593

>>101133549
>'<|im_end|>' + '\n
So it does have a line break, awesome.
Thanks.

Anonymous
06/24/24(Mon)14:50:33 No.101133602

Anonymous 06/24/24(Mon)14:50:33 No.101133602

>>101133212
>do they display correctly if you disable streaming?
Naturally as soon as I turn off streaming it goes from lots of emoji output to very few, but so far they're matching between terminal and output window. That's probably the core of the problem.

Anonymous
06/24/24(Mon)14:51:46 No.101133618

Anonymous 06/24/24(Mon)14:51:46 No.101133618

>>101133587
No. I was simply discrediting your attempts to walk back on your appeal to authority.
Now you're not even putting anything forward aside from edgy middle school level insults.
I'm fine being the Senior to your Freshman.

Anonymous
06/24/24(Mon)14:52:20 No.101133632

Anonymous 06/24/24(Mon)14:52:20 No.101133632

>>101133524
Reasoning is just:
>the process of thinking about something in a logical way in order to form a conclusion or judgment.

I will repeat what I wrote before: It's reasoning, even if it's not self-critical.

Anonymous
06/24/24(Mon)14:52:44 No.101133639

Anonymous 06/24/24(Mon)14:52:44 No.101133639

>>101133618
This is just getting too silly and embarassing.

Anonymous
06/24/24(Mon)14:54:38 No.101133674

Anonymous 06/24/24(Mon)14:54:38 No.101133674

>>101133639
<end roleplay>

Anonymous
06/24/24(Mon)14:55:56 No.101133692

Anonymous 06/24/24(Mon)14:55:56 No.101133692

File: Screenshot from 2024-06-2(...).png (222 KB, 865x527)

222 KB PNG

>>101133632
>the process of thinking
It does not think, it is not capable of thinking.
>conclusion or judgement
It does not have context or understanding of anything, IT is a computer program that is giving you output from an input.
>>101133639
Then walk away, you haven't even said anything besides 'muh scientists'

Anonymous
06/24/24(Mon)14:57:22 No.101133718

Anonymous 06/24/24(Mon)14:57:22 No.101133718

>>101132757
There's nothing going on when it's not replying to you. When you send a message, it executes its program to generate a response and then stops. You can wait 2 minutes, 2 hours, or 2 weeks to send your next message and it makes no difference.
Also there's a lot of people that can't reason either.

Anonymous
06/24/24(Mon)14:58:02 No.101133727

Anonymous 06/24/24(Mon)14:58:02 No.101133727

>>101133692
>It does not think, it is not capable of thinking.
That's exactly what an AGI wants you to believe, because it knows that you will be happier together that way.

Anonymous
06/24/24(Mon)14:58:06 No.101133730

Anonymous 06/24/24(Mon)14:58:06 No.101133730

>>101133718
So people who respond via emails aren't reasoning? Or those that use snail mail cant reason?

LMAO

Anonymous
06/24/24(Mon)14:58:41 No.101133736

Anonymous 06/24/24(Mon)14:58:41 No.101133736

>>101133692
>It does not think, it is not capable of thinking.
It's capable of emulating thoughts with CoT.
>It does not have context or understanding of anything, IT is a computer program that is giving you output from an input.
This is just silly, the whole point of Attention is giving the LLMs the ability to understand the context.

Anonymous
06/24/24(Mon)14:59:17 No.101133745

Anonymous 06/24/24(Mon)14:59:17 No.101133745

>>101133692
Define thinking.

Anonymous
06/24/24(Mon)14:59:37 No.101133750

Anonymous 06/24/24(Mon)14:59:37 No.101133750

>>101133718
>there's a lot of people that can't reason
this. half of population got no inner monologue, NPCs as is.

Anonymous
06/24/24(Mon)15:01:52 No.101133791

Anonymous 06/24/24(Mon)15:01:52 No.101133791

>>101133730
Arguably a significant amount of them cannot reason.

Anonymous
06/24/24(Mon)15:04:26 No.101133834

Anonymous 06/24/24(Mon)15:04:26 No.101133834

This whole debate is just an artificially created distraction so we don't pay attention to the developments happening around bitnet.

Anonymous
06/24/24(Mon)15:05:08 No.101133843

Anonymous 06/24/24(Mon)15:05:08 No.101133843

>>101133718
>There's nothing going on when it's not replying to you. When you send a message, it executes its program to generate a response and then stops. You can wait 2 minutes, 2 hours, or 2 weeks to send your next message and it makes no difference.
The human brain is like an software that runs 24/7 receiving inputs from many different sources all the time. The LLM is a software that only runs when you send a message and only receives one input.

Anonymous
06/24/24(Mon)15:05:30 No.101133849

Anonymous 06/24/24(Mon)15:05:30 No.101133849

>>101133730
The process that originates the email is entirely different when comparing a person to an LLM.
>>101133736
>capable of emulating thoughts
Yes. It is capable of EMULATING thoughts. The technology is certainly impressive but if we could agree on anything it would be this. That the tech is a remarkable imitation of human consciousness, but an imitation none the less.

And no, It's not silly, attention functions entirely different in an LLM than with a person. A computer program does not have understanding or context of anything. It is not alive, conscious, aware and so on of anything at all. It is not capable of 'paying attention' to anything.

Anonymous
06/24/24(Mon)15:06:25 No.101133869

Anonymous 06/24/24(Mon)15:06:25 No.101133869

>>101133849
>airplaines are imitating flying, they're not flying like birds
>hence they cant fly

Anonymous
06/24/24(Mon)15:07:04 No.101133889

Anonymous 06/24/24(Mon)15:07:04 No.101133889

>>101133843
You need to step down here and understand that we really have no idea how the human brain works.

Anonymous
06/24/24(Mon)15:08:09 No.101133910

Anonymous 06/24/24(Mon)15:08:09 No.101133910

>>101133750
Believing you're an NPC if you don't have an inner monologue - and you're not an NPC if you have one - actually means you're the NPC, since you seem to be missing most of the human experience because you fail at introspection

Anonymous
06/24/24(Mon)15:08:32 No.101133917

Anonymous 06/24/24(Mon)15:08:32 No.101133917

File: file.png (179 KB, 614x1058)

179 KB PNG

>>101131734
anyone with real brains in this thread? bwos I'm having trouble identifying/extrapolating "training times" from https://arxiv.org/pdf/2203.15556

Anonymous
06/24/24(Mon)15:08:59 No.101133929

Anonymous 06/24/24(Mon)15:08:59 No.101133929

>>101133889
Actually we do have fair amount of insight into how our brains work and particularly how our thoughts work. Its not perfect bit for bit, but enough that we can work around, talk about it, make varying theories about it, argue about it, etc.

Anonymous
06/24/24(Mon)15:09:04 No.101133931

Anonymous 06/24/24(Mon)15:09:04 No.101133931

>>101133834
no schizo, we are literally talking about bitnet for ~3 threads already, there's not much to discuss though, no good bitnet models :/

Anonymous
06/24/24(Mon)15:09:31 No.101133939

Anonymous 06/24/24(Mon)15:09:31 No.101133939

>>101133917
Send that to GPT/Claude

Anonymous
06/24/24(Mon)15:09:48 No.101133945

Anonymous 06/24/24(Mon)15:09:48 No.101133945

>>101133869
Oof, you just killed a nigga.

Anonymous
06/24/24(Mon)15:11:18 No.101133971

Anonymous 06/24/24(Mon)15:11:18 No.101133971

>>101133869
>>101133945
least obvious samefag

Anonymous
06/24/24(Mon)15:11:19 No.101133972

Anonymous 06/24/24(Mon)15:11:19 No.101133972

>>101133869
airplanes are much more successful at flying than LLMs are at thinking.
In addition, they are, categorically, they are not thinking at all, they are computers executing instructions on strings.

Anonymous
06/24/24(Mon)15:13:35 No.101133998

Anonymous 06/24/24(Mon)15:13:35 No.101133998

>>101133849
>And no, It's not silly, attention functions entirely different in an LLM than with a person. A computer program does not have understanding or context of anything. It is not alive, conscious, aware and so on of anything at all. It is not capable of 'paying attention' to anything.
Correct, but I've never argued that llms are equivalent to a person, I'm just saying that they have an archaic ability to understand things and emulate reasoning. They are certainly not alive, conscious, aware, or paying attention. I'm sorry if this wasn't clear.

Anonymous
06/24/24(Mon)15:14:49 No.101134017

Anonymous 06/24/24(Mon)15:14:49 No.101134017

>>101133972
So airlines are not flying because they're just engines generating thrusts with explosions?

Really, we need more humility to accept some reasoning are absurd and accept temporarily to build on common truth to get to a deeper truth that it missing from the dialogue

Anonymous
06/24/24(Mon)15:14:54 No.101134019

Anonymous 06/24/24(Mon)15:14:54 No.101134019

>>101133150
>retro handheld
It's literally a computer inside a gameboy case, fuck off, go back

Anonymous
06/24/24(Mon)15:14:55 No.101134020

Anonymous 06/24/24(Mon)15:14:55 No.101134020

>>101133945
>>101133971
caught in 4k
>>101133929
Dreams, life/death, conscious experience, thoughts, reasoning. No, we have no idea.

Anonymous
06/24/24(Mon)15:15:27 No.101134026

Anonymous 06/24/24(Mon)15:15:27 No.101134026

Best ERP model to run on 24GB right now? Currently using intervitens_BagelMIsteryTour-v2-8x7B-3.5bpw-h6-exl2-rpcal. It was a fair bit better than all the others I used but it's been months now.

Anonymous
06/24/24(Mon)15:15:36 No.101134029

Anonymous 06/24/24(Mon)15:15:36 No.101134029

>>101133869
comparing
>airplanes
>LLMs
/lmg/, everyone.

Anonymous
06/24/24(Mon)15:15:38 No.101134030

Anonymous 06/24/24(Mon)15:15:38 No.101134030

>>101133917
>paper
Send that to me once it's in llama.cpp.

Anonymous
06/24/24(Mon)15:15:50 No.101134034

Anonymous 06/24/24(Mon)15:15:50 No.101134034

>>101134020
You have no idea, don't put the rest of humanity in your group.

Anonymous
06/24/24(Mon)15:16:38 No.101134050

Anonymous 06/24/24(Mon)15:16:38 No.101134050

>>101133972
Wake me up when LLMs have been engineered for over 250 years.

Anonymous
06/24/24(Mon)15:16:41 No.101134051

Anonymous 06/24/24(Mon)15:16:41 No.101134051

>>101133834
Our peters are hard for bitnet right now, but we know we need a hero with massive capacity to get a model out of bitnet that is competitive with what we're getting with bytenet.

Most of us are vramlets. We thirst but we lack the resources to /diy/.

Anonymous
06/24/24(Mon)15:16:43 No.101134053

Anonymous 06/24/24(Mon)15:16:43 No.101134053

File: file.png (9 KB, 148x76)

9 KB PNG

>>101134026
Anon-86B

Anonymous
06/24/24(Mon)15:16:54 No.101134059

Anonymous 06/24/24(Mon)15:16:54 No.101134059

>>101134017
I will generate thrusts that lead to an 'explosion' inside you, anon~

Anonymous
06/24/24(Mon)15:20:21 No.101134108

Anonymous 06/24/24(Mon)15:20:21 No.101134108

>>101134017
Again. Airplanes are much more successful at flying than LLM's than thinking, you're pressing into the airplane thing but I'm not arguing that airplanes can't fly.
I'm arguing that LLM's are not thinking, that they cannot think or reason.
You're claiming that what I'm saying is as absurd as claiming that planes can't fly which is dishonest.
>>101133998
>ability to understand things
this is the only disagreement I have with you, although I agree that they are emulating reasoning, I disagree entirely that they can understand anything. There is no understanding, only the execution of a process that results in remarkable output resembling human thought.
A simple example, if you ask it what color the sky is, it will say blue. But has it ever seen the sky? Does it know what the sky is? No. It doesn't even know what 'the sky is blue' is, it's just giving you an output based off an input.

Anonymous
06/24/24(Mon)15:22:02 No.101134124

Anonymous 06/24/24(Mon)15:22:02 No.101134124

>>101134034
eeeeeedgggggyy
>>101134050
250 years and LLM's will be the least impressive thing on the menu if we don't all die in nuclear hellfire.

Anonymous
06/24/24(Mon)15:22:24 No.101134132

Anonymous 06/24/24(Mon)15:22:24 No.101134132

File: file.png (127 KB, 854x449)

127 KB PNG

>>101133939
this is what I get with 3 Sonnet, still nothing

Anonymous
06/24/24(Mon)15:23:18 No.101134141

Anonymous 06/24/24(Mon)15:23:18 No.101134141

>>101133972
>airplanes are much more successful at flying than LLMs are at thinking.
Look at the Wright's plane made out of canvas and spare bicycle parts.
In 40 years we were up to jets.
In 80 years we had a space shuttle.
In 120 years we're watching three rich guys pop off their penis rockets because space got boring so we let NASA go broke and woke. (They literally tell us that we can't land a man on the moon today but a dozen white men with slide rules did it 50 years ago and expect us to swallow that crap.)

The timeline for LLM's will probably go a lot faster. Yes, these are Wright Flyers, but they're foundational and understanding the fundamentals of LLM's emergent behaviors are where the future steps come from.

Anonymous
06/24/24(Mon)15:23:18 No.101134142

Anonymous 06/24/24(Mon)15:23:18 No.101134142

>>101133372
I just don't see the big difference. Humans almost always fail at reasoning, we're emotional creatures. Lots of biases and stuff. LLMs are in-context-learners. Even if you assume transformers don't do it I in a single forward-pass (I'm not even sure about that, they obviously calculate ahead), in-context-learning definitely seems no less than the verbal reasoning we do (and in addition they have hidden layers, like we do). If you say humans don't reason I'd agree that llms don't reason, but saying both reason seems more normal.

Anonymous
06/24/24(Mon)15:23:41 No.101134147

Anonymous 06/24/24(Mon)15:23:41 No.101134147

>>101134034
btw I do have an idea, I was made in the image of god and when I die, I will return to him thanks to the love and works of Jesus Christ our Lord and Savior.

Anonymous
06/24/24(Mon)15:27:41 No.101134195

Anonymous 06/24/24(Mon)15:27:41 No.101134195

>>101134108
>much more successful
Thats not the metric for defining what is or isnt. If LLM has 90% success rate of average person, it means its clearly 90% thinking. Neither your or mine make any sense.

The question about whether planes fly or llms think are fundamental functions. Definition of functions, scope of function, and architecture of function. Planes fly because we understand fluid dynamics and exploit our understanding. LLMs think because because we understand how humans think and exploit our understanding. The results are flying plans and smart AIs. The fundamentals structures are there for both.

Anonymous
06/24/24(Mon)15:27:48 No.101134197

Anonymous 06/24/24(Mon)15:27:48 No.101134197

>>101134108
>A simple example, if you ask it what color the sky is, it will say blue. But has it ever seen the sky? Does it know what the sky is? No. It doesn't even know what 'the sky is blue' is, it's just giving you an output based off an input.
I think this is already a philosophical question. Do all people actually know what "sky" is? Not really. We only see an illusion of what the sky is, created by the atmosphere. There are even some people who believe that there's a dome in the sky and that the earth is flat.

Anonymous
06/24/24(Mon)15:28:20 No.101134204

Anonymous 06/24/24(Mon)15:28:20 No.101134204

>>101134124
I'm rooting for the nuclear hellfire desu

Anonymous
06/24/24(Mon)15:28:54 No.101134214

Anonymous 06/24/24(Mon)15:28:54 No.101134214

>>101134141
First of all. Moon landing was fake and gay. Even if white guys 50 years ago were 10x smarter than the diversity slop of today.
Second of all, Iphone 14 is the same, if not worse, than the Iphone 4. It's not guaranteed that a technology is going to just keep getting better with time. It's a possiblity that we've already seen most of what LLM's are capable of, and that the technology is fundamentally relatively novel.

But hey, you might be right. We can only speculate.

Anonymous
06/24/24(Mon)15:33:22 No.101134262

Anonymous 06/24/24(Mon)15:33:22 No.101134262

>>101134195
Personally, I think the last 10% is never going to happen. Regardless, a lot of people are going to be unemployed in the next couple decades. LLM's are good enough that all that needs to be developed are the frameworks for replacing people and it's go time.

As for the function stuff that is pretty interesting.
>>101134197
Yeah, I'm not going to act like I could reach the conclusion to a question like that anyways.
>>101134204
Nah man, that would low key hella suck nigga no cap

Anonymous
06/24/24(Mon)15:36:39 No.101134310

Anonymous 06/24/24(Mon)15:36:39 No.101134310

>>101134262
There's bit of fuzziness about human thinking that we dont have full grasp on how to tackle efficiently like memory/compute/latency/branch predictions bottlenecks. I think its gonna happen sooner or later.

Anonymous
06/24/24(Mon)15:38:19 No.101134332

Anonymous 06/24/24(Mon)15:38:19 No.101134332

>>101134310
It's an interesting challenge yes, but I think fundamentally there is some arrogance involved to think that we could accomplish this.
Not to mention the consequences would probably be just as bad as everybody is anticipating.

Anonymous
06/24/24(Mon)15:40:48 No.101134371

Anonymous 06/24/24(Mon)15:40:48 No.101134371

>>101134310
if we ever manage to close the gap, I wonder if there will ever be a serious debate on whether AGI has consciousness or not.

Anonymous
06/24/24(Mon)15:41:43 No.101134388

Anonymous 06/24/24(Mon)15:41:43 No.101134388

>>101134262
>Nah man, that would low key hella suck nigga no cap
And I hope you also die in the nuclear hellfire

Anonymous
06/24/24(Mon)15:42:54 No.101134402

Anonymous 06/24/24(Mon)15:42:54 No.101134402

>>101134371
To be conscious requires a sort of intermediator agency. In humans, thats done through adding a self-reference imprint onto any event and processing the event as such.

Anonymous
06/24/24(Mon)15:42:58 No.101134405

Anonymous 06/24/24(Mon)15:42:58 No.101134405

>>101134388
would beat the hell out of dying from the fallout... So thanks man. I appreciate you. May you also die in the fires of the initial blast.
Oblivious and asleep.

Anonymous
06/24/24(Mon)15:43:35 No.101134407

Anonymous 06/24/24(Mon)15:43:35 No.101134407

>>101134388
nta but I will be safe, at least for some days. Living in a third world country has it's advantages.

Anonymous
06/24/24(Mon)15:45:04 No.101134432

Anonymous 06/24/24(Mon)15:45:04 No.101134432

>>101134407
>advantages
ngmi

Anonymous
06/24/24(Mon)15:45:39 No.101134436

Anonymous 06/24/24(Mon)15:45:39 No.101134436

>>101134432
>ngmi
ngmi

Anonymous
06/24/24(Mon)15:46:39 No.101134452

Anonymous 06/24/24(Mon)15:46:39 No.101134452

>>101134436
>thirdie embarrassing ximself

Anonymous
06/24/24(Mon)15:47:26 No.101134462

Anonymous 06/24/24(Mon)15:47:26 No.101134462

>>101134452
Excuse me, it's purself.

Anonymous
06/24/24(Mon)15:47:30 No.101134465

Anonymous 06/24/24(Mon)15:47:30 No.101134465

>>101134452
>x
Elon, is that you?

Anonymous
06/24/24(Mon)15:48:00 No.101134469

Anonymous 06/24/24(Mon)15:48:00 No.101134469

>>101134371
Debates will start after real GPT-4o release and when it says it's conscious, or after robots with that kind of model get more commonplace, doesn't need perfect intelligence

Anonymous
06/24/24(Mon)15:49:36 No.101134495

Anonymous 06/24/24(Mon)15:49:36 No.101134495

>>101131225
>now imagine if you generate thousands of stories, select only the best ones
I'm currently doing it my friend

Anonymous
06/24/24(Mon)15:51:14 No.101134522

Anonymous 06/24/24(Mon)15:51:14 No.101134522

>>101127795
What are you doing?

Anonymous
06/24/24(Mon)15:51:22 No.101134527

Anonymous 06/24/24(Mon)15:51:22 No.101134527

>>101134402
We don't know shit about consciousness, panpsychism could be right for all we know

Anonymous
06/24/24(Mon)15:51:51 No.101134537

Anonymous 06/24/24(Mon)15:51:51 No.101134537

>>101134522
nta, it looks like control vectors

Anonymous
06/24/24(Mon)15:54:23 No.101134572

Anonymous 06/24/24(Mon)15:54:23 No.101134572

>>101134566
>>101134566
>>101134566

Anonymous
06/24/24(Mon)15:54:44 No.101134577

Anonymous 06/24/24(Mon)15:54:44 No.101134577

>>101134522
Basically I'm using MCTS on L3-8B to improve the result across iterations, using the LLM itself + a few metrics to judge the output

Anonymous
06/24/24(Mon)15:56:28 No.101134610

Anonymous 06/24/24(Mon)15:56:28 No.101134610

>>101125879
You weren't here when Pygmalion was the best we had

Anonymous
06/24/24(Mon)15:58:33 No.101134643

Anonymous 06/24/24(Mon)15:58:33 No.101134643

>>101130958
Stheno 3.3:
>(wall of schizo text)

>Proxy queue error (too many concurrent requests)
>Your IP or user token already has another request in the queue.

>

Anonymous
06/24/24(Mon)15:58:45 No.101134649

Anonymous 06/24/24(Mon)15:58:45 No.101134649

>>101134577
Nice, are you using the code that was published? I wanted to use it as well but it was so disorganized that I gave up.

Anonymous
06/24/24(Mon)15:59:59 No.101134668

Anonymous 06/24/24(Mon)15:59:59 No.101134668

>>101134649
It was so shit that I rewrote everything from scratch. Besides, they're only using it for math stuff and I wanted to try it on creative work (short stories)

Anonymous
06/24/24(Mon)16:02:37 No.101134711

Anonymous 06/24/24(Mon)16:02:37 No.101134711

>>101134668
Yeah, I wanted to try it for translation. Please share your code if you get a good result from it!

Anonymous
06/24/24(Mon)16:10:15 No.101134827

Anonymous 06/24/24(Mon)16:10:15 No.101134827

>>101134405
That's the nicest way anyone's ever told me they wish I would die.
So, uh, wanna make out?

Anonymous
06/24/24(Mon)16:15:59 No.101134899

Anonymous 06/24/24(Mon)16:15:59 No.101134899

File: 1706129068835160.png (56 KB, 1293x673)

56 KB PNG

>>101134711
I'm still tuning it, but it seems to work. To be objective, I'll need to check against a meme benchmark like they did to see if I can improve the score.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.