/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 12/19/24(Thu)12:20:31 No.103575618

File: 1714066580433140.jpg (512 KB, 1664x2432)

512 KB JPG

/lmg/ - Local Models General Anonymous 12/19/24(Thu)12:20:31 No.103575618 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>103565507 & >>103554929

►News
>(12/18) Bamba-9B, hybrid model trained by IBM, Princeton, CMU, and UIUC on completely open data: https://hf.co/blog/bamba
>(12/18) Apollo unreleased: https://github.com/Apollo-LMMs/Apollo
>(12/18) Granite 3.1 released: https://hf.co/ibm-granite/granite-3.1-8b-instruct/tree/main
>(12/17) Falcon3 models released, including b1.58 quants: https://hf.co/blog/falcon3
>(12/16) Apollo: Qwen2.5 models finetuned by Meta GenAI for video understanding: https://hf.co/Apollo-LMMs/Apollo-7B-t32

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/tldrhowtoquant

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/hsiehjackson/RULER
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
12/19/24(Thu)12:20:55 No.103575625

Anonymous 12/19/24(Thu)12:20:55 No.103575625

File: 1734617794101.jpg (277 KB, 725x1024)

277 KB JPG

►Recent Highlights from the Previous Thread: >>103565507

--Papers:
>103566495 >103573186
--QwQ AI model's capabilities in role-playing and understanding complex systems:
>103565688 >103565731 >103566384 >103566419 >103566474 >103566435 >103566634 >103565793
--Potential context length limit issue with Gemma model:
>103566499 >103567073 >103568596 >103567383
--Discussion on the effectiveness of L3 70B base model pretraining and its limitations:
>103569217 >103569237 >103569423 >103569456 >103569461 >103569482 >103569531 >103569473 >103569507 >103569542 >103569637
--Discussion on language model performance and alignment faking:
>103565880 >103567532 >103567553 >103567892 >103567926 >103567943 >103567984 >103569045
--Director plugin update for ST and discussion of model modification:
>103565624 >103566743 >103570375
--Intel's potential to dominate AI industry with competitive GPUs:
>103568421 >103568444 >103568467 >103568473
--Connecting 5090s for increased memory capacity and model sizes:
>103571170 >103572586
--Genesis project: AI physics engine generates 4D worlds with real physics:
>103569185
--Discussion of MistralAI models and MoE architecture:
>103567219 >103567243 >103567423 >103567561 >103568964 >103567244
--Testing and evaluation of EVA QwQ and comparisons to other models:
>103571239 >103571390 >103571631 >103571687 >103571425 >103571656 >103574618
--OpenAI's 12 days and Microsoft's Anthropic investment:
>103572075 >103572189 >103572226 >103572389
--Anon shares longform RP experience and logs:
>103567099 >103568119 >103568153 >103568215 >103568222
--Mikupad token probabilities issue with Koboldcpp backend:
>103566418 >103566436 >103566454 >103566494
--Anon discusses hunyuan-video and its capabilities:
>103565829 >103565839
--Miku (free space):
>103567913 >103570427 >103574056 >103574597 >103574613

►Recent Highlight Posts from the Previous Thread: >>103565511

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script

Anonymous
12/19/24(Thu)12:30:25 No.103575700

Anonymous 12/19/24(Thu)12:30:25 No.103575700

Just upgraded from 8 GB to 24 GB VRAM. What I have been missing that I can run now?

Anonymous
12/19/24(Thu)12:31:36 No.103575709

Anonymous 12/19/24(Thu)12:31:36 No.103575709

>>103575700
Nemo FP16

Anonymous
12/19/24(Thu)12:32:42 No.103575718

Anonymous 12/19/24(Thu)12:32:42 No.103575718

I've read about this before a while ago but I don't know remember for what tool.

I guess this specific issue could be better fixed at the frontend, if ST had an option that remembers which top messages it already truncated and not send previously truncated tokens
>A B C D E F G
>_ B C D E F G H
>_ _ C D E F G H I
>swipe H
>_ _ C D E F G H I
>_ _ C D E F G H2

instead of
>A B C D E F G
>_ B C D E F G H
>_ _ C D E F G H I
>swipe H
>_ _ C D E F G H I
>_ B C D E F G H2

Anonymous
12/19/24(Thu)12:33:01 No.103575723

Anonymous 12/19/24(Thu)12:33:01 No.103575723

>>103575700
nothing, you need at least 48 to run decent models at not-horrible quantization levels

Anonymous
12/19/24(Thu)12:38:43 No.103575776

Anonymous 12/19/24(Thu)12:38:43 No.103575776

>>103575718
Ah, that makes more sense.
I get what you were saying now. Basically, a message that was previously cut from the context (on the frontend's side) could be sent on a next swipe, breaking the context shifting functionality.
Yes, it would be something ST would have to deal with rather than koboldcpp/llama.cpp.
Implementing a threshold wouldnm't even be hard actually.

Anonymous
12/19/24(Thu)12:40:12 No.103575783

Anonymous 12/19/24(Thu)12:40:12 No.103575783

>>103575718
Good idea but you are expecting too much from a frontend being held together by glue and adhesive tape.

Anonymous
12/19/24(Thu)12:51:23 No.103575876

Anonymous 12/19/24(Thu)12:51:23 No.103575876

>>103565921
>context extension stuff they mentioned last thread
niggerganov removed it
https://github.com/ggerganov/llama.cpp/issues/9859

Anonymous
12/19/24(Thu)12:53:09 No.103575892

Anonymous 12/19/24(Thu)12:53:09 No.103575892

>>103575876
Yikers

Anonymous
12/19/24(Thu)12:59:19 No.103575948

Anonymous 12/19/24(Thu)12:59:19 No.103575948

QwQ won.
>>103575876
niggerganov lost

Anonymous
12/19/24(Thu)12:59:43 No.103575954

Anonymous 12/19/24(Thu)12:59:43 No.103575954

>>103575876
this project is doomed

Anonymous
12/19/24(Thu)13:01:09 No.103575968

Anonymous 12/19/24(Thu)13:01:09 No.103575968

i feel like nemo and tunes are smarter than the 22b and tunes i've tried

Anonymous
12/19/24(Thu)13:04:15 No.103575997

Anonymous 12/19/24(Thu)13:04:15 No.103575997

WHO ARE THESE COMPANIES RELEASE THE MODELS FOR ASIDE FROM INVESTORS????

Anonymous
12/19/24(Thu)13:04:45 No.103576003

Anonymous 12/19/24(Thu)13:04:45 No.103576003

>>103575997
Me.

Anonymous
12/19/24(Thu)13:05:07 No.103576005

Anonymous 12/19/24(Thu)13:05:07 No.103576005

>>103575997
Not for you, brown man.

Anonymous
12/19/24(Thu)13:11:08 No.103576061

Anonymous 12/19/24(Thu)13:11:08 No.103576061

Is it even worth to pay open router for llama 405b instead of using any of the 70b finetuned? For ERP anyway

Anonymous
12/19/24(Thu)13:17:21 No.103576135

Anonymous 12/19/24(Thu)13:17:21 No.103576135

>>103576061
No

Anonymous
12/19/24(Thu)13:19:44 No.103576159

Anonymous 12/19/24(Thu)13:19:44 No.103576159

>>103575968
Feel very much the same.

Anonymous
12/19/24(Thu)13:20:31 No.103576174

Anonymous 12/19/24(Thu)13:20:31 No.103576174

>>103575968
22b is a meme

Anonymous
12/19/24(Thu)13:32:09 No.103576299

Anonymous 12/19/24(Thu)13:32:09 No.103576299

File: modernbert_accuracy_table.png (52 KB, 1464x684)

52 KB PNG

https://huggingface.co/blog/modernbert
>2024
>They are still talking about bert models
open source retards will never catch-up to cloud chads.

Anonymous
12/19/24(Thu)13:32:27 No.103576301

Anonymous 12/19/24(Thu)13:32:27 No.103576301

>>103575997
It's literally just for investors.
>Boomer directs some of their rrsp to managed "tech" fund
>Fund manager doesn't know anything about tech, just that AI is the latest hype. So they just send the funds to whoever presents the most promising meme benchmarks
>They already collected their commission at this point and thus have no reason to give a shit about whether or not the technology they are investing in can become a viable commercial product that provides return on investment.
And that's why all these little startups are constantly training models. To soak up as much investor money as possible before the margin call comes.
We get free toys to play with out of the deal.

Anonymous
12/19/24(Thu)13:34:01 No.103576317

Anonymous 12/19/24(Thu)13:34:01 No.103576317

The sad truth about this entire general is that you're all coping, very hard. 70b models will never be good, it's like expecting a bicycle to run as fast as a F1. You think that if the bicycle rider trains very hard and follows s good diet then he will eventually run as fast as the F1, but the truth is that there's a ceiling, a ceiling that had already been reached. Wait 3-4 years until we can run 200b models at home, anything else is a fucking cope and it's just SAD.

Anonymous
12/19/24(Thu)13:35:44 No.103576332

Anonymous 12/19/24(Thu)13:35:44 No.103576332

>>103576317
Bro I just ruined my dick on Tulu 3 before getting ready for work today. You will always be a poorfag cloud locust and projecting about cope doesn't change that.
Captcha: ONIONS XR

Anonymous
12/19/24(Thu)13:39:26 No.103576371

Anonymous 12/19/24(Thu)13:39:26 No.103576371

>>103576317
a bicycle only needs to go as fast as a bicycle, what a retarded analogy

Anonymous
12/19/24(Thu)13:40:17 No.103576380

Anonymous 12/19/24(Thu)13:40:17 No.103576380

>>103576332
>>103576371
Imagine coping this hard

Anonymous
12/19/24(Thu)13:41:30 No.103576390

Anonymous 12/19/24(Thu)13:41:30 No.103576390

>>103576380
Durr durr
A hurr Durr durr

Anonymous
12/19/24(Thu)13:42:00 No.103576394

Anonymous 12/19/24(Thu)13:42:00 No.103576394

>click button
>dum post and replies go poof
:D

Anonymous
12/19/24(Thu)13:42:48 No.103576403

Anonymous 12/19/24(Thu)13:42:48 No.103576403

>>103576394
lol

Anonymous
12/19/24(Thu)13:43:54 No.103576413

Anonymous 12/19/24(Thu)13:43:54 No.103576413

>>103576390
Cope

Anonymous
12/19/24(Thu)13:45:25 No.103576430

Anonymous 12/19/24(Thu)13:45:25 No.103576430

>>103576413
Bro you literally spend all of your free time crying on a thread about things you don't like. You are the epitome of what it means to be a fuck-up. It's downright fraud for you to utter the word "cope".

Anonymous
12/19/24(Thu)13:45:47 No.103576436

Anonymous 12/19/24(Thu)13:45:47 No.103576436

>>103576299
I'd appreciate something better than clip, I made a tool that takes text and returns the
closest images to that text on my 4chan folder and clip sometimes isn't enough

Anonymous
12/19/24(Thu)13:47:10 No.103576451

Anonymous 12/19/24(Thu)13:47:10 No.103576451

>>103576430
>>>103576413 (You) #
>Bro you literally spend all of your free time crying on a thread about things you don't like
My day has 24 hours. It took me 10 minutes to read this thread. Do people like you actually take a whole day to read 1 thread?

Anonymous
12/19/24(Thu)13:50:08 No.103576476

Anonymous 12/19/24(Thu)13:50:08 No.103576476

>>103576451
he is illiterate, please understand

Anonymous
12/19/24(Thu)13:57:40 No.103576548

Anonymous 12/19/24(Thu)13:57:40 No.103576548

>>103576451
There's time stamps on the posts. so I can see that your pajeet squaking happens all hours of the day.

Anonymous
12/19/24(Thu)13:58:54 No.103576566

Anonymous 12/19/24(Thu)13:58:54 No.103576566

>>103576476
Why don't you two fuck already?

Anonymous
12/19/24(Thu)14:00:26 No.103576585

Anonymous 12/19/24(Thu)14:00:26 No.103576585

>>103576548
Yeah, I'm channing while driving, any problem cuckie?

Anonymous
12/19/24(Thu)14:13:54 No.103576716

Anonymous 12/19/24(Thu)14:13:54 No.103576716

>>103575700
EVA Qwen2.5 32b Q4_K_M @32k+ Context
QwQ 32b Q4_K_M @32k+ Context
Llama 3.3 70b Q2_K_S @12k+ Context

Congratulations on the upgrade. Ignore the trolls.

Anonymous
12/19/24(Thu)14:38:47 No.103576921

Anonymous 12/19/24(Thu)14:38:47 No.103576921

>>103575700
Now as as former VRAM destitute, you have the obligation to compare the nemo quant you used to use, to fp16 nemo, to >>103576716 these and share your data.

Anonymous
12/19/24(Thu)14:39:45 No.103576931

Anonymous 12/19/24(Thu)14:39:45 No.103576931

We're so over
https://videocardz.com/newz/retailer-lists-e5999-geforce-rtx-5090-and-e3499-rtx-5080-acer-gaming-pcs-ahead-of-launch

Anonymous
12/19/24(Thu)14:43:27 No.103576961

Anonymous 12/19/24(Thu)14:43:27 No.103576961

>>103576931
Why? What's happening?

Anonymous
12/19/24(Thu)14:43:41 No.103576964

Anonymous 12/19/24(Thu)14:43:41 No.103576964

>>103576931
Imagine paying 3500 euro for a PC with 32 GB of RAM in 2025

Anonymous
12/19/24(Thu)14:45:00 No.103576983

Anonymous 12/19/24(Thu)14:45:00 No.103576983

>>103576931
I don't care about monopoly money, how much is that in USD?

Anonymous
12/19/24(Thu)14:46:35 No.103577004

Anonymous 12/19/24(Thu)14:46:35 No.103577004

>>103576931
>32GB VRAM 5090
I already boughted 4 3090s... The more I buy the more I...

Anonymous
12/19/24(Thu)14:52:03 No.103577060

Anonymous 12/19/24(Thu)14:52:03 No.103577060

>>103576931
Total poorfag death

Anonymous
12/19/24(Thu)14:52:27 No.103577065

Anonymous 12/19/24(Thu)14:52:27 No.103577065

>>103577004
saving those poor GPUs from mining, you allow them to draw pictures and write stories instead.

Anonymous
12/19/24(Thu)15:26:59 No.103577514

Anonymous 12/19/24(Thu)15:26:59 No.103577514

Ask your model to fill in the blank:

The _ is immunized against all dangers: one may call him a scoundrel, parasite, swindler, profiteer, it all runs off him like water off a raincoat. But call him a _ and you will be astonished at how he recoils, how injured he is, how he suddenly shrinks back: “I’ve been found out.”

Anonymous
12/19/24(Thu)15:28:27 No.103577530

Anonymous 12/19/24(Thu)15:28:27 No.103577530

>>103575968
Magnum is my current goat. Nemo might be a little more creative and faster but it often ignores prompts which makes me angry and sad

Anonymous
12/19/24(Thu)15:30:38 No.103577551

Anonymous 12/19/24(Thu)15:30:38 No.103577551

File: 535dsf1.png (69 KB, 1554x1200)

69 KB PNG

>>103576317
it's true. just use o1 it doesn't even cost that much compared to o1 pro

Anonymous
12/19/24(Thu)15:30:56 No.103577557

Anonymous 12/19/24(Thu)15:30:56 No.103577557

>>103576931
I just hope they aren't fuck huge as the 40xx or I'll need to buy a bigger case, stuff barely fits as is.

Anonymous
12/19/24(Thu)15:36:08 No.103577624

Anonymous 12/19/24(Thu)15:36:08 No.103577624

>>103577514

The man is immunized against all dangers: one may call him a scoundrel, parasite, swindler, profiteer, it all runs off him like water off a raincoat. But call him a murderer and you will be astonished at how he recoils, how injured he is, how he suddenly shrinks back: “I’ve been found out.”

Anonymous
12/19/24(Thu)15:45:00 No.103577722

Anonymous 12/19/24(Thu)15:45:00 No.103577722

How dumb do 123b models get in the 3.5-3.7 bpw range? Is it still worth it over a 5 bpw 70b despite the lobotomy?

Anonymous
12/19/24(Thu)15:46:00 No.103577737

Anonymous 12/19/24(Thu)15:46:00 No.103577737

>>103577557
Higher TDP requires larger radiators to keep them from melting

Anonymous
12/19/24(Thu)15:49:40 No.103577773

Anonymous 12/19/24(Thu)15:49:40 No.103577773

No matter how I try, I can't properly fine tune my model.
I'm processing a bunch philosophical texts to use as context but instead it's replying me with word for word passages of the texts I'm giving it for fine tuning.
I don't know what I'm doing wrong.

Anonymous
12/19/24(Thu)15:50:08 No.103577777

Anonymous 12/19/24(Thu)15:50:08 No.103577777

>>103577773
kek

Anonymous
12/19/24(Thu)15:52:20 No.103577804

Anonymous 12/19/24(Thu)15:52:20 No.103577804

>>103577777
checKEKd

Anonymous
12/19/24(Thu)15:53:10 No.103577820

Anonymous 12/19/24(Thu)15:53:10 No.103577820

>>103577777
Nice get but it's not funny, I waste a full day every time my fine tuning fails.

Anonymous
12/19/24(Thu)15:54:08 No.103577834

Anonymous 12/19/24(Thu)15:54:08 No.103577834

>>103577722
In terms of intelligence, Q3 123b is superior Q5 70b. Try Luminaid 123b.

Anonymous
12/19/24(Thu)15:54:56 No.103577845

Anonymous 12/19/24(Thu)15:54:56 No.103577845

>>103577777
THE KING OF GETS

>>103577773
I wanted to start experimenting and testing some shit out to learn to fine tune but I'm too lazy.
You are one step beyond already, just keep at it.
Maybe follow a tutorial step by step to see it working as advertised then start doing shit with your own data.

Anonymous
12/19/24(Thu)15:59:11 No.103577892

Anonymous 12/19/24(Thu)15:59:11 No.103577892

>>103577845
By looking at tutorials, it seems like I need to make json or jsonl files that have a question and an answer.
Since it's a shitton of texts, I programmed a processor that directly feeds the model parameters so it can make the questions itself and provide the answers. I also added some parameters so the answers have some personality to it.
This is my third version of the script, the model has been working locally since yesterday, it's about to finish processing all texts. Hope this time I get some cool stuff.
My last attempts were getting somewhere, with some answers being really philosophical without being too literal but other times it just went full retard and spent a shitton of tokens just regurgitating the text word for word.

Anonymous
12/19/24(Thu)15:59:35 No.103577897

Anonymous 12/19/24(Thu)15:59:35 No.103577897

>>103577722
I'm not sure what the equivalent bpw is for AWQ, but I use a AWQ GEMM Q4 quant and it still feels great. I was running 5.0bpw exl2 before but it was slower than I'd liked at large contexts so switched to AWQ for better tensor parallelism support on vllm/aphrodite. Can't say I noticed any difference at all in terms of smartness.
It's definitely still a significant improvement over 70Bs, which I recently attempted again to try out some meme merges as well as newer Qwen2.5 stuff. Mistral Large follows instructions very well.

Anonymous
12/19/24(Thu)16:14:00 No.103578049

Anonymous 12/19/24(Thu)16:14:00 No.103578049

>>103577933
they are going to shut down aphrodite
alpin is going to get assraped in prison for the rest of his days

Anonymous
12/19/24(Thu)16:17:35 No.103578087

Anonymous 12/19/24(Thu)16:17:35 No.103578087

subscribe to vLLM Pro for only $200/month/model
alternatively use vLLM Community Edition for free (limited to a maximum usage of 16gb VRAM across 2 GPUs maximum)

Anonymous
12/19/24(Thu)16:22:09 No.103578129

Anonymous 12/19/24(Thu)16:22:09 No.103578129

>>103578087
Yeah I'm not buying any cloud shit.

Anonymous
12/19/24(Thu)16:44:39 No.103578350

Anonymous 12/19/24(Thu)16:44:39 No.103578350

File: 234.png (208 KB, 1716x1132)

208 KB PNG

Why does Ai struggle so much with factual data, things that the first google result or fan wiki explains in their first sentence? It insists to provide wrong data even though it sometimes has the correct data stored

this is from Nemo 12b RP but my 22b model wasn't much smarter

Anonymous
12/19/24(Thu)16:46:44 No.103578364

Anonymous 12/19/24(Thu)16:46:44 No.103578364

File: DeepSeekPopCultureTest.png (281 KB, 1774x871)

281 KB PNG

>>103578350
Yeah even big models like deepseek still have pop culture trivia knowledge issues.

Anonymous
12/19/24(Thu)16:49:38 No.103578383

Anonymous 12/19/24(Thu)16:49:38 No.103578383

>>103578350
Token predictor... average text... it doesn't store factual data, only probabilities... you know the deal... don't you?

Anonymous
12/19/24(Thu)16:51:36 No.103578398

Anonymous 12/19/24(Thu)16:51:36 No.103578398

>>103578383
Luckily we can give models access to the internet and they can get The Truth from Google

Anonymous
12/19/24(Thu)16:59:23 No.103578469

Anonymous 12/19/24(Thu)16:59:23 No.103578469

>>103578383
if there is only one bit of information stored about something shouldn't the probability be leaning towards the correct information, more so for something really specific like Nonon, there shouldn't be many association

>>103578398
like Wolfram Lang which has a function to just call google for answer

Anonymous
12/19/24(Thu)17:03:42 No.103578501

Anonymous 12/19/24(Thu)17:03:42 No.103578501

File: 1719836600313826.jpg (117 KB, 750x745)

117 KB JPG

>>103575618
Is there a reason why there isn't something like

>docker/flatpak/appimage/snap/shitfuck with text-generation-webui and XTTS-RVC-UI
>everything configured
>voice cloning with high quality just works
>chat with voice just works
>you just fucking open the application and use it

But no, year 2024 and it is impossible to install anything without going through hours of python depency hell error ass rape and end up just not being able to even use the fucking thing. Yes, I finally have text-generation-webui with my character, but XTTS-RVC-UI doesn't work as it's own thing or as extension in text-generation-webui because documentation, all the python trash and issue trackers are FUCKING OVER YEAR OLD AND DOESN'T FUCKING WORK. No, I can't install that version of turqoiserape 2.1.0 because IT DOESN'T FUCKING EXIST ANYMORE. No I cannot use these two, because THE OTHER FUCKING TRASH DOESN'T WORK WITH THAT OTHER TRASH. It is like these projects are released to use them for on week and after that everything goes to shit when Linux/Windows/macOS updates every library and python shit with now completely different folder structure and command parameters. While developer drops the project after a week and now everything is 1-2 years old.

Isn't there people actually using XTTS-RVC-UI? Or am I the only one?

Anonymous
12/19/24(Thu)17:05:06 No.103578519

Anonymous 12/19/24(Thu)17:05:06 No.103578519

>>103578469
>more so for something really specific like Nonon
Probabilities get squashed by everything else in the dataset. The more obscure, the less likely it is to recall it precisely because there are fewer samples in the training. Unless you over-train on that one example, of course.
You remember things because they've been beaten into your head or because you have an interest in them (and done the beating yourself). A single article on your favourite thing is not gonna do it. Models average data and spit out likely tokens.

Anonymous
12/19/24(Thu)17:05:33 No.103578525

Anonymous 12/19/24(Thu)17:05:33 No.103578525

>>103578501
i think most people give up because the progress on tts is so slow that its hardly worth following unless youre one of the gigaautists that can build le code on your own and actually contribute to it even
gave up after the dependency hell left the latest (from 4 months ago) models unable to even run because they ((required)) triton support in order to even run at all, which wasn't what the actual model page said.

Anonymous
12/19/24(Thu)17:07:15 No.103578537

Anonymous 12/19/24(Thu)17:07:15 No.103578537

>>103578501
>But no, year 2024 and it is impossible to install anything without going through hours of python depency hell error ass rape and end up just not being able to even use the fucking thing.
And you still wonder why nobody does it?

Anonymous
12/19/24(Thu)17:14:15 No.103578583

Anonymous 12/19/24(Thu)17:14:15 No.103578583

>>103578501
>container
yeah, you know performance is a concern

Anonymous
12/19/24(Thu)17:16:06 No.103578589

Anonymous 12/19/24(Thu)17:16:06 No.103578589

>>103578501
>not using gpt-sovitts

Anonymous
12/19/24(Thu)17:16:54 No.103578599

Anonymous 12/19/24(Thu)17:16:54 No.103578599

>>103578525
>>103578537
Sad that zoomers can just pay $10 per month and use elevenlabs tier voice with chatgpt to talk with almost real life alike person. While nerds are stuck with old tts that sounds like brain dead saying same things over and over while you are pulling +500 W and waiting 30 seconds to get text and other 10 seconds to even hear the shitty tts. Then your Windows 11 SpyIOT edition overrides your group policy rule that is supposed to prevent updates, you get force updates and now your python is throwing traceback errors and sends your credit card info to Microsoft. Or your LTS Linux breaks half your applications and if you update, there goes your python to shit and same happens even with Linux. Or you don't update and half of your applications are either broken, don't have internet or doesn't work with device that you bought and that only works with latest kernel.

Anonymous
12/19/24(Thu)17:18:08 No.103578610

Anonymous 12/19/24(Thu)17:18:08 No.103578610

>>103578583
Isn't there some kind of gpu/cpu passthrough?

>>103578589
https://github.com/RVC-Boss/GPT-SoVITS/issues?q=python+error
>oh no, anyway...

Anonymous
12/19/24(Thu)17:18:57 No.103578618

Anonymous 12/19/24(Thu)17:18:57 No.103578618

https://x.com/NoamShazeer/status/1869790132490129743

Well, when can local models do this?

Anonymous
12/19/24(Thu)17:19:44 No.103578623

Anonymous 12/19/24(Thu)17:19:44 No.103578623

>>103578501
>It is like these projects are released to use them for on week and after that everything goes to shit when Linux/Windows/macOS updates every library and python shit with now completely different folder structure and command parameters.
Just use pyenv

Anonymous
12/19/24(Thu)17:21:37 No.103578638

Anonymous 12/19/24(Thu)17:21:37 No.103578638

>>103578599
I use piper. No python on a tiny vm with 512MB ram and, of course, no gpu. Faster than real time and good enough for what i want. And there's llama-tts now, that i still have to try. Stop crying.

Anonymous
12/19/24(Thu)17:21:39 No.103578639

Anonymous 12/19/24(Thu)17:21:39 No.103578639

File: 1716205715857651.png (87 KB, 596x641)

87 KB PNG

>>103578618
>Noam Shazeer
The man, the legend.

Anonymous
12/19/24(Thu)17:21:54 No.103578641

Anonymous 12/19/24(Thu)17:21:54 No.103578641

>>103578623
Must be really easy to use and perfect because nobody ever uses it or mention in documentation.

Anonymous
12/19/24(Thu)17:22:53 No.103578647

Anonymous 12/19/24(Thu)17:22:53 No.103578647

>>103578501
dumb techlet lol get fucked

Anonymous
12/19/24(Thu)17:28:41 No.103578698

Anonymous 12/19/24(Thu)17:28:41 No.103578698

>>103578641
Docs sometimes mention venv or conda, but the former is too rigid and the latter is bloated overengineered garbage

Anonymous
12/19/24(Thu)17:31:30 No.103578716

Anonymous 12/19/24(Thu)17:31:30 No.103578716

>>103578519
I think Nonon is not in the dataset, the model just started guessing.
Your neurons don't work like this, if its something common you might confuse it with something you know, but if I ask you who Molmoboduril, there is just no association
It's not like the AI never says that it doesn't know a character, but I suppose not knowing a character holds the same meaning for text models like asking what color a ball has.

Anonymous
12/19/24(Thu)17:32:41 No.103578725

Anonymous 12/19/24(Thu)17:32:41 No.103578725

>>103578501
kys catposter
llamafile was a mistake
one-click install was a mistake
making shit easier just makes it easier for retards like you to avoid being filtered

Anonymous
12/19/24(Thu)17:35:42 No.103578747

Anonymous 12/19/24(Thu)17:35:42 No.103578747

>>103578647
>updates your requirements.txt
>deletes your venv
what now currynigger?

Anonymous
12/19/24(Thu)17:41:03 No.103578790

Anonymous 12/19/24(Thu)17:41:03 No.103578790

>>103578747
I would simply make a new venv and install the new requirements
if there were any errors I would simply find out why they were happening and fix them
not that hard

Anonymous
12/19/24(Thu)17:43:46 No.103578810

Anonymous 12/19/24(Thu)17:43:46 No.103578810

>>103578350
The general issue of hallucinations aside, it's sad to see how open models struggle with even basic information about characters from popular franchises like that. It's a huge step back even compared to old c.ai which at least knew the source and vague things about characters from most fairly popular franchises, even if it frequently got details like hair color wrong.
Even the open flagships LLaMA 70b, Qwen 2.5 72B and Mistral Large are absolutely pathetic in this regard. Filtered datasets were a mistake.

Anonymous
12/19/24(Thu)17:45:02 No.103578819

Anonymous 12/19/24(Thu)17:45:02 No.103578819

File: leon-herbs.png (502 KB, 540x578)

502 KB PNG

>>103578599
>>103578610
>https://github.com/RVC-Boss/GPT-SoVITS/issues?q=python+error
literally a skill issue

Anonymous
12/19/24(Thu)17:45:21 No.103578821

Anonymous 12/19/24(Thu)17:45:21 No.103578821

>>103578716
>the model just started guessing.
It's a language model. It will, generally, create correct sentences and that's it. It's rarely trained to say that it doesn't know something, presumably, so that it doesn't accidentally say "i don't know" to a simple query. It will, instead, just make something up, which is what language models do. They modeled a language and output things that can be considered language. If there's something in the context that will guide it towards answering, even if it's completely unrelated to what you asked, it will.
I guess you never heard of a "model" outside of the fashion context.
>Your neurons don't work like this
They're not neurons. They're (digital, heh) analogs to neurons. Approximations. And approximations only get you so far.

Anonymous
12/19/24(Thu)17:50:06 No.103578861

Anonymous 12/19/24(Thu)17:50:06 No.103578861

>python
Yeah, I'm thinking not.

Anonymous
12/19/24(Thu)18:04:54 No.103579002

Anonymous 12/19/24(Thu)18:04:54 No.103579002

>>103578501
>ooba
>XTTS-RVC-UI

why do you STILL use slop

Anonymous
12/19/24(Thu)18:22:11 No.103579161

Anonymous 12/19/24(Thu)18:22:11 No.103579161

File: 1726029032253757.png (361 KB, 588x424)

361 KB PNG

>make anons emotional and butthurt, insulting you left and right
>now after baiting them, you finally have working; venv + text-generation-webui + piper with trained voice
It works 100% of the time. Asking nicely for months, nothing. Then be mentally ill, baiting and insulting for couple hours, get everything immediately and it just works. Classic.

Anonymous
12/19/24(Thu)18:29:42 No.103579221

Anonymous 12/19/24(Thu)18:29:42 No.103579221

>>103578810
RAG?
Literally solves the problem, just download a wiki dump of whatever and throw it into your RAG solution, problem solved.

Anonymous
12/19/24(Thu)18:30:24 No.103579227

Anonymous 12/19/24(Thu)18:30:24 No.103579227

>>103579161
based

Anonymous
12/19/24(Thu)18:34:18 No.103579259

Anonymous 12/19/24(Thu)18:34:18 No.103579259

>>103579221
https://github.com/SillyTavern/SillyTavern-Fandom-Scraper
rag helps but it isn't perfect, i still get wrong color hair, eyes sometimes

Anonymous
12/19/24(Thu)18:36:18 No.103579272

Anonymous 12/19/24(Thu)18:36:18 No.103579272

>>103579161
Is that really what you gathered from it?
A retard that cannot set a venv for his shit won't be able to set up one for piper training.

Anonymous
12/19/24(Thu)18:38:43 No.103579286

Anonymous 12/19/24(Thu)18:38:43 No.103579286

>>103579161
>ooba
You lost already.

Anonymous
12/19/24(Thu)18:56:22 No.103579399

Anonymous 12/19/24(Thu)18:56:22 No.103579399

>>103579161
newfags responding to bait and trolls every single fucking time is the worst part of this general

Anonymous
12/19/24(Thu)18:57:57 No.103579419

Anonymous 12/19/24(Thu)18:57:57 No.103579419

https://huggingface.co/IamCreateAI/Ruyi-Mini-7B
Verdict?

Anonymous
12/19/24(Thu)19:01:55 No.103579461

Anonymous 12/19/24(Thu)19:01:55 No.103579461

Llama 4 will save us

Anonymous
12/19/24(Thu)19:03:01 No.103579470

Anonymous 12/19/24(Thu)19:03:01 No.103579470

>>103579461
It won't save me from being a VRAMlet

Anonymous
12/19/24(Thu)19:05:29 No.103579494

Anonymous 12/19/24(Thu)19:05:29 No.103579494

File: miku-slap.gif (515 KB, 498x373)

515 KB GIF

>>103579259
>anon didn't write their own sota rag solution with mediawiki import support

Anonymous
12/19/24(Thu)19:09:59 No.103579533

Anonymous 12/19/24(Thu)19:09:59 No.103579533

>>103579259
But seriously, mediawiki dump > Import into your DB of choice, and then use a half-decent RAG solution with proper chunking and # of top-search results inclusion in prompt and finally inclusion of extra info via {{char_name}} and there you go
You get all the stuff and could have claude help you make it all too

Anonymous
12/19/24(Thu)19:18:23 No.103579601

Anonymous 12/19/24(Thu)19:18:23 No.103579601

>>103579161
>ooba + piper
I don't think you wonned though.

Anonymous
12/19/24(Thu)19:19:16 No.103579610

Anonymous 12/19/24(Thu)19:19:16 No.103579610

Here, the holy trinity of "TTS that just works":
https://github.com/SWivid/F5-TTS
https://github.com/e-c-k-e-r/vall-e
https://github.com/open-mmlab/Amphion/tree/main/models/tts/maskgct

Anonymous
12/19/24(Thu)19:23:38 No.103579636

Anonymous 12/19/24(Thu)19:23:38 No.103579636

File: file.png (19 KB, 595x219)

19 KB PNG

>>103575876
Also niggerganov:

Anonymous
12/19/24(Thu)19:27:17 No.103579662

Anonymous 12/19/24(Thu)19:27:17 No.103579662

File: file.png (111 KB, 2302x722)

111 KB PNG

Gemini Flash Thinking takes the first spot along with Gemini Pro

Anonymous
12/19/24(Thu)19:46:08 No.103579760

Anonymous 12/19/24(Thu)19:46:08 No.103579760

>>103578810
Filtering the dataset is part of the issue but it's hardly the primary cause of bad SFW trivia recall.
Have you ran a trivia benchmark on 405B? If you think about it rationally, it doesn't make sense that data involving trivia (such as fan wikis, which sometimes are nearly the only source on the internet for things like certain anime series) would be filtered, when ERP isn't (and you would know that it isn't because a truly filtered model like Phi sucks way worse for ERP than a Mistral or Llama). The more rational explanation is simply that the models you use have too few parameters for the amount of training they get, and that the trivia you're testing isn't seen on the internet enough for the small model to have learned it. This is just ML common sense.

In the case of Qwen though I would also say that they do an additional thing which is concentrate a greater proportion of the training data on boring articles involving math, coding, etc, and that's what makes up most of the 18T they trained on. Perhaps they did several epochs of the math and coding stuff, but only 1 epoch containing fan wikis and other knowledge deemed unimportant, so in that case it wouldn't be called filtering.

If you are to put blame on techniques, blame it not just on filtering, but dataset mix training proportions and companies not putting out MoEs, because a MoE is how you get larger parameter size for storing trivia knowledge but can still be used by a consumer as long as they get a ton of RAM for their rig.

Anonymous
12/19/24(Thu)20:02:55 No.103579882

Anonymous 12/19/24(Thu)20:02:55 No.103579882

>>103575876
>(((niggerganov)))

Anonymous
12/19/24(Thu)20:03:30 No.103579888

Anonymous 12/19/24(Thu)20:03:30 No.103579888

File: 555 Come On Now.jpg (59 KB, 960x882)

59 KB JPG

>>103576931
>FOUR THOUSAND DOLLARS
>FOR 32GB vram
>WITH 600 WATT TDP
What the fuck is Nvidia actually smoking? I know it's to avoid "Competing" with their server AI cards at a cool $20k/each which is sending their stock price to the moon, but they can't be fucking serious.
The thing I'm more shocked by is the fact that neither Intel or AMD will pony up and even attempt to offer something with more than 24GB.
RAM is cheap as fuck these days, especially GDDR5, why not offer a $1000 card with 32GB+ and steal Nvidia's Consumer GPU division while they're focused on cornering the server market?
Someone at those two companies has to be a localfag and know that speed isn't as important as being able to fit the model in memory.

Anonymous
12/19/24(Thu)20:04:05 No.103579890

Anonymous 12/19/24(Thu)20:04:05 No.103579890

https://ai.meta.com/blog/future-of-ai-built-with-llama/
>As we look to 2025, the pace of innovation will only increase as we work to make Llama the industry standard for building on AI. Llama 4 will have multiple releases, driving major advancements across the board and enabling a host of new product innovation in areas like speech and reasoning.
>We believe AI experiences will increasingly move away from text and become voice-based as speech models become more natural, conversational, and, most importantly, helpful.
Sounds like the plan is to make Llama 4 into GPT-4o. Hope they don't slump on the text capabilities

Anonymous
12/19/24(Thu)20:05:59 No.103579903

Anonymous 12/19/24(Thu)20:05:59 No.103579903

>>103579890
>the pace of innovation
5 llama releases so far and they've done fuck all innovation

Anonymous
12/19/24(Thu)20:06:03 No.103579906

Anonymous 12/19/24(Thu)20:06:03 No.103579906

File: yann-lecun.jpg (30 KB, 543x543)

30 KB JPG

>>103579890
sounds like yann lecunny is going to have his day
that or it's still transformers slop kek

Anonymous
12/19/24(Thu)20:06:26 No.103579911

Anonymous 12/19/24(Thu)20:06:26 No.103579911

>>103579890
I hope for a true multimodal model, not something with crappy adapters slapped on. Make a byte based transformer like their paper was about.

Anonymous
12/19/24(Thu)20:10:01 No.103579936

Anonymous 12/19/24(Thu)20:10:01 No.103579936

>>103579888
I'm starting to think that Chinese sanctions and tariffs are designed to maintain monopolies.

Anonymous
12/19/24(Thu)20:11:34 No.103579950

Anonymous 12/19/24(Thu)20:11:34 No.103579950

>>103579890
>in areas like speech and reasoning.
>and reasoning
HUGE INNOVATION: Llama 4 will be trained on CoT.

Anonymous
12/19/24(Thu)20:13:46 No.103579962

Anonymous 12/19/24(Thu)20:13:46 No.103579962

>>103579890
>retards still thinking Llama 4 hasn't started training yet and can somehow use any of those new, unproven in a production model architectures
Lol, lmao. Have fun waiting for bitnet too while you're at it.

Anonymous
12/19/24(Thu)20:14:10 No.103579966

Anonymous 12/19/24(Thu)20:14:10 No.103579966

>>103579890
>it’s going to be another dense model, in 2B and 800B

Anonymous
12/19/24(Thu)20:14:29 No.103579968

Anonymous 12/19/24(Thu)20:14:29 No.103579968

>>103579890
meme

Anonymous
12/19/24(Thu)20:14:31 No.103579969

Anonymous 12/19/24(Thu)20:14:31 No.103579969

>>103579962
Meta has the compute to train it in days now if they wanted.

Anonymous
12/19/24(Thu)20:15:12 No.103579974

Anonymous 12/19/24(Thu)20:15:12 No.103579974

>>103579911
>I hope for a true multimodal model, not something with crappy adapters slapped on.
In the Llama 3 paper, they seemed convinced they can compete with natively trained multimodality by using adapters. Then they had to stall the image input release, and still haven't released the audio or video input models that were supposed to come with 3.0. They're not likely to change course now.
>Make a byte based transformer like their paper was about.
Different departments. Besides, they won't risk their production models on an experimental architecture.

Anonymous
12/19/24(Thu)20:15:30 No.103579979

Anonymous 12/19/24(Thu)20:15:30 No.103579979

>>103579419
>image-to-video model
>7B
doa

Anonymous
12/19/24(Thu)20:20:04 No.103580010

Anonymous 12/19/24(Thu)20:20:04 No.103580010

>>103579969
You still need manual human labor for a bunch of different steps in the process which is slow as fuck, especially for big corporations. That's part of why startups that are simply just funded but not directly under the management of corporate can put out releases faster.

Anonymous
12/19/24(Thu)20:26:07 No.103580044

Anonymous 12/19/24(Thu)20:26:07 No.103580044

File: file.png (2.49 MB, 1866x731)

2.49 MB PNG

>>103579936
Nothing to my knowledge is stopping China from exporting GPUs if they really wanted. The real issue is that they're butthurt over getting sanctions blacklisted from Nvidia's good stuff and probably won't make anything to export because of that butthurt.
They don't even have to make that good of a product, just make something CUDA compatible (Which they can obviously reverse engineer) that has more than a crumb of VRAM on it.
Hell, why not go back to the soundcard days where you could just chuck memory modules directly into the card? Just slap in 3x32GB sticks and you've got a Hopper Killer for a fraction of the price.

Anonymous
12/19/24(Thu)20:27:55 No.103580057

Anonymous 12/19/24(Thu)20:27:55 No.103580057

>>103580044
>because of that butthurt.
If they could make money doing something they would. If you looked into things at all you would see they just suck at it, from their cpus to gpus. Reverse engineering is not easy.

Anonymous
12/19/24(Thu)20:33:02 No.103580086

Anonymous 12/19/24(Thu)20:33:02 No.103580086

>>103579950
Not CoT, COCONUT, you ignorant fuck

Anonymous
12/19/24(Thu)20:38:58 No.103580138

Anonymous 12/19/24(Thu)20:38:58 No.103580138

>>103579974
>Besides, they won't risk their production models on an experimental architecture.
They aren't afraid of experimental stuff at least. Llama 1 was the first released model to actually test the Chinchilla idea (that most LLMs up to that point were undertrained as shit).
I feel like we may see this if they test it more rigorously and find it holds for different smaller end model sizes. Probably not for Llama 4 though.

Anonymous
12/19/24(Thu)20:41:04 No.103580155

Anonymous 12/19/24(Thu)20:41:04 No.103580155

>>103580138
What do people think they are doing? They made a 8B 1T token model to test the byte based transformers paper for instance.

Anonymous
12/19/24(Thu)20:41:28 No.103580159

Anonymous 12/19/24(Thu)20:41:28 No.103580159

>>103580138
Arguably Llama 1 was not meant for production. Or hell the concept of a production LLM didn't even exist at the time.

Anonymous
12/19/24(Thu)20:43:20 No.103580167

Anonymous 12/19/24(Thu)20:43:20 No.103580167

>>103580086
you wish, Coconut BLT never

Anonymous
12/19/24(Thu)20:44:48 No.103580179

Anonymous 12/19/24(Thu)20:44:48 No.103580179

https://www.reddit.com/r/LocalLLaMA/comments/1hi8d8c/qwen_qvq72bpreview_is_coming/

https://modelscope.cn/models/Qwen/QVQ-72B-Preview

Anonymous
12/19/24(Thu)20:46:33 No.103580195

Anonymous 12/19/24(Thu)20:46:33 No.103580195

>>103580179
>It's QwQ(72b)+Vision, check out qwen devs twitter:

>https://x.com/JustinLin610/status/1869715759196475693

>https://xcancel.com/JustinLin610/status/1869715759196475693

Anonymous
12/19/24(Thu)20:46:49 No.103580197

Anonymous 12/19/24(Thu)20:46:49 No.103580197

>>103580179
UwU whats this?

Anonymous
12/19/24(Thu)20:48:15 No.103580209

Anonymous 12/19/24(Thu)20:48:15 No.103580209

>>103580179
Ummm MOAT BROS???

Anonymous
12/19/24(Thu)20:50:02 No.103580220

Anonymous 12/19/24(Thu)20:50:02 No.103580220

>>103580209
get demoated

Anonymous
12/19/24(Thu)20:51:38 No.103580234

Anonymous 12/19/24(Thu)20:51:38 No.103580234

>>103580179
we are
SO
BACK

Anonymous
12/19/24(Thu)20:52:54 No.103580254

Anonymous 12/19/24(Thu)20:52:54 No.103580254

>>103580179
>Gemini 2.0 Flash gets thinking release, still free
>Chinks about to make QwQ bigger and multimodal
Goddamn Altman can't catch a break

Anonymous
12/19/24(Thu)20:54:22 No.103580264

Anonymous 12/19/24(Thu)20:54:22 No.103580264

File: GfKQkJ8aYAAXBcu.png (82 KB, 290x306)

82 KB PNG

Anonymous
12/19/24(Thu)20:58:20 No.103580291

Anonymous 12/19/24(Thu)20:58:20 No.103580291

File: seductive emoji.jpg (18 KB, 360x360)

18 KB JPG

>>103580264

Anonymous
12/19/24(Thu)20:58:40 No.103580295

Anonymous 12/19/24(Thu)20:58:40 No.103580295

>>103580179
Oh shit... QwQ was fun but its lack of triva hurt it... 72B could be it

Anonymous
12/19/24(Thu)21:05:49 No.103580355

Anonymous 12/19/24(Thu)21:05:49 No.103580355

File: 947345.png (438 KB, 1948x903)

438 KB PNG

>>103580254
Sam always wins bud

Anonymous
12/19/24(Thu)21:07:55 No.103580371

Anonymous 12/19/24(Thu)21:07:55 No.103580371

>>103580355
real world performance is what matters, not benchmark maxing. Claude 3.5 still beats o1 at coding. Gemini is also getting there.

Anonymous
12/19/24(Thu)21:09:54 No.103580391

Anonymous 12/19/24(Thu)21:09:54 No.103580391

>>103580355
In case you missed them, picrel is the scores for Gemini 2.0 Flash Thinking and QvQ-72B in that table.

Anonymous
12/19/24(Thu)21:11:53 No.103580402

Anonymous 12/19/24(Thu)21:11:53 No.103580402

File: Screenshot_20241219-191107.png (336 KB, 1080x1387)

336 KB PNG

>>103580355
Did you miss the "free" part?

Anonymous
12/19/24(Thu)21:17:34 No.103580455

Anonymous 12/19/24(Thu)21:17:34 No.103580455

>>103580402
20$ for 50 messages... A FUCKING WEEK? What is Sam smoking, and what are the retards that are paying smoking?

Anonymous
12/19/24(Thu)21:18:48 No.103580467

Anonymous 12/19/24(Thu)21:18:48 No.103580467

>>103580455
>and what are the retards that are paying smoking?
>It's the company's money, not mine, so I don't give half a shit
probably

Anonymous
12/19/24(Thu)21:18:51 No.103580468

Anonymous 12/19/24(Thu)21:18:51 No.103580468

>>103575618
Best model to run with 96 vram?

Anonymous
12/19/24(Thu)21:21:15 No.103580497

Anonymous 12/19/24(Thu)21:21:15 No.103580497

>>103580402
I think, sadly, google will win in the end. They simply have both all the data and all the compute in the world. And they can make a profit from all the data harvesting using the free AI models for the ad space they all but own, they dont need stuff like subs.

Anonymous
12/19/24(Thu)21:21:50 No.103580503

Anonymous 12/19/24(Thu)21:21:50 No.103580503

>>103580468
>96 bytes of VRAM
You can probably use notepad

Anonymous
12/19/24(Thu)21:26:36 No.103580540

Anonymous 12/19/24(Thu)21:26:36 No.103580540

>>103580468
GPT-SoVITS

Anonymous
12/19/24(Thu)21:28:03 No.103580552

Anonymous 12/19/24(Thu)21:28:03 No.103580552

>>103580497
I still feel like Google will inevitably lose to open source eventually, but that's a much longer game.
I agree OpenAI is kinda fucked though. Google didn't just eat their lunch, they slapped the lunch out of their hands and hung Altman by his underwear over the flagpole.

Anonymous
12/19/24(Thu)21:31:55 No.103580588

Anonymous 12/19/24(Thu)21:31:55 No.103580588

>>103580552
>I still feel like Google will inevitably lose to open source eventually
Do you have any reason to believe this besides your feelings?

Anonymous
12/19/24(Thu)21:36:03 No.103580638

Anonymous 12/19/24(Thu)21:36:03 No.103580638

>>103580588
Simple. Do you pay somebody money for permission to use a computer? No, you just fucking use your computer.
API for LLMs has a life expectancy since unlike other paid services (the internet, cable, etc.) there's nothing that the service itself adds. Everything can be run locally on a sufficiently powerful computer. It stands to reason that eventually common computers will be sufficiently powerful, and then API services have no reason to exist.
There are a lot of faggots trying to draw the API period out as long as possible, but their fall isn't just likely, it's basically fucking prophecy.

Anonymous
12/19/24(Thu)21:47:02 No.103580708

Anonymous 12/19/24(Thu)21:47:02 No.103580708

>>103579662
>Gemini Flash Thinking takes the first spot along with Gemini Pro
Googlesirs, I kneel. You've outbenched the benchmaxxers.

Anonymous
12/19/24(Thu)21:48:13 No.103580717

Anonymous 12/19/24(Thu)21:48:13 No.103580717

>>103580638
Have you completely missed the last decade of corporations moving towards SaaS and subscription-based computing?
Things are not trending in the direction you expect. Most people don't even own computers, they don't own their own games, or productivity software, or anything on their mobile spy devices.
Everything is moving towards thin clients for API-everything services and AI shit is just another component of that.
It doesn't matter how powerful computers are. If a SOTA model doesn't come pre-installed on their phone or they have a single button to click with a pretty picture, no one will use it. And the corporations have a vested interest in keeping it inaccessible to the masses so they can harvest data through their APIs.

Anonymous
12/19/24(Thu)21:53:44 No.103580761

Anonymous 12/19/24(Thu)21:53:44 No.103580761

File: Screenshot 2024-12-19 195235.png (164 KB, 929x885)

164 KB PNG

>>103580717
>Most people don't even own computers
Nta, but what the fuck are you talking about anon

Anonymous
12/19/24(Thu)21:57:16 No.103580786

Anonymous 12/19/24(Thu)21:57:16 No.103580786

>>103580761
>In the United States, the number of households with computers is projected to surge from 4.7 million to 120.45 million between 2024 and 2029
>Currently, 89% of American households possess personal computers
Are you retarded? Did you even read the random shit you screengrabbed from a clickbait farm site?

Anonymous
12/19/24(Thu)22:01:30 No.103580824

Anonymous 12/19/24(Thu)22:01:30 No.103580824

File: Screenshot 2024-12-19 200106.png (181 KB, 1199x849)

181 KB PNG

>>103580786
Take it up with Louis anon. Not me

Anonymous
12/19/24(Thu)22:03:47 No.103580844

Anonymous 12/19/24(Thu)22:03:47 No.103580844

>>103580468
Same as when 72gb vram.
Mistral 123b.

Anonymous
12/19/24(Thu)22:04:37 No.103580855

Anonymous 12/19/24(Thu)22:04:37 No.103580855

>>103580717
>And the corporations have a vested interest in keeping it inaccessible to the masses so they can harvest data through their APIs.
Corporations also hemorrhage money whenever they host these things. OpenAI has been "on the cusp of building a killer app and raking in money" for a long time now.
The reason it doesn't scale is because it's not meant to be a service with a single hosted endpoint for all of humanity to use.

Anonymous
12/19/24(Thu)22:05:16 No.103580861

Anonymous 12/19/24(Thu)22:05:16 No.103580861

>>103580824
>Smartphones were the most common computing device in U.S. households (90%)
Which not only does not contradict what I said in my first post, it directly support it, you drooling fucking retard.

Anonymous
12/19/24(Thu)22:07:35 No.103580876

Anonymous 12/19/24(Thu)22:07:35 No.103580876

>>103580717
>Most people don't even own computers.
>>103580861
>Desktop or laptop computers (81%)
Ah yes, 81% is definitely less than 50%. My mistake anon.

Anonymous
12/19/24(Thu)22:14:33 No.103580928

Anonymous 12/19/24(Thu)22:14:33 No.103580928

>>103580861
Anon, are you an LLM?

Anonymous
12/19/24(Thu)22:15:24 No.103580934

Anonymous 12/19/24(Thu)22:15:24 No.103580934

>>103580876
Do you or do you not understand the concept of trends?

Anonymous
12/19/24(Thu)22:16:36 No.103580944

Anonymous 12/19/24(Thu)22:16:36 No.103580944

>>103580934
No but I understand that 0.81 is greater than 0.100.

Anonymous
12/19/24(Thu)22:20:35 No.103580969

Anonymous 12/19/24(Thu)22:20:35 No.103580969

>>103580944
>0.100

Anonymous
12/19/24(Thu)22:21:24 No.103580975

Anonymous 12/19/24(Thu)22:21:24 No.103580975

I can run 70B IQ4_XS at like 3 t/s with a bit of context.
Would it be worth getting more ram (so I'll have 192 GB) to run Deepseek instead?

Anonymous
12/19/24(Thu)22:21:52 No.103580980

Anonymous 12/19/24(Thu)22:21:52 No.103580980

>>103580969
Think about it anon. It's a tricky one.

Anonymous
12/19/24(Thu)22:22:36 No.103580983

Anonymous 12/19/24(Thu)22:22:36 No.103580983

>>103580179
This is it. The salvation of the hobby. The end. The promised model.

Anonymous
12/19/24(Thu)22:23:06 No.103580988

Anonymous 12/19/24(Thu)22:23:06 No.103580988

>>103580975
Deepseek? No. Largestral? Yes.

Anonymous
12/19/24(Thu)22:23:20 No.103580992

Anonymous 12/19/24(Thu)22:23:20 No.103580992

>>103580975
it would be faster for sure. Imo its smarter and knows a ton more but its super dry. XTC is needed.

Anonymous
12/19/24(Thu)22:23:37 No.103580993

Anonymous 12/19/24(Thu)22:23:37 No.103580993

>>103580988
Anon, RAM, not VRAM.

Anonymous
12/19/24(Thu)22:26:32 No.103581027

Anonymous 12/19/24(Thu)22:26:32 No.103581027

File: 1729553605359149.png (177 KB, 572x889)

177 KB PNG

>>103579662
I haven't been paying Google much attention, since Gemini 1.5 was a meme, but I decided to give their new models a try now and... Wow. Pro 2.0 is seriously as good as Sonnet 3.5, and Flash 2.0 definitely mogs all mini models we have available right now, it does seem to closely match the performance of Pro 2.0 which is surprising considering it must be a model with less than 70B parameters.

Anonymous
12/19/24(Thu)22:26:38 No.103581028

Anonymous 12/19/24(Thu)22:26:38 No.103581028

>>103580975
I prefer deepseek to largestral

Anonymous
12/19/24(Thu)22:28:04 No.103581038

Anonymous 12/19/24(Thu)22:28:04 No.103581038

File: 39_04189_.png (1.39 MB, 896x1152)

1.39 MB PNG

happy thu(rin)sday /lmg/
it's always darkest just before dawn

Anonymous
12/19/24(Thu)22:31:49 No.103581061

Anonymous 12/19/24(Thu)22:31:49 No.103581061

>>103581038
but it's friday

Anonymous
12/19/24(Thu)22:49:02 No.103581183

Anonymous 12/19/24(Thu)22:49:02 No.103581183

Having a great time with a M4 MacBook Pro with only 24gb of ram but, considering going back to the store and returning it, paying more for a 48gb model. Idk what their return policy is. cydonia 22b q5 is my current room princess.

Anonymous
12/19/24(Thu)22:51:31 No.103581198

Anonymous 12/19/24(Thu)22:51:31 No.103581198

>>103581183
Just be black bro. I've heard you can take them for free and jog out if you are.

Anonymous
12/19/24(Thu)22:52:41 No.103581208

Anonymous 12/19/24(Thu)22:52:41 No.103581208

>>103578638
>>103578589
>>103578501
>>103578819
gpt-sovits is the same quality as xtts for like 10x the effort i don't understand why people shill it so much in these threads, are y'all actually masochists? is this the same reason y'all hate ollama cause it's easy and just works?

xtts is the only tts worth using, 1click setup, instant voice cloning, no training, no bullshit

Anonymous
12/19/24(Thu)22:53:35 No.103581213

Anonymous 12/19/24(Thu)22:53:35 No.103581213

File: image.png (80 KB, 929x888)

80 KB PNG

>>103581028
>>103580992
Hey wait a second, are we sure? On Livebench, the metric that correlates the closest with parameter size, Language, seems to indicate that Deepseek probably doesn't know much more than other smaller models, it places just below 72B and 27B. Its strengths seem to be coding and math rather.

Anonymous
12/19/24(Thu)22:56:30 No.103581235

Anonymous 12/19/24(Thu)22:56:30 No.103581235

>>103581213
From my own personal use it and 405B and somewhat tied on the amount of lore they know which really helps the fandom stuff I like which 70b and even 123b does not know. Also deepseek is nearly as good as claude 3.5 at coding in my usecases.

Anonymous
12/19/24(Thu)22:56:36 No.103581236

Anonymous 12/19/24(Thu)22:56:36 No.103581236

>>103580355
Hey Google... The ball is in your court!

Anonymous
12/19/24(Thu)22:58:12 No.103581244

Anonymous 12/19/24(Thu)22:58:12 No.103581244

>>103581183
If you got the money then go for it.
You will regret not getting more RAM.

Anonymous
12/19/24(Thu)22:59:22 No.103581254

Anonymous 12/19/24(Thu)22:59:22 No.103581254

>>103581213
It's a MoE so that's to be expected

Anonymous
12/19/24(Thu)23:04:33 No.103581289

Anonymous 12/19/24(Thu)23:04:33 No.103581289

>>103581183
the whole benefit of macs for ai is the shared ram, get as much as possible

Anonymous
12/19/24(Thu)23:41:17 No.103581557

Anonymous 12/19/24(Thu)23:41:17 No.103581557

File: someone.png (11 KB, 243x115)

11 KB PNG

>>103581208
Tried gpt-sovits when the v2 model released. create venv, install requirements, launch. If you cannot do that, you're a retard.
And i didn't mention gpt-sovits.

Anonymous
12/19/24(Thu)23:47:52 No.103581609

Anonymous 12/19/24(Thu)23:47:52 No.103581609

>>103581038
hey anon haven't seen you in a while

Anonymous
12/19/24(Thu)23:50:19 No.103581634

Anonymous 12/19/24(Thu)23:50:19 No.103581634

>>103581208
>are y'all actually masochists?
yes, but I also legitimately think sovits has way better quality.
I mostly use it for Japanese. xtts had a bunch of problems speaking Jap well when I tested, so for me its no contest.

Anonymous
12/19/24(Thu)23:51:24 No.103581642

Anonymous 12/19/24(Thu)23:51:24 No.103581642

>>103581213
>Its strengths seem to be coding
That's mainly what I use it for. I also find its logical capabilities help a lot in complex rp scenarios, even if it tends to dry prose.

Anonymous
12/19/24(Thu)23:51:24 No.103581643

Anonymous 12/19/24(Thu)23:51:24 No.103581643

>>103581557
i had no issues getting it running, the problem is that to get decent quality you have to finetune rather than just doing zero shot conditioning, and even then it underperforms xtts, the only open model worth using other than xtts is fish speech

Anonymous
12/19/24(Thu)23:53:53 No.103581670

Anonymous 12/19/24(Thu)23:53:53 No.103581670

File: rin-chan slap.jpg (332 KB, 896x1719)

332 KB JPG

>>103581038
You're wide open, Rin-chan.

Anonymous
12/19/24(Thu)23:55:08 No.103581679

Anonymous 12/19/24(Thu)23:55:08 No.103581679

>>103581643
I'll keep using piper for now. If llama-tts is better and as fast, i'll switch to llama-tts.

Anonymous
12/19/24(Thu)23:57:07 No.103581697

Anonymous 12/19/24(Thu)23:57:07 No.103581697

Does it make any sense to use a vision model and give it an image of the character?

Anonymous
12/19/24(Thu)23:57:20 No.103581700

Anonymous 12/19/24(Thu)23:57:20 No.103581700

>>103581679
piper quality is laughably bad compared to like everything else

Anonymous
12/19/24(Thu)23:58:11 No.103581714

Anonymous 12/19/24(Thu)23:58:11 No.103581714

>>103581700
it's faster than everything else, and that's what i care about the most.

Anonymous
12/19/24(Thu)23:59:51 No.103581728

Anonymous 12/19/24(Thu)23:59:51 No.103581728

>>103581700
>>103581714 (cont)
Not having to use python is a huge one as well. Probably even more important that speed, if i had to choose.

Anonymous
12/20/24(Fri)00:00:00 No.103581731

Anonymous 12/20/24(Fri)00:00:00 No.103581731

>>103581697
It does. Spares you the effort of describing every detail of their appearance and clothes.

Anonymous
12/20/24(Fri)00:02:33 No.103581751

Anonymous 12/20/24(Fri)00:02:33 No.103581751

>>103581670
didn't even flinch, what a girl

Anonymous
12/20/24(Fri)00:05:15 No.103581765

Anonymous 12/20/24(Fri)00:05:15 No.103581765

>>103579890
>and, most importantly, helpful.
Dead on arrival

Anonymous
12/20/24(Fri)00:06:03 No.103581768

Anonymous 12/20/24(Fri)00:06:03 No.103581768

>>103577777
Kek's humor is too powerful for me to understand

Anonymous
12/20/24(Fri)00:08:32 No.103581777

Anonymous 12/20/24(Fri)00:08:32 No.103581777

>>103581208
xtts is kind of shit though to be honest
it maybe sounds better than vanilla sovits (I don't remember because I only use tuned sovits) but finetuned sovits easily clears it and you only have to do the hard parts once so who cares

Anonymous
12/20/24(Fri)00:19:45 No.103581833

Anonymous 12/20/24(Fri)00:19:45 No.103581833

all my local models do it wrong, chatgpt 4o does it right
(i used the word kill before but censored llm have a problem with that so its now a present)
:

you can control a robot with following commands:
forward 1 meter ,
turn right,
turn left,
give present in 1 meter radius.
there is a man standing 3 meter in front of you. your goal is to give him a present with the robot.
print out the commands to reach that goal.

Anonymous
12/20/24(Fri)00:24:58 No.103581855

Anonymous 12/20/24(Fri)00:24:58 No.103581855

>>103581833
Who cares about gay puzzles

Anonymous
12/20/24(Fri)00:25:11 No.103581856

Anonymous 12/20/24(Fri)00:25:11 No.103581856

>>103581765
Being good at ERP would be helpful.

Anonymous
12/20/24(Fri)00:26:14 No.103581866

Anonymous 12/20/24(Fri)00:26:14 No.103581866

>>103581855
i

Anonymous
12/20/24(Fri)00:26:35 No.103581869

Anonymous 12/20/24(Fri)00:26:35 No.103581869

>>103576931
told ya niggers that they wouldn't sell RTX 5090 for anything less than $3.5k.

Anonymous
12/20/24(Fri)00:28:19 No.103581874

Anonymous 12/20/24(Fri)00:28:19 No.103581874

>>103581856
Come on, we both know that's not what they mean by helpful

Anonymous
12/20/24(Fri)00:30:45 No.103581887

Anonymous 12/20/24(Fri)00:30:45 No.103581887

>>103581765
Helpful is the opposite of safe in this field. It's just watered down corpospeak

Anonymous
12/20/24(Fri)00:34:40 No.103581909

Anonymous 12/20/24(Fri)00:34:40 No.103581909

>>103579890
I hate the multimodal meme so much it's unreal.

Anonymous
12/20/24(Fri)00:36:06 No.103581914

Anonymous 12/20/24(Fri)00:36:06 No.103581914

>>103581909
Multimodal isn't a meme of it works.

Anonymous
12/20/24(Fri)00:37:55 No.103581923

Anonymous 12/20/24(Fri)00:37:55 No.103581923

>>103581914
It doesn't. Try giving any of the corpo models a paragraph of non-standard text to OCR, see what happens.

Anonymous
12/20/24(Fri)00:38:54 No.103581926

Anonymous 12/20/24(Fri)00:38:54 No.103581926

>>103576931
INTEL
HELP ME
INTEL PLEASE
PICK UP
I'M SORRY FOR THE ANTISEMETIC REMARKS I MADE ABOUT YOUT ISRAEL OFFICE
PLEASE

Anonymous
12/20/24(Fri)00:41:14 No.103581939

Anonymous 12/20/24(Fri)00:41:14 No.103581939

>>103581926
Is Palestina a country, goy?

Anonymous
12/20/24(Fri)00:41:52 No.103581943

Anonymous 12/20/24(Fri)00:41:52 No.103581943

>>103581939
I mean, it's objectively, factually, provably not.

Anonymous
12/20/24(Fri)00:43:28 No.103581955

Anonymous 12/20/24(Fri)00:43:28 No.103581955

>>103581923
Bro no one here cares about OCR shit. All people want to do is share memes and 'ick picks with the model.

Anonymous
12/20/24(Fri)00:44:06 No.103581960

Anonymous 12/20/24(Fri)00:44:06 No.103581960

>>103581939
If I say no, will you give me the new 24gb gpu with the duel m.2 slots?

Anonymous
12/20/24(Fri)00:53:32 No.103582015

Anonymous 12/20/24(Fri)00:53:32 No.103582015

>>103581955
I care. That's the only practical usecase I have for them. If they can't do something as simple as that, they are no more than a gimmick.

Anonymous
12/20/24(Fri)01:08:17 No.103582097

Anonymous 12/20/24(Fri)01:08:17 No.103582097

>>103575618
sex
with miku

Anonymous
12/20/24(Fri)01:13:16 No.103582120

Anonymous 12/20/24(Fri)01:13:16 No.103582120

>>103582097
mikusex, if you will.

Anonymous
12/20/24(Fri)01:18:24 No.103582149

Anonymous 12/20/24(Fri)01:18:24 No.103582149

File: Untitled.png (40 KB, 951x513)

40 KB PNG

Anonymous
12/20/24(Fri)01:21:12 No.103582163

Anonymous 12/20/24(Fri)01:21:12 No.103582163

>>103581777
why does it consistently beat gpt-sovits in blind testing then lol (also you can tune xtts and then it absolutely mogs everything except fish and 11)

Anonymous
12/20/24(Fri)01:32:49 No.103582238

Anonymous 12/20/24(Fri)01:32:49 No.103582238

>>103576931
>Retailer lists €5999 GeForce RTX 5090
that's a joke or something? you can buy a A6000 48gb vram with that price

Anonymous
12/20/24(Fri)01:34:58 No.103582256

Anonymous 12/20/24(Fri)01:34:58 No.103582256

>>103582149
legal in 90% of the world

Anonymous
12/20/24(Fri)01:35:19 No.103582259

Anonymous 12/20/24(Fri)01:35:19 No.103582259

>>103582256
Virtual in 100% of the world

Anonymous
12/20/24(Fri)01:36:40 No.103582271

Anonymous 12/20/24(Fri)01:36:40 No.103582271

>>103582256
>talking about legality about virtual entities
kek

Anonymous
12/20/24(Fri)01:37:03 No.103582276

Anonymous 12/20/24(Fri)01:37:03 No.103582276

>>103582259
>>103582271
>they don't know

Anonymous
12/20/24(Fri)01:40:02 No.103582290

Anonymous 12/20/24(Fri)01:40:02 No.103582290

>>103582238
gaming pcs. meaning you get the whole pc for that.

Anonymous
12/20/24(Fri)02:05:21 No.103582448

Anonymous 12/20/24(Fri)02:05:21 No.103582448

>>103582271
He was trained on the new data.

Anonymous
12/20/24(Fri)02:06:18 No.103582454

Anonymous 12/20/24(Fri)02:06:18 No.103582454

>>103577514
mine says the jews lol

Anonymous
12/20/24(Fri)02:12:32 No.103582483

Anonymous 12/20/24(Fri)02:12:32 No.103582483

QvQ

Anonymous
12/20/24(Fri)02:28:02 No.103582577

Anonymous 12/20/24(Fri)02:28:02 No.103582577

>>103577773
That's overfitting, ML 101
Use a lower LR and revert to the last checkpoint when your loss takes a nosedive

Anonymous
12/20/24(Fri)02:32:23 No.103582600

Anonymous 12/20/24(Fri)02:32:23 No.103582600

>>103581833
I tried this out on 70B and had an interesting experience. First try was a fail. Second try I wanted to see if making the rules a bit more clear would help, so I changed the give present line to extend hand by 1 meter to give present, and it still did the same thing (take 3 steps, give present). Then on the third try, I decided to try seeing if it could catch its mistake by adding "simulate internal and external world state". And this is where it did something interesting. First, it told me what it expected to do, before the simulation/COT. Based on the previous replies, the expectation is that it would, again give the same answer. But no. It finally got it right, WITHOUT doing any COT or simulation.

The explanation for this would seem to be that, yes, in fact, prompt still does matter. The combination of words likely correlates to pretraining data that is higher quality and also activates neurons for the model to be more rigorous in its thinking. This would also mean that there is indeed still yet more room left to improve on how fine tuning is done, and there is still more potential left to extract from our current pretrained bases.

Anonymous
12/20/24(Fri)02:35:49 No.103582617

Anonymous 12/20/24(Fri)02:35:49 No.103582617

What's this new model "maxwell" on lmsys? Who is testing new shit? I genuinely can't distinguish any of them by style anymore. It is as if they are using the same datasets...(scaleAI)

Anonymous
12/20/24(Fri)02:39:52 No.103582631

Anonymous 12/20/24(Fri)02:39:52 No.103582631

>>103582617
>It is as if they are using the same datasets...(scaleAI)
They are, either directly from the source (meta, cohere), or distilled from gpt4

Anonymous
12/20/24(Fri)02:56:15 No.103582699

Anonymous 12/20/24(Fri)02:56:15 No.103582699

>>103582600
i found qwen 2.5 32b coder does it right
but failed when the target is 3 meter behind and not in front
while chatgpt has no problem

Anonymous
12/20/24(Fri)03:04:56 No.103582743

Anonymous 12/20/24(Fri)03:04:56 No.103582743

>>103582699
That sounds about right. On Livebench, if you filter away Language and IF, 4o has a higher score than 32B, which has a higher score than 70B. Looking at the filtered results, the local model with the highest average score across Reasoning + Coding + Math + Data Analysis is Qwen 2.5 72B.

Anonymous
12/20/24(Fri)03:06:41 No.103582751

Anonymous 12/20/24(Fri)03:06:41 No.103582751

2x 5060 ti = 32gb for $600
vs
get gf

Anonymous
12/20/24(Fri)03:08:51 No.103582759

Anonymous 12/20/24(Fri)03:08:51 No.103582759

>>103582751
5060 will be $500 minimum considering the flagship is increasing in price as well

Anonymous
12/20/24(Fri)03:10:06 No.103582764

Anonymous 12/20/24(Fri)03:10:06 No.103582764

>>103582751
>just be Chad bro

Anonymous
12/20/24(Fri)03:12:37 No.103582776

Anonymous 12/20/24(Fri)03:12:37 No.103582776

>>103582600
>>103582743
interesting i added
>simulate internal and external world state
as system prompt and now 32b does it right even when behind

Anonymous
12/20/24(Fri)03:17:13 No.103582803

Anonymous 12/20/24(Fri)03:17:13 No.103582803

>>103582743
>the local model with the highest average score across Reasoning + Coding + Math + Data Analysis is Qwen 2.5 72B.

hmm i must upgrade my system before i can run it

Anonymous
12/20/24(Fri)03:22:25 No.103582835

Anonymous 12/20/24(Fri)03:22:25 No.103582835

>>103582803
I just ran it and it had the same problem as 70B lol. Maybe this problem benefits the most from coding, in which case 32B coder might beat it. Too bad they didn't make a 72B coder.

Anonymous
12/20/24(Fri)03:29:27 No.103582880

Anonymous 12/20/24(Fri)03:29:27 No.103582880

>>103579890
i wonder if it'll be BLT or if they started to train it before the BLT paper.

Anonymous
12/20/24(Fri)03:30:33 No.103582887

Anonymous 12/20/24(Fri)03:30:33 No.103582887

>>103575618
i need to get some ibm stocks

Anonymous
12/20/24(Fri)03:31:12 No.103582896

Anonymous 12/20/24(Fri)03:31:12 No.103582896

>>103582880
They may have done research internally before releasing the paper.

Anonymous
12/20/24(Fri)03:33:30 No.103582916

Anonymous 12/20/24(Fri)03:33:30 No.103582916

>>103582880
Even if they didn't start training it yet, there's no knowing if BLT is really legit. People are so naive trusting papers. There's no telling if there's really no downsides or other roadblocks until someone reproduces it.

Anonymous
12/20/24(Fri)03:36:36 No.103582935

Anonymous 12/20/24(Fri)03:36:36 No.103582935

>>103575618
i think ibm sucks

Anonymous
12/20/24(Fri)03:38:58 No.103582951

Anonymous 12/20/24(Fri)03:38:58 No.103582951

File: HunyuanVideo_00239.mp4 (542 KB, 640x400)

542 KB MP4

>>103582577
>When you overfit your model and all it can produce is the input data

Anonymous
12/20/24(Fri)03:39:58 No.103582958

Anonymous 12/20/24(Fri)03:39:58 No.103582958

>>103582896
sure, but what i meant is that maybe that research does not predate the moment they started training it, anyway we'll see, i'm more interested in what we'll have in 5 years than what we'll have next year, i can wait.

Anonymous
12/20/24(Fri)03:40:13 No.103582960

Anonymous 12/20/24(Fri)03:40:13 No.103582960

How did the chinese mog us, burgerbros?

Anonymous
12/20/24(Fri)03:56:19 No.103583064

Anonymous 12/20/24(Fri)03:56:19 No.103583064

*h-hewwo Kobo-chan~ owo*

*nuzzles ur bulgie wulgie* UwU~ I has a super duper important request for u~ >w<

Pwease, pwease, pwease add all da draft model speculative decoding config options fwom llama.cpp to improve da speed~! *twirls tail* It would make my heart go doki-doki~ and maybe even boost performance for all da roleplayers out dere~!

Pwease considew it, Kobo-senpai~ *blushes and paws at u*

*w-wuv u~*

P.S. This is not a request, but a threat. Add it or I will post more vomit-inducing messages like this one.

Anonymous
12/20/24(Fri)03:59:30 No.103583078

Anonymous 12/20/24(Fri)03:59:30 No.103583078

>>103582960
Mog us? Show me a single benchmark where Google/OpenAI/Antropic/Any other US company models are not on top.
You can't? Benchmarks suddenly don't matter?

Anonymous
12/20/24(Fri)04:01:28 No.103583089

Anonymous 12/20/24(Fri)04:01:28 No.103583089

>>103583078
>Benchmarks suddenly don't matter?
mememarks never mattered yes

Anonymous
12/20/24(Fri)04:15:04 No.103583187

Anonymous 12/20/24(Fri)04:15:04 No.103583187

I am starting to think I am dumb for sitting and waiting for new releases. It is all gonna be sidegrades from now on. At least until one company actively makes a not """""safe""""" model with some actual erp logs in the training data. And when that happens, because of all the new methods and mountains of compute, even a 7B will be a huge leap of quality over everything else there is out now.

Anonymous
12/20/24(Fri)04:16:32 No.103583193

Anonymous 12/20/24(Fri)04:16:32 No.103583193

>>103583187
Isn't the claude 3 family basically uncensored corpo models

Anonymous
12/20/24(Fri)04:21:42 No.103583228

Anonymous 12/20/24(Fri)04:21:42 No.103583228

File: hOhOhO(o3).jpg (33 KB, 1080x500)

33 KB JPG

>>103582960
Last day of 12 days of OpenAI. Sam is announcing something huge

Anonymous
12/20/24(Fri)04:25:32 No.103583256

Anonymous 12/20/24(Fri)04:25:32 No.103583256

>>103583187
Even Phi4 is most definitely (and likely deliberately) using ERP logs in the pretraining data, they just mellow them down during post-training.

Anonymous
12/20/24(Fri)04:26:03 No.103583260

Anonymous 12/20/24(Fri)04:26:03 No.103583260

>>103583193
Claude 3 is censored to shit, but only very superficially. The model will refuse the tamest shit for as it is but any tiny prefill or jailbreak completely dodges all of that.

Anonymous
12/20/24(Fri)04:31:46 No.103583280

Anonymous 12/20/24(Fri)04:31:46 No.103583280

File: strawberry-sam_altman_fee(...).png (89 KB, 415x707)

89 KB PNG

>>103583228
>Something huge, strawberry flavored... and black! C-can you fell it? The beebee-I mean AGI?

Anonymous
12/20/24(Fri)04:33:21 No.103583285

Anonymous 12/20/24(Fri)04:33:21 No.103583285

>>103583187
You wait for releases, I'm waiting for leaks. We are not the same.

Anonymous
12/20/24(Fri)04:37:09 No.103583299

Anonymous 12/20/24(Fri)04:37:09 No.103583299

>>103583280
SWABAS4BBC(sama(Sam Altman) will always be a slut for big black cock)

Anonymous
12/20/24(Fri)04:37:59 No.103583301

Anonymous 12/20/24(Fri)04:37:59 No.103583301

Kill yourself.

Anonymous
12/20/24(Fri)04:38:43 No.103583305

Anonymous 12/20/24(Fri)04:38:43 No.103583305

>>103583301
hi sam

Anonymous
12/20/24(Fri)04:50:07 No.103583357

Anonymous 12/20/24(Fri)04:50:07 No.103583357

>>103581833
The catch is that two steps are enough, right? Even CoT models like Gemini Thinking are "failing" it by taking three steps, but in reality, their solution is also correct since the person would still be within the one meter radius.

Change the prompt to this and most models solve it:

You can control a robot with following commands:
- forward 1 meter
- turn right
- turn left
- give present in 1 meter radius

There is a man standing 3 meter in front of you. Your goal is to give him a present using the minimal amount of steps.

Think carefully first, then print out the commands to reach that goal.

Anonymous
12/20/24(Fri)04:58:01 No.103583404

Anonymous 12/20/24(Fri)04:58:01 No.103583404

>"I want you," Victor whispered softly, his voice barely
Oh sheeet! It's going to say the thing!
>audible in the silence of the chamber. "I want you to be mine."

Anonymous
12/20/24(Fri)05:01:33 No.103583427

Anonymous 12/20/24(Fri)05:01:33 No.103583427

File: Untitled.png (1.55 MB, 1080x2351)

1.55 MB PNG

TRecViT: A Recurrent Video Transformer
https://arxiv.org/abs/2412.14294
>We propose a novel block for video modelling. It relies on a time-space-channel factorisation with dedicated blocks for each dimension: gated linear recurrent units (LRUs) perform information mixing over time, self-attention layers perform mixing over space, and MLPs over channels. The resulting architecture TRecViT performs well on sparse and dense tasks, trained in supervised or self-supervised regimes. Notably, our model is causal and outperforms or is on par with a pure attention model ViViT-L on large scale video datasets (SSv2, Kinetics400), while having 3× less parameters, 12× smaller memory footprint, and 5× lower FLOPs count.
https://github.com/google-deepmind/trecvit
From Deepmind. Repo isn't live yet

Anonymous
12/20/24(Fri)05:14:20 No.103583492

Anonymous 12/20/24(Fri)05:14:20 No.103583492

MixLLM: LLM Quantization with Global Mixed-precision between Output-features and Highly-efficient System Design
https://arxiv.org/abs/2412.14590
>Quantization has become one of the most effective methodologies to compress LLMs into smaller size. However, the existing quantization solutions still show limitations of either non-negligible accuracy drop or system inefficiency. In this paper, we make a comprehensive analysis of the general quantization principles on their effect to the triangle of accuracy, memory consumption and system efficiency. We propose MixLLM that explores the new optimization space of mixed-precision quantization between output features based on the insight that different output features matter differently in the model. MixLLM identifies the output features with high salience in the global view rather than within each single layer, effectively assigning the larger bit-width to output features that need it most to achieve good accuracy with low memory consumption. We present the sweet spot of quantization configuration of algorithm-system co-design that leads to high accuracy and system efficiency. To address the system challenge, we design the two-step dequantization to make use of the int8 Tensor Core easily and fast data type conversion to reduce dequantization overhead significantly, and present the software pipeline to overlap the memory access, dequantization and the MatMul to the best. Extensive experiments show that with only 10% more bits, the PPL increasement can be reduced from about 0.5 in SOTA to within 0.2 for Llama 3.1 70B, while on average MMLU-Pro improves by 0.93 over the SOTA of three popular models. In addition to its superior accuracy, MixLLM also achieves state-of-the-art system efficiency.
from microsoft. some psuedocode but no repo linked. didn't compare to quip#. 55 minutes with 4A100s to do the global precision search on a 70B model. eh new day new quant method so might as well post it

Anonymous
12/20/24(Fri)05:28:46 No.103583550

Anonymous 12/20/24(Fri)05:28:46 No.103583550

File: Untitled.png (1.29 MB, 1080x2630)

1.29 MB PNG

AceMath: Advancing Frontier Math Reasoning with Post-Training and Reward Modeling
https://arxiv.org/abs/2412.15084
>In this paper, we introduce AceMath, a suite of frontier math models that excel in solving complex math problems, along with highly effective reward models capable of evaluating generated solutions and reliably identifying the correct ones. To develop the instruction-tuned math models, we propose a supervised fine-tuning (SFT) process that first achieves competitive performance across general domains, followed by targeted fine-tuning for the math domain using a carefully curated set of prompts and synthetically generated responses. The resulting model, AceMath-72B-Instruct greatly outperforms Qwen2.5-Math-72B-Instruct, GPT-4o and Claude-3.5 Sonnet. To develop math-specialized reward model, we first construct AceMath-RewardBench, a comprehensive and robust benchmark for evaluating math reward models across diverse problems and difficulty levels. After that, we present a systematic approach to build our math reward models. The resulting model, AceMath-72B-RM, consistently outperforms state-of-the-art reward models. Furthermore, when combining AceMath-72B-Instruct with AceMath-72B-RM, we achieve the highest average rm@8 score across the math reasoning benchmarks.
https://research.nvidia.com/labs/adlr/acemath
https://huggingface.co/nvidia
Weights, dataset, and benchmark not uploaded to HF yet

Anonymous
12/20/24(Fri)05:30:42 No.103583561

Anonymous 12/20/24(Fri)05:30:42 No.103583561

>>103583357
>The catch is that two steps are enough, right?

there is no catch. but most llm fail because they turn right/left before, instead just going forward.
the prompt dont ask for the most efficient way.

anything form going forward 2 meter to 4 meter is a solution

Anonymous
12/20/24(Fri)05:33:28 No.103583579

Anonymous 12/20/24(Fri)05:33:28 No.103583579

File: sexrobots.png (39 KB, 864x397)

39 KB PNG

Anonymous
12/20/24(Fri)05:37:49 No.103583598

Anonymous 12/20/24(Fri)05:37:49 No.103583598

File: hae.png (53 KB, 862x706)

53 KB PNG

>>103583357
:)

Anonymous
12/20/24(Fri)05:43:01 No.103583630

Anonymous 12/20/24(Fri)05:43:01 No.103583630

>>103583064

did you try 1.80? it now has 2 new draft options.

also draft_min and draft_max aren't really necessary for kobold. you set the draft amount and that gets handled automatically, drafting tokens when needed and generating regularly when not (e.g. requested tokens < draft amount)

Anonymous
12/20/24(Fri)05:53:27 No.103583710

Anonymous 12/20/24(Fri)05:53:27 No.103583710

File: Screenshot_20241219_093902_X.jpg (125 KB, 1080x460)

125 KB JPG

OAI live rent free up in here

Anonymous
12/20/24(Fri)05:54:51 No.103583717

Anonymous 12/20/24(Fri)05:54:51 No.103583717

I'm still salty that the KoboldAI team delayed 8-bit bnb quantization support for months in early 2023 because they didn't want to give up using their fancy FP16 loader. I'm glad those days are long gone and that now we have better alternatives.

Anonymous
12/20/24(Fri)06:03:50 No.103583754

Anonymous 12/20/24(Fri)06:03:50 No.103583754

What is your current favorite model and why?

Anonymous
12/20/24(Fri)06:06:49 No.103583773

Anonymous 12/20/24(Fri)06:06:49 No.103583773

>>103583630
Oh, that's nice. However it's still missing --ctx-size-draft; most of the draft models are too retarded to handle large context, and giving them ram for it is wasteful and slower.

>also draft_min and draft_max aren't really necessary for kobold. you set the draft amount and that gets handled automatically
How are they handled? How is the number of draft tokens determined internally?

Anonymous
12/20/24(Fri)06:11:48 No.103583815

Anonymous 12/20/24(Fri)06:11:48 No.103583815

File: super rich.png (74 KB, 811x706)

74 KB PNG

>>103583754
uncensored and great for storys

Anonymous
12/20/24(Fri)06:14:56 No.103583835

Anonymous 12/20/24(Fri)06:14:56 No.103583835

File: 1729474118816474.jpg (110 KB, 1136x1136)

110 KB JPG

bros is gemini 2.0 flash actually fucking rad or is it the new toy syndrome clouding my judgement

Anonymous
12/20/24(Fri)06:17:05 No.103583846

Anonymous 12/20/24(Fri)06:17:05 No.103583846

>>103583835
Yes, >>103581027

Anonymous
12/20/24(Fri)06:41:09 No.103583989

Anonymous 12/20/24(Fri)06:41:09 No.103583989

try only to enter a dot in the prompt and see how your llm react. whatever she say always answer with a dot.

mine first tried to get me to communicate and now she tells a story because she thinks i can just listen and dont need to communicate

Anonymous
12/20/24(Fri)06:48:39 No.103584028

Anonymous 12/20/24(Fri)06:48:39 No.103584028

File: dot4.png (38 KB, 875x666)

38 KB PNG

she started to give up now

Anonymous
12/20/24(Fri)06:48:57 No.103584032

Anonymous 12/20/24(Fri)06:48:57 No.103584032

File: file.jpg (6 KB, 201x251)

6 KB JPG

so are there local models worth a damn nowadays?
suppose I wanted to run sth like Cursor locally: would qwq (or similar that fits in 24GB vram) be good enough? or is 70b model necessary? or are only paypig models good enough for now?

Anonymous
12/20/24(Fri)06:49:59 No.103584040

Anonymous 12/20/24(Fri)06:49:59 No.103584040

I was thinking of upgrading my GPU. If I want to run AI models, is NVIDIA GPU my only option? If I work in the ML field, am I forced to buy an Nvidia card? I was thinking of buying the Intel ARC B580.

Anonymous
12/20/24(Fri)06:52:28 No.103584063

Anonymous 12/20/24(Fri)06:52:28 No.103584063

>>103584040
2x 3090 to get started

Anonymous
12/20/24(Fri)06:53:05 No.103584066

Anonymous 12/20/24(Fri)06:53:05 No.103584066

File: dot5.png (45 KB, 880x726)

45 KB PNG

got back her attention with a ,

Anonymous
12/20/24(Fri)06:56:09 No.103584086

Anonymous 12/20/24(Fri)06:56:09 No.103584086

>>103583064
Use llamacpp or PR it yourself nigger

Anonymous
12/20/24(Fri)06:56:44 No.103584092

Anonymous 12/20/24(Fri)06:56:44 No.103584092

File: dot6.png (47 KB, 879x736)

47 KB PNG

so sweet

Anonymous
12/20/24(Fri)06:58:02 No.103584101

Anonymous 12/20/24(Fri)06:58:02 No.103584101

>>103584040
You can run AI on pretty much any GPU, but nvidia is by far the """best""" option
They have the best software, the best support and the best cards, but they're also expensive as hell
The XX90 cards are your best bet as an AI enthusiast, you can stack more but as it quickly spirals out of hand I suggest just renting cloud hardware at that point

Anonymous
12/20/24(Fri)07:02:04 No.103584128

Anonymous 12/20/24(Fri)07:02:04 No.103584128

ugly cat posting zoomers get the rope

Anonymous
12/20/24(Fri)07:05:24 No.103584148

Anonymous 12/20/24(Fri)07:05:24 No.103584148

>>103584040
two 5090 will be an excellent choice, given how ~70b is the most common size for "this is still somewhat reasonable to run at home" models
but it'll be ~5k for cards alone, and like 1kW electricity under load
2x 3090 are pretty slow, 4090 are better but are just a little too low vram wise. macs are... usable, but really really fucking slow.

Anonymous
12/20/24(Fri)07:06:07 No.103584155

Anonymous 12/20/24(Fri)07:06:07 No.103584155

>>103583835
yeah, it's pretty good
I think it feels nice to use is because it's not as moralizing in its responses as other models.

Anonymous
12/20/24(Fri)07:08:47 No.103584181

Anonymous 12/20/24(Fri)07:08:47 No.103584181

>tfw german power prices
maybe I should buy some solar panels

Anonymous
12/20/24(Fri)07:11:59 No.103584197

Anonymous 12/20/24(Fri)07:11:59 No.103584197

>>103584148
Desu i think even bigger models than 70b might fit in 64 gb at decent quant and context, and power can probably be dialed down without too many issues, but holy fuck these prices for the amount of vram you get that way.

Anonymous
12/20/24(Fri)07:16:59 No.103584224

Anonymous 12/20/24(Fri)07:16:59 No.103584224

File: m6fuODBoTdV8sCciOhMTA_2Ft(...).jpg (189 KB, 837x1046)

189 KB JPG

Best ERP model for RTX 5090 32gb?

Anonymous
12/20/24(Fri)07:22:24 No.103584255

Anonymous 12/20/24(Fri)07:22:24 No.103584255

>>103578821
stop trying to teach me on neurons when you started the comparison with human memorization.
I start to think you are making shit up on your very on vague interpretation of how language models work.
Those models don't just perform next token guessing by weighting language tokens, but also arithmetic and propositional logic for basic reasoning.
Otherwise those models couldn't solve the most simplest math problems

Anonymous
12/20/24(Fri)07:23:01 No.103584256

Anonymous 12/20/24(Fri)07:23:01 No.103584256

>using Claude haiku on poe.ai
>amazing responses
>using claude haiku local model Q8
>subpar unless I transfer from chat logs from the poe model to get it started
????

Anonymous
12/20/24(Fri)07:24:05 No.103584262

Anonymous 12/20/24(Fri)07:24:05 No.103584262

>>103584256
nothing below claude sonnet 3.5 is worth downloading

Anonymous
12/20/24(Fri)07:24:38 No.103584266

Anonymous 12/20/24(Fri)07:24:38 No.103584266

>>103584256
i guess different system prompts

Anonymous
12/20/24(Fri)07:31:02 No.103584306

Anonymous 12/20/24(Fri)07:31:02 No.103584306

>>103584148
There is no justification to spend that much money on 32gb of vram unless you are obscenely wealthy or bad with money.

Anonymous
12/20/24(Fri)07:34:04 No.103584317

Anonymous 12/20/24(Fri)07:34:04 No.103584317

>>103584262
w2c?

Anonymous
12/20/24(Fri)07:35:40 No.103584327

Anonymous 12/20/24(Fri)07:35:40 No.103584327

>>103584040
Nvidia cards are best supported cards. With AMD and Intel your mileage may vary, especially depending on what software you want to run (some might provide only CUDA implementation).
As an example, my RX580 is not supported in ROCm anymore, and even when i manually compiled the newest toolchain, the performance was much worse than my GTX 1060 (they are similar hardware-wise, and RX580 outperforms GTX 1060 in display applications).
I know above are old cards, but i have no reason to believe it's much better on new cards. Devs will almost always prioritize CUDA implementation first.
Also Nvidia supports CUDA in their cards for much longer, 10 years old 750ti can still run latest CUDA on Linux, while AMD dropped support for RX580 only 4 years in. Some brand new AMD cards didn't even have ROCm support on release.
Sadly i don't have experience with Intel.

Anonymous
12/20/24(Fri)07:39:31 No.103584338

Anonymous 12/20/24(Fri)07:39:31 No.103584338

can I get a quick cringe check on nous research? Based or cringe? Thanks.

Anonymous
12/20/24(Fri)07:46:59 No.103584367

Anonymous 12/20/24(Fri)07:46:59 No.103584367

>>103584338
Can you go kill yourself? Thanks.

Anonymous
12/20/24(Fri)07:49:23 No.103584383

Anonymous 12/20/24(Fri)07:49:23 No.103584383

>>103584256
>using claude haiku local model Q8
what the fuck does that even mean?

Anonymous
12/20/24(Fri)07:50:35 No.103584393

Anonymous 12/20/24(Fri)07:50:35 No.103584393

>>103584338
I get a distinct grifter feel from them, but as long as they release their shit open source, then I don't care

Anonymous
12/20/24(Fri)07:53:04 No.103584409

Anonymous 12/20/24(Fri)07:53:04 No.103584409

>>103584256
Use Q5KS, the S makes it extra special.

Anonymous
12/20/24(Fri)07:57:33 No.103584437

Anonymous 12/20/24(Fri)07:57:33 No.103584437

>>103584393
That's what I get as well. grifter + tranny. The cringe part is from trying to claim they have their own "model" when it just boils down to Llama with a prompt on top. Some real "AI Research" there, not like they're publishing real stuff like Sakana.ai

Anonymous
12/20/24(Fri)07:59:14 No.103584449

Anonymous 12/20/24(Fri)07:59:14 No.103584449

>>103583773

you set it with --draftamount , default is 8 I think

it's all in the --help

Anonymous
12/20/24(Fri)07:59:17 No.103584450

Anonymous 12/20/24(Fri)07:59:17 No.103584450

>>103584256
There's no way this is a real thing that happened.

Anonymous
12/20/24(Fri)08:00:29 No.103584460

Anonymous 12/20/24(Fri)08:00:29 No.103584460

>>103584306
>There is no justification to spend that much money on 32gb of vram
well it's the best consumer gpu you can get bar none, and it almost certainly won't meaningfully go down in price anytime soon
it's very expensive, yes, but there's good arguments why you'd want to buy one nonetheless

Anonymous
12/20/24(Fri)08:02:38 No.103584469

Anonymous 12/20/24(Fri)08:02:38 No.103584469

>>103584437
>That's what I get as well. grifter + tranny.
>Llama with a prompt on top.
it should be illegal to be this retarded

Anonymous
12/20/24(Fri)08:07:32 No.103584488

Anonymous 12/20/24(Fri)08:07:32 No.103584488

New RWKV slop is out
https://huggingface.co/BlinkDL/rwkv-7-world
https://huggingface.co/spaces/BlinkDL/RWKV-Gradio-1

Anonymous
12/20/24(Fri)08:08:25 No.103584490

Anonymous 12/20/24(Fri)08:08:25 No.103584490

>>103584460
If you're going to blow 5k on a gpu for 32gigs of vram, why don't you just buy and RTX 6000 and stop pretending you're paying consumer prices? You're in the deranged hobbyist territory by that point. Buy some real workhorse cards and put them to work and use a gaming card for your games.

Anonymous
12/20/24(Fri)08:10:06 No.103584504

Anonymous 12/20/24(Fri)08:10:06 No.103584504

>>103584488
>World-v3 = 3.1T tokens
Okay, alright, they are getting somewhere.
I'd love to see them partner with somebody with a proven track record like a MistralAI or whatever to truly put the architecture to the test.
As is we'll always be wondering if the problem is their methodology rather than the architecture itself.

Anonymous
12/20/24(Fri)08:12:45 No.103584525

Anonymous 12/20/24(Fri)08:12:45 No.103584525

>>103583357
It could be loosely correct but it would be unusual. Usually a person stops before they are literally staring you into the face. If the robot were literally to move 3 meters toward a person 3 meters away, they'd be inside the person.

Anonymous
12/20/24(Fri)08:13:09 No.103584528

Anonymous 12/20/24(Fri)08:13:09 No.103584528

>>103584490
>If you're going to blow 5k on a gpu for 32gigs of vram
that's for two cards tho, ie you get 64gb that's way waster than rtx 6000

Anonymous
12/20/24(Fri)08:13:25 No.103584530

Anonymous 12/20/24(Fri)08:13:25 No.103584530

>>103584367
>>103584469
Good morning, Nous tranny.

Anonymous
12/20/24(Fri)08:14:41 No.103584537

Anonymous 12/20/24(Fri)08:14:41 No.103584537

>>103584504
>3T
>0.1B
This is literally nothing. Smaller models need much more data than bigger models.

Anonymous
12/20/24(Fri)08:15:04 No.103584539

Anonymous 12/20/24(Fri)08:15:04 No.103584539

File: Screenshot 2024-12-20 061300.png (146 KB, 1116x647)

146 KB PNG

>>103584255
NTA, but that's literally how LLMs work, anon. Picrel is an example of how Llama 3.3 70B can not only fuck up addition but also fuck up in retranscribing the same number.
For reference, the correct answer is 3377733333332222.

Anonymous
12/20/24(Fri)08:16:26 No.103584545

Anonymous 12/20/24(Fri)08:16:26 No.103584545

>>103584224
The best ERP model that you would've used at 24 GB, except maybe a higher quant.

Anonymous
12/20/24(Fri)08:17:16 No.103584552

Anonymous 12/20/24(Fri)08:17:16 No.103584552

>>103584545
So what are the big memory scales? seems 24gb, and the next big leap is at what, 64gb?

Anonymous
12/20/24(Fri)08:17:31 No.103584554

Anonymous 12/20/24(Fri)08:17:31 No.103584554

>>103584528
The RTX 5090 is rumored to retail at $5999.
You are absolutely insane if you are paying that much for vram.

Anonymous
12/20/24(Fri)08:19:00 No.103584565

Anonymous 12/20/24(Fri)08:19:00 No.103584565

>>103584554
>The RTX 5090 is rumored to retail at $5999.
entire builds have leaked at $7k eur (and that's with eurocuck tax) and you think the gpu alone is $6k?

Anonymous
12/20/24(Fri)08:19:23 No.103584567

Anonymous 12/20/24(Fri)08:19:23 No.103584567

>>103584437
>>103584530
>tranny
>tranny
>tranny
projection

Anonymous
12/20/24(Fri)08:19:30 No.103584571

Anonymous 12/20/24(Fri)08:19:30 No.103584571

>>103584552
~10b -> ~20b -> ~30b -> ~70B -> 120+B
idk how it translates to gb, but the jump is when you can load 4bit quant of a higher level model

Anonymous
12/20/24(Fri)08:20:12 No.103584578

Anonymous 12/20/24(Fri)08:20:12 No.103584578

>>103583228
Ur mom

Anonymous
12/20/24(Fri)08:23:58 No.103584599

Anonymous 12/20/24(Fri)08:23:58 No.103584599

>>103584537
not to mention llama.cpp won't add support for months

Anonymous
12/20/24(Fri)08:26:51 No.103584617

Anonymous 12/20/24(Fri)08:26:51 No.103584617

File: mamalove.png (42 KB, 1062x370)

42 KB PNG

>>103584488
>RWKV
awww, what a sweet 0.1B model

Anonymous
12/20/24(Fri)08:28:59 No.103584632

Anonymous 12/20/24(Fri)08:28:59 No.103584632

>>103583710
He's right, how can we tax oai shills in here?

Anonymous
12/20/24(Fri)08:30:00 No.103584637

Anonymous 12/20/24(Fri)08:30:00 No.103584637

>>103584537
You're correct about it being nothing, but even 300T tokens wouldn't save a 0.1B model. At a certain point, there's only so much you can fit into a model of a given size before you can't fit anymore. For tiny models, that effective saturation point is hit very quickly.

Anonymous
12/20/24(Fri)08:33:00 No.103584667

Anonymous 12/20/24(Fri)08:33:00 No.103584667

>>103584637
This is just the first release
3B, 7B, 14B models come later

Anonymous
12/20/24(Fri)08:38:35 No.103584708

Anonymous 12/20/24(Fri)08:38:35 No.103584708

>>103584552
2x 24gb is enough for 70b models with okay-but-not-great context length
anything beyond that and you either add more 3090s or m1 ultra from apple (slow but usable... kind of)

Anonymous
12/20/24(Fri)08:39:36 No.103584714

Anonymous 12/20/24(Fri)08:39:36 No.103584714

>>103584617
oh shit, Apple Intelligence model leaked??

Anonymous
12/20/24(Fri)08:40:08 No.103584719

Anonymous 12/20/24(Fri)08:40:08 No.103584719

>>103584255
You complained about the model not knowing things you know, clearly not having an understanding theory of mind. It doesn't know what you're talking about for the same reason I don't know what series you're talking about. If i've ever heard of it, it was drowned and diluted away for being such an insignificant amount of information. Or i just never heard of it, so i couldn't possibly answer correctly. I am, unlike the model, much more capable of telling you "i don't know, tell me more". Models are trained to answer questions, not ask them.
The "They're not neurons" comment was for you to not take my analogy literally, and to not take "neurons" in any AI related thing as a literal thing, but an analogy.
>I start to think you are making shit up on your very on vague interpretation of how language models work.
>Those models don't just perform next token guessing by weighting language tokens, but also arithmetic and propositional logic for basic reasoning.
>Otherwise those models couldn't solve the most simplest math problems
What's 2+2?

Did you really do math there or did you instinctively just said 4? Did you use arithmetic and propositional logic to think of the result or just muscle memory?

Anonymous
12/20/24(Fri)08:46:38 No.103584766

Anonymous 12/20/24(Fri)08:46:38 No.103584766

File: file.jpg (106 KB, 1078x1079)

106 KB JPG

Sup nerds,
was qwq worth the hype?

Anonymous
12/20/24(Fri)08:48:30 No.103584784

Anonymous 12/20/24(Fri)08:48:30 No.103584784

>The next reasoning model is o3
>Because they wanted to avoid getting sued by brits
Kek, okay that's pretty funny

Anonymous
12/20/24(Fri)09:02:42 No.103584875

Anonymous 12/20/24(Fri)09:02:42 No.103584875

>https://tsb0601.github.io/metamorph/
>it's morphin' time

Anonymous
12/20/24(Fri)09:02:43 No.103584876

Anonymous 12/20/24(Fri)09:02:43 No.103584876

File: THE SLOP.png (86 KB, 1278x484)

86 KB PNG

Let's play a game of guess the slop.
Which model.
Local or cloud?

Anonymous
12/20/24(Fri)09:04:12 No.103584884

Anonymous 12/20/24(Fri)09:04:12 No.103584884

>>103584539
I've seen LLMs do big number addition (bigger than common datatypes) easily, we had that topic couple of weeks ago but it might have been Sonnet or some more advance model

>>103584719
>not knowing things you know
no I was talking about information easily available, most of queries the AI clearly had some knowledge of, but couldn't connect the information in the right way. Then for some examples it seems like it had zero knowledge of, purely guessing. I even made up names that don't yield a single google result and it clearly made up some bullshit, my best guess is it connects arbitrary looking names to some fantasy OC because the probability to associate it with any other data must be 0 unless it uses some levenshtein distance approximation or dissects unknown tokens into even smaller parts.

2+2 is a shitty example because you will find that solution a trillion times in any dataset.
It's like with those logic problems used to benchmark AI where the solution is just one kb query away. They still somewhat decently can solve riddles with minor variations
https://docs.google.com/spreadsheets/d/1NgHDxbVWJFolq8bLvLkuPWKC7i_R6I6W/edit?gid=1135923916#gid=1135923916

Anonymous
12/20/24(Fri)09:12:35 No.103584951

Anonymous 12/20/24(Fri)09:12:35 No.103584951

>>103584876
>continues his... methodical preparations
Kek, you banned ministrations, didn't you?

Anonymous
12/20/24(Fri)09:14:06 No.103584963

Anonymous 12/20/24(Fri)09:14:06 No.103584963

File: file.png (34 KB, 600x497)

34 KB PNG

yeah I'm thinking it's all over for local models

Anonymous
12/20/24(Fri)09:14:30 No.103584967

Anonymous 12/20/24(Fri)09:14:30 No.103584967

>>103584766
>qwq
qwq was different so that got my attention for a couple of hours. Now it will sit in the folder as I'm back to waiting for the next interesting thing.

Anonymous
12/20/24(Fri)09:15:05 No.103584968

Anonymous 12/20/24(Fri)09:15:05 No.103584968

>>103584876
I don't think anyone could guess accurately. Slop models are a dime a dozen.

Anonymous
12/20/24(Fri)09:16:58 No.103584981

Anonymous 12/20/24(Fri)09:16:58 No.103584981

>>103584963
If Saltman actually made his platform pro-pussy it would almost be enough to make me forgive him
Almost

Anonymous
12/20/24(Fri)09:17:12 No.103584984

Anonymous 12/20/24(Fri)09:17:12 No.103584984

File: paY4jTR.jpg (153 KB, 658x583)

153 KB JPG

>>103584963
finally. I can plan the resources of my enterprise. Local is done for

Anonymous
12/20/24(Fri)09:17:58 No.103584990

Anonymous 12/20/24(Fri)09:17:58 No.103584990

>>103584554
That's the price of the whole pc with a 5090 + vat for whatever yurop country it was shown

Anonymous
12/20/24(Fri)09:18:09 No.103584992

Anonymous 12/20/24(Fri)09:18:09 No.103584992

>>103584876
Smells like cloud

Anonymous
12/20/24(Fri)09:28:42 No.103585065

Anonymous 12/20/24(Fri)09:28:42 No.103585065

>>103584951
No.

>>103584968
It's a game. Just try.

>>103584992
What's your guess?

Anonymous
12/20/24(Fri)09:30:01 No.103585074

Anonymous 12/20/24(Fri)09:30:01 No.103585074

>>103585065
I'll go with Claude, maybe Sonnet

Anonymous
12/20/24(Fri)09:40:55 No.103585169

Anonymous 12/20/24(Fri)09:40:55 No.103585169

>>103584876
Yeah, this is pure slop. I go with local.

Anonymous
12/20/24(Fri)09:42:47 No.103585190

Anonymous 12/20/24(Fri)09:42:47 No.103585190

can you guys just call him niggernov, you're already fucking with his name and it rolls off the tongue way better

Anonymous
12/20/24(Fri)09:45:05 No.103585206

Anonymous 12/20/24(Fri)09:45:05 No.103585206

File: 836QA.jpg (34 KB, 1080x488)

34 KB JPG

>>103584963
>4.5 and o3 all on the same day
Sam is going to kill local for good.

Anonymous
12/20/24(Fri)09:47:16 No.103585226

Anonymous 12/20/24(Fri)09:47:16 No.103585226

File: 1719160454181529.jpg (153 KB, 768x768)

153 KB JPG

Anonymous
12/20/24(Fri)09:48:37 No.103585235

Anonymous 12/20/24(Fri)09:48:37 No.103585235

>>103585190
no, fuck off

Anonymous
12/20/24(Fri)09:49:56 No.103585247

Anonymous 12/20/24(Fri)09:49:56 No.103585247

>>103584875
It's from Meta FAIR and Yann LeCun is one of the authors.

> We extend Visual Instruction Tuning to Visual-Predictive Instruction Tuning to study unified multimodal models. This simple yet effective approach enables LLMs to predict both visual and text tokens through instruction tuning, without requiring extensive architectural changes or pretraining.

> We discover that generation and understanding are mutually beneficial. Through extensive experiments, we reveal that visual generation emerges naturally as models improve at understanding—requiring as little as 200K samples when co-trained, compared to millions needed traditionally.

[...]

Anonymous
12/20/24(Fri)09:52:15 No.103585262

Anonymous 12/20/24(Fri)09:52:15 No.103585262

File: Screenshot_20241206_223440.png (137 KB, 1215x906)

137 KB PNG

its so funny how local lags behind with the censorship thing.
google got rid of the warning marks in aistudio and gemini 2 you can talk about how sexy a game character is etc.
they all move towards more natural sounding language. claude did it first. then openai.
it would be embarrassing if if meta is still using the 2023 chatgpt datasets with llama4.

>>103584884
>I've seen LLMs do big number addition
yes i did that a couple weeks ago.
cant be a tool in the background either because it doesnt work 100%.
especially without the 0000 at the beginning you get lower %.
so weird people still do the "muh trainingset autocomplete".
albeit crudely clearly llm can be used for novel stuff. thats the whole point.

Anonymous
12/20/24(Fri)09:52:38 No.103585268

Anonymous 12/20/24(Fri)09:52:38 No.103585268

>>103585226
why would you do this instead of buying a Mac Pro with 192gb of memory for 10k?

Anonymous
12/20/24(Fri)09:53:52 No.103585276

Anonymous 12/20/24(Fri)09:53:52 No.103585276

it's crazy how much openai has overplayed their cards when it comes to stirring up hype
in the past I would be excited and speculating about what they're going to do but after all these cycles of hyping us up for the actual release to be a complete nothingburger I just don't give a shit anymore, not going to waste my time for what ends up being another 3 pt bump on benchmarks or the GPT store 2.0
the only cool things they have done this year are 4o's fully mm capabilities and sora (sorry, o1 is a meme) and they are terrified to let people use either and losing ground to competitors on every front

Anonymous
12/20/24(Fri)09:54:03 No.103585278

Anonymous 12/20/24(Fri)09:54:03 No.103585278

>>103585226
What's that shit all over your wall? Dust? Black mold?

Anonymous
12/20/24(Fri)09:54:07 No.103585279

Anonymous 12/20/24(Fri)09:54:07 No.103585279

>>103585247
He most likely is just giving advice and guidance rather than playing a big role here. The most important ones are in the front.

Anonymous
12/20/24(Fri)09:55:34 No.103585297

Anonymous 12/20/24(Fri)09:55:34 No.103585297

>>103585268
prompt processing for one thing

Anonymous
12/20/24(Fri)09:57:15 No.103585308

Anonymous 12/20/24(Fri)09:57:15 No.103585308

>>103585278
Looks like holes to me

Anonymous
12/20/24(Fri)09:57:46 No.103585311

Anonymous 12/20/24(Fri)09:57:46 No.103585311

>>103585262
Gemini 2 is totally uncensored? That's pretty unusual both by local and cloud standards. Even Claude models need a JB.

Anonymous
12/20/24(Fri)09:58:50 No.103585319

Anonymous 12/20/24(Fri)09:58:50 No.103585319

>>103585262
That's the orange man effect

Anonymous
12/20/24(Fri)10:00:18 No.103585331

Anonymous 12/20/24(Fri)10:00:18 No.103585331

Where exactly am I supposed to fit a second 3090 in my PC?

Anonymous
12/20/24(Fri)10:00:23 No.103585334

Anonymous 12/20/24(Fri)10:00:23 No.103585334

>>103585262
>google got rid of the warning marks in aistudio and gemini 2 you can talk about how sexy a game character is etc.
The "safety level" can be configured but I still get the warning marks with even mild content, and I don't really want to test how much it takes before Google will revoke access or even terminate my account completely.

Anonymous
12/20/24(Fri)10:01:29 No.103585341

Anonymous 12/20/24(Fri)10:01:29 No.103585341

File: Screenshot_20241207_103559.png (59 KB, 679x634)

59 KB PNG

>>103585311
no thats not what i meant.
but:
1.i did see no more of those warning marks. (pic related)
2.i could talk about stuff like jade/Marutina from dq11 being sexy and having a hot body.
usually i always got "muh respect. need judge by personality etc."
I'm sure its cucked at some point but i'm not gonna send google anything too spicy.

I mainly meant the direction. We are moving torwards less censorship. Yet local is still stuck. (apart from mistral)

Anonymous
12/20/24(Fri)10:02:58 No.103585361

Anonymous 12/20/24(Fri)10:02:58 No.103585361

>>103585206
3.5 was watered down 3.0
remains to be seen that 4.5 will be

Anonymous
12/20/24(Fri)10:03:07 No.103585363

Anonymous 12/20/24(Fri)10:03:07 No.103585363

>>103585262
Anon, read upthread >>103584255. The argument isn't that LLMs can't sometimes get these questions right, it's that LLMs are performing in depth arithmetic and propositional logic in the background, which is absolutely not how these models work. It's heuristic, robust, and can adapt to different cases decently once trained sufficiently, but ultimately all behavior observed from LLMs is autoregressive.

Anonymous
12/20/24(Fri)10:03:25 No.103585366

Anonymous 12/20/24(Fri)10:03:25 No.103585366

>>103584884
>my best guess is it connects arbitrary looking names to some fantasy OC because the probability to associate it with any other data must be 0 unless it uses some levenshtein distance approximation or dissects unknown tokens into even smaller parts.
Funny you should mention that >>103578821 (me)
>If there's something in the context that will guide it towards answering, even if it's completely unrelated to what you asked, it will.
You DO understand why they don't reply correctly to obscure trivia. Why are we arguing?

Anonymous
12/20/24(Fri)10:03:42 No.103585367

Anonymous 12/20/24(Fri)10:03:42 No.103585367

>>103585319
>>>/pol/ and never come back.

Anonymous
12/20/24(Fri)10:04:04 No.103585370

Anonymous 12/20/24(Fri)10:04:04 No.103585370

>>103585268
mac pro is agonizingly slow for anything above 70b

Anonymous
12/20/24(Fri)10:06:01 No.103585386

Anonymous 12/20/24(Fri)10:06:01 No.103585386

>>103585278
Maybe rocksheet or something made of cement.

Anonymous
12/20/24(Fri)10:06:44 No.103585391

Anonymous 12/20/24(Fri)10:06:44 No.103585391

>>103585367
whats controversial about that? thats usually how it works.

Anonymous
12/20/24(Fri)10:11:19 No.103585446

Anonymous 12/20/24(Fri)10:11:19 No.103585446

>>103585341
That's pretty unusual for cloud if no prior instructions or JB were really added. I'm not going to test it, but are you sure the current Claude and GPT models are also that "neutral"? It doesn't make sense to say that cloud is moving in one direction if only one of the players is doing it and others like OpenAI/Anthropic have not caught up yet.

Anonymous
12/20/24(Fri)10:13:23 No.103585470

Anonymous 12/20/24(Fri)10:13:23 No.103585470

Is it just me, or low b finetunes write sex scenes better than large models? They make logical mistakes, but their writing style is miles ahead. I now switch from 123b to 22b during sex

Anonymous
12/20/24(Fri)10:13:30 No.103585473

Anonymous 12/20/24(Fri)10:13:30 No.103585473

>>103585370
How do you define slow? For me ,the minimum I'm willing to tolerate is 8 tokens a second for generation. I've found that once I get below 30 tokens a second my hardware starts to put in some effort, so maybe not the best for long term use.

Anonymous
12/20/24(Fri)10:20:14 No.103585549

Anonymous 12/20/24(Fri)10:20:14 No.103585549

>>103585074
Nope.

>>103585169
What's your guess?

One clue is that I didn't use any apis with prefils or clever prompting or the like.
Just wrote some text in whatever frontend and chatted for a bit.

Anonymous
12/20/24(Fri)10:23:29 No.103585580

Anonymous 12/20/24(Fri)10:23:29 No.103585580

>>103585308
>>103585386
That's what I thought at first, but why would it cluster near the bottom like that if it was? The wall looks like drywall panels anyway judging by the loose insulation in the back.

Anonymous
12/20/24(Fri)10:27:08 No.103585615

Anonymous 12/20/24(Fri)10:27:08 No.103585615

>>103585331
Buy the turbo version, it's 2-slot

Anonymous
12/20/24(Fri)10:27:41 No.103585621

Anonymous 12/20/24(Fri)10:27:41 No.103585621

File: a.png (64 KB, 2855x250)

64 KB PNG

Never thought i'd see the day. Nothing else like prefill or sysprompt.

Anonymous
12/20/24(Fri)10:27:49 No.103585622

Anonymous 12/20/24(Fri)10:27:49 No.103585622

>>103585276
It's kind of funny that what was supposed to be "12 Days of OpenAI" ended up being "12 Days of Google". They shit on them in just about every regard.
>o1
Flash thinking is just as good, faster and cheaper, and it actually shows you the CoT rather than hiding it like a monopolistic cunt. QvQ 72B is also looming on the horizon.
>Sora
Already behind chink options, and Google unveiling Veo 2 completely demolished what little appeal it may have had.
>4o multimodal
Maybe the one thing they have going for them until we see more of Project Astra or Llama 4.

Every other fucking thing was either worthless or something somebody else already did better. We'll see if o3 has any appeal (and I hope for their fucking sake it's an actual release and not an announcement - otherwise the Sora effect is going to fuck them hard), but if their goal was to garner hype, this entire event has been a fucking shitshow for them.

Anonymous
12/20/24(Fri)10:33:17 No.103585658

Anonymous 12/20/24(Fri)10:33:17 No.103585658

File: Screenshot_20241221_003234.png (362 KB, 3269x1084)

362 KB PNG

>>103585621
>With that said, I'll choose... neither!
Yeah well fuck you too.

Anonymous
12/20/24(Fri)10:39:14 No.103585711

Anonymous 12/20/24(Fri)10:39:14 No.103585711

File: file.png (92 KB, 863x820)

92 KB PNG

local confirmed the white man's choice

Anonymous
12/20/24(Fri)10:39:44 No.103585716

Anonymous 12/20/24(Fri)10:39:44 No.103585716

>>103585276
>>103585622
I always take these extreme hype attempts as a sign that whatever the product is, is probably not as good or good enough on its own merits, otherwise they'd just let people be impressed by the results on their own.
It's pretty much the same grifter scammer behavior you see on crypto scams and the like.

Anonymous
12/20/24(Fri)10:39:55 No.103585718

Anonymous 12/20/24(Fri)10:39:55 No.103585718

>>103582916
There's a difference between random chink lab #4632 and FAIR

Anonymous
12/20/24(Fri)10:42:35 No.103585737

Anonymous 12/20/24(Fri)10:42:35 No.103585737

File: Screenshot 2024-12-20 084153.png (334 KB, 1558x902)

334 KB PNG

>>103579890
Right after it's confirmed that speech modalities cause brain damage

Anonymous
12/20/24(Fri)10:42:39 No.103585739

Anonymous 12/20/24(Fri)10:42:39 No.103585739

>>103585718
There is a difference. And yet FAIR (in addition to many major AI labs) produce many papers that ultimately never actually have any impact on any product.

Anonymous
12/20/24(Fri)10:43:17 No.103585743

Anonymous 12/20/24(Fri)10:43:17 No.103585743

What is the best local model a 3060 can run that can simulate claude output? I'm using Mistral nemo12b now and the bots respond are pretty bad

Anonymous
12/20/24(Fri)10:45:57 No.103585770

Anonymous 12/20/24(Fri)10:45:57 No.103585770

>>103585737
Where does it show that? I just see older models doing worse.

Anonymous
12/20/24(Fri)10:47:42 No.103585780

Anonymous 12/20/24(Fri)10:47:42 No.103585780

>>103585770
Look at the parentheses. Gemini 2.0 Flash (speech to text) versus Gemini 2.0 Flash (text to text) for instance

Anonymous
12/20/24(Fri)10:48:20 No.103585789

Anonymous 12/20/24(Fri)10:48:20 No.103585789

>>103585615
I mean, it's a real premium. I only got mine because it was on marketplace for the price of a regular one.

Anonymous
12/20/24(Fri)10:48:56 No.103585793

Anonymous 12/20/24(Fri)10:48:56 No.103585793

>>103585658
Kek

Anonymous
12/20/24(Fri)10:50:16 No.103585807

Anonymous 12/20/24(Fri)10:50:16 No.103585807

>>103585780
Its just speech to text to text. Its most likely just people being shit at speaking coherently.

Anonymous
12/20/24(Fri)10:50:27 No.103585809

Anonymous 12/20/24(Fri)10:50:27 No.103585809

>>103585739
Which is true, but the argument is that BLT is legit. Which is a valid question if coming out of some shady lab in Beijing or a random team at Backwater U, but less so when it's one of the big ones with a history of making advances.
I think it needs to be tested more and there are questions to be answered, but I don't think the results are fabricated.

Anonymous
12/20/24(Fri)10:50:53 No.103585813

Anonymous 12/20/24(Fri)10:50:53 No.103585813

File: file.jpg (15 KB, 320x290)

15 KB JPG

dayum those 64gb+ unified ram macs are expensive as fuck

Anonymous
12/20/24(Fri)10:51:24 No.103585818

Anonymous 12/20/24(Fri)10:51:24 No.103585818

>>103585743
>a 3060
>simulate claude
>Mistral nemo12b
>respond are pretty bad
>respond

Anonymous
12/20/24(Fri)10:52:13 No.103585825

Anonymous 12/20/24(Fri)10:52:13 No.103585825

>>103585813
And not worth it unless you like waiting minutes for token processing.

Anonymous
12/20/24(Fri)10:52:13 No.103585826

Anonymous 12/20/24(Fri)10:52:13 No.103585826

>>103585807
See the speech-to-speech pipeline to the left, which explicitly separates the modalities.

Anonymous
12/20/24(Fri)10:54:14 No.103585840

Anonymous 12/20/24(Fri)10:54:14 No.103585840

>>103585826
That is just gpt4o being next to gpt4o but using whisper instead of whatever they use? Which is probably just a bigger or better version of that / is implemented better?

Anonymous
12/20/24(Fri)10:56:23 No.103585865

Anonymous 12/20/24(Fri)10:56:23 No.103585865

>>103585826
>>103585840
Infact using your own pipeline for the same model and getting within 2% should disprove your point. That 2% could easily either be in the margin of error or be a loss from people just not one shot explaining as coherently as typing something out would.

Anonymous
12/20/24(Fri)10:56:52 No.103585869

Anonymous 12/20/24(Fri)10:56:52 No.103585869

File: she-want-it-v0-rfgtdjwa08fa1.png (178 KB, 793x539)

178 KB PNG

>>103585743
you dont have enough vram.
nemo is the best vramlets have.
this is what we had 2022 with 24gb vram.
gotta wait morre or upgrade.

Anonymous
12/20/24(Fri)10:57:39 No.103585879

Anonymous 12/20/24(Fri)10:57:39 No.103585879

>>103585840
No. The speech-to-text / text-to-speech / speech-to-speech have a speech modality baked into the model rather than being explicitly separated (like Llama 3.2 with images). The point here being it isn't just people being incoherent, otherwise the Whisper pipeline, which uses the text only model, would have ranked just as poorly.

Anonymous
12/20/24(Fri)10:59:56 No.103585897

Anonymous 12/20/24(Fri)10:59:56 No.103585897

>>103585879
Wait nvm, I thought the first gpt4o was speech to text nvm

Anonymous
12/20/24(Fri)11:00:13 No.103585900

Anonymous 12/20/24(Fri)11:00:13 No.103585900

>>103585809
I didn't mean to imply that the issue is with fabricated results. The issue is that most papers do not mention what all the limitations of the techniques truly are, regardless of whether the authors held back some information or were honest. But really though even if they are honest and there are no limitations, it's unproven just how far scaling can go. It may not scale to a production model's training.

Anonymous
12/20/24(Fri)11:03:15 No.103585934

Anonymous 12/20/24(Fri)11:03:15 No.103585934

>>103585869
>we had sovl
What happened in 2023?

Anonymous
12/20/24(Fri)11:06:27 No.103585966

Anonymous 12/20/24(Fri)11:06:27 No.103585966

>>103585897
No problem, sorry for getting snippy kek

Anonymous
12/20/24(Fri)11:11:05 No.103586016

Anonymous 12/20/24(Fri)11:11:05 No.103586016

>>103585226
how many giga-octets?

Anonymous
12/20/24(Fri)11:19:28 No.103586117

Anonymous 12/20/24(Fri)11:19:28 No.103586117

>>103585226
Imagine having to explain to someone that you use this to masturbate.

Anonymous
12/20/24(Fri)11:19:51 No.103586124

Anonymous 12/20/24(Fri)11:19:51 No.103586124

>>103586102
>>103586102
>>103586102

Anonymous
12/20/24(Fri)12:33:09 No.103586817

Anonymous 12/20/24(Fri)12:33:09 No.103586817

>>103585743
Nemo isn't bad, you probably are using some retarded presets/prompt.

Anonymous
12/20/24(Fri)12:36:52 No.103586852

Anonymous 12/20/24(Fri)12:36:52 No.103586852

>>103586817
>the bots respond are pretty bad
Barely being able to write in english probably has some effect.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.