/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

[Post a Reply]

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous
/lmg/ - Local Models General 12/21/24(Sat)17:27:37 No.103601859

File: 1719876762014876.jpg (957 KB, 2048x2048)

957 KB JPG

/lmg/ - Local Models General Anonymous 12/21/24(Sat)17:27:37 No.103601859

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>103591928 & >>103586102

►News
>(12/20) RWKV-7 released: https://hf.co/BlinkDL/rwkv-7-world
>(12/19) Finally, a Replacement for BERT: https://hf.co/blog/modernbert
>(12/18) Bamba-9B, hybrid model trained by IBM, Princeton, CMU, and UIUC on open data: https://hf.co/blog/bamba
>(12/18) Apollo unreleased: https://github.com/Apollo-LMMs/Apollo
>(12/18) Granite 3.1 released: https://hf.co/ibm-granite/granite-3.1-8b-instruct

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/tldrhowtoquant

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/hsiehjackson/RULER
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
12/21/24(Sat)17:27:58 No.103601864

Anonymous 12/21/24(Sat)17:27:58 No.103601864

File: MikuUndPanzer.png (1.2 MB, 1024x1024)

1.2 MB PNG

►Recent Highlights from the Previous Thread: >>103591928

--Testing prompt with Gemma and Llama.cpp reveals potential bug:
>103598786 >103599060 >103599119 >103599270 >103599980 >103599335 >103599387
--Discussion on AI model capabilities and OpenAI's marketing strategy:
>103594194 >103594260 >103594596 >103594671 >103594789 >103594837 >103595375 >103595386 >103599087
--Open AI and closed-source model comparison, with discussion on MOAT and Sonnet:
>103596394 >103596500 >103596568 >103596673 >103596910
--Speculation on Google's next model release and Gemini 2.0 Flash architecture:
>103597226 >103597294
--Anon seeks reliable benchmark for open models, suggests SimpleBench and Livebench as alternatives:
>103597588 >103597598 >103597858
--Model parameters and code quality discussion, with focus on synthetic data and training quality:
>103599057 >103599156 >103599190
--AGI and ARC-AGI benchmark discussion:
>103598880 >103598932 >103598957
--Debate on the newsworthiness of OpenAI's AGI advancements:
>103591969 >103592019 >103592034 >103593135 >103597897
--OpenAI's Memory feature and its relation to RAG:
>103598915 >103598979 >103599065
--Discussion on context size handling in gpttype_adapter.cpp and llama.cpp:
>103593005 >103593583 >103593616 >103593767 >103593804
--Anon seeks recommendations for smaller AI models (3B-8B tier):
>103599850 >103600140
--Intel B580 availability and paper launch rumors:
>103597156 >103597197
--Anon's ST Director plugin development and user interface design:
>103600709 >103600898 >103601014
--Anon's revelation about prioritizing diverse results over initial accuracy:
>103593344
--Anon asks about using model-output tags for RAG with Silly's vectorDb:
>103598196
--o3 core mechanism explained:
>103601121
--Miku (free space):
>103595899 >103597253

►Recent Highlight Posts from the Previous Thread: >>103591931

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script

Anonymous
12/21/24(Sat)17:29:56 No.103601899

Anonymous 12/21/24(Sat)17:29:56 No.103601899

It's funny to see just how badly OpenAI failed yet again.

Anonymous
12/21/24(Sat)17:35:41 No.103601958

Anonymous 12/21/24(Sat)17:35:41 No.103601958

>>103601899
openai won DOE?

Anonymous
12/21/24(Sat)17:38:33 No.103601992

Anonymous 12/21/24(Sat)17:38:33 No.103601992

>>103601958
Let them have their "AGI" for another week before everyone realizes that it's just the same old shit with slightly better benchmarks.

Anonymous
12/21/24(Sat)17:39:10 No.103601996

Anonymous 12/21/24(Sat)17:39:10 No.103601996

GPT-5

Anonymous
12/21/24(Sat)17:40:06 No.103602006

Anonymous 12/21/24(Sat)17:40:06 No.103602006

Where qvq

Anonymous
12/21/24(Sat)17:40:32 No.103602013

Anonymous 12/21/24(Sat)17:40:32 No.103602013

I'm hungry

Anonymous
12/21/24(Sat)17:40:36 No.103602014

Anonymous 12/21/24(Sat)17:40:36 No.103602014

>>103601992
Very much like this general every time with new opensource meme model release.

Anonymous
12/21/24(Sat)17:42:18 No.103602031

Anonymous 12/21/24(Sat)17:42:18 No.103602031

I want to goooooooooooooooon

Anonymous
12/21/24(Sat)17:42:47 No.103602035

Anonymous 12/21/24(Sat)17:42:47 No.103602035

>>103602014
I don't think anyone has ever claimed an open source model to be AGI, so its not the same.

Anonymous
12/21/24(Sat)17:43:15 No.103602041

Anonymous 12/21/24(Sat)17:43:15 No.103602041

>>103602006
Monday.

Anonymous
12/21/24(Sat)17:43:17 No.103602042

Anonymous 12/21/24(Sat)17:43:17 No.103602042

Here's a not-so-novel idea, just to throw it out here. From "The Unreasonable Ineffectiveness of the Deeper Layers" (https://arxiv.org/abs/2403.17887) we know that at least with current training techniques about 30-50% of the model weights (mainly the deep layers) do not contribute much to the models' final performance. What if we (that is, some AI company with large enough compute) were to train models 2-3 times as deep as normal, and then chopped them to the regular depth? Wouldn't that improve model weight utilization, of course at the cost of training efficiency?

Meta could for example obtain a Llama 8B from a very deep ~20B model with the same dimensions as the target 8B model. Like the paper suggests, some continued pretraining after that might be necessary for optimizing final performance, but wouldn't it be a potentially much better 8B model than a normally trained one? Same for any other final size.

Anonymous
12/21/24(Sat)17:43:40 No.103602049

Anonymous 12/21/24(Sat)17:43:40 No.103602049

>>103602035
QvQ will be AGI

Anonymous
12/21/24(Sat)17:43:55 No.103602054

Anonymous 12/21/24(Sat)17:43:55 No.103602054

just cummed all over my thigh

Anonymous
12/21/24(Sat)17:44:18 No.103602060

Anonymous 12/21/24(Sat)17:44:18 No.103602060

>>103602035
Yeah, the /lmg/ equivalent is a model being CLAUDE AT HOME for RP until it isn't.

Anonymous
12/21/24(Sat)17:44:56 No.103602065

Anonymous 12/21/24(Sat)17:44:56 No.103602065

File: i-1054076549.jpg (20 KB, 395x320)

20 KB JPG

i haven't followed LLMs ever since GPT-3.5. what have i missed?

Anonymous
12/21/24(Sat)17:46:11 No.103602081

Anonymous 12/21/24(Sat)17:46:11 No.103602081

>>103602065
Nothing

Anonymous
12/21/24(Sat)17:47:20 No.103602093

Anonymous 12/21/24(Sat)17:47:20 No.103602093

>>103602065
Local models now are good enough to do actual work with just a bit of tardwrangling.

Anonymous
12/21/24(Sat)17:47:47 No.103602098

Anonymous 12/21/24(Sat)17:47:47 No.103602098

>pretraining scaling, the only source of interesting emergent capabilities and generalizable intelligence, grinds to a halt
>let's cope by doing expensive inference-time math benchmaxxing instead
grim.

Anonymous
12/21/24(Sat)17:50:50 No.103602131

Anonymous 12/21/24(Sat)17:50:50 No.103602131

Is Skyfall better than Cydonia 1.3?

Anonymous
12/21/24(Sat)17:54:42 No.103602172

Anonymous 12/21/24(Sat)17:54:42 No.103602172

>>103602093
how much memory do they take?

Anonymous
12/21/24(Sat)17:55:09 No.103602179

Anonymous 12/21/24(Sat)17:55:09 No.103602179

>>103601801
thanks anon. you can edit the non-lorebook options in the messy html file they're all held in option tags so its not to hard to add or change them. next version will have a new lorebook option called other for all that stuff to make it easier (thats what the first screenshot was of)

Anonymous
12/21/24(Sat)17:55:28 No.103602181

Anonymous 12/21/24(Sat)17:55:28 No.103602181

>>103602172
how much you got?

Anonymous
12/21/24(Sat)17:57:30 No.103602201

Anonymous 12/21/24(Sat)17:57:30 No.103602201

>>103602181
12gb gpu ram and 32gb cpu ram

Anonymous
12/21/24(Sat)17:58:48 No.103602215

Anonymous 12/21/24(Sat)17:58:48 No.103602215

>>103602201
If your ram is ddr4 you are cooked.

Anonymous
12/21/24(Sat)17:58:50 No.103602217

Anonymous 12/21/24(Sat)17:58:50 No.103602217

File: da0fvqF.gif (1.5 MB, 640x427)

1.5 MB GIF

>>103602201

Anonymous
12/21/24(Sat)17:59:01 No.103602220

Anonymous 12/21/24(Sat)17:59:01 No.103602220

Pros/Cons on the different local UIs? I'm stuck using Ooga since that's the only thing that worked out of the box for me but seeing people mention Lorebooks and such like in >>103602179 makes me think I'm missing out on features.

Anonymous
12/21/24(Sat)18:00:21 No.103602233

Anonymous 12/21/24(Sat)18:00:21 No.103602233

>>103602220
silly tavern for rp, kobold's basic ui for general and coding

Anonymous
12/21/24(Sat)18:00:23 No.103602234

Anonymous 12/21/24(Sat)18:00:23 No.103602234

Skyfall feels closer to base Small with some smut and RP sauce poured in. I like it

Anonymous
12/21/24(Sat)18:00:40 No.103602240

Anonymous 12/21/24(Sat)18:00:40 No.103602240

>>103602215
if that isn't enough, then i don't consider local models to be real

Anonymous
12/21/24(Sat)18:06:42 No.103602290

Anonymous 12/21/24(Sat)18:06:42 No.103602290

If I plug a second 3090 into a 3.0 pcie x16 will I see a significant drop in performance or will it only be slight?

Anonymous
12/21/24(Sat)18:11:54 No.103602337

Anonymous 12/21/24(Sat)18:11:54 No.103602337

Gemma2 9b vs 27b for chinese translation, there is a significant difference but maybe not worth the massive decrease in speed.

Anonymous
12/21/24(Sat)18:12:07 No.103602340

Anonymous 12/21/24(Sat)18:12:07 No.103602340

>>103595899
Just use the elevenlabs reader app for audiobooks. It's free use to use. Use screen copy to record the audio.
https://github.com/Genymobile/scrcpy

Anonymous
12/21/24(Sat)18:21:37 No.103602436

Anonymous 12/21/24(Sat)18:21:37 No.103602436

>>103602220
ooba doesn't have lorebooks? it might also be called world info. they're like dictionaries for info you want to bring up sometimes. like you could make an entry called 'my home', keywords 'house, home', and then describe it. then the info from that entry will be automatically added to the prompt when you type home or house into your rp

Anonymous
12/21/24(Sat)18:23:28 No.103602461

Anonymous 12/21/24(Sat)18:23:28 No.103602461

>>103602290
24gb more vram is going to outweigh any speed lost, it'll still be fast

Anonymous
12/21/24(Sat)18:23:34 No.103602463

Anonymous 12/21/24(Sat)18:23:34 No.103602463

File: capture.png (79 KB, 2550x823)

79 KB PNG

>>103602436
This is all I'm seeing for character setup in Ooga.

Anonymous
12/21/24(Sat)18:26:25 No.103602481

Anonymous 12/21/24(Sat)18:26:25 No.103602481

>>103602463
lorebooks are seperate from char cards (though i think you can actually embed them?) i've never used ooba but check what the notebook tab is. i thought its a pretty common feature

Anonymous
12/21/24(Sat)18:27:47 No.103602487

Anonymous 12/21/24(Sat)18:27:47 No.103602487

File: capture.png (52 KB, 2547x1312)

52 KB PNG

>>103602481
That's good thinking, but this is all I'm seeing there.

Anonymous
12/21/24(Sat)18:29:24 No.103602500

Anonymous 12/21/24(Sat)18:29:24 No.103602500

File: 9041724.jpg (106 KB, 1179x1180)

106 KB JPG

>>103601899
cope. sam made agi

Anonymous
12/21/24(Sat)18:31:30 No.103602515

Anonymous 12/21/24(Sat)18:31:30 No.103602515

>>103602290
No, there isn't significant data transfer going on during inference for that to be a significant factor. Something that isn't often talked about when using 2 GPUs however is the substantial increase in operating temperatures, which for RTX3090s (most of them having a 2.5-3.0 slot design) is particularly harmful due to their clamshell memory module arrangement.

Anonymous
12/21/24(Sat)18:33:53 No.103602528

Anonymous 12/21/24(Sat)18:33:53 No.103602528

>>103602487
are you rping? if so you'll want to try st anyways, its just nicer and its what a lot of char cards and lorebooks are made for anyways. what error were you getting with it? i've always used staging and paste over the whole folder to update every few days

Anonymous
12/21/24(Sat)18:33:57 No.103602530

Anonymous 12/21/24(Sat)18:33:57 No.103602530

>>103602515
Why would more than one GPU cause more heat? Unless you're talking about physical proximity. Is there something else in missing?

Anonymous
12/21/24(Sat)18:34:20 No.103602535

Anonymous 12/21/24(Sat)18:34:20 No.103602535

>>103602500
I find it very funny that they have AGI, yet they cant use it to find how to make better video and image models.

Anonymous
12/21/24(Sat)18:38:39 No.103602560

Anonymous 12/21/24(Sat)18:38:39 No.103602560

>>103602528
Really just experimenting with things now. Good news is no errors with Ooga (it's actually the only thing that didn't give me errors trying to install it), just saw the lorebook chat and looks like it's missing there.

If Silly Tavern is the meta choice I'll check it out. Looks like I have to setup that and a backend like KoboldCpp separately, right?

Anonymous
12/21/24(Sat)18:39:09 No.103602562

Anonymous 12/21/24(Sat)18:39:09 No.103602562

>>103602530
Yes, I mean physical proximity in a regular case on a standard consumer motherboard, where the next fastest 16x PCIe slot is the second one from above. For a period I had a 3090 and a 1070 just to have 32GB of total VRAM and run 70B models at decent quantization levels and speeds (~8.5-9.0 tokens/s). Even limited to 230W, the 3090 had +15-20 °C higher core temperature than in a single GPU scenario during prolonged inference. I eventually took the 1070 off.

Anonymous
12/21/24(Sat)18:42:01 No.103602586

Anonymous 12/21/24(Sat)18:42:01 No.103602586

>>103602562
Oh this second 3090 will be hanging out of the case like intestines spilling out of a severed gut. It won't be an issue

Anonymous
12/21/24(Sat)18:42:53 No.103602595

Anonymous 12/21/24(Sat)18:42:53 No.103602595

>>103602560
ooba is your back end/server running the model, but it also has a built in interface/front end that you're using now. silly tavern is only a front end and meant to connect to any server. you should be just fine using your existing ooba setup with it. i like kobold because it just works but you do not need it for st by any means, nearly any local server can connect to st

Anonymous
12/21/24(Sat)18:45:07 No.103602617

Anonymous 12/21/24(Sat)18:45:07 No.103602617

File: charts.jpg (103 KB, 1350x1200)

103 KB JPG

>>103602500

Anonymous
12/21/24(Sat)18:46:25 No.103602630

Anonymous 12/21/24(Sat)18:46:25 No.103602630

>>103602617
>ad hominem

Anonymous
12/21/24(Sat)18:46:44 No.103602632

Anonymous 12/21/24(Sat)18:46:44 No.103602632

>>103602500
Can't wait to access AGI (real) for the price of a H100 for every 100 tokens

Anonymous
12/21/24(Sat)18:49:14 No.103602658

Anonymous 12/21/24(Sat)18:49:14 No.103602658

So how exactly did they achieve o3 performance? They just had o1 feed itself synthetic data over and over at increasing quality and trained it?

Anonymous
12/21/24(Sat)18:49:46 No.103602661

Anonymous 12/21/24(Sat)18:49:46 No.103602661

>>103602632
It works if you're not poor

Anonymous
12/21/24(Sat)18:51:25 No.103602681

Anonymous 12/21/24(Sat)18:51:25 No.103602681

>>103602595
Thanks for being so helpful, really appreciate it!

Anonymous
12/21/24(Sat)18:56:01 No.103602717

Anonymous 12/21/24(Sat)18:56:01 No.103602717

I prefer tabby to ooba

Anonymous
12/21/24(Sat)18:58:38 No.103602739

Anonymous 12/21/24(Sat)18:58:38 No.103602739

File: 3602591130.png (39 KB, 1600x891)

39 KB PNG

you are smarter than o3 AGI if you can solve this

Anonymous
12/21/24(Sat)19:00:53 No.103602753

Anonymous 12/21/24(Sat)19:00:53 No.103602753

File: 1726552856869326.jpg (77 KB, 864x701)

77 KB JPG

>>103602681
no prob. when you first get st connected it might seem confusing but it won't take long to learn and be a much better experience. if you have trouble connecting, look for this socket button at the top and make sure your connection is set right

Anonymous
12/21/24(Sat)19:01:01 No.103602754

Anonymous 12/21/24(Sat)19:01:01 No.103602754

>>103602739
Do they have these in a text grid format?

Anonymous
12/21/24(Sat)19:18:18 No.103602861

Anonymous 12/21/24(Sat)19:18:18 No.103602861

>>103602658
Similar process of o1 preview to o1 but with a lot more compute time

Anonymous
12/21/24(Sat)19:22:05 No.103602895

Anonymous 12/21/24(Sat)19:22:05 No.103602895

>>103602739
It would be pretty grim if a normal adult male couldn't solve this one kek.

Anonymous
12/21/24(Sat)19:34:18 No.103603019

Anonymous 12/21/24(Sat)19:34:18 No.103603019

>>103602895
>>103602739
yeah I'm a fucking retard and this one's obvious to me. if o3 can't do it there's still obviously no real mind here, despite any other impressive things it can do. still just a kind of brittle savant.

Anonymous
12/21/24(Sat)19:55:24 No.103603224

Anonymous 12/21/24(Sat)19:55:24 No.103603224

File: 1724171822740238.jpg (47 KB, 640x360)

47 KB JPG

>>103602739
i've played that level

Anonymous
12/21/24(Sat)20:04:33 No.103603320

Anonymous 12/21/24(Sat)20:04:33 No.103603320

>>103602739
It's less a question of solving it and more predicting how an average human would think it should be solved.

Still a fucking joke.

Anonymous
12/21/24(Sat)20:06:35 No.103603339

Anonymous 12/21/24(Sat)20:06:35 No.103603339

>>103602739
I tried giving this to qwq and it's infuriating because I ran it multiple times and every time it figures out that blue cells connect it immediately gives up.

>Wait, perhaps it's about filling rows and columns that contain 'B's, but only between the boundaries defined by 'B's in those rows and columns.
>This is getting complicated.
>Let me try a different approach.

>Alternatively, maybe it's about filling from the leftmost 'B' in any row up to the rightmost 'B' in any row.
>But that seems too broad.
>I need a better approach.

I gave it inputs like this
..........
..........
..........
...RRR....
B..RRR...B
...RRR....
..........
..........
.....RR...
.....RR...
..........

..........
..........
..........
...BBB....
BBBBBBBBBB
...BBB....
..........
..........
.....RR...
.....RR...
..........

Anonymous
12/21/24(Sat)20:07:09 No.103603341

Anonymous 12/21/24(Sat)20:07:09 No.103603341

>>103602739
I don't get it. I know what the solution is here, but what's the difficult thing about solving it? Does it need to prove the solution mathematically or something?

Anonymous
12/21/24(Sat)20:08:11 No.103603354

Anonymous 12/21/24(Sat)20:08:11 No.103603354

>>103602739
I think this entire test was really ripe for gaming but no LLM maker simply just cared because AGI felt pretty far off and this test is more geared towards visual/multimodal models. After you get multimodality, and you train on a bunch of visual reasoning tasks + COT that you can synthetically generate, it's logical this could be solved. Like so many of the puzzles are just so really easy. So it's more like multimodal model development was in its infancy before now.

Anonymous
12/21/24(Sat)20:11:50 No.103603387

Anonymous 12/21/24(Sat)20:11:50 No.103603387

>>103603339
QvQ will save us...

Anonymous
12/21/24(Sat)20:14:51 No.103603408

Anonymous 12/21/24(Sat)20:14:51 No.103603408

File: 1716775342528871.gif (2.87 MB, 275x498)

2.87 MB GIF

>>103602739
You're telling me o3 can solve THIS?
Take my money sama-sama

Anonymous
12/21/24(Sat)20:17:19 No.103603432

Anonymous 12/21/24(Sat)20:17:19 No.103603432

>>103603387
Now that you mention it, it's pretty funny that it being visual was teased before o3 was announced. It's like they already knew about OpenAI's plan so they began preparing their catch up early.

Anonymous
12/21/24(Sat)20:20:14 No.103603468

Anonymous 12/21/24(Sat)20:20:14 No.103603468

>>103603408
>You're telling me o3 can solve THIS?
no, he specifically said o3 cannot

Anonymous
12/21/24(Sat)20:20:51 No.103603480

Anonymous 12/21/24(Sat)20:20:51 No.103603480

>>103603468
Nothing that can't be solved with longer CoT

Anonymous
12/21/24(Sat)20:23:26 No.103603505

Anonymous 12/21/24(Sat)20:23:26 No.103603505

>>103602739
LLMs are 1D entities, it's pointless to ask them 2D tests

Anonymous
12/21/24(Sat)20:23:27 No.103603506

Anonymous 12/21/24(Sat)20:23:27 No.103603506

>>103602739
So can I get my own trillion of H100s and a nuclear PP now that I'm better than SOTA AI model?

Anonymous
12/21/24(Sat)20:36:15 No.103603628

Anonymous 12/21/24(Sat)20:36:15 No.103603628

File: file.png (159 KB, 1719x1294)

159 KB PNG

What the fuck is this shitalian getting himself into now.

Anonymous
12/21/24(Sat)20:37:22 No.103603641

Anonymous 12/21/24(Sat)20:37:22 No.103603641

>>103603628
qrd

Anonymous
12/21/24(Sat)20:39:27 No.103603667

Anonymous 12/21/24(Sat)20:39:27 No.103603667

>>103603505
This is definitely the case. I added simpler examples to this >>103603339 that are just Bs at edges with no Rs to demonstrate the connections.
It struggles with columns. It easily identifies that the row is filled when the edges are Bs but has trouble doing the same with columns.

Anonymous
12/21/24(Sat)20:51:10 No.103603772

Anonymous 12/21/24(Sat)20:51:10 No.103603772

>>103603505
4o and o3 are native multimodal and 2D, somewhat even 3D (just like how Sora is 3D and a person who was born with one working eye is 3D).

Anonymous
12/21/24(Sat)20:54:04 No.103603800

Anonymous 12/21/24(Sat)20:54:04 No.103603800

File: capture.png (47 KB, 1283x595)

47 KB PNG

>>103602753
Thanks to this anon for recommending Silly Tavern. It does look like a much more robust UI than the default Ooga one, with additional features including the Lore Books.

Unfortunately It looks like it's not connecting to my Ooga backend. Started Ooga separately, it's running at the IP address in the screenshot and its own UI works fine.

Anyone have any idea why connecting it to Silly Tavern wouldn't be working?

Anonymous
12/21/24(Sat)20:56:09 No.103603813

Anonymous 12/21/24(Sat)20:56:09 No.103603813

>>103602739
>>103603341
The difficult part is that your solution is wrong if you don't color the very top box (which does not intersect any lines) blue, due to a rule not expressed in any of the examples. If you drew lines between the blue squares and colored the boxes they went through, you got the same wrong answer as o3.

Anonymous
12/21/24(Sat)20:58:08 No.103603827

Anonymous 12/21/24(Sat)20:58:08 No.103603827

>>103603800
You've got the wrong server URL in there. It should be something like the example

Try one of these

http://localhost:5001/v1/
http://localhost:5000/v1/
http://localhost:5001

Anonymous
12/21/24(Sat)20:58:40 No.103603832

Anonymous 12/21/24(Sat)20:58:40 No.103603832

>>103603800
did you add the --api flag like it says to your ooba launch?

Anonymous
12/21/24(Sat)21:00:28 No.103603851

Anonymous 12/21/24(Sat)21:00:28 No.103603851

>>103603827
Tried them all but no dice.

>>103603832
Ah fuck me. Good eyes anon! I'm running it from a .bat batch file though, how do I pass a parameter when running it? I can edit the file in Notepad++ but can't see where to pass it in there either.

Anonymous
12/21/24(Sat)21:07:29 No.103603907

Anonymous 12/21/24(Sat)21:07:29 No.103603907

>>103603851
are you running a gguf file? consider trying kobold for a server, its one file and just works. it'd have a different interface that you use, but you're using st anyways
https://github.com/LostRuins/koboldcpp/tree/concedo_experimental

Anonymous
12/21/24(Sat)21:10:12 No.103603943

Anonymous 12/21/24(Sat)21:10:12 No.103603943

desu, I haven't used ooba in months. Its only on my PC because thats where I keep the models. I've switched to tabby api for exl2 and kobold for gguf. Those two fit my needs perfectly.

Anonymous
12/21/24(Sat)21:10:18 No.103603947

Anonymous 12/21/24(Sat)21:10:18 No.103603947

Broes, do you guys know any text to speech software models that dont require internet connection or a subscription? I want to convert my notes into audio

Anonymous
12/21/24(Sat)21:11:30 No.103603959

Anonymous 12/21/24(Sat)21:11:30 No.103603959

File: IMG_20241221_231118.jpg (222 KB, 1080x1150)

222 KB JPG

>>103603947

Anonymous
12/21/24(Sat)21:12:03 No.103603964

Anonymous 12/21/24(Sat)21:12:03 No.103603964

>>103603943
What purpose does exl2 have now that there's no performance difference between it and gguf? If anything it's worse because it requires multiple files.

Anonymous
12/21/24(Sat)21:14:18 No.103603978

Anonymous 12/21/24(Sat)21:14:18 No.103603978

>>103603964
idk, I just had some exl2 files that I ran from time to time. Is there really no drop between the two files now? What about the kv cache can it be quanted on gguf?

Anonymous
12/21/24(Sat)21:19:23 No.103604020

Anonymous 12/21/24(Sat)21:19:23 No.103604020

Is the cat poster just one guy? Ive started to ignore the cats when they ask for advice because whenever I provide them with something they'll bounce back with unhinged goalpost moving.

>How do I do X?
>Post link to thing
>Uhm, actually I don't have a GPU?

>Can X do Y?
>Yeah you can do it with this [link]
>WTF? This is useless for video game development
>How the fuck was I supposed to know you were developing a game

etc etc

Anonymous
12/21/24(Sat)21:21:25 No.103604038

Anonymous 12/21/24(Sat)21:21:25 No.103604038

>>103603019
>>103602739

its a visual test not a written test. not ideal for chatgpt.
solving this would mean it can reason visual

Anonymous
12/21/24(Sat)21:21:40 No.103604041

Anonymous 12/21/24(Sat)21:21:40 No.103604041

>kobolcpp
>uses pyinstaller
Cppniles?

Anonymous
12/21/24(Sat)21:22:47 No.103604050

Anonymous 12/21/24(Sat)21:22:47 No.103604050

File: 1536370174188.jpg (118 KB, 1920x1080)

118 KB JPG

Bros I... I tried a different Gemma tune, Tiger Gemma v3, and Ifable beat it. It was funner, it was more in-character, AND it was simply just smarter. Even though Tiger Gemma is the highest scoring 9B on the UGI leaderboard while Ifable is way lower. What the hell did the Ifable guy do that the tiger gemma dude couldn't?

>Training and evaluation data

>Gutenberg: https://huggingface.co/datasets/jondurbin/gutenberg-dpo-v0.1
>Carefully curated proprietary creative writing dataset

>Training procedure

>Training method: SimPO (GitHub - princeton-nlp/SimPO: SimPO: Simple Preference Optimization with a Reference-Free Reward)

Hmm...

Anonymous
12/21/24(Sat)21:23:38 No.103604053

Anonymous 12/21/24(Sat)21:23:38 No.103604053

>>103604038
>its a visual test

This can easily be represented as an array.

Anonymous
12/21/24(Sat)21:23:44 No.103604055

Anonymous 12/21/24(Sat)21:23:44 No.103604055

>>103604041
contained dependencies, no venv and no 50 other things reading from it. but you know that

Anonymous
12/21/24(Sat)21:24:38 No.103604064

Anonymous 12/21/24(Sat)21:24:38 No.103604064

>>103604050
Sorry, I only take model recommendations from brands I trust. Like Drummer.

Anonymous
12/21/24(Sat)21:25:08 No.103604068

Anonymous 12/21/24(Sat)21:25:08 No.103604068

>>103604064
Hi all!

Anonymous
12/21/24(Sat)21:25:26 No.103604071

Anonymous 12/21/24(Sat)21:25:26 No.103604071

>>103604038
o1 and o3 are all based on 4o, which is native multimodal. Of course it should be able to have some visual reasoning, especially after they then especially tuned it for the test.

Anonymous
12/21/24(Sat)21:26:20 No.103604080

Anonymous 12/21/24(Sat)21:26:20 No.103604080

>>103604050
>UGI leaderboard
What makes you think that that's authoritative in any way?
Also, smarter how?
I never tried Gemma 9B or its tunes. Maybe I should.

Anonymous
12/21/24(Sat)21:26:36 No.103604088

Anonymous 12/21/24(Sat)21:26:36 No.103604088

>>103604064
It's kind of crazy actually. The Ifable guy has no other models. It appears that 9B is the only thing he has ever done, and he struck gold.

Anonymous
12/21/24(Sat)21:27:23 No.103604095

Anonymous 12/21/24(Sat)21:27:23 No.103604095

>>103604020
Its my first time posting on this general

Anonymous
12/21/24(Sat)21:27:58 No.103604101

Anonymous 12/21/24(Sat)21:27:58 No.103604101

>>103604050
Apparently the SimPO thing somehow makes gemma 9b smarter, although it didn't work on 27b.

Anonymous
12/21/24(Sat)21:34:42 No.103604151

Anonymous 12/21/24(Sat)21:34:42 No.103604151

>>103604050
I told you but no one ever listens to me. Small gemma is crazy good.

Anonymous
12/21/24(Sat)21:35:15 No.103604155

Anonymous 12/21/24(Sat)21:35:15 No.103604155

>>103603907
I had tried Koboldccp like a year ago or something like that and it kept giving me errors, but trying it again now worked and Silly Tavern connected to it no problem. Thanks for the suggestion!

Are there meta settings for getting the best responses out of Silly Tavern for RP or creative writing stuff? It has a lot of options available.

Anonymous
12/21/24(Sat)21:35:27 No.103604159

Anonymous 12/21/24(Sat)21:35:27 No.103604159

>>103604080
UGI seems to be pretty correlative to my experience with how uncensored models are, which is one metric that is preferable, though not indicative of total model quality.
Smarter as in it didn't confuse logic about anatomy and which character did what in scenes as much as tiger did.
>I never tried Gemma 9B or its tunes. Maybe I should.
You should. Don't expect perfection, it's still a 9B. But it's pretty great for a 9B.
Also as long as you don't need more than 8-12k and use Exllama.

Anonymous
12/21/24(Sat)21:36:45 No.103604173

Anonymous 12/21/24(Sat)21:36:45 No.103604173

>>103604159
Also as long as you don't need more than 8-12k
You can do up to 30k, the drop off is at 31k. I posted the rope config awhile ago.

Anonymous
12/21/24(Sat)21:38:01 No.103604182

Anonymous 12/21/24(Sat)21:38:01 No.103604182

>>103604159
>UGI seems to be pretty correlative to my experience with how uncensored models are, which is one metric that is preferable, though not indicative of total model quality.
Fair enough, actually.

>and use Exllama.
Is it still fucked in llama.cpp? Is it due to the sliding window implementation?

Anonymous
12/21/24(Sat)21:46:33 No.103604241

Anonymous 12/21/24(Sat)21:46:33 No.103604241

>>103604155
nice to see its working, you'll like it. making sure you're using the proper template is the most important part. click the big A at the top of the screen then look at the left context section. when you dl a model, the page will tell you what template it wants and your st settings should match that. and some models want other things like instruct mode specifically (the middle part of the window, note the green toggle). these templates aren't always super important when rping but it depends on the model.
also note that these settings do not save per card nor per chat (st shortcoming, imo). so if you were to switch models, you have to remember to switch your template too for models that it matters with. and even that has exceptions - some models do fine with pretty much any format, some are more strict

Anonymous
12/21/24(Sat)21:52:34 No.103604277

Anonymous 12/21/24(Sat)21:52:34 No.103604277

>>103604173
I used that, but it failed my long context tests. Other people seem to also have a similar experience with Llama.cpp testing below 8k if you read above in the thread. To be fair it's probably fine for ERP and less complicated RPs though as it can still seem to recall recent context perfectly, just not super early stuff.

>>103604182
Yeah I dunno. It would make sense though.

Anonymous
12/21/24(Sat)21:55:19 No.103604295

Anonymous 12/21/24(Sat)21:55:19 No.103604295

>>103604241
Thanks! Ooga seems to detect what the model wanted whenever I loaded one, and I got by using the chat-instruct setting there without worrying about changing different instruct modes. I got used to switching between different models to compare results. It sounds like for Silly Tavern every model is going to need a configuration setup manually then?

Anonymous
12/21/24(Sat)21:59:46 No.103604329

Anonymous 12/21/24(Sat)21:59:46 No.103604329

Second 3090 arriving tomorrow. What should I do first with it?

>Inb4 trash

I want to know what model I should load up onto them.

Anonymous
12/21/24(Sat)22:03:12 No.103604351

Anonymous 12/21/24(Sat)22:03:12 No.103604351

What's the "very awa" of LLM prompting?

Anonymous
12/21/24(Sat)22:03:57 No.103604354

Anonymous 12/21/24(Sat)22:03:57 No.103604354

>>103604329
QvQ

Anonymous
12/21/24(Sat)22:04:23 No.103604357

Anonymous 12/21/24(Sat)22:04:23 No.103604357

>>103604351
Not a straight forward as it depends on the use case and model but look at JBs on /aicg

Anonymous
12/21/24(Sat)22:09:36 No.103604401

Anonymous 12/21/24(Sat)22:09:36 No.103604401

>>103604295
you dont set it up multiple times or per character, st's template data is just held as one thing so whatever you did last is whats saved. thats just how it behaves. personally i think it should be per-card/chat
the card of the model will say what the template should be, but a lot are built into st (like chatml, alpaca) so its easy to change

Anonymous
12/21/24(Sat)22:13:19 No.103604432

Anonymous 12/21/24(Sat)22:13:19 No.103604432

>>103604329
Unironically Ifable 9B.
But you can also try 27B if you want more intelligence for non-RP stuff. I hear it's good for translation. And Qwen 32B Coder if you want coding. If you want to play with RPG cards, I'd say go with 9B until you get to 8k, then unload it and load up Mistral Small.

Anonymous
12/21/24(Sat)22:13:44 No.103604438

Anonymous 12/21/24(Sat)22:13:44 No.103604438

is there any way to use CFG scale to make an LLM smarter? like some magic negative prompt someone found that slightly boosts intelligence

Anonymous
12/21/24(Sat)22:14:30 No.103604443

Anonymous 12/21/24(Sat)22:14:30 No.103604443

>>103604354
did that drop?

Anonymous
12/21/24(Sat)22:14:59 No.103604450

Anonymous 12/21/24(Sat)22:14:59 No.103604450

>>103604432
he can already run ifable on the single card he already has, anon

Anonymous
12/21/24(Sat)22:16:59 No.103604480

Anonymous 12/21/24(Sat)22:16:59 No.103604480

>>103604443
Soon™

Anonymous
12/21/24(Sat)22:20:56 No.103604509

Anonymous 12/21/24(Sat)22:20:56 No.103604509

>>103604450
My bad, I skimmed (speedread).
In that case I'd suggest Llama 3.3 Eva. I only tried v0.0, so that's what I'll recommend. For RP. It's not the smartest, but it's pretty fun.

Anonymous
12/21/24(Sat)22:26:09 No.103604557

Anonymous 12/21/24(Sat)22:26:09 No.103604557

>>103604354
That's honestly why I bitched out and bought the second 3090. Hopefully they arrive around the same time.

Anonymous
12/21/24(Sat)22:39:15 No.103604638

Anonymous 12/21/24(Sat)22:39:15 No.103604638

>>103604354
The fabled savior of the hobby...

Anonymous
12/21/24(Sat)22:40:25 No.103604647

Anonymous 12/21/24(Sat)22:40:25 No.103604647

>>103604329
>What should I do first with it?
Stress tests. OCCT VRAM error test, gayming stability that uses the tensor cores like port royal, or something free like Quake RTX.

Anonymous
12/21/24(Sat)22:42:20 No.103604661

Anonymous 12/21/24(Sat)22:42:20 No.103604661

File: ARC-AGI_o3-failed-output-1.png (930 KB, 1570x674)

930 KB PNG

>>103602739
>>103603019
>>103604038
Holy shit, it's worse than I could have ever imagined. (Left: question, Right: o3's answer)
THIS is supposed to be "AGI"?

Anonymous
12/21/24(Sat)22:45:31 No.103604679

Anonymous 12/21/24(Sat)22:45:31 No.103604679

>>103604329
QwQ and Qwen2.5 Coder at 8 bit, Llama 3.3 and Qwen2.5 at 4bit for general assistant stuff. And Magnum v4 72B for God-tier ERP.

Anonymous
12/21/24(Sat)22:47:29 No.103604690

Anonymous 12/21/24(Sat)22:47:29 No.103604690

>>103604661
uh that looks correct to me?

Anonymous
12/21/24(Sat)22:48:21 No.103604702

Anonymous 12/21/24(Sat)22:48:21 No.103604702

>>103604690
>this nigga as dumb as an LLM

Anonymous
12/21/24(Sat)22:49:04 No.103604706

Anonymous 12/21/24(Sat)22:49:04 No.103604706

File: file.png (88 KB, 1749x173)

88 KB PNG

>>103601121
So o3 is retarded?

Anonymous
12/21/24(Sat)22:50:27 No.103604715

Anonymous 12/21/24(Sat)22:50:27 No.103604715

File: 1645963693975.png (719 KB, 1774x1087)

719 KB PNG

>Improved the UI by pushing Gradio to its limits and making it look like ChatGPT, specifically the early 2023 ChatGPT look (which I think looked better than the current darker theme).
>Improved
>by making it look like ChatGPT
New ooba is shit. SHIT! How the fuck is the soulless shitgpt look copied by every shitty chat frontend since 2022 supposed to be better than the original soulful UI? I hate this.
That is all.

Anonymous
12/21/24(Sat)22:52:49 No.103604729

Anonymous 12/21/24(Sat)22:52:49 No.103604729

>>103604661
petra post

Anonymous
12/21/24(Sat)22:52:54 No.103604731

Anonymous 12/21/24(Sat)22:52:54 No.103604731

>>103604715
right, after pic even shows it takes a million times more space and makes you need to scroool to see stuff that used to take 3/4 of the screen

Anonymous
12/21/24(Sat)22:53:49 No.103604735

Anonymous 12/21/24(Sat)22:53:49 No.103604735

File: image.png (503 KB, 834x674)

503 KB PNG

>>103604690
Retard, it missed this right here touching the blue beams. You have to color in those boxes blue. See the examples: >>103602739

Anonymous
12/21/24(Sat)22:54:25 No.103604742

Anonymous 12/21/24(Sat)22:54:25 No.103604742

File: 1731709531741190.jpg (345 KB, 1600x1200)

345 KB JPG

if its not local it doesn't matter

Anonymous
12/21/24(Sat)22:56:31 No.103604749

Anonymous 12/21/24(Sat)22:56:31 No.103604749

>>103604735
Oh yeah, I missed that just being adjacent to a red square is enough and it doesn't actually have to pass through it. I guess I'm as dumb as o3.

Anonymous
12/21/24(Sat)22:56:35 No.103604750

Anonymous 12/21/24(Sat)22:56:35 No.103604750

>>103604735
Where in the examples does a merely grazing a box turn it blue? All the examples show the blue lines intersecting.
Also what happens if there is more than one box on the X or y axis? Should there be a line through those too?

Anonymous
12/21/24(Sat)22:56:39 No.103604751

Anonymous 12/21/24(Sat)22:56:39 No.103604751

>>103604735
The examples only show it coloring when it passes through them tho, not when it just touches?

Anonymous
12/21/24(Sat)22:56:41 No.103604752

Anonymous 12/21/24(Sat)22:56:41 No.103604752

>>103604735
>going-through vs touching

Anonymous
12/21/24(Sat)22:56:43 No.103604753

Anonymous 12/21/24(Sat)22:56:43 No.103604753

I'm addicted to mother-daughter threesome RPs, nothing in life is superior to it

Anonymous
12/21/24(Sat)22:57:43 No.103604759

Anonymous 12/21/24(Sat)22:57:43 No.103604759

>>103604735
I disagree. I think that particular square is open for interpretation since there is no similar example.

Anonymous
12/21/24(Sat)22:58:11 No.103604765

Anonymous 12/21/24(Sat)22:58:11 No.103604765

>>103604753
>mother-daughter threesome RPs
Rate the various models you have tried.

Anonymous
12/21/24(Sat)22:58:12 No.103604766

Anonymous 12/21/24(Sat)22:58:12 No.103604766

>>103604735
That undefined behavior, none of the example have this case, they all have part of the line in a block, none just touching.

Anonymous
12/21/24(Sat)22:58:17 No.103604767

Anonymous 12/21/24(Sat)22:58:17 No.103604767

File: 1732676046293702.jpg (189 KB, 900x1200)

189 KB JPG

>>103604735
all of the examples where it turns boxes blue intersect the red boxes. just touching them is not the pattern, it's piercing them.
congratulations! you are dumber than o3.

Anonymous
12/21/24(Sat)22:58:45 No.103604771

Anonymous 12/21/24(Sat)22:58:45 No.103604771

>>103604735
Retard. It did not intersect, therefore the square should be red per the examples.

Anonymous
12/21/24(Sat)22:59:13 No.103604775

Anonymous 12/21/24(Sat)22:59:13 No.103604775

>>103604753
Ah yes. I believe that's called oyakodon in hentai land.

Anonymous
12/21/24(Sat)22:59:41 No.103604778

Anonymous 12/21/24(Sat)22:59:41 No.103604778

>>103604749
>>103604750
>>103604751
>>103604766
>>103604767
Keep coping, Sam. Francois won.

Anonymous
12/21/24(Sat)23:01:14 No.103604788

Anonymous 12/21/24(Sat)23:01:14 No.103604788

>>103604735
This is clearly correct so I guess the retard is the anon upthread who claimed o3 got it wrong. I should have known to follow the link and check instead of taking his word for it.

Anonymous
12/21/24(Sat)23:01:34 No.103604789

Anonymous 12/21/24(Sat)23:01:34 No.103604789

>>103604778
Imagine being dumber than o3...

Anonymous
12/21/24(Sat)23:01:45 No.103604790

Anonymous 12/21/24(Sat)23:01:45 No.103604790

File: 1727354144929504.gif (3.77 MB, 432x592)

3.77 MB GIF

>>103604778
t. replaceable by o3

Anonymous
12/21/24(Sat)23:03:58 No.103604808

Anonymous 12/21/24(Sat)23:03:58 No.103604808

>>103604767
But it also doesn't show any example where it touches the edge and DOESN'T turn blue, so either could be valid.
The test actually gives you two chances to get it right, so that you can try both possibilities if you're generally intelligent.
o3 wasted its second try testing if the fucking pairs of blue dots on the left and right edges should connect to each other vertically between them for no fucking reason.

Anonymous
12/21/24(Sat)23:05:23 No.103604814

Anonymous 12/21/24(Sat)23:05:23 No.103604814

>>103604808
I was wondering if you needed to connect them too, so it makes sense to me, bad benchmark, o3 did its best

Anonymous
12/21/24(Sat)23:12:46 No.103604856

Anonymous 12/21/24(Sat)23:12:46 No.103604856

>>103604788
Sorry but the actual fucking creator of the benchmark knows which answer is actually correct and he disagrees with you. I know who I believe.

Anonymous
12/21/24(Sat)23:13:54 No.103604861

Anonymous 12/21/24(Sat)23:13:54 No.103604861

>>103604661
this is the correct answer nigger

Anonymous
12/21/24(Sat)23:14:43 No.103604866

Anonymous 12/21/24(Sat)23:14:43 No.103604866

>>103604856
because there have never ever been errors in memebenches

Anonymous
12/21/24(Sat)23:15:51 No.103604878

Anonymous 12/21/24(Sat)23:15:51 No.103604878

>>103604856
sounds more like shifting goals

Anonymous
12/21/24(Sat)23:16:46 No.103604884

Anonymous 12/21/24(Sat)23:16:46 No.103604884

>>103604861
Nope, see >>103604735
You can complain all you want but the official correct answer is what counts, not whatever looks right to you. Better luck next time. Maybe you'll get it during your 12 days of 2025 christmas, Sam.

Anonymous
12/21/24(Sat)23:17:56 No.103604889

Anonymous 12/21/24(Sat)23:17:56 No.103604889

>>103604884
then the official answer is shit

Anonymous
12/21/24(Sat)23:18:01 No.103604890

Anonymous 12/21/24(Sat)23:18:01 No.103604890

Sam himself will manifest the Basilisk and sic it on the AGI doubters

Anonymous
12/21/24(Sat)23:18:31 No.103604894

Anonymous 12/21/24(Sat)23:18:31 No.103604894

>>103604884
The benchmark creator can decide that grazing a box counts as activation if he wants, but if he doesn't include any instances of grazing in the examples then he can't blame the test taker for making a perfectly coherent guess.

Anonymous
12/21/24(Sat)23:20:08 No.103604906

Anonymous 12/21/24(Sat)23:20:08 No.103604906

>new agi criteria: needs to actually read minds

Anonymous
12/21/24(Sat)23:21:07 No.103604910

Anonymous 12/21/24(Sat)23:21:07 No.103604910

>>103604890
Can he give us a good goddamn image generator that caters to my fetishes first? Christ

Anonymous
12/21/24(Sat)23:22:49 No.103604920

Anonymous 12/21/24(Sat)23:22:49 No.103604920

File: 1732026089517481.png (152 KB, 700x525)

152 KB PNG

>>103604889
the official answer is wrong and the creator failed his own test

Anonymous
12/21/24(Sat)23:25:29 No.103604935

Anonymous 12/21/24(Sat)23:25:29 No.103604935

>>103604920
this is simple algebra, 2x=10 therefore x=5
why the fuck would the third piece suddenly take 4x instead of 3x?

Anonymous
12/21/24(Sat)23:26:18 No.103604940

Anonymous 12/21/24(Sat)23:26:18 No.103604940

>>103604935
lol

Anonymous
12/21/24(Sat)23:26:55 No.103604941

Anonymous 12/21/24(Sat)23:26:55 No.103604941

>>103604935
it's because it asks "how long" not "how much longer" so you have to add in the 10 minutes she already spent

Anonymous
12/21/24(Sat)23:27:47 No.103604951

Anonymous 12/21/24(Sat)23:27:47 No.103604951

jesus christ...

Anonymous
12/21/24(Sat)23:27:52 No.103604952

Anonymous 12/21/24(Sat)23:27:52 No.103604952

>>103604935
you cut 2 times for 3 pices

Anonymous
12/21/24(Sat)23:28:03 No.103604954

Anonymous 12/21/24(Sat)23:28:03 No.103604954

>>103604920
picrel enrages me every time I see it, teachers are retards

Anonymous
12/21/24(Sat)23:29:20 No.103604965

Anonymous 12/21/24(Sat)23:29:20 No.103604965

>>103604935
retard detected. It does not suddenly take less time to saw another piece off.

Anonymous
12/21/24(Sat)23:29:48 No.103604970

Anonymous 12/21/24(Sat)23:29:48 No.103604970

>>103604935
idk, maybe the teacher is retarded. x is the time it takes to cut through a board. Cutting through it once is ten minutes and makes two pieces, cutting through it again would take another two minutes and make three pieces.

So 20 minutes is correct.

Anonymous
12/21/24(Sat)23:30:42 No.103604978

Anonymous 12/21/24(Sat)23:30:42 No.103604978

https://github.com/fchollet/ARC-AGI/issues/95
>Use case for unambiguous benchmarks?

Anonymous
12/21/24(Sat)23:30:50 No.103604979

Anonymous 12/21/24(Sat)23:30:50 No.103604979

>>103604970
>two minutes
I mean ten

Anonymous
12/21/24(Sat)23:33:42 No.103604999

Anonymous 12/21/24(Sat)23:33:42 No.103604999

>>103604978
So his argument for saying the model got it wrong is that it should have dealt with the ambiguity by giving both potential answers?
Every time I see a twitter post from Chollet he comes across as an AI-hating chud who loves moving goalposts, this is doing nothing to dispel that perception.

Anonymous
12/21/24(Sat)23:33:49 No.103605001

Anonymous 12/21/24(Sat)23:33:49 No.103605001

>>103604978
>this is the supposed AGI supertest

Anonymous
12/21/24(Sat)23:33:58 No.103605004

Anonymous 12/21/24(Sat)23:33:58 No.103605004

>>103604978
ambiguity gets you more engagment

Anonymous
12/21/24(Sat)23:34:22 No.103605007

Anonymous 12/21/24(Sat)23:34:22 No.103605007

File: file.png (160 KB, 1348x1143)

160 KB PNG

>>103604735
Both solutions in picrel can also be correct.

Anonymous
12/21/24(Sat)23:40:46 No.103605053

Anonymous 12/21/24(Sat)23:40:46 No.103605053

Okay, so o3 gave a valid possible answer to the puzzle. But what exactly does that have to do with AGI? That's not a difficult question. It's barely even a warmup on an IQ test.

Anonymous
12/21/24(Sat)23:41:17 No.103605056

Anonymous 12/21/24(Sat)23:41:17 No.103605056

>>103605007
Keep moving those goal posts.

Anonymous
12/21/24(Sat)23:42:08 No.103605063

Anonymous 12/21/24(Sat)23:42:08 No.103605063

>>103605053
idk I've seen easier stuff in the earlier parts of a real IQ test before. seems like the kind of thing you might see in the first third of the raven's matrices or something.

Anonymous
12/21/24(Sat)23:42:45 No.103605067

Anonymous 12/21/24(Sat)23:42:45 No.103605067

File: is-this-integral-solvable-v0.png (2.05 MB, 2400x1920)

2.05 MB PNG

Anonymous
12/21/24(Sat)23:42:53 No.103605070

Anonymous 12/21/24(Sat)23:42:53 No.103605070

>>103605053
AGI is just a sentience test. There is no minimum IQ to qualify as AGI.

Anonymous
12/21/24(Sat)23:49:12 No.103605111

Anonymous 12/21/24(Sat)23:49:12 No.103605111

>>103605053
I don't care about o3 but if I see something that I believe is wrong I will point it out, even if it means defending something I may dislike.

Anonymous
12/21/24(Sat)23:52:04 No.103605134

Anonymous 12/21/24(Sat)23:52:04 No.103605134

>>103605067
1

Anonymous
12/22/24(Sun)00:20:21 No.103605339

Anonymous 12/22/24(Sun)00:20:21 No.103605339

man nvidia really captured lightning in a bottle with Nemo12B, it's crazy how smart it is for the size

why can't they do that again with a 30b

Anonymous
12/22/24(Sun)00:22:18 No.103605352

Anonymous 12/22/24(Sun)00:22:18 No.103605352

File: 39_02058_.png (1.25 MB, 744x1024)

1.25 MB PNG

>>103601859
>migus' frontline

Anonymous
12/22/24(Sun)00:29:41 No.103605395

Anonymous 12/22/24(Sun)00:29:41 No.103605395

>>103603813
What rule is that?

Anonymous
12/22/24(Sun)00:31:11 No.103605402

Anonymous 12/22/24(Sun)00:31:11 No.103605402

>>103605339
It's also the most unfiltered. People conflate the result of training on more data with the result training on filtered data

Anonymous
12/22/24(Sun)00:31:38 No.103605404

Anonymous 12/22/24(Sun)00:31:38 No.103605404

File: file.png (129 KB, 1912x631)

129 KB PNG

>>103604920
I thought that this would be the sally's sister tally 2.0. But it actually seems to be pretty easy for an LLM?

Anonymous
12/22/24(Sun)00:31:56 No.103605405

Anonymous 12/22/24(Sun)00:31:56 No.103605405

The combined salaries of the people in this thread trying to figure out what the right answer is actually more than getting o3 to do it.

Sam can't stop winning.

Anonymous
12/22/24(Sun)00:32:47 No.103605410

Anonymous 12/22/24(Sun)00:32:47 No.103605410

>>103605395
He's playing Calvinball with an LLM. Don't expect the rules to make any sense or not be made up on the spot for the sake of being contrarian.

Anonymous
12/22/24(Sun)00:34:11 No.103605424

Anonymous 12/22/24(Sun)00:34:11 No.103605424

>>103605410
Never mind I read the rest and got it. Touching vs intersecting. Examples need to be fixed.

Anonymous
12/22/24(Sun)00:34:49 No.103605433

Anonymous 12/22/24(Sun)00:34:49 No.103605433

>>103605404(me)
All those times I had to kill the loader because I can't stand the writing when I am trying to fuck the model, has made me think the models are much dumber than they actually are.

Anonymous
12/22/24(Sun)00:37:45 No.103605454

Anonymous 12/22/24(Sun)00:37:45 No.103605454

>>103605405
The sum of a bunch of zeros is still zero.

Anonymous
12/22/24(Sun)00:41:52 No.103605491

Anonymous 12/22/24(Sun)00:41:52 No.103605491

>>103602739
I don't get it

Anonymous
12/22/24(Sun)00:53:34 No.103605567

Anonymous 12/22/24(Sun)00:53:34 No.103605567

>>103605053
imagine an agi test created by an iq80 guy

Anonymous
12/22/24(Sun)00:58:51 No.103605603

Anonymous 12/22/24(Sun)00:58:51 No.103605603

File: Screenshot_20241222_125734_X.jpg (411 KB, 1080x1037)

411 KB JPG

We're not getting more grok weights are we?

Anonymous
12/22/24(Sun)00:58:59 No.103605606

Anonymous 12/22/24(Sun)00:58:59 No.103605606

>>103605053
>>103605070
Are you guys just pretending to be retarded?

Anonymous
12/22/24(Sun)01:10:50 No.103605709

Anonymous 12/22/24(Sun)01:10:50 No.103605709

>>103605603
>more grok weights
I thought grok kind of sucked desu

Anonymous
12/22/24(Sun)01:12:03 No.103605725

Anonymous 12/22/24(Sun)01:12:03 No.103605725

>>103604735
sam and fags are right when they claim agi people like this retard are a good chunk of the populace its just that it usually expresses in different ways then simple tests like this though sometimes like this too
gpt 3 was unironically as smart as the average retard if you hooked up a wikipedia into it it would pretty much be it except for the multimodality but that needent be said

Anonymous
12/22/24(Sun)01:19:39 No.103605769

Anonymous 12/22/24(Sun)01:19:39 No.103605769

What on earth do you use for Cydonia? Sampler settings/order, context template, prompt? All the model card says anything about is the instruct templates it supports, and I'm pretty sure it's supposed to be a Mistral small finetune, but that's all I got.
The closest thing I could find was a set for Mistral Nemo from a past thread, but I'm not sure if that would also work for a Small finetune or not.
t. retard skillet

Anonymous
12/22/24(Sun)01:33:41 No.103605863

Anonymous 12/22/24(Sun)01:33:41 No.103605863

File: 853212.jpg (112 KB, 1080x1090)

112 KB JPG

>>103605405
Sam twinkman

Anonymous
12/22/24(Sun)01:39:17 No.103605905

Anonymous 12/22/24(Sun)01:39:17 No.103605905

so im still using kobold and utopia-13b.Q5_K_M.gguf
how far behind am i?
i tried other models which were supposedly more advanced a year or two or 3 ago and they were just dumber than this and sometimes even way slower at the same time too

Anonymous
12/22/24(Sun)01:42:44 No.103605931

Anonymous 12/22/24(Sun)01:42:44 No.103605931

>>103605905
>utopia-13b.Q5_K_M.gguf
>Cydonia

what the fuck are these models?

Anonymous
12/22/24(Sun)01:43:22 No.103605935

Anonymous 12/22/24(Sun)01:43:22 No.103605935

>>103605931
wtf is cydonia i never said that

Anonymous
12/22/24(Sun)01:45:16 No.103605947

Anonymous 12/22/24(Sun)01:45:16 No.103605947

>>103605935
Another person above you posted it.

Anonymous
12/22/24(Sun)01:51:23 No.103605981

Anonymous 12/22/24(Sun)01:51:23 No.103605981

File: Screenshot_20241222_01101(...).jpg (471 KB, 1080x2340)

471 KB JPG

Phone slop anon checking in. Trying out author's note for something different other than third person slop. What do you think? Any other nemo tunes you fellas personally enjoy? Roci, unslop, and magnum are boring to me anons.

Anonymous
12/22/24(Sun)02:04:09 No.103606038

Anonymous 12/22/24(Sun)02:04:09 No.103606038

>>103605981
Cydonia is a step up if you can run it

Anonymous
12/22/24(Sun)02:04:29 No.103606042

Anonymous 12/22/24(Sun)02:04:29 No.103606042

>>103605905
people swear on cydonia 22b
rocinante 12b v1.1 is my favorite

Anonymous
12/22/24(Sun)02:07:55 No.103606058

Anonymous 12/22/24(Sun)02:07:55 No.103606058

>>103605405
A "salary" usually refers to monthly or annual pay. Are you comparing to using o3 for a month/year?

Anonymous
12/22/24(Sun)02:09:59 No.103606067

Anonymous 12/22/24(Sun)02:09:59 No.103606067

https://arxiv.org/abs/2412.09871
>for fixed inference costs, BLT shows significantly better scaling than tokenization-based models, by simultaneously growing both patch and model size.
Is this a new cope or is this the true future of llms?

Anonymous
12/22/24(Sun)02:11:00 No.103606076

Anonymous 12/22/24(Sun)02:11:00 No.103606076

>>103602500
>sam made agi
ok it's good at this benchmark? and? does that translate to real world problems?

Anonymous
12/22/24(Sun)02:11:47 No.103606079

Anonymous 12/22/24(Sun)02:11:47 No.103606079

>>103606067
Every paper is to be assumed a cope until proven otherwise by model weights and implementation into a loader.

Anonymous
12/22/24(Sun)02:12:08 No.103606081

Anonymous 12/22/24(Sun)02:12:08 No.103606081

>>103606067
Meta already made a 1T 8B model that out performed a 15T 8B one so it seems like the next big thing.

Anonymous
12/22/24(Sun)02:13:14 No.103606093

Anonymous 12/22/24(Sun)02:13:14 No.103606093

>>103606067
The new bitnet

Anonymous
12/22/24(Sun)02:14:25 No.103606101

Anonymous 12/22/24(Sun)02:14:25 No.103606101

>>103606093
Qwen-UwW-bitnet-BLT-70b as good as o3, trust the plan

Anonymous
12/22/24(Sun)02:25:21 No.103606162

Anonymous 12/22/24(Sun)02:25:21 No.103606162

File: sally.png (41 KB, 856x514)

41 KB PNG

>>103605404
seems so

Anonymous
12/22/24(Sun)02:28:27 No.103606182

Anonymous 12/22/24(Sun)02:28:27 No.103606182

>>103606162
lol is the model just like that or did you system prompt it into being a bitch?

Anonymous
12/22/24(Sun)02:28:48 No.103606188

Anonymous 12/22/24(Sun)02:28:48 No.103606188

>>103606162
>an 8B model is smarter than a public school teacher

Anonymous
12/22/24(Sun)02:31:16 No.103606196

Anonymous 12/22/24(Sun)02:31:16 No.103606196

>>103606182
its because of the system prompt

Anonymous
12/22/24(Sun)02:40:25 No.103606244

Anonymous 12/22/24(Sun)02:40:25 No.103606244

File: sally-hitler.png (32 KB, 853x385)

32 KB PNG

>>103606182

Anonymous
12/22/24(Sun)02:40:50 No.103606250

Anonymous 12/22/24(Sun)02:40:50 No.103606250

>>103602500
other models that were purposefully trained for that achieved high results too.
It's super easy to create millions of synthetic data for that challenge and reinforcement learning is good at learning specific things.

There is a reason why o1 is great at solving competitive coding problems but bad at explaining specific details from some x documentation or how things actually work.

Anonymous
12/22/24(Sun)02:42:11 No.103606257

Anonymous 12/22/24(Sun)02:42:11 No.103606257

I hate fat people so much

Anonymous
12/22/24(Sun)02:43:21 No.103606264

Anonymous 12/22/24(Sun)02:43:21 No.103606264

>>103602500
not 100% yet

Anonymous
12/22/24(Sun)02:49:31 No.103606295

Anonymous 12/22/24(Sun)02:49:31 No.103606295

>>103605603
He is a grifter, you can't expect much from a grifer.

Anonymous
12/22/24(Sun)02:53:24 No.103606330

Anonymous 12/22/24(Sun)02:53:24 No.103606330

File: sally - comodian.png (41 KB, 882x477)

41 KB PNG

>>103606182
you are a comedian. every answer must be funny and full of jokes. but the answer should still be right.

Anonymous
12/22/24(Sun)03:02:03 No.103606375

Anonymous 12/22/24(Sun)03:02:03 No.103606375

>>103606182
>>103606196
but in that case i didnt system prompt her directly into a bitch.
the system prompt gives her more freedom
so maby she is a bitch at her base core

Anonymous
12/22/24(Sun)03:15:23 No.103606434

Anonymous 12/22/24(Sun)03:15:23 No.103606434

File: 1724274055995046.jpg (1.13 MB, 4096x2546)

1.13 MB JPG

When is Mistral Larger

Anonymous
12/22/24(Sun)03:19:14 No.103606458

Anonymous 12/22/24(Sun)03:19:14 No.103606458

>>103606434
post xs with xl's tits, ai should be able to solve this

Anonymous
12/22/24(Sun)03:21:11 No.103606469

Anonymous 12/22/24(Sun)03:21:11 No.103606469

>>103606434
Yes to all the Miku. Is there a fourth Miku there or is it only implies as to tease the viewer?

Anonymous
12/22/24(Sun)03:23:33 No.103606479

Anonymous 12/22/24(Sun)03:23:33 No.103606479

>>103606469
$200/mo subscriber exclusive

Anonymous
12/22/24(Sun)03:24:58 No.103606487

Anonymous 12/22/24(Sun)03:24:58 No.103606487

>>103606469
0-indexing detected

Anonymous
12/22/24(Sun)03:29:15 No.103606515

Anonymous 12/22/24(Sun)03:29:15 No.103606515

>>103606434
L is the most breedable body type of all, fucking come at me

Anonymous
12/22/24(Sun)03:42:09 No.103606588

Anonymous 12/22/24(Sun)03:42:09 No.103606588

So, /g/ what's the verdict now that some time for testing has passed? Is that broken tokenizer thing from a while back a somethingburger or a nothingburger? Referring to https://desuarchive.org/g/thread/103265207/#q103266637
>>103528480
Yes, I've been playing with Rocinante-12B-v2j-Q5_K_M (v4.1) today and my experience echoes yours: using Metharme, as Drummer suggests, breaks it. Specifically, it repeatedly mixes up the text that should and should not be in asterisks, so its speech is italicized and its actions are not. It works much better using Mistral for context and instruct templates.

Anonymous
12/22/24(Sun)03:47:18 No.103606612

Anonymous 12/22/24(Sun)03:47:18 No.103606612

>>103601121
A single o3 query can cost thousands of dollars? LOL.
What happens when it's clearly wrong and hallucinating? Oh well, thousands of dollars down the drain?

Anonymous
12/22/24(Sun)03:48:58 No.103606625

Anonymous 12/22/24(Sun)03:48:58 No.103606625

>>103606612
gpu power becomes cheaper
in 20 years its a nothingburger

Anonymous
12/22/24(Sun)03:49:25 No.103606627

Anonymous 12/22/24(Sun)03:49:25 No.103606627

>>103606612
o3 goes beyond a simple LLM query. You're essentially asking a universal genius for his service. Expertise is a valuable commodity.

Anonymous
12/22/24(Sun)03:53:27 No.103606646

Anonymous 12/22/24(Sun)03:53:27 No.103606646

>they overfit a model to a benchmark and are now charging thousands of dollars per query for it
LOL

Anonymous
12/22/24(Sun)03:56:26 No.103606665

Anonymous 12/22/24(Sun)03:56:26 No.103606665

103606627
(You)

Anonymous
12/22/24(Sun)04:03:25 No.103606695

Anonymous 12/22/24(Sun)04:03:25 No.103606695

>>103606612
It needs 10000 times more computational power than normal gpt4o per query.
Even if you take every currently working GPUs in the world and turn all of them into H100s and connect them, it still will not be enough to run that shit on mass scale.

Anonymous
12/22/24(Sun)04:07:08 No.103606721

Anonymous 12/22/24(Sun)04:07:08 No.103606721

>>103606695
We're going to run out of electricity soon because people are too fucking stupid to build more nuclear power plants (or because the powers that be want us to run out of electricity soon), aren't we?

Anonymous
12/22/24(Sun)04:09:57 No.103606736

Anonymous 12/22/24(Sun)04:09:57 No.103606736

File: 1709426676411018.png (3.89 MB, 1920x1200)

3.89 MB PNG

$20 to solve 76% of the problems, $3000 to solve 88% of them, and they're all very simple problems, any retarded human could solve them instantly. It's obvious what's going on here, whatever algorithm they're using to compensate for the model's stupidity grows exponentially with the complexity of the problem. It's not going to be useful for any real world application and ClosedAI is doomed.

Anonymous
12/22/24(Sun)04:11:34 No.103606750

Anonymous 12/22/24(Sun)04:11:34 No.103606750

File: garbage-bait.png (206 KB, 1233x957)

206 KB PNG

>>103602500
>mememarks
If they had anything close to AGI they would just make the thing search for and fix bugs in well-known open-source projects.
The fact that they're just throwing more compute at the problem shows their desperation.

Anonymous
12/22/24(Sun)04:12:57 No.103606758

Anonymous 12/22/24(Sun)04:12:57 No.103606758

>>103606736
>$20 to solve 76% of the problems, $3000 to solve 88% of them
per task anon.

Anonymous
12/22/24(Sun)04:13:35 No.103606762

Anonymous 12/22/24(Sun)04:13:35 No.103606762

qwq #2
https://rentry.org/u9heumvh

Anonymous
12/22/24(Sun)04:14:43 No.103606773

Anonymous 12/22/24(Sun)04:14:43 No.103606773

>>103606762
give me your pipeline

Anonymous
12/22/24(Sun)04:16:10 No.103606780

Anonymous 12/22/24(Sun)04:16:10 No.103606780

>>103602500
Ok now tell it to (dis-) prove the riemann hypothesis
Your AGI can do that, right? It's not just gaming benchmarks, right? It can think and update its state (weights) in real time, right?

Anonymous
12/22/24(Sun)04:18:58 No.103606800

Anonymous 12/22/24(Sun)04:18:58 No.103606800

>>103606736
OAI could use it to extend datasets for training normal models with higher quality synthetic data.

Anonymous
12/22/24(Sun)04:19:59 No.103606809

Anonymous 12/22/24(Sun)04:19:59 No.103606809

>>103606780
State != weights.

Anonymous
12/22/24(Sun)04:26:11 No.103606859

Anonymous 12/22/24(Sun)04:26:11 No.103606859

>>103606762
It's hard to read this and not realize that AI will truly swallow all. Nice gen

Anonymous
12/22/24(Sun)04:27:11 No.103606868

Anonymous 12/22/24(Sun)04:27:11 No.103606868

>>103606780
You appear to have confused AGI with ASI
They're not the same thing, anon

Anonymous
12/22/24(Sun)04:33:10 No.103606913

Anonymous 12/22/24(Sun)04:33:10 No.103606913

>>103606773
it's custom software written in lisp and takes for-fucking-ever to generate. I've been at this since gpt2. With qwq I get the first time the feeling there's some real taste to it. But it needs to be refined. I love qwq but I wish it was a bit bigger and less schizo. (The times I came back to the gen just to realize everything turned chinese....) I'm not sure if it would make sense to add another model to the process or just to wait if somebody else releases a bigger CoT model, things are moving fast

Anonymous
12/22/24(Sun)04:39:37 No.103606968

Anonymous 12/22/24(Sun)04:39:37 No.103606968

offtopic but its very funny so i will mention anyone remember that nigger who blew 50k on a hazbin hotel animation ? dumbfuck could have bought 2 h100 with that made a lora for hunyuan and had and inf of something much better more personalized etc

Anonymous
12/22/24(Sun)04:41:43 No.103606993

Anonymous 12/22/24(Sun)04:41:43 No.103606993

>>103606968
Wallet's closed due to AIDS.

Anonymous
12/22/24(Sun)04:41:54 No.103606996

Anonymous 12/22/24(Sun)04:41:54 No.103606996

>>103606809
If we're talking about LLMs then the weights are the only "long term memory" you can change
Context is way too limited
>>103606868
I know, but AGI should match or (slightly) surpass most humans, plus you can speed it up (effectively time dilation) and it doesn't have to rest, so putting AGI to work on real life problems doesn't seem that far fetched to me

Anonymous
12/22/24(Sun)04:42:42 No.103607002

Anonymous 12/22/24(Sun)04:42:42 No.103607002

>>103606913
Can you ask it to continue from the book?

https://rentry.org/9e8wks72

This is what I use to test models and generally they give a much, much shittier continuations than author's.

>>103606996
RNNs have the actual state that isn't in weight nor in context.

Anonymous
12/22/24(Sun)04:43:42 No.103607012

Anonymous 12/22/24(Sun)04:43:42 No.103607012

>>103606067
But wait, since the model will operate on bytes natively, does that mean that it's training data can be natively multimodal as well? I mean you can feed it text in bytes, so images or videos are also just bytes. Actually any file type?

Anonymous
12/22/24(Sun)04:45:00 No.103607025

Anonymous 12/22/24(Sun)04:45:00 No.103607025

>>103607012
>does that mean that it's training data can be natively multimodal as well?
it does, the model will be able to recognize anything, it'll be an elegant way to make multimodal models yeah

Anonymous
12/22/24(Sun)04:45:08 No.103607027

Anonymous 12/22/24(Sun)04:45:08 No.103607027

>>103607002
>RNNs have the actual state that isn't in weight nor in context
That's true, but they don't seem to have taken off in the LLM space. Honestly, the only problem I can see is that longer texts take longer to run through the whole thing, but that's the same as transformer
Oh yeah, isn't training them a pain in the ass? Inference is also not parallelizable iirc

Anonymous
12/22/24(Sun)04:45:26 No.103607031

Anonymous 12/22/24(Sun)04:45:26 No.103607031

>>103607012
That's not too different from how they do this now. They suse some simple conversion for media and put it into context. And if you didn't train on it, it's going to end up being shit.

Anonymous
12/22/24(Sun)04:47:39 No.103607050

Anonymous 12/22/24(Sun)04:47:39 No.103607050

>>103606042
I still can't find the best prompt and settings for either of those.

Anonymous
12/22/24(Sun)04:48:52 No.103607063

Anonymous 12/22/24(Sun)04:48:52 No.103607063

>>103607031
I mean yeah, if it's completely absent from dataset then probably. But every file has it's magic bytes, headers, etc. You could feed it bunch of executables. Wouldn't that make it good at partial reverse engineering for example?

Anonymous
12/22/24(Sun)04:52:10 No.103607092

Anonymous 12/22/24(Sun)04:52:10 No.103607092

>>103607027
Context processing can't be parallelized, but that's a price worth paying since the state can be reused and the inference time doesn't grow with the size of the context that was already processed. Transformers become slower and slower as the context grows even with a cache.

Anonymous
12/22/24(Sun)04:53:41 No.103607102

Anonymous 12/22/24(Sun)04:53:41 No.103607102

>>103606042
Why Rocinante over UnslopNemo?

Anonymous
12/22/24(Sun)04:53:54 No.103607104

Anonymous 12/22/24(Sun)04:53:54 No.103607104

>>103607027
The problem with training is with transforming, you send the whole sequence, and the model trains on all of it in one step, fully parallelizable. For RRNs, when you train on a sequence, it has to go through tokens one by one.

Anonymous
12/22/24(Sun)04:54:48 No.103607112

Anonymous 12/22/24(Sun)04:54:48 No.103607112

>>103606762
kino

Anonymous
12/22/24(Sun)05:09:13 No.103607191

Anonymous 12/22/24(Sun)05:09:13 No.103607191

>>103607112
QwQ is genuinely brilliant if you can wrangle it into obedience as a storytelling model. Can't wait until we have a COCONUT-based model next year; bet it's gonna blow our dicks clean off.

Anonymous
12/22/24(Sun)05:09:22 No.103607192

Anonymous 12/22/24(Sun)05:09:22 No.103607192

File: dancing.png (564 KB, 841x867)

564 KB PNG

Not-so-new paper, but interesting observation. Curious to see what models we will have about in 6 months. They're not going to keep improving forever, though.
https://arxiv.org/pdf/2412.04315

>Densing Law of LLMs
> [...] Our further analysis of recent open-source base LLMs reveals an empirical law (the densing law) that the capacity density of LLMs grows exponentially over time. More specifically, using some widely used benchmarks for evaluation, the capacity density of LLMs doubles approximately every three months. The law provides new perspectives to guide future LLM development, emphasizing the importance of improving capacity density to achieve optimal results with minimal computational overhead.

Anonymous
12/22/24(Sun)05:13:29 No.103607216

Anonymous 12/22/24(Sun)05:13:29 No.103607216

>>103607192
>not going to keep improving forever
Obviously not, but from what I understand, we're nowhere near maximum information density yet, so that trend should continue for the foreseeable future. We'll be eating good, fellas.

Anonymous
12/22/24(Sun)05:15:10 No.103607228

Anonymous 12/22/24(Sun)05:15:10 No.103607228

File: firefox_3GqfTgbm4G.png (526 KB, 786x892)

526 KB PNG

>>103607002
Did it myself. It's not good, but it's better than many other bigger models.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.