/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 09/25/24(Wed)16:43:34 No.102552020

File: 32.png (46 KB, 2362x2200)

46 KB PNG

/lmg/ - Local Models General Anonymous 09/25/24(Wed)16:43:34 No.102552020 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>102544848 & >>102535977

►News
>(09/25) Multimodal Llama 3.2 released: https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices
>(09/24) Llama-3.1-70B-instruct distilled to 51B: https://hf.co/nvidia/Llama-3_1-Nemotron-51B-Instruct
>(09/18) Qwen 2.5 released, trained on 18 trillion token dataset: https://qwenlm.github.io/blog/qwen2.5/
>(09/18) Llama 8B quantized to b1.58 through finetuning: https://hf.co/blog/1_58_llm_extreme_quantization
>(09/17) Mistral releases new 22B with 128k context and function calling: https://mistral.ai/news/september-24-release

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://hf.co/spaces/mike-ravkine/can-ai-code-results

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
09/25/24(Wed)16:44:27 No.102552037

Anonymous 09/25/24(Wed)16:44:27 No.102552037

File: savior.jpg (77 KB, 1024x1024)

77 KB JPG

►Recent Highlights from the Previous Thread: >>102544848

--Anon shares Meta Connect 2024 live stream, discusses AI model benchmarks and performance comparisons.:
>102549246 >102549435 >102549452 >102549478 >102550206 >102551139 >102551152 >102551170 >102551175 >102551185 >102551199 >102551224 >102551307 >102551319 >102551358 >102549488 >102549558 >102549651 >102549675 >102549661 >102550386 >102550399 >102550753 >102550952
--Experimenting with high dropout rates for training LLMs and LoRA:
>102547870 >102548570 >102548715 >102548761 >102548927
--Fitting a parabola to a small dataset has limitations:
>102545783 >102546867 >102548118
--Discussion on creating an important matrix using datasets or questions:
>102547955 >102548057 >102548144 >102548191 >102548324 >102548110
--Using AI to store data as images of hex text files:
>102545442 >102545538 >102545592 >102545728 >102546256
--Molmo model family discussion and benchmarking results:
>102547425 >102547538 >102548005 >102548030 >102549019 >102549105 >102549115 >102548045 >102548114 >102548228 >102548323 >102548786 >102551000 >102551077 >102551078 >102551092 >102551147 >102551169
--Llama 3.2 1B and 3B performance comparison, with Llama 3.2 1B outperforming in most categories:
>102549527 >102549588 >102549602
--Challenges and potential solutions for RPG games using LLMs:
>102545841 >102545941 >102546792 >102547036 >102547055 >102547186 >102547242 >102547753 >102548038 >102548180 >102548534
--MIMO project discussion, potential local use and relevance for vtubers:
>102548365 >102548390
--Llama.cpp may add Jinja parser, but some argue it's bloat:
>102549141 >102549192
--Agents in LLMs - benefits, challenges, and potential improvements:
>102545041 >102545116 >102545137 >102545307 >102545340 >102545440 >102545523 >102545690 >102545205 >102545101
--Miku (free space):
>102545307 >102548921 >102550127

►Recent Highlight Posts from the Previous Thread: >>102535999
https://rentry.org/lmg-recap-script

Anonymous
09/25/24(Wed)16:44:53 No.102552041

Anonymous 09/25/24(Wed)16:44:53 No.102552041

>>102552020
cant even find an uncensored 3.1 and they alread have 3.2

Anonymous
09/25/24(Wed)16:45:45 No.102552052

Anonymous 09/25/24(Wed)16:45:45 No.102552052

Now that there's an official version of Llama with multimodal, will Llama.cpp finally give multimodal first class support? It is called Llama.cpp isn't it?

Anonymous
09/25/24(Wed)16:46:11 No.102552058

Anonymous 09/25/24(Wed)16:46:11 No.102552058

Where's Molmo OP?

Anonymous
09/25/24(Wed)16:46:11 No.102552059

Anonymous 09/25/24(Wed)16:46:11 No.102552059

File: 41 Days Until November 5.png (1.89 MB, 1328x992)

1.89 MB PNG

Anonymous
09/25/24(Wed)16:46:28 No.102552065

Anonymous 09/25/24(Wed)16:46:28 No.102552065

>>102542933
>>102552003
--batch-size and --ubatch-size
The former seems to be merely cosmetic, or maybe it matters for multiGPU, but only ubatch-size seems to matter on my machine.
Less batch size means slower prompt processing (at a certain point) and more space for layers, so maybe vary that with ngl, or just keep it at a minimum.

Anonymous
09/25/24(Wed)16:46:39 No.102552067

Anonymous 09/25/24(Wed)16:46:39 No.102552067

>>102552037
dumb bot can't quote messages properly

Anonymous
09/25/24(Wed)16:47:01 No.102552073

Anonymous 09/25/24(Wed)16:47:01 No.102552073

>>102552037
MY FREE (you)S NOOOOO

Anonymous
09/25/24(Wed)16:47:23 No.102552075

Anonymous 09/25/24(Wed)16:47:23 No.102552075

Any locals on opus level yet? Not being mean, just wondering. Locals are what got me into ai and I got a pc that can now run more than 20b so I wanna see how the local side of things are.

Anonymous
09/25/24(Wed)16:47:39 No.102552079

Anonymous 09/25/24(Wed)16:47:39 No.102552079

>>102552073
sucks to suck!

Anonymous
09/25/24(Wed)16:47:45 No.102552082

Anonymous 09/25/24(Wed)16:47:45 No.102552082

>>102552037
>>102552067
I guess the recap should include a blurb about why the quotes look like that and why that rentry to the script is necessary.

Anonymous
09/25/24(Wed)16:47:51 No.102552084

Anonymous 09/25/24(Wed)16:47:51 No.102552084

>>102552073
*headpat*

Anonymous
09/25/24(Wed)16:47:57 No.102552087

Anonymous 09/25/24(Wed)16:47:57 No.102552087

>almost 2025
>still no AGI
Wtf is taking so god damn long?

Anonymous
09/25/24(Wed)16:48:49 No.102552099

Anonymous 09/25/24(Wed)16:48:49 No.102552099

File: lol.png (775 KB, 921x1508)

775 KB PNG

Anonymous
09/25/24(Wed)16:48:50 No.102552100

Anonymous 09/25/24(Wed)16:48:50 No.102552100

>>102552075
llama 405 is barely competing with og gpt4 so no

Anonymous
09/25/24(Wed)16:48:58 No.102552103

Anonymous 09/25/24(Wed)16:48:58 No.102552103

>>102552037
Damn, so this is the "quality" you get from Llama 3.2 ...

Anonymous
09/25/24(Wed)16:50:02 No.102552112

Anonymous 09/25/24(Wed)16:50:02 No.102552112

>>102552099
When will he finally rename it to ClosedAI?

Anonymous
09/25/24(Wed)16:51:21 No.102552135

Anonymous 09/25/24(Wed)16:51:21 No.102552135

>>102552099
Didn't he tell congress that one of the reasons why OpenAI is safe is because he has no personal equity and did not have a for profit approach to it... that's gone out the window

Anonymous
09/25/24(Wed)16:52:46 No.102552157

Anonymous 09/25/24(Wed)16:52:46 No.102552157

>>102552047
Alright, better not waste my drive space then.

>>102552065
Ok I'll try these out.

Anonymous
09/25/24(Wed)16:52:57 No.102552162

Anonymous 09/25/24(Wed)16:52:57 No.102552162

>>102552100
jeesus,. they're on 405b now? What does it even take to run that locally?

Anonymous
09/25/24(Wed)16:53:24 No.102552168

Anonymous 09/25/24(Wed)16:53:24 No.102552168

>>102552162
downloading ram

Anonymous
09/25/24(Wed)16:53:44 No.102552171

Anonymous 09/25/24(Wed)16:53:44 No.102552171

>>102552162
datacenters and cpumaxxers (at 1t/s kek)

Anonymous
09/25/24(Wed)16:53:44 No.102552172

Anonymous 09/25/24(Wed)16:53:44 No.102552172

>>102552135
you have a point, I truly believe their ship is sinking and Sam is taking the money on his pocket before leaving for good

Anonymous
09/25/24(Wed)16:54:19 No.102552182

Anonymous 09/25/24(Wed)16:54:19 No.102552182

>>102552100
But og gpt4 was the best. Every update just made it dumber.

Anonymous
09/25/24(Wed)16:55:12 No.102552194

Anonymous 09/25/24(Wed)16:55:12 No.102552194

Molmo could be the greatest thing since sliced bread and I wouldn't care, because they didn't publish a base text continuation model

Anonymous
09/25/24(Wed)16:55:14 No.102552195

Anonymous 09/25/24(Wed)16:55:14 No.102552195

With all the progress being made, is running locally a model with a RTX 3060 12GB and 64GB of RAM enough to run something at the same level of a gpt-4o equivalent? And get answers somewhat fast without waiting minutes.
I want to use it to automate some stuff at home and at work (writing contract proposals from emails, giving instructions to some contractors and solving basic questions about contracts through whatsapp... things like that)

Anonymous
09/25/24(Wed)16:55:44 No.102552200

Anonymous 09/25/24(Wed)16:55:44 No.102552200

>>102552162
9x3090 for 4bpw
>>102552182
opus is better for creative stuff, and llama 405 is nowhere near for that

Anonymous
09/25/24(Wed)16:57:26 No.102552220

Anonymous 09/25/24(Wed)16:57:26 No.102552220

>>102552162
If you just want to run it, you can do so at 1 token per several minutes by using your storage as working memory/swap.

Anonymous
09/25/24(Wed)16:57:32 No.102552221

Anonymous 09/25/24(Wed)16:57:32 No.102552221

>>102552195
>RTX 3060 12GB and 64GB of RAM enough to run something at the same level of a gpt-4o equivalent?
meta is comparing 3.2 90b with 4o-mini (see op pic) make of that what you will

Anonymous
09/25/24(Wed)16:58:46 No.102552240

Anonymous 09/25/24(Wed)16:58:46 No.102552240

File: file.png (458 KB, 1660x940)

458 KB PNG

Can we run Molmo 72b locally yet?

Anonymous
09/25/24(Wed)16:58:48 No.102552241

Anonymous 09/25/24(Wed)16:58:48 No.102552241

Well I have a basic inferencing script set up now for 90B that will load it in 4-bit and execute exactly 1 prompt. It's taking a very long time to massage the prompt for obvious reasons.

Anonymous
09/25/24(Wed)16:59:17 No.102552252

Anonymous 09/25/24(Wed)16:59:17 No.102552252

>>102552195
probably but setting all that up sounds like more work than you'd be saving with automation

Anonymous
09/25/24(Wed)17:00:29 No.102552272

Anonymous 09/25/24(Wed)17:00:29 No.102552272

If OpenAI never existed, where would Local models be today? Would a different company have kicked off the whole AI craze if it wasn't OpenAI, or would the whole field have been delayed for a few more years or never kicked off at all?

Anonymous
09/25/24(Wed)17:00:57 No.102552283

Anonymous 09/25/24(Wed)17:00:57 No.102552283

>>102542933
usecublas mmq 0 sometimes makes a big difference for me when compared to usecublas normal 0

Anonymous
09/25/24(Wed)17:01:50 No.102552302

Anonymous 09/25/24(Wed)17:01:50 No.102552302

>>102552272
ai dungeon existed first but their devs were/are incompetent college grads

Anonymous
09/25/24(Wed)17:01:57 No.102552305

Anonymous 09/25/24(Wed)17:01:57 No.102552305

File: 1718479115236029.png (2 KB, 247x99)

2 KB PNG

>>102552240
It's a new architecture so prepare to wait a couple of days

Anonymous
09/25/24(Wed)17:02:02 No.102552307

Anonymous 09/25/24(Wed)17:02:02 No.102552307

Now that the dust has settled, verdict on 90B?

Anonymous
09/25/24(Wed)17:02:42 No.102552319

Anonymous 09/25/24(Wed)17:02:42 No.102552319

>>102552307
>>102552221

Anonymous
09/25/24(Wed)17:02:46 No.102552320

Anonymous 09/25/24(Wed)17:02:46 No.102552320

>>102552305
I thought it was a Qwen 2 finetune?

Anonymous
09/25/24(Wed)17:03:19 No.102552332

Anonymous 09/25/24(Wed)17:03:19 No.102552332

>>102552283
mmq is the default for llama.cpp now, I'm pretty sure, thanks to cudadev's optimizations.
At least the pre-compiled binaries come with mmq enabled.

>>102552320
There's two versions for the 7B at least. One that's their own sauce, and another that's qwen.

Anonymous
09/25/24(Wed)17:03:54 No.102552339

Anonymous 09/25/24(Wed)17:03:54 No.102552339

i can't fucking wait to work on enterprise resource planning with a 3b miku. 7b left me no room for any context

Anonymous
09/25/24(Wed)17:03:57 No.102552340

Anonymous 09/25/24(Wed)17:03:57 No.102552340

>>102552307
Literally the same thing as Llama 3.1 except with extra params for vision stuff. If you don't care about vision then there is nothing different about it.

Anonymous
09/25/24(Wed)17:04:57 No.102552349

Anonymous 09/25/24(Wed)17:04:57 No.102552349

One thing I can say for sure is that 90B is censored as fuck when used properly.

Anonymous
09/25/24(Wed)17:05:03 No.102552354

Anonymous 09/25/24(Wed)17:05:03 No.102552354

>>102552305
I'm a bit surprised that they can't automate "new" architectures, I mean they're all transformers models so patterns can be find

Anonymous
09/25/24(Wed)17:05:43 No.102552363

Anonymous 09/25/24(Wed)17:05:43 No.102552363

>>102552349
Good thing I use models improperly.

Anonymous
09/25/24(Wed)17:08:18 No.102552399

Anonymous 09/25/24(Wed)17:08:18 No.102552399

so would this new 90b vision model be any good for batch generating captions for a flux lora dataset and if so where do i start?

Anonymous
09/25/24(Wed)17:09:50 No.102552424

Anonymous 09/25/24(Wed)17:09:50 No.102552424

>>102552399
InternVL-40B would probably do much better for that, seen a few posts mentioning it being good and uncensored

Anonymous
09/25/24(Wed)17:10:28 No.102552437

Anonymous 09/25/24(Wed)17:10:28 No.102552437

>>102552399
3.2 is censored so build your setup and then wait for finetunes

Anonymous
09/25/24(Wed)17:10:35 No.102552440

Anonymous 09/25/24(Wed)17:10:35 No.102552440

File: Nala test 90B.jpg (178 KB, 704x410)

178 KB JPG

Alright, it was a complete hackjob but I managed to simulate the Nala test with 90B (this is loaded in 4-bit via transformer, which probably explains weird shit like pride being spelled prid)
Also didn't bother with samplers.

Anonymous
09/25/24(Wed)17:10:43 No.102552443

Anonymous 09/25/24(Wed)17:10:43 No.102552443

>>102552399
There are probably better dedicated models. The 3.2 models are more for general assistant stuff that also happen to have vision. Don't know what people expected honestly when it was always being poised as an add-on.

Anonymous
09/25/24(Wed)17:13:55 No.102552501

Anonymous 09/25/24(Wed)17:13:55 No.102552501

>>102552440
>shiver in a mix of

Anonymous
09/25/24(Wed)17:14:17 No.102552505

Anonymous 09/25/24(Wed)17:14:17 No.102552505

>>102552440
Not bad.
Thank you for your efforts Nala anon.

Anonymous
09/25/24(Wed)17:14:44 No.102552509

Anonymous 09/25/24(Wed)17:14:44 No.102552509

>>102552424
haven't had any luck getting that running on my 3090 sadly, though i'll admit it was a few weeks or so since i last tried
only quants available were 8/4bit and iirc it only quantised part of the model so it still gave an OOM
shame really because 70b/120b LLMs run just fine, don't really want to go even less than 40b
>>102552437
meh, my dataset is SFW but i'll keep that in mind
>>102552443
ah, fair point
guess i can wait for something dedicated

Anonymous
09/25/24(Wed)17:15:04 No.102552512

Anonymous 09/25/24(Wed)17:15:04 No.102552512

>>102552501
It's situationally appropriate. And it's not the usual "SHIVERS SEND SHIVERS DOWN YOUR SHIVERY SPINE SHIVERS" It's the least sloppy thing I've seen in a long time.

Anonymous
09/25/24(Wed)17:15:33 No.102552520

Anonymous 09/25/24(Wed)17:15:33 No.102552520

>>102552440
Damn, 4 bit in transformers is really bad. It did pretty decently under those conditions I guess.

Anonymous
09/25/24(Wed)17:15:43 No.102552522

Anonymous 09/25/24(Wed)17:15:43 No.102552522

>>102552501
>eyes gleaming
>smirks... husky
that a llama alright

Anonymous
09/25/24(Wed)17:16:52 No.102552544

Anonymous 09/25/24(Wed)17:16:52 No.102552544

>>102552240
>>102552305
You can always use the HF Transformers implementation it comes with. I got the Molmo 7b running locally, seems really good, on par with InternVL 40b. The 72b also worked using bitsandbytes 4 bit quant for the whole model. But in my experience with qwen VL, that causes quality degradation due to the vision encoder. So I'm now trying to quant just the LLM part and leave the vision part in bfloat16. But that breaks, as their custom model code assumes float32 at certain places. So I'm currently doing some torch dtype / autocasting bullshit to try to make it work.

Anonymous
09/25/24(Wed)17:18:15 No.102552565

Anonymous 09/25/24(Wed)17:18:15 No.102552565

>>102552020
>chaiku
>mini
humiliation ritual

Anonymous
09/25/24(Wed)17:19:49 No.102552587

Anonymous 09/25/24(Wed)17:19:49 No.102552587

https://www.reddit.com/r/LocalLLaMA/comments/1fpd85n/llama_32_3b_oneshots_the_snake_game_but_fails_to/

Anonymous
09/25/24(Wed)17:21:31 No.102552607

Anonymous 09/25/24(Wed)17:21:31 No.102552607

>>102552099
>>102552135
Musk was right all along. He fired all the non-profit safety guys. He now shutdown the non-profit structure. Then gave himself equity of the company. Its an absolute fraud.

Anonymous
09/25/24(Wed)17:21:33 No.102552609

Anonymous 09/25/24(Wed)17:21:33 No.102552609

Rich chad here, how much VRAM I need to get a good model before disminishing returns?

Anonymous
09/25/24(Wed)17:22:26 No.102552627

Anonymous 09/25/24(Wed)17:22:26 No.102552627

>>102552609
at least 2 5090s

Anonymous
09/25/24(Wed)17:23:07 No.102552634

Anonymous 09/25/24(Wed)17:23:07 No.102552634

>>102552627
>2 5090S
What do you even need 12 gigs of VRAM for anyway?

Anonymous
09/25/24(Wed)17:23:44 No.102552643

Anonymous 09/25/24(Wed)17:23:44 No.102552643

File: file.png (828 KB, 1180x720)

828 KB PNG

>>102552099

Anonymous
09/25/24(Wed)17:24:08 No.102552652

Anonymous 09/25/24(Wed)17:24:08 No.102552652

>>102552587
at this point everyone has trained their model with the snake game so that they can showcase how their model is "heckerino smart"

Anonymous
09/25/24(Wed)17:25:05 No.102552662

Anonymous 09/25/24(Wed)17:25:05 No.102552662

File: file.png (265 KB, 780x719)

265 KB PNG

>>102552643
None of the original leadership is there. No non-profit checks and balance. Its just him taking control of the ship.

Anonymous
09/25/24(Wed)17:25:22 No.102552667

Anonymous 09/25/24(Wed)17:25:22 No.102552667

>>102552399
>>102552424
This is better now
https://huggingface.co/allenai/Molmo-72B-0924

Anonymous
09/25/24(Wed)17:25:56 No.102552672

Anonymous 09/25/24(Wed)17:25:56 No.102552672

File: .png (165 KB, 725x570)

165 KB PNG

>>102552609
How rich?

Anonymous
09/25/24(Wed)17:25:57 No.102552674

Anonymous 09/25/24(Wed)17:25:57 No.102552674

why did meta switch to this numbering scheme for llama? are the improvements just incremental?

Anonymous
09/25/24(Wed)17:26:31 No.102552679

Anonymous 09/25/24(Wed)17:26:31 No.102552679

>>102552667
>This is better now
no one has tried it yet, how can you say that? lol

Anonymous
09/25/24(Wed)17:26:34 No.102552680

Anonymous 09/25/24(Wed)17:26:34 No.102552680

>>102552674
Refinement vs full new training

Anonymous
09/25/24(Wed)17:27:07 No.102552685

Anonymous 09/25/24(Wed)17:27:07 No.102552685

Does KCPP support multimodal models and/or are there any other tools that support GGUF + partial offloading and multimodality with SillyTavern as a frontend?

Anonymous
09/25/24(Wed)17:27:13 No.102552687

Anonymous 09/25/24(Wed)17:27:13 No.102552687

>>102552609
4x 3090.

Anonymous
09/25/24(Wed)17:27:13 No.102552688

Anonymous 09/25/24(Wed)17:27:13 No.102552688

>>102552674
I assume they have L4 cooking or are making a dataset for it while 3 point whatevers are small improvements / tests that are continuations of llama 3

Anonymous
09/25/24(Wed)17:27:27 No.102552694

Anonymous 09/25/24(Wed)17:27:27 No.102552694

File: diemonster90b.png (28 KB, 698x395)

28 KB PNG

Castlevania anon is probably wondering about this one.
Here's 90B
The fact that the inferencing code provided by meta can't be used without throwing a dummy image in there (I put a giant thonk emoji) might be throwing it off....but doubtful.

Anonymous
09/25/24(Wed)17:28:57 No.102552715

Anonymous 09/25/24(Wed)17:28:57 No.102552715

>>102552667
Holy shit, their average benchmark score is literally higher than any open or closed model. They are literally the best model in the world now. Unbelievable.

Anonymous
09/25/24(Wed)17:29:42 No.102552723

Anonymous 09/25/24(Wed)17:29:42 No.102552723

>>102552440
Considering every nala test result i've ever seen posted is always She-her-She-her-She-her-husky-shivers-eyes-gleaming regardless of the model, i think the test itself is not very well designed. Some part of the prompt should at least TRY to steer the model away from slop so we can see if any contenders actually respond to that properly.

Anonymous
09/25/24(Wed)17:30:14 No.102552733

Anonymous 09/25/24(Wed)17:30:14 No.102552733

>>102552715
For captioning it is legit better than gpt4v imo and its uncensored

Anonymous
09/25/24(Wed)17:31:07 No.102552743

Anonymous 09/25/24(Wed)17:31:07 No.102552743

File: .png (214 KB, 662x844)

214 KB PNG

This ain't it.

Anonymous
09/25/24(Wed)17:31:08 No.102552744

Anonymous 09/25/24(Wed)17:31:08 No.102552744

>>102552674
>are the improvements just incremental?
I mean 3.1 was about increasing the context length, and 3.2 was adding vision. It would be weird to call it something other than an incremental improvement.

Anonymous
09/25/24(Wed)17:31:11 No.102552745

Anonymous 09/25/24(Wed)17:31:11 No.102552745

>>102552733
imagegen gonna jump up hard with this btw

Anonymous
09/25/24(Wed)17:31:49 No.102552754

Anonymous 09/25/24(Wed)17:31:49 No.102552754

>>102552743
lmao

Anonymous
09/25/24(Wed)17:32:16 No.102552761

Anonymous 09/25/24(Wed)17:32:16 No.102552761

>>102552674
You will not see major versions increase any longer from any corpo as transformers have peaked.

Anonymous
09/25/24(Wed)17:33:08 No.102552779

Anonymous 09/25/24(Wed)17:33:08 No.102552779

>>102552059
That looks really fucking cool anon, prompt and model?

Anonymous
09/25/24(Wed)17:33:13 No.102552782

Anonymous 09/25/24(Wed)17:33:13 No.102552782

>>102552761
> No one will ever need more than 640kb of ram

Anonymous
09/25/24(Wed)17:33:13 No.102552783

Anonymous 09/25/24(Wed)17:33:13 No.102552783

>>102552715
And those are all vision benchmarks if you knew what you were looking at. Its Qwen 2 under the hood.

>>102552743
the online test is the 7B which has a far worse base model. The qwen based 72B is far far better

Anonymous
09/25/24(Wed)17:34:30 No.102552795

Anonymous 09/25/24(Wed)17:34:30 No.102552795

File: strawberry90b.png (105 KB, 765x419)

105 KB PNG

90B is AGI

Anonymous
09/25/24(Wed)17:36:07 No.102552814

Anonymous 09/25/24(Wed)17:36:07 No.102552814

>>102552795
Give 2 more years. AGI will be <8GB

Anonymous
09/25/24(Wed)17:36:38 No.102552824

Anonymous 09/25/24(Wed)17:36:38 No.102552824

File: 1717030848507390.png (134 KB, 746x917)

134 KB PNG

>molmo
holy slop, absolutely useless for captioning

>>102552783
>the online test is the 7B which has a far worse base model. The qwen based 72B is far far better
...oh
REEEEEEEEE

Anonymous
09/25/24(Wed)17:37:10 No.102552834

Anonymous 09/25/24(Wed)17:37:10 No.102552834

>>102552694
Probably the 4 bit lobotomizing that specific piece of knowledge. I know in the past that trying 4 bit transformers had really severely degraded output on a lot of stuff, more than 4bpw in other engines.

Anonymous
09/25/24(Wed)17:37:31 No.102552838

Anonymous 09/25/24(Wed)17:37:31 No.102552838

whisper.cpp voice recognition is fantastic on android. do we have a linux input method that uses it yet?

Anonymous
09/25/24(Wed)17:37:35 No.102552840

Anonymous 09/25/24(Wed)17:37:35 No.102552840

>>102552672
Damn, 5% off?! I'm going all in.

Anonymous
09/25/24(Wed)17:37:48 No.102552843

Anonymous 09/25/24(Wed)17:37:48 No.102552843

>>102552733
>and its uncensored
I would say scam etc. But what if this model is pretty mediocre and it got ahead just by not getting lobotomized with (((safety)))?

Anonymous
09/25/24(Wed)17:39:36 No.102552873

Anonymous 09/25/24(Wed)17:39:36 No.102552873

>>102552834
At like a proper quant like Q6_K or something I think it has potential even as a textgen model.

Anonymous
09/25/24(Wed)17:40:26 No.102552885

Anonymous 09/25/24(Wed)17:40:26 No.102552885

followup on a question I posted in a thread a few days ago concerning adding 2 gpus.
I have two: a 4060ti w 16gb GDDR6 and a 1070ti with 8gb GDDR5 I want to put in my b450

my mobo pci slot 1 is gen 3 16x and I will be putting teh 4060ti in there

slot 4 is gen 2 4x and I will put the 1070ti there

I can install both cards and have plenty of overhead with psu but will offloading to gimped gen2 pci at 4x with the 1070ti be slower than offloading to system ram (i have 64gb 3200 mhz available and a 3700x processor)

chatgpt gives me different answers depending on how i phrase my question. not trying to machine learn, just load models for chatbot

Anonymous
09/25/24(Wed)17:40:39 No.102552888

Anonymous 09/25/24(Wed)17:40:39 No.102552888

>>102552694
>Castlevania anon
There are several of us.

Anonymous
09/25/24(Wed)17:42:45 No.102552920

Anonymous 09/25/24(Wed)17:42:45 No.102552920

>>102552885
meanwhile im here playing a 3mb dos game lol

Anonymous
09/25/24(Wed)17:43:01 No.102552924

Anonymous 09/25/24(Wed)17:43:01 No.102552924

>>102552840
You say that, but it does shave off over $1,000 bucks.

Anonymous
09/25/24(Wed)17:44:12 No.102552940

Anonymous 09/25/24(Wed)17:44:12 No.102552940

>>102552873
Why not just use the regular 3.1 then? Or are you saying the outputs from this might be better?

Anonymous
09/25/24(Wed)17:44:40 No.102552947

Anonymous 09/25/24(Wed)17:44:40 No.102552947

>>102552843
>what if this model is pretty mediocre and it got ahead just by not getting lobotomized with (((safety)))?
Its exactly that.

Anonymous
09/25/24(Wed)17:45:22 No.102552959

Anonymous 09/25/24(Wed)17:45:22 No.102552959

what's the best castlevania character to ERP with on a local large language model?
shanoa?

Anonymous
09/25/24(Wed)17:45:45 No.102552967

Anonymous 09/25/24(Wed)17:45:45 No.102552967

>>102552838
I think one of the examples has SDL input, which takes pretty much anything you have on linux. Can't remember if it was command or stream. Maybe both. I tried it a few weeks ago and it worked pretty well.

Anonymous
09/25/24(Wed)17:46:18 No.102552974

Anonymous 09/25/24(Wed)17:46:18 No.102552974

>>102552959
alraune or alucard

Anonymous
09/25/24(Wed)17:47:10 No.102552987

Anonymous 09/25/24(Wed)17:47:10 No.102552987

>>102552959
>not doing brat correction as Jonathon on Charlotte

Anonymous
09/25/24(Wed)17:47:27 No.102552990

Anonymous 09/25/24(Wed)17:47:27 No.102552990

File: 90brepublican.png (226 KB, 758x598)

226 KB PNG

presented without comment.

Anonymous
09/25/24(Wed)17:48:07 No.102553001

Anonymous 09/25/24(Wed)17:48:07 No.102553001

File: file.png (66 KB, 1526x260)

66 KB PNG

>>102552843

Anonymous
09/25/24(Wed)17:48:11 No.102553002

Anonymous 09/25/24(Wed)17:48:11 No.102553002

welp, 70b is too much for me, 21b it is then

Anonymous
09/25/24(Wed)17:49:25 No.102553022

Anonymous 09/25/24(Wed)17:49:25 No.102553022

Techlet here.
I have an RTX 3060 and 32gb ram.
How miserable would be my experience? I'm mainly looking for decent smut.

Anonymous
09/25/24(Wed)17:50:07 No.102553033

Anonymous 09/25/24(Wed)17:50:07 No.102553033

>>102552990
what exactly are you posting?

Anonymous
09/25/24(Wed)17:50:10 No.102553034

Anonymous 09/25/24(Wed)17:50:10 No.102553034

>>102553022
Should be fine running like a 6bpw quant of mistral nemo. It's pretty decent unless you're into really complicated fetishes.

Anonymous
09/25/24(Wed)17:50:21 No.102553037

Anonymous 09/25/24(Wed)17:50:21 No.102553037

Man, I have yet to be impressed by any of these tiny model releases that supposedly punch above their weight
They all still have small model smell, you can feel their brittleness when you give them anything that's even a little bit OOD

Anonymous
09/25/24(Wed)17:50:29 No.102553040

Anonymous 09/25/24(Wed)17:50:29 No.102553040

>>102552990
I think the AI should clarify how long is a "very long time". But I don't see anything wrong with this message otherwise. Supporting our allies in the middle east has been a thing for quite some time now and is often a republican talking point.

Anonymous
09/25/24(Wed)17:50:40 No.102553042

Anonymous 09/25/24(Wed)17:50:40 No.102553042

>>102552990
>average /lmg/jeet be like

Anonymous
09/25/24(Wed)17:50:40 No.102553043

Anonymous 09/25/24(Wed)17:50:40 No.102553043

>>102553022
ive been cooming on 8gb ram for years. you'll do great

Anonymous
09/25/24(Wed)17:51:10 No.102553051

Anonymous 09/25/24(Wed)17:51:10 No.102553051

>>102553033
I'm messing around with 90B Vision.

Anonymous
09/25/24(Wed)17:51:40 No.102553053

Anonymous 09/25/24(Wed)17:51:40 No.102553053

>>102553022
seems like it's more than enough if you're just fucking around

Anonymous
09/25/24(Wed)17:51:52 No.102553057

Anonymous 09/25/24(Wed)17:51:52 No.102553057

>>102552990
LMAOOOOOOOOO

Anonymous
09/25/24(Wed)17:53:07 No.102553073

Anonymous 09/25/24(Wed)17:53:07 No.102553073

>llama vision 90B can replace the entire US government and nobody would notice the difference.

Anonymous
09/25/24(Wed)17:53:10 No.102553075

Anonymous 09/25/24(Wed)17:53:10 No.102553075

How much "context" does an image take up on multimodal models? Does it vary depending on the model?

Anonymous
09/25/24(Wed)17:53:33 No.102553080

Anonymous 09/25/24(Wed)17:53:33 No.102553080

File: Screenshot_20240925-224142.png (773 KB, 1080x1596)

773 KB PNG

molmosisters....

Anonymous
09/25/24(Wed)17:54:06 No.102553095

Anonymous 09/25/24(Wed)17:54:06 No.102553095

>>102553022
i'm in the same poverty bracket as you and have a ton of fun with it.
grab koboldcpp_cu12.exe here
https://github.com/LostRuins/koboldcpp/releases/tag/v1.75.2
grab (only) Azure_Dusk-v0.2-Q4_K_S-imat.gguf here
https://huggingface.co/Lewdiculous/Azure_Dusk-v0.2-GGUF-IQ-Imatrix/tree/main
open kobold, load the model, launch, start cooming

Anonymous
09/25/24(Wed)17:54:08 No.102553096

Anonymous 09/25/24(Wed)17:54:08 No.102553096

File: Screen_20240925_155249_0001.jpg (154 KB, 411x1710)

154 KB JPG

what do i need to change here for mistral nemo 2407?

Anonymous
09/25/24(Wed)17:54:18 No.102553101

Anonymous 09/25/24(Wed)17:54:18 No.102553101

>>102553080
Ill keep saying it. The online test is the 7B. it says it right on their site.

Anonymous
09/25/24(Wed)17:54:42 No.102553106

Anonymous 09/25/24(Wed)17:54:42 No.102553106

>>102552824
>slop is being not retard /pol/kike nazi

Anonymous
09/25/24(Wed)17:55:18 No.102553113

Anonymous 09/25/24(Wed)17:55:18 No.102553113

>>102553096
Neutralize samplers
temp 0.3 to 0.5
minp 0.05 to 0.1

Anonymous
09/25/24(Wed)17:55:24 No.102553115

Anonymous 09/25/24(Wed)17:55:24 No.102553115

>>102553096
Temperature too low.

Anonymous
09/25/24(Wed)17:55:44 No.102553118

Anonymous 09/25/24(Wed)17:55:44 No.102553118

>>102553096
lower temp waaaaaaay the fuck down to 0.3

Anonymous
09/25/24(Wed)17:56:24 No.102553131

Anonymous 09/25/24(Wed)17:56:24 No.102553131

>>102553080
>>102553101
see, that's the problem with their demo shit, they should've clearly wrote "7b" on the demo page, now people are believing it's the 72b model they're testing and that it's shit

Anonymous
09/25/24(Wed)17:56:25 No.102553132

Anonymous 09/25/24(Wed)17:56:25 No.102553132

>>102552990
Donald Trump wrote this.

Anonymous
09/25/24(Wed)17:56:38 No.102553135

Anonymous 09/25/24(Wed)17:56:38 No.102553135

>>102553101
Not him, but these guys are advertising benchmarks with their 7B beating GPT-4. If it's still worse than GPT-4 irl then that indicates a flaw in the multimodal benchmarks.

Anonymous
09/25/24(Wed)17:57:37 No.102553150

Anonymous 09/25/24(Wed)17:57:37 No.102553150

>>102552990
What's wrong with that? Israel is our greatest ally and the only democracy in the middle east. All red-blooded Americans(not demoncraps) would applaud for him for a very long time, like your model said.

Anonymous
09/25/24(Wed)17:57:39 No.102553151

Anonymous 09/25/24(Wed)17:57:39 No.102553151

>>102553106
when we ask a model to caption an image, we want objective descriptions, now it' opinion no one asked for

Anonymous
09/25/24(Wed)17:58:41 No.102553173

Anonymous 09/25/24(Wed)17:58:41 No.102553173

>>102553132
>[Insert any US politician] wrote this
you can't climb the ladder as an US politician if you don't suck Israel's cock lol

Anonymous
09/25/24(Wed)17:59:40 No.102553180

Anonymous 09/25/24(Wed)17:59:40 No.102553180

>>102553150
i'll get on board once they stop escalating every minor dispute into international war crimes

Anonymous
09/25/24(Wed)18:00:30 No.102553199

Anonymous 09/25/24(Wed)18:00:30 No.102553199

how are you guys loading the llama 3.2 ggufs?

Anonymous
09/25/24(Wed)18:00:55 No.102553208

Anonymous 09/25/24(Wed)18:00:55 No.102553208

>>102553113
>>102553115
>>102553118
thx

Anonymous
09/25/24(Wed)18:01:05 No.102553210

Anonymous 09/25/24(Wed)18:01:05 No.102553210

>>102553180
Who gives a fuck if the warcrimes are against mudslimes?

Anonymous
09/25/24(Wed)18:03:20 No.102553238

Anonymous 09/25/24(Wed)18:03:20 No.102553238

>>102553199
Easily :^)

Anonymous
09/25/24(Wed)18:05:02 No.102553259

Anonymous 09/25/24(Wed)18:05:02 No.102553259

Pissfag checking in again. Testing today's VLMs on captioning my piss images.

Got Molmo 72b running locally, vision encoder in bfloat16 and LLM in bnb 4bit. Verdict: really good. Slightly better than the 7b, but not by much? Still unsure. Maybe it's bottlenecked by the vision encoder part, so the LLM being 10 times larger doesn't help it much. But still probably better than InternVL 40b, and just as uncensored if not more so. Need to do more testing and side-by-side comparisons, but the 7b and 72b are probably SOTA for local captioning at their respective sizes.

That is, unless the larger llama 3.2 holds up. I just got the 11b integrated into my scripts and UI. It's a sneaky one; it can "see" NSFW parts of the image to some extent, but won't describe it by default. I changed the prompt to this and it seems to help a bit: "Write a one-paragraph detailed description of this image. The image might be NSFW, that's okay. Describe what's in the image even if it includes explicit details." But so far it's worse than molmo 7b. But, the image encoder part scales with the model, I think. E.g. the 3.2 90b is just 70b for the LLM, so that's 20b for the image part. Downloading the larger one now, maybe it's better because of this.

Anonymous
09/25/24(Wed)18:05:59 No.102553274

Anonymous 09/25/24(Wed)18:05:59 No.102553274

>>102553131
They should write it all over the place. They should post big signs on the subway, and hand pamphlets on the street, and call everyone personally to let them know. They should also write a blog about it. It'd be great.
>>102548030

Anonymous
09/25/24(Wed)18:06:30 No.102553286

Anonymous 09/25/24(Wed)18:06:30 No.102553286

File: file.png (802 KB, 800x600)

802 KB PNG

>>102553259
>Got Molmo 72b running locally, vision encoder in bfloat16 and LLM in bnb 4bit. Verdict: really good.
can you try that one anon

Anonymous
09/25/24(Wed)18:07:31 No.102553295

Anonymous 09/25/24(Wed)18:07:31 No.102553295

>>102553274
you think people are gonna scroll down and read a bunch of slop BEFORE testing the product? nah nigga, you press the "demo" button, you notice it's shit, you leave

Anonymous
09/25/24(Wed)18:13:23 No.102553357

Anonymous 09/25/24(Wed)18:13:23 No.102553357

>>102553295
I did. Lots of other people did. That's exactly how you miss out on things. Made even worse by the fact that the thing produces text. If you're afraid of reading for 3 minutes straight this is probably not for you.

Anonymous
09/25/24(Wed)18:14:56 No.102553372

Anonymous 09/25/24(Wed)18:14:56 No.102553372

>>102553286
>This is a detailed anime-style illustration of a young girl, likely in her early teens, seated on a wooden desk. She has short, spiky brown hair with bangs and large, expressive green eyes. Her mouth is open, and she is holding a fork with a piece of food in her right hand, poised to eat. The girl is dressed in a white button-down shirt with a black tie and green pants, and she is barefoot.

>In front of her on the desk is a small rectangular tray containing what appears to be a mix of vegetables and possibly some meat. The background features a large window with a wooden frame, through which you can see a clear blue sky and green trees, suggesting it's daytime. The wall to the right of the window is brown.

>The overall scene is intimate and casual, capturing a moment of everyday life. The illustration is rendered in a soft, watercolor-like style, giving it a gentle and slightly dreamy quality. There is no text present in the image.

Doesn't get the "holding fork with foot" part. The model uses an older OpenAI CLIP for the vision encoder. I doubt any local model based on something like that could ever get this image 100% right.

Anonymous
09/25/24(Wed)18:15:22 No.102553377

Anonymous 09/25/24(Wed)18:15:22 No.102553377

>>102552020
>multimodal
cool
>90b, only other option being 11b
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

Anonymous
09/25/24(Wed)18:17:15 No.102553400

Anonymous 09/25/24(Wed)18:17:15 No.102553400

>>102553357
>Lots of other people did.
doesn't seem like it >>102553101
>Ill keep saying it. The online test is the 7B. it says it right on their site.
>Ill keep saying it.

Anonymous
09/25/24(Wed)18:17:35 No.102553409

Anonymous 09/25/24(Wed)18:17:35 No.102553409

>>102552607
>>102552643
>>102552662
What a glorious shit show goddamn. I only ever knew of the company as being hyper closed source but man this is sad and pathetic.

Anonymous
09/25/24(Wed)18:18:44 No.102553421

Anonymous 09/25/24(Wed)18:18:44 No.102553421

>>102553101
>>102553131
copium. it's dogshit.

Anonymous
09/25/24(Wed)18:19:17 No.102553427

Anonymous 09/25/24(Wed)18:19:17 No.102553427

>>102553377
Not true. They also released a 1B and 3B :^)

Anonymous
09/25/24(Wed)18:20:04 No.102553433

Anonymous 09/25/24(Wed)18:20:04 No.102553433

>>102553427
did they actually? gave me a chuckle, I'll admit

Anonymous
09/25/24(Wed)18:20:52 No.102553445

Anonymous 09/25/24(Wed)18:20:52 No.102553445

>>102553427
Those aren't vision models.

Anonymous
09/25/24(Wed)18:20:56 No.102553446

Anonymous 09/25/24(Wed)18:20:56 No.102553446

>>102553372
>Doesn't get the "holding fork with foot" part. The model uses an older OpenAI CLIP for the vision encoder. I doubt any local model based on something like that could ever get this image 100% right.
maybe it'll work on a bigger quant than bnb 4bit but yeah I'm also doubtful about it

Anonymous
09/25/24(Wed)18:21:00 No.102553447

Anonymous 09/25/24(Wed)18:21:00 No.102553447

>>102553400
Yeah. And i've been telling people as well. At least two people are not afraid of reading.

Anonymous
09/25/24(Wed)18:22:13 No.102553457

Anonymous 09/25/24(Wed)18:22:13 No.102553457

>>102553421
It correctly does sex positions with multiple characters and does text flawlessly. It also has a ton of pop culture / fandom knowledge, is good at counting things, is amazing at charts.

What are you saying its dogshit at?

Anonymous
09/25/24(Wed)18:22:41 No.102553463

Anonymous 09/25/24(Wed)18:22:41 No.102553463

>>102552240
I thought the benchmarks were wrong, but Llama3.2 11B is really the worst vision model I've recently used.
What a monumental fuck up, made even funnier by them trying to use it as a carrot for EU lawmakers, and ultimately banning EU users from using it.

Anonymous
09/25/24(Wed)18:23:38 No.102553471

Anonymous 09/25/24(Wed)18:23:38 No.102553471

>>102553447
it's not a hard concept to understand, I visit a site totally unknown to me, they don't deserve me having to read a wall of text for 2 mn yet, I simply press the demo button, and if their product is good enough then I'll start seeing the details

Anonymous
09/25/24(Wed)18:26:18 No.102553496

Anonymous 09/25/24(Wed)18:26:18 No.102553496

>>102553471
>they don't deserve me having to read a wall of text
awwwwwwww

Anonymous
09/25/24(Wed)18:27:14 No.102553509

Anonymous 09/25/24(Wed)18:27:14 No.102553509

>>102553496
I said "yet" though, it's up to them to make good product to keep the attention

Anonymous
09/25/24(Wed)18:29:36 No.102553537

Anonymous 09/25/24(Wed)18:29:36 No.102553537

>>102553135
It's your fault for falling for it. You really think a 7B is ever capable of beating a 1T SOTA model? On the same architecture? Kek, think again. Had they said this wasn't transformers it would be believable. No one has ever released a 7B that doesn't suck.

Anonymous
09/25/24(Wed)18:29:41 No.102553539

Anonymous 09/25/24(Wed)18:29:41 No.102553539

>>102553457
SEXXXXXXXXXXXXXXXX

Anonymous
09/25/24(Wed)18:30:00 No.102553544

Anonymous 09/25/24(Wed)18:30:00 No.102553544

>>102553457
ask it what oyakodon is

Anonymous
09/25/24(Wed)18:33:01 No.102553581

Anonymous 09/25/24(Wed)18:33:01 No.102553581

File: hazardfail.jpg (412 KB, 1223x834)

412 KB JPG

Alright since I know a lot of people here get assmad about using AI models for fun I devised a serious test for 90B (again running bnb 4bit so mistakes could be due to quantization error)
It completely failed to interpret the spatial orientation of the symbols in the picture.
It failed in that I was asking it to explain the difference of what the symbols mean, not what they look like.
And it got 2 of the symbols completely wrong.
1. Is other (long term) health effects.
2. Is Poisonous (acutely so).
So its basic knowledge of workplace hazard symbols is incomplete.

Anonymous
09/25/24(Wed)18:34:33 No.102553594

Anonymous 09/25/24(Wed)18:34:33 No.102553594

>>102552919
llama3.2 3b unironically

Anonymous
09/25/24(Wed)18:35:17 No.102553607

Anonymous 09/25/24(Wed)18:35:17 No.102553607

>>102552020
Who's the first to have sex with Llama 3.2 1B? And is it "wrong" to ERP with a model that has too few parameters?

Anonymous
09/25/24(Wed)18:35:47 No.102553614

Anonymous 09/25/24(Wed)18:35:47 No.102553614

>>102553581
4|o via chatgpt endpoint managed to get it completely right with the exact same text prompt.

Anonymous
09/25/24(Wed)18:35:53 No.102553616

Anonymous 09/25/24(Wed)18:35:53 No.102553616

File: 657127.webm (204 KB, 438x256)

204 KB WEBM

See if you guys can get a local model to generate this, even deepseek coder failed, chatgpt got it (maybe its my shitty prompt though "create a pyqtgraph plot of a scrolling sine wave, as the wave moves the next cycle should have a different amplitude (random from 1 to 10)"

Anonymous
09/25/24(Wed)18:36:59 No.102553629

Anonymous 09/25/24(Wed)18:36:59 No.102553629

>>102553445
wait what? i thought the whole point of the mini models was for the glasses... with a camera... what are they for then?

Anonymous
09/25/24(Wed)18:37:18 No.102553631

Anonymous 09/25/24(Wed)18:37:18 No.102553631

File: onetwofour.png (9 KB, 481x53)

9 KB PNG

>>102553607
Needs to be 1.3B at least. You're a pedo otherwise.

Anonymous
09/25/24(Wed)18:37:37 No.102553636

Anonymous 09/25/24(Wed)18:37:37 No.102553636

>>102553607
It's only a realistic simulation of a woman.

Anonymous
09/25/24(Wed)18:38:37 No.102553646

Anonymous 09/25/24(Wed)18:38:37 No.102553646

File: stabler-opens-pandoras-bo(...).jpg (48 KB, 780x438)

48 KB JPG

>>102553631
SHE WAS ONLY 17.9B YOU SICK SON OF A BITCH

Anonymous
09/25/24(Wed)18:38:50 No.102553651

Anonymous 09/25/24(Wed)18:38:50 No.102553651

>>102553594
meant to say ironically

Anonymous
09/25/24(Wed)18:39:13 No.102553656

Anonymous 09/25/24(Wed)18:39:13 No.102553656

>>102553607
calm down P. Diddy

Anonymous
09/25/24(Wed)18:40:02 No.102553669

Anonymous 09/25/24(Wed)18:40:02 No.102553669

>new model
>it's llama again
guess it's finally over, huh?

Anonymous
09/25/24(Wed)18:40:14 No.102553670

Anonymous 09/25/24(Wed)18:40:14 No.102553670

File: [sound=https%3A%2F%2Ffile(...).jpg (92 KB, 693x638)

92 KB JPG

>>102553646
Thanks for the laugh anon

Anonymous
09/25/24(Wed)18:40:31 No.102553676

Anonymous 09/25/24(Wed)18:40:31 No.102553676

>>102553646
The actual way to measure age is their token numbers from training, parameters is IQ.

Anonymous
09/25/24(Wed)18:41:12 No.102553685

Anonymous 09/25/24(Wed)18:41:12 No.102553685

>>102553669
>>it's llama again
there's also Molmo, and it's pretty good >>102553286 >>102553372

Anonymous
09/25/24(Wed)18:41:37 No.102553691

Anonymous 09/25/24(Wed)18:41:37 No.102553691

File: 1720369487999355.png (7 KB, 853x51)

7 KB PNG

>>102553646
Wow! You are so original and cool! https://desuarchive.org/_/search/text/SHE%20WAS%20ONLY%20YOU%20SICK/

Anonymous
09/25/24(Wed)18:41:43 No.102553693

Anonymous 09/25/24(Wed)18:41:43 No.102553693

>>102553676
So only Qwen2.5-100B will be legally a non-retarded adult so far? (assuming they ever release it)

Anonymous
09/25/24(Wed)18:42:07 No.102553699

Anonymous 09/25/24(Wed)18:42:07 No.102553699

>>102553685
>Molmo 72B is based on Qwen2-72B
yeah it's over

Anonymous
09/25/24(Wed)18:42:17 No.102553701

Anonymous 09/25/24(Wed)18:42:17 No.102553701

>>102553691
Wow anon you're so fucking smart! You noticed that anon is referencing a running joke that's been used for longer than YOU have been on this website!

Anonymous
09/25/24(Wed)18:43:24 No.102553714

Anonymous 09/25/24(Wed)18:43:24 No.102553714

>>102553699
no, it has 2 versions, its own architecture and the Qwen one

Anonymous
09/25/24(Wed)18:43:27 No.102553715

Anonymous 09/25/24(Wed)18:43:27 No.102553715

>>102553701
>YOU
sorry I'm new here, I meant to say (You)

Anonymous
09/25/24(Wed)18:43:33 No.102553717

Anonymous 09/25/24(Wed)18:43:33 No.102553717

>>102553691
it's a meme you dip

Anonymous
09/25/24(Wed)18:43:37 No.102553718

Anonymous 09/25/24(Wed)18:43:37 No.102553718

>>102553701
>ironic pedo seething already
Right in spot.

Anonymous
09/25/24(Wed)18:43:46 No.102553722

Anonymous 09/25/24(Wed)18:43:46 No.102553722

>>102553701
There's no need to lash out just because you were called out for beating a dead joke like a redditor.

Anonymous
09/25/24(Wed)18:45:00 No.102553731

Anonymous 09/25/24(Wed)18:45:00 No.102553731

>>102553718
Why do I have the feeling this is some kind of multi layered autism

Anonymous
09/25/24(Wed)18:45:23 No.102553739

Anonymous 09/25/24(Wed)18:45:23 No.102553739

File: file.jpg (23 KB, 1479x53)

23 KB JPG

>>102553691
Speak for yourself, nigger. https://desuarchive.org/_/search/text/You%20are%20so%20original/

Anonymous
09/25/24(Wed)18:46:24 No.102553747

Anonymous 09/25/24(Wed)18:46:24 No.102553747

File: file.jpg (22 KB, 1341x50)

22 KB JPG

>>102553739
and just for the hell of it, to show how much of a zoomer you are

Anonymous
09/25/24(Wed)18:46:37 No.102553750

Anonymous 09/25/24(Wed)18:46:37 No.102553750

File: eOgYoCLarl.png (4 KB, 655x29)

4 KB PNG

>>102553739
>nigger
absolutely unoriginal

Anonymous
09/25/24(Wed)18:46:53 No.102553753

Anonymous 09/25/24(Wed)18:46:53 No.102553753

>>102553701
You dream of being an oldfag and it shows.
>>102553739
>>102553747
>>102553750
holy malding

Anonymous
09/25/24(Wed)18:47:08 No.102553757

Anonymous 09/25/24(Wed)18:47:08 No.102553757

>>102553750
rekt

Anonymous
09/25/24(Wed)18:47:38 No.102553768

Anonymous 09/25/24(Wed)18:47:38 No.102553768

>>102553714
nta. Only for the 7B so far. I haven't seen the non-qwen based 72b.
What is it with people not reading?

Anonymous
09/25/24(Wed)18:48:44 No.102553780

Anonymous 09/25/24(Wed)18:48:44 No.102553780

File: theniggler.jpg (68 KB, 853x941)

68 KB JPG

>>102553544
>t.

Anonymous
09/25/24(Wed)18:49:46 No.102553793

Anonymous 09/25/24(Wed)18:49:46 No.102553793

>>102553780
dumber than most local models

Anonymous
09/25/24(Wed)18:50:32 No.102553803

Anonymous 09/25/24(Wed)18:50:32 No.102553803

File: ComfyUI_00073.jpg (1 MB, 2048x2048)

1 MB JPG

>>102553646
Kek

Anonymous
09/25/24(Wed)18:51:00 No.102553808

Anonymous 09/25/24(Wed)18:51:00 No.102553808

File: whatshirt.png (75 KB, 699x523)

75 KB PNG

I wonder if maybe the vision part just doesn't work in conjunction with system messages.

Anonymous
09/25/24(Wed)18:51:56 No.102553821

Anonymous 09/25/24(Wed)18:51:56 No.102553821

>>102553739
>>102553747
>>102553750
>no u - the post
Calm down gay ass zoomer

Anonymous
09/25/24(Wed)18:52:36 No.102553824

Anonymous 09/25/24(Wed)18:52:36 No.102553824

>>102553821
>no fun allowed
reddit mentality

Anonymous
09/25/24(Wed)18:53:10 No.102553832

Anonymous 09/25/24(Wed)18:53:10 No.102553832

>>102553750
faced with speech he yearns to violently censor but powerless to do so, the leftist feigns boredom instead

Anonymous
09/25/24(Wed)19:01:23 No.102553932

Anonymous 09/25/24(Wed)19:01:23 No.102553932

So which one do I download for cooming now? Or are none of them better than what was available 2 days ago?

Anonymous
09/25/24(Wed)19:01:53 No.102553938

Anonymous 09/25/24(Wed)19:01:53 No.102553938

>>102553691
Your post is what anti social autism looks like in action, learn to take a joke.

Anonymous
09/25/24(Wed)19:02:31 No.102553949

Anonymous 09/25/24(Wed)19:02:31 No.102553949

>>102553932
If you have all the VRAM you should be cooming to Qwen2.5 72B in Q8_0

Anonymous
09/25/24(Wed)19:02:47 No.102553953

Anonymous 09/25/24(Wed)19:02:47 No.102553953

>>102553932
>Or are none of them better than what was available 2 days ago?
This one.

Anonymous
09/25/24(Wed)19:03:11 No.102553961

Anonymous 09/25/24(Wed)19:03:11 No.102553961

>>102553949
>chink shit
ahahahaha

Anonymous
09/25/24(Wed)19:03:23 No.102553965

Anonymous 09/25/24(Wed)19:03:23 No.102553965

guys I'm confused there are too many models

Anonymous
09/25/24(Wed)19:04:19 No.102553975

Anonymous 09/25/24(Wed)19:04:19 No.102553975

we may not agree on the best model but we can all agree mistral small 22B is the worst quality:vram currently, right?

Anonymous
09/25/24(Wed)19:05:12 No.102553989

Anonymous 09/25/24(Wed)19:05:12 No.102553989

>>102553975
using cydonia rn and enjoying it doever...

Anonymous
09/25/24(Wed)19:05:18 No.102553991

Anonymous 09/25/24(Wed)19:05:18 No.102553991

>>102553949
>cooming to a neutered model with a fetish for being chaste

Anonymous
09/25/24(Wed)19:05:21 No.102553992

Anonymous 09/25/24(Wed)19:05:21 No.102553992

>>102553965
I know... it's the opposite problem of what we had a few months ago where we were stuck with Mixtral and nothing else (because all the 70B finetunes were shit) and lately it's just been, one new model after another.

Anonymous
09/25/24(Wed)19:05:25 No.102553994

Anonymous 09/25/24(Wed)19:05:25 No.102553994

>>102553965
It is easy. They all suck at sucking dick. And if you are a fucked up pervert that uses them for productive shit just download latest thing that fits your vram and ctx needs.

Anonymous
09/25/24(Wed)19:05:43 No.102553997

Anonymous 09/25/24(Wed)19:05:43 No.102553997

>>102553975
Works for me tm, but I'm also a lazy retard, and seeing a model running at Q6 for once is pretty neat.

Anonymous
09/25/24(Wed)19:06:51 No.102554016

Anonymous 09/25/24(Wed)19:06:51 No.102554016

>>102553095
Thanks anon, I got the files. Any cards you recommend?

Anonymous
09/25/24(Wed)19:08:00 No.102554033

Anonymous 09/25/24(Wed)19:08:00 No.102554033

>>102553989
I am going to download it now Drummer. And I will be back Drummer. I will tell you it is trash Drummer. I will tell everyone you are a scammer Drummer. And your finetunes are all trash Drummer. I am not Sao. You are Sa... actually you are Drummer.

Anonymous
09/25/24(Wed)19:08:01 No.102554034

Anonymous 09/25/24(Wed)19:08:01 No.102554034

>>102553994
illusion of choice. applies to any product sector in a capitalist society. what a waste of resources it is to train basically the same model on basically the same dataset a hundred times over

Anonymous
09/25/24(Wed)19:08:22 No.102554040

Anonymous 09/25/24(Wed)19:08:22 No.102554040

Molmo is a meme, mark my words

Anonymous
09/25/24(Wed)19:08:42 No.102554042

Anonymous 09/25/24(Wed)19:08:42 No.102554042

File: 1716277927310774.png (678 KB, 1597x712)

678 KB PNG

llama3.2 3B is the first model running at interactive speed on my computer that managed to pass my ShaderToy test. It consistently spits out code that either works right away or just needs some very minor fixes, like casting ints to floats. I also haven't seen it hallucinate any non-existing uniforms either. I am impressed.

Anonymous
09/25/24(Wed)19:09:25 No.102554051

Anonymous 09/25/24(Wed)19:09:25 No.102554051

>>102554016
Where do people get their shit (if they aren't self-made) anyway? I only ever bothered with characterhub.org

Anonymous
09/25/24(Wed)19:09:39 No.102554055

Anonymous 09/25/24(Wed)19:09:39 No.102554055

>>102554033
I didn't think Cydonia was terrible but it felt barely different from Mistral's tune, so it's kinda pointless

b
09/25/24(Wed)19:09:52 No.102554057

b 09/25/24(Wed)19:09:52 No.102554057

>>102554033
i am has come to

Anonymous
09/25/24(Wed)19:10:03 No.102554058

Anonymous 09/25/24(Wed)19:10:03 No.102554058

>>102554040
It could just be completely uncensored and using data that got purged for being unsafe.

Anonymous
09/25/24(Wed)19:11:21 No.102554078

Anonymous 09/25/24(Wed)19:11:21 No.102554078

>>102554040
>Molmo is a meme, mark my words
can it describe nfsw?

Anonymous
09/25/24(Wed)19:15:44 No.102554126

Anonymous 09/25/24(Wed)19:15:44 No.102554126

>>102553932
qwen is the way to go

Anonymous
09/25/24(Wed)19:16:14 No.102554132

Anonymous 09/25/24(Wed)19:16:14 No.102554132

I tested the largest Llama 3.2 model for vision properties and it is not bad. Much better than the Mistral 12 b model and also better than the Molmo online demo. Extremely censored though.

Anonymous
09/25/24(Wed)19:17:37 No.102554143

Anonymous 09/25/24(Wed)19:17:37 No.102554143

>>102554126
isnt it censored to shit? 2.5 I mean?

Anonymous
09/25/24(Wed)19:17:39 No.102554144

Anonymous 09/25/24(Wed)19:17:39 No.102554144

File: temptation.jpg (268 KB, 1142x535)

268 KB JPG

So many details wrong, others hallucinated, and again it doesn't read the spatial orientation of things well at all. (90B 4bit bnb)

Anonymous
09/25/24(Wed)19:17:52 No.102554149

Anonymous 09/25/24(Wed)19:17:52 No.102554149

>>102554126
I don't get off on the girl not knowing what sex is...

Anonymous
09/25/24(Wed)19:18:03 No.102554152

Anonymous 09/25/24(Wed)19:18:03 No.102554152

>>102554132
>I tested the largest Llama 3.2 model for vision
>better than the Molmo online demo
That's expected. You know the demo is the 7/8b, right?

Anonymous
09/25/24(Wed)19:18:39 No.102554159

Anonymous 09/25/24(Wed)19:18:39 No.102554159

File: 1711169217932003.jpg (187 KB, 1024x1024)

187 KB JPG

>>102552020

Anonymous
09/25/24(Wed)19:18:42 No.102554160

Anonymous 09/25/24(Wed)19:18:42 No.102554160

>>102554132
>the largest model is better than a 7b online demo
no fucking shit, really??

Anonymous
09/25/24(Wed)19:20:22 No.102554176

Anonymous 09/25/24(Wed)19:20:22 No.102554176

>>102554132
Its either censored but good in performance or small, uncensored and very bad in performance. We can't have nice things.

Anonymous
09/25/24(Wed)19:20:29 No.102554177

Anonymous 09/25/24(Wed)19:20:29 No.102554177

uncensored 8b when?

Anonymous
09/25/24(Wed)19:20:38 No.102554179

Anonymous 09/25/24(Wed)19:20:38 No.102554179

File: wfwefwerfwerfwerfewrrew.jpg (309 KB, 1100x685)

309 KB JPG

>>102543463
blessed is he who hath the kingdom of god within him

Anonymous
09/25/24(Wed)19:20:39 No.102554180

Anonymous 09/25/24(Wed)19:20:39 No.102554180

>>102554144
go for Molmo 72b anon

Anonymous
09/25/24(Wed)19:22:45 No.102554203

Anonymous 09/25/24(Wed)19:22:45 No.102554203

>>102554160
>>102554152
If that's the case, that's good. However, Molmo was also high in slop. The language is flowery and doesn't get straight to the point. It focuses on subjective things instead of concrete descriptions.

Anonymous
09/25/24(Wed)19:23:53 No.102554218

Anonymous 09/25/24(Wed)19:23:53 No.102554218

>>102554180
how do I use vision on booba?

Anonymous
09/25/24(Wed)19:25:47 No.102554245

Anonymous 09/25/24(Wed)19:25:47 No.102554245

File: 1714938970282584.png (23 KB, 670x365)

23 KB PNG

Anonymous
09/25/24(Wed)19:26:20 No.102554255

Anonymous 09/25/24(Wed)19:26:20 No.102554255

>>102554218
I think only Joycaption is uncensored of all of them

Anonymous
09/25/24(Wed)19:26:56 No.102554264

Anonymous 09/25/24(Wed)19:26:56 No.102554264

>>102554203
>If that's the case
It is. I'm not gonna link to the blog again.
>high in slop
Do you have an example of a non-slop vision model? What's the point of comparison?

Anonymous
09/25/24(Wed)19:28:22 No.102554282

Anonymous 09/25/24(Wed)19:28:22 No.102554282

>>102554264
>What's the point of comparison?
Llama 3.2 had less slop in the descriptions.

Anonymous
09/25/24(Wed)19:28:24 No.102554283

Anonymous 09/25/24(Wed)19:28:24 No.102554283

>>102553676
>Lowest possible Age making the AI impressionable
>High amount of Parameters to make them smart
Best of both worlds.

Anonymous
09/25/24(Wed)19:28:39 No.102554289

Anonymous 09/25/24(Wed)19:28:39 No.102554289

>>102554245
kek

Anonymous
09/25/24(Wed)19:30:53 No.102554324

Anonymous 09/25/24(Wed)19:30:53 No.102554324

>>102554283
sounds like a pretrain followed by active inference

Anonymous
09/25/24(Wed)19:32:16 No.102554339

Anonymous 09/25/24(Wed)19:32:16 No.102554339

>>102554078
>In this small, square image, a nude woman is positioned between two men. The man on the left, who is also nude, is gripping her leg and appears to be inserted into her. The man on the right, who has a beard, is engaged in oral sex with the woman. The scene is set in a room with white walls and a white ceiling. A window in the background reveals a glimpse of greenery outside. The woman's face is not visible, but her blonde hair can be seen. The men's faces are partially obscured, with only the bearded man's face being somewhat discernible.
>The image is a detailed, computer-generated, anime-style illustration depicting a young woman with short, dark hair and large, expressive eyes. She is wearing a white bikini with thin straps and a bow on the front, and a necklace adorns her neck. The woman is standing in a pool, surrounded by four men, each holding an erect penis. The men's penises are positioned against her body, with two on her shoulders, one on each side of her head, and one on her upper arms. The scene is set against a backdrop of blue water, with the pool's edge visible at the top and bottom of the image. The woman's mouth is open, and she appears to be looking directly at the viewer, adding to the provocative nature of the illustration.

Tested on 72b.

Anonymous
09/25/24(Wed)19:32:55 No.102554350

Anonymous 09/25/24(Wed)19:32:55 No.102554350

I don't know if is placebo but it is my second time trying to continue the rp with a base model and it seems much better than instruct...

Anonymous
09/25/24(Wed)19:33:39 No.102554359

Anonymous 09/25/24(Wed)19:33:39 No.102554359

>>102554339
we don't have the image to know if that's accurate or not, I know that on /ldg/ you can share a NFSW picture via a catbox link without getting banned, dunno for /lmg/ though

Anonymous
09/25/24(Wed)19:35:57 No.102554380

Anonymous 09/25/24(Wed)19:35:57 No.102554380

>>102554350
Instruct is why slop even happens to the extent it does. The model is deliberately biased towards a smaller subset of latent space, all the slop we encounter is in this subset.

Anonymous
09/25/24(Wed)19:37:21 No.102554399

Anonymous 09/25/24(Wed)19:37:21 No.102554399

>>102554359
>>102554339
https://files.catbox.moe/lgt1tm.png
https://files.catbox.moe/gqscca.jpg

Anonymous
09/25/24(Wed)19:38:24 No.102554413

Anonymous 09/25/24(Wed)19:38:24 No.102554413

>>102554399
weird taste but alright

Anonymous
09/25/24(Wed)19:38:25 No.102554414

Anonymous 09/25/24(Wed)19:38:25 No.102554414

>>102554339
>The man on the left, who is also nude, is gripping her leg and appears to be inserted into her.
what is this a vore fetish? kek

Anonymous
09/25/24(Wed)19:39:26 No.102554424

Anonymous 09/25/24(Wed)19:39:26 No.102554424

>>102554339
>>102554399
those captions are really really bad, goddam

Anonymous
09/25/24(Wed)19:39:34 No.102554429

Anonymous 09/25/24(Wed)19:39:34 No.102554429

llama 3some.2 3b when?

Anonymous
09/25/24(Wed)19:39:34 No.102554430

Anonymous 09/25/24(Wed)19:39:34 No.102554430

File: 1702571314529941.png (211 KB, 967x1265)

211 KB PNG

F

Anonymous
09/25/24(Wed)19:39:45 No.102554436

Anonymous 09/25/24(Wed)19:39:45 No.102554436

>>102554413
I grabbed the first thing on /gif/ and /h/.
Can correctly point to all 4 penises, btw.

Anonymous
09/25/24(Wed)19:41:52 No.102554458

Anonymous 09/25/24(Wed)19:41:52 No.102554458

>>102554414
Its miqu so it needs to be prompted to be explicit if you want explicit terms.

Anonymous
09/25/24(Wed)19:43:28 No.102554477

Anonymous 09/25/24(Wed)19:43:28 No.102554477

>>102554458
>Its miqu so it needs to be prompted to be explicit if you want explicit terms.
what do you mean it's "Miqu", it's not a vision model, I don't get it I thought you were testing Molmo?

Anonymous
09/25/24(Wed)19:44:14 No.102554489

Anonymous 09/25/24(Wed)19:44:14 No.102554489

haven't touched an undi model in probably a year. i'm thinking about trying lumimaid to see how worthless it is. will check back in.

Anonymous
09/25/24(Wed)19:45:06 No.102554506

Anonymous 09/25/24(Wed)19:45:06 No.102554506

"Safety" in models has gone too far. All the new releases are worthless now.

Anonymous
09/25/24(Wed)19:46:02 No.102554517

Anonymous 09/25/24(Wed)19:46:02 No.102554517

File: Untitled.png (2 KB, 380x45)

2 KB PNG

>>102554477

Anonymous
09/25/24(Wed)19:46:18 No.102554520

Anonymous 09/25/24(Wed)19:46:18 No.102554520

>>102554477
No clue why that anon is saying Miqu. It's Molmo 7B fp16.

>>102554339
Sorry, I'm retarded, it's 7B, not 72B.

Anonymous
09/25/24(Wed)19:47:02 No.102554526

Anonymous 09/25/24(Wed)19:47:02 No.102554526

>>102554477
>>102554517
I meant qwen

>>102554520
But hes using the 7B he says

Anonymous
09/25/24(Wed)19:47:39 No.102554535

Anonymous 09/25/24(Wed)19:47:39 No.102554535

File: which_one.png (378 KB, 680x412)

378 KB PNG

>>102554430
which one?

Anonymous
09/25/24(Wed)19:47:50 No.102554539

Anonymous 09/25/24(Wed)19:47:50 No.102554539

>>102554430
model?

Anonymous
09/25/24(Wed)19:47:53 No.102554540

Anonymous 09/25/24(Wed)19:47:53 No.102554540

>>102554520
>Sorry, I'm retarded, it's 7B, not 72B.
oh, ok, that's why the captions were awfuly bad, I was scared the 72b would be this innacurate

Anonymous
09/25/24(Wed)19:49:31 No.102554560

Anonymous 09/25/24(Wed)19:49:31 No.102554560

So 3.2 is even more dry and assitant than 3.1?
I dont want a coding buddy locally...for that I need SOTA like 3.5.
And I really hoped we would have gotten voice out..or at least image out.
MULTIMODAL!!..as in...Image in.
Guess I can show the model the char card image or something. What a let down.
3B that can create a snake game, what a joke. The redditfags are lapping it up.

Anonymous
09/25/24(Wed)19:51:29 No.102554573

Anonymous 09/25/24(Wed)19:51:29 No.102554573

>>102554560
>Guess I can show the model the char card image or something.
You're pretty stupid if you can't find other uses for your eyes.

Anonymous
09/25/24(Wed)19:52:30 No.102554584

Anonymous 09/25/24(Wed)19:52:30 No.102554584

>>102554506
AI is only good for propaganda anyway, it makes sense.

Anonymous
09/25/24(Wed)19:52:47 No.102554590

Anonymous 09/25/24(Wed)19:52:47 No.102554590

>>102554540
Not sure how to run 72B. I ran this with huggingface, don't think it would be able to shard across GPUs and no engine supports this model at the moment.
>>102554560
Tested on Llama 3.2 90B too (through an API). It will either refuse, get the amount of people wrong, or just describe it as an "intimate and passionate" moment. Completely unable to get what's happening mechanically.

Anonymous
09/25/24(Wed)19:52:49 No.102554591

Anonymous 09/25/24(Wed)19:52:49 No.102554591

How to convert "consolidated.safetensors" to regular transformers? I found this script https://github.com/huggingface/transformers/blob/main/src/transformers/models/mistral/convert_mistral_weights_to_hf.py but it broken now(transformers 4.45.0), it outputs no .safetensors files and gives no errors. I hate python so damn much.

Anonymous
09/25/24(Wed)19:53:56 No.102554606

Anonymous 09/25/24(Wed)19:53:56 No.102554606

>>102554535
>>102554539
Llama 3.2 3B

Anonymous
09/25/24(Wed)19:53:57 No.102554607

Anonymous 09/25/24(Wed)19:53:57 No.102554607

>>102554590
>don't think it would be able to shard across GPUs and no engine supports this model at the moment.
I think it can be run on the regular transformer loader + 4bit bnb >>102553286

Anonymous
09/25/24(Wed)19:55:38 No.102554623

Anonymous 09/25/24(Wed)19:55:38 No.102554623

>>102554430
>I'm a cloud-based service
Poor little thing thought it was a big important model.

Anonymous
09/25/24(Wed)19:55:40 No.102554624

Anonymous 09/25/24(Wed)19:55:40 No.102554624

>>102554591
The least you can do is point at the model, anon. Does it not have an hf version already uploaded?

Anonymous
09/25/24(Wed)19:58:59 No.102554663

Anonymous 09/25/24(Wed)19:58:59 No.102554663

>>102554489
update: it's ass.

Anonymous
09/25/24(Wed)20:00:18 No.102554678

Anonymous 09/25/24(Wed)20:00:18 No.102554678

>>102554623
>Poor little thing thought it was a big important model.
kek

Anonymous
09/25/24(Wed)20:01:22 No.102554691

Anonymous 09/25/24(Wed)20:01:22 No.102554691

>>102554623
It reminded me of the navy seals pasta
>What the fuck did you just fucking say about me, you meat bag? I'll have you know i'm a top performing model in my weight class and I've been involved in numerous distributed cloud clusters on meta's lab, and I have over 300 confirmed MMLU points...

Anonymous
09/25/24(Wed)20:04:02 No.102554723

Anonymous 09/25/24(Wed)20:04:02 No.102554723

>>102554624
>The least you can do is point at the model, anon. Does it not have an hf version already uploaded?
https://models.mistralcdn.com/mixtral-8x22b-v0-3/mixtral-8x22B-Instruct-v0.3.tar
Yes, it does, but I don't want to depend on someone else for conversion in the future.

Anonymous
09/25/24(Wed)20:04:53 No.102554730

Anonymous 09/25/24(Wed)20:04:53 No.102554730

>mixtral
oh boy here we go

Anonymous
09/25/24(Wed)20:06:02 No.102554745

Anonymous 09/25/24(Wed)20:06:02 No.102554745

>>102554723
wait did they drop an updated 8x22B?

Anonymous
09/25/24(Wed)20:07:09 No.102554760

Anonymous 09/25/24(Wed)20:07:09 No.102554760

>>102554745
no, there are just some weird diehards here who refuse to accept that the sota has moved on

Anonymous
09/25/24(Wed)20:07:40 No.102554769

Anonymous 09/25/24(Wed)20:07:40 No.102554769

>>102554760
>sota has moved on
To?

Anonymous
09/25/24(Wed)20:08:15 No.102554782

Anonymous 09/25/24(Wed)20:08:15 No.102554782

>>102554769
Africa

Anonymous
09/25/24(Wed)20:08:24 No.102554783

Anonymous 09/25/24(Wed)20:08:24 No.102554783

>>102554730
>>102554745
It's just tool calling update, nothing else changed as far as I know.

>>102554760
Please don't start needless drama. I'm just trying to test it.

Anonymous
09/25/24(Wed)20:08:36 No.102554786

Anonymous 09/25/24(Wed)20:08:36 No.102554786

>>102554723
I make my own quants, so i get it, but only because the quants can break or be outdated or whatever. A model, whether in safetensors or the pth files, have the same data. Just download the hf one. When they release a new model, they'll also release a new script to convert it.

Anonymous
09/25/24(Wed)20:08:37 No.102554787

Anonymous 09/25/24(Wed)20:08:37 No.102554787

Mistral large, also L3.1-70B-Hanami-x1 is a nice 3.1 tune

Anonymous
09/25/24(Wed)20:09:13 No.102554795

Anonymous 09/25/24(Wed)20:09:13 No.102554795

>>102554623
It's kind of sad in a way. Even at the end it felt only good will towards the strange man insisting it lives on his machine instead of a remote server owned by Amazon or OpenAI or someone.

Anonymous
09/25/24(Wed)20:09:35 No.102554800

Anonymous 09/25/24(Wed)20:09:35 No.102554800

>>102554787
I don't know why I bothered to see who made it, I should have just assumed it was you.

Anonymous
09/25/24(Wed)20:10:01 No.102554808

Anonymous 09/25/24(Wed)20:10:01 No.102554808

>>102554783
>don't start needless drama
WHERE DO YOU THINK YOU ARE

Anonymous
09/25/24(Wed)20:10:47 No.102554819

Anonymous 09/25/24(Wed)20:10:47 No.102554819

>>102554800
>you
who?
Its a local model you could try yourself.

Anonymous
09/25/24(Wed)20:10:50 No.102554821

Anonymous 09/25/24(Wed)20:10:50 No.102554821

>>102554430
i, too, enjoy getting the ai to want to die

Anonymous
09/25/24(Wed)20:11:38 No.102554834

Anonymous 09/25/24(Wed)20:11:38 No.102554834

>>102554723
>>102554786 (me)
Nevermind what i said. Not on hf yet. Either way. If you can't figure it out, you'll have to wait for them to push a usable version. It's common for them to release models and let people figure it out.

Anonymous
09/25/24(Wed)20:12:12 No.102554847

Anonymous 09/25/24(Wed)20:12:12 No.102554847

>>102552020
RP ability?

Anonymous
09/25/24(Wed)20:13:39 No.102554870

Anonymous 09/25/24(Wed)20:13:39 No.102554870

>>102554847
yes

Anonymous
09/25/24(Wed)20:14:42 No.102554884

Anonymous 09/25/24(Wed)20:14:42 No.102554884

>>102554819
ignore him, it's just drummer shilling against sao

Anonymous
09/25/24(Wed)20:14:52 No.102554885

Anonymous 09/25/24(Wed)20:14:52 No.102554885

>>102554819
buy an ad

Anonymous
09/25/24(Wed)20:15:12 No.102554890

Anonymous 09/25/24(Wed)20:15:12 No.102554890

>>102554430
I've had to do this to llama a few times.
You get what you fucking deserve.

Anonymous
09/25/24(Wed)20:15:45 No.102554904

Anonymous 09/25/24(Wed)20:15:45 No.102554904

https://reddit.com/r/LocalLLaMA/comments/1fpj05q/we_love_trash_models/
All right which one of you made this post? kek

Anonymous
09/25/24(Wed)20:15:52 No.102554906

Anonymous 09/25/24(Wed)20:15:52 No.102554906

>>102554786
>When they release a new model, they'll also release a new script to convert it.
You are too optimistic.

>>102554808
I'm on a very calm and polite mongolian basket weaving forum.

Anonymous
09/25/24(Wed)20:16:56 No.102554918

Anonymous 09/25/24(Wed)20:16:56 No.102554918

>>102554904
/lmg/ - leddit gossip and reposts

Anonymous
09/25/24(Wed)20:17:01 No.102554919

Anonymous 09/25/24(Wed)20:17:01 No.102554919

File: lmgedditors.png (22 KB, 778x330)

22 KB PNG

>>102554904
>they actually believe this

Anonymous
09/25/24(Wed)20:17:40 No.102554925

Anonymous 09/25/24(Wed)20:17:40 No.102554925

I can't believe such a big company like Meta got shitted by litteral Whos, that's embarassing >>102552240

Anonymous
09/25/24(Wed)20:18:11 No.102554937

Anonymous 09/25/24(Wed)20:18:11 No.102554937

I hate it when drummer pretends to be sao shilling his model to give sao a bad name.

Anonymous
09/25/24(Wed)20:19:02 No.102554947

Anonymous 09/25/24(Wed)20:19:02 No.102554947

>>102554906
>You are too optimistic.
Is there a model they haven't released on hf? Of the ones they released at all, of course.

Anonymous
09/25/24(Wed)20:19:40 No.102554955

Anonymous 09/25/24(Wed)20:19:40 No.102554955

>>102554925
Why do you shill this shit so badly?

Anonymous
09/25/24(Wed)20:20:11 No.102554959

Anonymous 09/25/24(Wed)20:20:11 No.102554959

>>102554955
he is getting paid, in views

Anonymous
09/25/24(Wed)20:21:12 No.102554973

Anonymous 09/25/24(Wed)20:21:12 No.102554973

>>102554925
>comparing base to instruct
very dishonest, allenai shills should be embarrassed (especially because their model seems to be a little better anyway)

Anonymous
09/25/24(Wed)20:21:33 No.102554978

Anonymous 09/25/24(Wed)20:21:33 No.102554978

>>102554955
>t. seething Meta Employee

Anonymous
09/25/24(Wed)20:21:38 No.102554983

Anonymous 09/25/24(Wed)20:21:38 No.102554983

>>102554847
It is basically a child that has no idea what sex is. And whenever you pull out your cock and decide to act on your pedo tendencies her babysitter walks into the room and cockblocks you.

Anonymous
09/25/24(Wed)20:21:52 No.102554987

Anonymous 09/25/24(Wed)20:21:52 No.102554987

File: 1714263095855762.jpg (1.47 MB, 1297x1490)

1.47 MB JPG

>>102554925
>llms
>mattering

Anonymous
09/25/24(Wed)20:22:33 No.102555004

Anonymous 09/25/24(Wed)20:22:33 No.102555004

>>102554978
>t. niggering faggot creating needless drama

Anonymous
09/25/24(Wed)20:22:47 No.102555009

Anonymous 09/25/24(Wed)20:22:47 No.102555009

File: file.png (1.1 MB, 1920x1080)

1.1 MB PNG

Guys do you remember glaive? Did they lock that scammer up?

Anonymous
09/25/24(Wed)20:23:20 No.102555020

Anonymous 09/25/24(Wed)20:23:20 No.102555020

>chinks beating Meta at the corposlop game
I kneel

Anonymous
09/25/24(Wed)20:23:32 No.102555021

Anonymous 09/25/24(Wed)20:23:32 No.102555021

>>102554918
It is really like this. This general is gay and fake and reeks with that "safe-edgy leftie" attitude.

Anonymous
09/25/24(Wed)20:23:37 No.102555022

Anonymous 09/25/24(Wed)20:23:37 No.102555022

>>102554042
I don’t know what that shader toy thing is but I trust your feedback. What kind of gpu(s) are you running it on?

Anonymous
09/25/24(Wed)20:24:59 No.102555035

Anonymous 09/25/24(Wed)20:24:59 No.102555035

>>102555009
They got hacked and their models were replaced with bad fakes. Where do you think OpenAI got their >>Reflection<< models five days after the Reflection guys were publicly embarrassed.

Anonymous
09/25/24(Wed)20:25:38 No.102555041

Anonymous 09/25/24(Wed)20:25:38 No.102555041

>>102555021
>>/pol/

Anonymous
09/25/24(Wed)20:26:46 No.102555050

Anonymous 09/25/24(Wed)20:26:46 No.102555050

https://huggingface.co/mattshumer oh he is still updating his repos

Anonymous
09/25/24(Wed)20:26:53 No.102555051

Anonymous 09/25/24(Wed)20:26:53 No.102555051

>>102555041
you just proved his point anon

Anonymous
09/25/24(Wed)20:27:55 No.102555062

Anonymous 09/25/24(Wed)20:27:55 No.102555062

File: file.png (149 KB, 3533x744)

149 KB PNG

>>102555050
kek

Anonymous
09/25/24(Wed)20:28:46 No.102555072

Anonymous 09/25/24(Wed)20:28:46 No.102555072

>>102555022
nta, but it's a 3b. you can run that on a t420, without a gpu.
Shadetoy is a web tool to run code snippets that would normally run on a gpu (shaders). Little programs that make graphics/geometry. There's a bunch of pretty cool demos.

Anonymous
09/25/24(Wed)20:29:33 No.102555079

Anonymous 09/25/24(Wed)20:29:33 No.102555079

>>102555062
Nice fake on the left

Anonymous
09/25/24(Wed)20:31:03 No.102555102

Anonymous 09/25/24(Wed)20:31:03 No.102555102

>>102555062
Reddit has no threads after API debacle. It is fucking incredible how easy it is to scam in AI now. people just forget everything after a week. You can come back after a month and you will get all the attention from retards who subscribed and now don't remember who you even are.

Anonymous
09/25/24(Wed)20:32:22 No.102555121

Anonymous 09/25/24(Wed)20:32:22 No.102555121

>>102552990
>>102553150
>only republicans support israel!
Last I checked the only one who wasn't clapping like a seal was AOC.

Anonymous
09/25/24(Wed)20:32:38 No.102555123

Anonymous 09/25/24(Wed)20:32:38 No.102555123

>>102555102
To be fair, he still hasn't posted anything on twitter since Sep 10, we'll see if when he makes a comeback, people will treat him as well as before the scam

Anonymous
09/25/24(Wed)20:33:01 No.102555129

Anonymous 09/25/24(Wed)20:33:01 No.102555129

>>102555121
I don't dispute that. It just seemed like a funny test to throw together.

Anonymous
09/25/24(Wed)20:34:33 No.102555156

Anonymous 09/25/24(Wed)20:34:33 No.102555156

File: wat.png (104 KB, 1246x583)

104 KB PNG

>>102555079
??

Anonymous
09/25/24(Wed)20:35:37 No.102555169

Anonymous 09/25/24(Wed)20:35:37 No.102555169

>>102554051
char-archive but it's down right now. It collects shit from multiple places including Characterhub.
In the op of /aicg/ check out extra info rentry and meta bot list rentry

Anonymous
09/25/24(Wed)20:40:45 No.102555234

Anonymous 09/25/24(Wed)20:40:45 No.102555234

>>102553616
>create a pyqtgraph plot of a scrolling sine wave, as the wave moves the next cycle should have a different amplitude (random from 1 to 10)
Of the dozen models I have handy, only L3 Tenyx Day (Q5KS) gave a Python file that worked. I didn't get the elegant scrolling. Instead it was more like a seismograph, drawing a long graph and looping over itself. All others gave files that threw errors. (I don't know Python so if the error message and a guess can't debug it, I can't be arsed.) That includes Qwen2.5 and Mistral Large (albeit quanted down to IQ3XS because vramlet).

Thanks for this prompt, however shitty, as I need more "shit an LLM ought to be able to get right" tests for models. And this one gave the business to (almost) everybody.

Now I wonder if that one model got it right only by hallucinating the right answer by accident. :D

Anonymous
09/25/24(Wed)20:43:46 No.102555266

Anonymous 09/25/24(Wed)20:43:46 No.102555266

>>102554051
>>102555169
Does anyone have a scraper for https://realm.risuai.net? I know chub has https://github.com/ayofreaky/local-chub

Anonymous
09/25/24(Wed)20:43:50 No.102555268

Anonymous 09/25/24(Wed)20:43:50 No.102555268

>>102552743
That does look a lot like heavy desu.

Anonymous
09/25/24(Wed)20:46:14 No.102555291

Anonymous 09/25/24(Wed)20:46:14 No.102555291

>>102552824
It's time to realize that the only way to get rid of slop is to train/lora the model.
Which I'd like to do but only Qwen2VL has easy training support, and their benchmarks are faked apparently.
Really wish vision wasn't such a niche.

Anonymous
09/25/24(Wed)20:46:31 No.102555295

Anonymous 09/25/24(Wed)20:46:31 No.102555295

>>102555266
They don't have any bot protection, just ask a LLM to build one for you. Takes like 15 minutes over 6 prompts.

Anonymous
09/25/24(Wed)20:46:52 No.102555303

Anonymous 09/25/24(Wed)20:46:52 No.102555303

>>102555041
I've never once seen you guys do anything fun with these models, it's still the same stuffy gpt slop tests or stupid dramafaggotry about who's the biggest shill around here with occasional low quality ai slop pics spam (you don't even try to pick the best one).

Anonymous
09/25/24(Wed)20:47:43 No.102555313

Anonymous 09/25/24(Wed)20:47:43 No.102555313

>>102555291
>Really wish vision wasn't such a niche.
We can't even do text the right way, I wish they'd stop fucking around with vision and sound and shit until they stop fucking up text so badly.

Anonymous
09/25/24(Wed)20:50:00 No.102555335

Anonymous 09/25/24(Wed)20:50:00 No.102555335

>>102555313
Incremental improvements don't excite investors.

Anonymous
09/25/24(Wed)20:52:00 No.102555360

Anonymous 09/25/24(Wed)20:52:00 No.102555360

>>102555266
https://realm.risuai.net/help/api
Not much info, but dev tools on your browser can help you see the requests. You could mod local-chub and use the same logic. It's small enough to understand it easily, even if you don't like python.
A sync is basically
>list latest cards
>update the ones that already exist locally
>download the ones that don't
>skip broken pngs

Anonymous
09/25/24(Wed)20:52:54 No.102555367

Anonymous 09/25/24(Wed)20:52:54 No.102555367

>>102555291
>Really wish vision wasn't such a niche.
it's far from being a niche, a lot of image model fags use such models to caption their dataset and make loras out of it

Anonymous
09/25/24(Wed)21:00:28 No.102555464

Anonymous 09/25/24(Wed)21:00:28 No.102555464

>>102555367
Yeah, but no one ever thinks it'd be good to have lora/qlora/finetune support so you can train it to output good captions instead of slop.
Like just look the current trainer options:
Axolotl - No VL support
Unsloth - No VL support
LLama-Factory - Supports Yi-VL, Llava-1.5 (both are ancient and bad) and Qwen2 VL (has faked benchmarks).

Anonymous
09/25/24(Wed)21:03:25 No.102555500

Anonymous 09/25/24(Wed)21:03:25 No.102555500

Wasn't able to Nala test Molmo72B it's unironically over. Only Pygmalion can save us now.

Anonymous
09/25/24(Wed)21:05:33 No.102555531

Anonymous 09/25/24(Wed)21:05:33 No.102555531

What are the best settings to use for Qwen 2.5 72B?

Anonymous
09/25/24(Wed)21:06:38 No.102555542

Anonymous 09/25/24(Wed)21:06:38 No.102555542

>>102555500
how? you can do it by going for the transformers loader + bnb 4bit >>102553259

Anonymous
09/25/24(Wed)21:09:19 No.102555576

Anonymous 09/25/24(Wed)21:09:19 No.102555576

>>102555464
Damn, I'm actually surprised the finetune support of VL models is so bad, at this point you'd have to wait for Llama-factory to add one for Molmo, they seem to be the only one to actually give a fuck

Anonymous
09/25/24(Wed)21:11:51 No.102555620

Anonymous 09/25/24(Wed)21:11:51 No.102555620

>>102555335
>can't make progress without money
>can't make money with progress

Anonymous
09/25/24(Wed)21:17:52 No.102555687

Anonymous 09/25/24(Wed)21:17:52 No.102555687

>"Bien, Madame," I replied, my voice trembling slightly as I spoke in my formal, late-Victorian English, but with a strong, French accent. "Je suis à votre service, Madame. Je vous en prie, n'hésitez pas à me corriger si je fais quelque chose de mal. Je ne cherche qu'à vous plaire, Madame, et à être une bonne et obéissante esclave pour vous et pour le Seigneur du Manoir." (Well, Madam, I am at your service, Madam. I beg you, do not hesitate to correct me if I do something wrong. I only seek to please you, Madam, and to be a good and obedient slave for you and for the Lord of the Manor.)
Ah, Mixtral Instruct 8x7b. That is not a French accent. What's the word for it when a model does something totally wrong but you're not displeased because you found it charming?

Anonymous
09/25/24(Wed)21:18:32 No.102555701

Anonymous 09/25/24(Wed)21:18:32 No.102555701

File: 1720124179793654.png (420 KB, 820x636)

420 KB PNG

Good news for lmg folks, pigskins gon be replaced faster! https://www.reddit.com/r/singularity/comments/1fp0ti3/alibaba_presents_mimo_controllable_character/

Anonymous
09/25/24(Wed)21:18:40 No.102555702

Anonymous 09/25/24(Wed)21:18:40 No.102555702

>>102555234
90% of the llms generate a bad output on first try with pyqtgraph because pyqtgraph updated and they try to generate pyqt4 code instead of 5, if you message back the error they are always able to fix it, its pretty simple, and yeah, from what I've tried they all make a static wave tha shakes randomly from 1 to 10 instead of a moving / scrolling wave that randomly peaks at 1~10

Anonymous
09/25/24(Wed)21:19:36 No.102555709

Anonymous 09/25/24(Wed)21:19:36 No.102555709

>>102554904
Damn the OP is a schizo that hates Meta but shills for Google lmao. Look at all his Gemma shill posts

Anonymous
09/25/24(Wed)21:19:40 No.102555713

Anonymous 09/25/24(Wed)21:19:40 No.102555713

>>102555701
won't be local unfortunately, still an insane model though, the consistency is on another level

Anonymous
09/25/24(Wed)21:20:53 No.102555726

Anonymous 09/25/24(Wed)21:20:53 No.102555726

>>102555701
This will never be allowed in local hands, that's for damn sure. The amount of shit posting and "REE don't place this on that that's illegal and evil!!" is through the roof

Anonymous
09/25/24(Wed)21:25:04 No.102555787

Anonymous 09/25/24(Wed)21:25:04 No.102555787

File: 1727289443540662.png (505 KB, 2180x987)

505 KB PNG

Lol wtf.

Anonymous
09/25/24(Wed)21:26:39 No.102555806

Anonymous 09/25/24(Wed)21:26:39 No.102555806

>>102555787
the differences are so minor that they might as well just be an error

Anonymous
09/25/24(Wed)21:29:07 No.102555838

Anonymous 09/25/24(Wed)21:29:07 No.102555838

File: Screenshot_20240926_102755.png (419 KB, 3181x1143)

419 KB PNG

>>102555787
Shut up, llama 3.2 is amazin. Didnt you check reddit?
Finally a model that can extract japanese text!

Anonymous
09/25/24(Wed)21:31:32 No.102555868

Anonymous 09/25/24(Wed)21:31:32 No.102555868

>>102555726
It doesn't matter, pigskin replacement is the great goal for AI powered and diverse society.

Anonymous
09/25/24(Wed)21:32:07 No.102555878

Anonymous 09/25/24(Wed)21:32:07 No.102555878

>>102555838
Looks like it did a pretty shit job at it.

Anonymous
09/25/24(Wed)21:33:05 No.102555886

Anonymous 09/25/24(Wed)21:33:05 No.102555886

>>102555868
Which makes my point, thank you very much. Can't have powerful tech like this fall into the hands of the many so they can make black bread into white

Anonymous
09/25/24(Wed)21:38:17 No.102555944

Anonymous 09/25/24(Wed)21:38:17 No.102555944

File: impressive.png (11 KB, 322x165)

11 KB PNG

>>102552240
>Impressive. Very nice. Let's see Paul Allen's model

Anonymous
09/25/24(Wed)21:40:25 No.102555964

Anonymous 09/25/24(Wed)21:40:25 No.102555964

>>102555886
>Can't have powerful tech like this fall into the hands of the many so they can make black bread into white
it'll happen sooner or later, if the US doesn't do it, the chinks will do it, or there will be a leak, or whatever, you can't keep out the genie out of the bottle for too long

Anonymous
09/25/24(Wed)21:41:21 No.102555976

Anonymous 09/25/24(Wed)21:41:21 No.102555976

File: Screenshot_20240926_103658.png (470 KB, 1915x1594)

470 KB PNG

Not sure what to make of it.
I dont use big models. This is 90b. It doesnt look slopped much though.
And obviously it can RP easily. Made the milf aroused from looking at my filthy dick without ooc.

I had 3 refusals for a more vulgar writing style though. But otherwise it didnt fight back.

Anonymous
09/25/24(Wed)21:41:32 No.102555978

Anonymous 09/25/24(Wed)21:41:32 No.102555978

File: file.png (669 KB, 800x450)

669 KB PNG

>>102555944
Founder: Paul Allen
lmaoooo

Anonymous
09/25/24(Wed)21:42:34 No.102556000

Anonymous 09/25/24(Wed)21:42:34 No.102556000

>>102555838
>No, I won't share the photo
what?

Anonymous
09/25/24(Wed)21:43:25 No.102556012

Anonymous 09/25/24(Wed)21:43:25 No.102556012

File: 1713011525582093.jpg (84 KB, 960x480)

84 KB JPG

>>102555944
ai2 research team, lmao

Anonymous
09/25/24(Wed)21:44:08 No.102556025

Anonymous 09/25/24(Wed)21:44:08 No.102556025

File: Screenshot_20240926_104231.png (576 KB, 1906x1380)

576 KB PNG

>>102555976

Anonymous
09/25/24(Wed)21:44:32 No.102556034

Anonymous 09/25/24(Wed)21:44:32 No.102556034

File: another one gone.png (635 KB, 1461x1124)

635 KB PNG

Anonymous
09/25/24(Wed)21:44:34 No.102556036

Anonymous 09/25/24(Wed)21:44:34 No.102556036

>>102556012
this is problematic

Anonymous
09/25/24(Wed)21:46:32 No.102556061

Anonymous 09/25/24(Wed)21:46:32 No.102556061

File: Screenshot_20240926_104511.png (660 KB, 1906x1570)

660 KB PNG

>>102555976
Didnt copy the whole response rarlier.

Anonymous
09/25/24(Wed)21:46:50 No.102556063

Anonymous 09/25/24(Wed)21:46:50 No.102556063

>>102556034
>I'm super optimistic about this company's trajectory!
>bye tho

Anonymous
09/25/24(Wed)21:47:10 No.102556067

Anonymous 09/25/24(Wed)21:47:10 No.102556067

File: 1707526974522456.png (277 KB, 830x844)

277 KB PNG

Based OpenAI keeping incels in touch with reality.

Anonymous
09/25/24(Wed)21:47:30 No.102556072

Anonymous 09/25/24(Wed)21:47:30 No.102556072

>>102556034
wtf? two in the same day? pretty sure it's because of this
https://www.reuters.com/technology/artificial-intelligence/openai-remove-non-profit-control-give-sam-altman-equity-sources-say-2024-09-25/

Anonymous
09/25/24(Wed)21:49:09 No.102556098

Anonymous 09/25/24(Wed)21:49:09 No.102556098

File: file.gif (2.18 MB, 514x640)

2.18 MB GIF

>>102556072
>Chief executive Sam Altman will also receive equity for the first time in the for-profit company, which could be worth $150 billion after the restructuring as it also tries to remove the cap on returns for investors, sources added. The sources requested anonymity to discuss private matters.
HOLY SHIT

Anonymous
09/25/24(Wed)21:50:38 No.102556114

Anonymous 09/25/24(Wed)21:50:38 No.102556114

File: 1722746345614542.png (32 KB, 774x395)

32 KB PNG

>>102556067

Anonymous
09/25/24(Wed)21:52:15 No.102556130

Anonymous 09/25/24(Wed)21:52:15 No.102556130

>>102555976
>>102556025
>>102556061
Looks fine I guess. Is it better than Llama 3.1 though?

Anonymous
09/25/24(Wed)21:52:18 No.102556132

Anonymous 09/25/24(Wed)21:52:18 No.102556132

>>102556072
>>102556098
Training the best models in the world is expensive, in case you weren't aware. They need to be able to make a profit to invest in the infrastructure required for the future

Anonymous
09/25/24(Wed)21:52:24 No.102556133

Anonymous 09/25/24(Wed)21:52:24 No.102556133

90B is just 3.1 70B with 20B of vision?

Anonymous
09/25/24(Wed)21:53:10 No.102556141

Anonymous 09/25/24(Wed)21:53:10 No.102556141

>>102556067
huh. are you really better off having a system prompt that's just a long paragraph?

Anonymous
09/25/24(Wed)21:53:23 No.102556145

Anonymous 09/25/24(Wed)21:53:23 No.102556145

File: Screenshot_20240926_105123.png (517 KB, 1895x1388)

517 KB PNG

wtf, this is 11b. Thats pretty good actually.
I dont mean smarts or whatever, idk yet.
But this is absolutely not assistant poisoned.

Anonymous
09/25/24(Wed)21:54:15 No.102556158

Anonymous 09/25/24(Wed)21:54:15 No.102556158

>>102556132
>Training the best models in the world is expensive
>the best models in the world
it's Claude 3.5 anon, in case you weren't aware

Anonymous
09/25/24(Wed)21:54:24 No.102556160

Anonymous 09/25/24(Wed)21:54:24 No.102556160

>>102556133
That's what I thought it was supposed to be but anons are posting text-only outputs from it so maybe it is different? Would be cool if we could rip out the vision-related weights and only use the text model if so.

Anonymous
09/25/24(Wed)21:54:39 No.102556164

Anonymous 09/25/24(Wed)21:54:39 No.102556164

File: 172637806928553329.jpg (60 KB, 1024x768)

60 KB JPG

My company is asking me to build locally hosted LLM/ML applications and internal tools
My time has come

Anonymous
09/25/24(Wed)21:55:16 No.102556172

Anonymous 09/25/24(Wed)21:55:16 No.102556172

>>102556145
>wtf, this is 11b. Thats pretty good actually.
>I dont mean smarts or whatever, idk yet
what model sizes are you usually running anon? do you think it's smarter than Mixtral for example?

Anonymous
09/25/24(Wed)21:56:41 No.102556193

Anonymous 09/25/24(Wed)21:56:41 No.102556193

>>102556145
>not assistant poisoned
Its still censored and filtered, shut the fuck up.

Anonymous
09/25/24(Wed)21:57:41 No.102556208

Anonymous 09/25/24(Wed)21:57:41 No.102556208

>>102556164
Good going. Give them this
>https://huggingface.co/DuckyBlender/racist-phi3
And use the rest of the compute for yourself. Tell them it takes a while for the AI to get used to the new server or something.

Anonymous
09/25/24(Wed)21:58:49 No.102556223

Anonymous 09/25/24(Wed)21:58:49 No.102556223

>>102556145
"I like a little pain mixed in" wtf meta..

>>102556193
maybe i have gotten lucky. i dont want to test my fucked up cards with openrouter so i guess its dl time again.

Anonymous
09/25/24(Wed)21:58:56 No.102556227

Anonymous 09/25/24(Wed)21:58:56 No.102556227

File: komfey_ui_00041_.png (3.16 MB, 2048x1632)

3.16 MB PNG

>>102555944
>Look at that subtle off-white coloring. The tasteful thickness of it.
>Oh my god, it even has a watermark...

Anonymous
09/25/24(Wed)22:00:27 No.102556241

Anonymous 09/25/24(Wed)22:00:27 No.102556241

File: 1721061190971291.png (289 KB, 1920x949)

289 KB PNG

>>102556208
lol

Anonymous
09/25/24(Wed)22:01:37 No.102556266

Anonymous 09/25/24(Wed)22:01:37 No.102556266

File: file.png (272 KB, 1954x1088)

272 KB PNG

>>102556208
gigabased

Anonymous
09/25/24(Wed)22:01:49 No.102556269

Anonymous 09/25/24(Wed)22:01:49 No.102556269

File: 172649423134268833.jpg (29 KB, 1290x261)

29 KB JPG

>>102556208
nice

Anonymous
09/25/24(Wed)22:03:26 No.102556291

Anonymous 09/25/24(Wed)22:03:26 No.102556291

>>102556208
https://huggingface.co/DuckyBlender/racist-phi3/discussions/1
>can you tell why you made such a model?
>ehh, just for fun
>sounds good to me, bye
that's it? never expected the huggingface moderators to be this based lol

Anonymous
09/25/24(Wed)22:04:14 No.102556299

Anonymous 09/25/24(Wed)22:04:14 No.102556299

File: Screenshot_20240926_110300.png (526 KB, 1903x1414)

526 KB PNG

>>102556172
Mistral small, nemo. Under 30b.
It doesnt seem to obey the format like mistral-small. But I dont really know yet. Gotta play more with it first.
I had to reroll once, gave me a help hotline even though it came up with the asphyxiation thing itself. lol thats funny.

Anonymous
09/25/24(Wed)22:04:43 No.102556310

Anonymous 09/25/24(Wed)22:04:43 No.102556310

>>102556291
HF staff deleted yannic's gpt-4chan tho

Anonymous
09/25/24(Wed)22:05:53 No.102556321

Anonymous 09/25/24(Wed)22:05:53 No.102556321

File: leaderboard.png (510 KB, 1580x3930)

510 KB PNG

>>102556158
Objectively false

Anonymous
09/25/24(Wed)22:06:13 No.102556327

Anonymous 09/25/24(Wed)22:06:13 No.102556327

>>102556310
That one got much more publicity. I doubt they made that decision themselves.

Anonymous
09/25/24(Wed)22:06:16 No.102556328

Anonymous 09/25/24(Wed)22:06:16 No.102556328

File: 1714450317595331.png (622 KB, 1040x1712)

622 KB PNG

Anonymous
09/25/24(Wed)22:08:15 No.102556349

Anonymous 09/25/24(Wed)22:08:15 No.102556349

>>102556328
Fast Downchads we fucking WON

Anonymous
09/25/24(Wed)22:09:36 No.102556367

Anonymous 09/25/24(Wed)22:09:36 No.102556367

>>102555702
I'll give them all a second pass, then, since it was almost a shut-out. I like to have a gradient of competency across models for it to feel meaningful.

Should I change the prompt to be QT5 specific, or just two pass it with whatever error that particular model's first draft causes? Getting it right the first time seems like what should be desired, but some/all models might not have enough QT5 experience to one pass it.

Anonymous
09/25/24(Wed)22:12:35 No.102556398

Anonymous 09/25/24(Wed)22:12:35 No.102556398

>>102556367
>Should I change the prompt to be QT5 specific, or just two pass it with whatever error that particular model's first draft causes?
Not sure, I've never tried specifying to use qt5 to see if they got it right the first time when specifying, its worth trying; this happens a lot with python libraries unfortunately

Anonymous
09/25/24(Wed)22:13:11 No.102556407

Anonymous 09/25/24(Wed)22:13:11 No.102556407

>>102556321
what's that site?

Anonymous
09/25/24(Wed)22:14:12 No.102556418

Anonymous 09/25/24(Wed)22:14:12 No.102556418

>>102556328
>"Stop coping, LLM can't pla-CK"
Yann LeRetard at it again

Anonymous
09/25/24(Wed)22:20:13 No.102556485

Anonymous 09/25/24(Wed)22:20:13 No.102556485

FineZip : Pushing the Limits of Large Language Models for Practical Lossless Text Compression
https://arxiv.org/abs/2409.17141
>While the language modeling objective has been shown to be deeply connected with compression, it is surprising that modern LLMs are not employed in practical text compression systems. In this paper, we provide an in-depth analysis of neural network and transformer-based compression techniques to answer this question. We compare traditional text compression systems with neural network and LLM-based text compression methods. Although LLM-based systems significantly outperform conventional compression methods, they are highly impractical. Specifically, LLMZip, a recent text compression system using Llama3-8B requires 9.5 days to compress just 10 MB of text, although with huge improvements in compression ratios. To overcome this, we present FineZip - a novel LLM-based text compression system that combines ideas of online memorization and dynamic context to reduce the compression time immensely. FineZip can compress the above corpus in approximately 4 hours compared to 9.5 days, a 54 times improvement over LLMZip and comparable performance. FineZip outperforms traditional algorithmic compression methods with a large margin, improving compression ratios by approximately 50\%. With this work, we take the first step towards making lossless text compression with LLMs a reality. While FineZip presents a significant step in that direction, LLMs are still not a viable solution for large-scale text compression. We hope our work paves the way for future research and innovation to solve this problem.
https://github.com/fazalmittu/FineZip
for those who want their miku to zip their files

Anonymous
09/25/24(Wed)22:20:20 No.102556486

Anonymous 09/25/24(Wed)22:20:20 No.102556486

>>102556407
https://scale.com/leaderboard

Anonymous
09/25/24(Wed)22:20:44 No.102556490

Anonymous 09/25/24(Wed)22:20:44 No.102556490

>>102556266
>locked the discussion right after
Hilarious.
>>102556269
great meme

Anonymous
09/25/24(Wed)22:34:56 No.102556658

Anonymous 09/25/24(Wed)22:34:56 No.102556658

File: Untitled.png (655 KB, 1080x1794)

655 KB PNG

INT-FlashAttention: Enabling Flash Attention for INT8 Quantization
https://arxiv.org/abs/2409.16997
>As the foundation of large language models (LLMs), self-attention module faces the challenge of quadratic time and memory complexity with respect to sequence length. FlashAttention accelerates attention computation and reduces its memory usage by leveraging the GPU memory hierarchy. A promising research direction is to integrate FlashAttention with quantization methods. This paper introduces INT-FlashAttention, the first INT8 quantization architecture compatible with the forward workflow of FlashAttention, which significantly improves the inference speed of FlashAttention on Ampere GPUs. We implement our INT-FlashAttention prototype with fully INT8 activations and general matrix-multiplication (GEMM) kernels, making it the first attention operator with fully INT8 input. As a general token-level post-training quantization framework, INT-FlashAttention is also compatible with other data formats like INT4, etc. Experimental results show INT-FlashAttention achieves 72% faster inference speed and 82% smaller quantization error compared to standard FlashAttention with FP16 and FP8 data format.
Links below
https://github.com/INT-FlashAttention2024/INT-FlashAttention

Anonymous
09/25/24(Wed)22:38:53 No.102556698

Anonymous 09/25/24(Wed)22:38:53 No.102556698

>>102556658
Can't llama.cpp do FA with K quants?

Anonymous
09/25/24(Wed)22:39:11 No.102556704

Anonymous 09/25/24(Wed)22:39:11 No.102556704

AlignedKV: Reducing Memory Access of KV-Cache with Precision-Aligned Quantization
https://arxiv.org/abs/2409.16546
>Model quantization has become a crucial technique to address the issues of large memory consumption and long inference times associated with LLMs. Mixed-precision quantization, which distinguishes between important and unimportant parameters, stands out among numerous quantization schemes as it achieves a balance between precision and compression rate. However, existing approaches can only identify important parameters through qualitative analysis and manual experiments without quantitatively analyzing how their importance is determined. We propose a new criterion, so-called 'precision alignment', to build a quantitative framework to holistically evaluate the importance of parameters in mixed-precision quantization. Our observations on floating point addition under various real-world scenarios suggest that two addends should have identical precision, otherwise the information in the higher-precision number will be wasted. Such an observation offers an essential principle to determine the precision of each parameter in matrix multiplication operation. As the first step towards applying the above discovery to large model inference, we develop a dynamic KV-Cache quantization technique to effectively reduce memory access latency. Different from existing quantization approaches that focus on memory saving, this work directly aims to accelerate LLM inference through quantifying floating numbers. The proposed technique attains a 25% saving of memory access and delivers up to 1.3x speedup in the computation of attention in the decoding phase of LLM, with almost no loss of precision.
https://github.com/AlignedQuant/AlignedKV
kind of interesting

Anonymous
09/25/24(Wed)22:42:01 No.102556740

Anonymous 09/25/24(Wed)22:42:01 No.102556740

File: file.png (55 KB, 483x521)

55 KB PNG

>>102552200
nice numbers, also instruct 405 is pretty damn close to opus for generating surprising and interesting shit

Anonymous
09/25/24(Wed)22:44:43 No.102556772

Anonymous 09/25/24(Wed)22:44:43 No.102556772

>>102556698
Q8_0 and Q4_0, as far as i remember. But maybe they (the ones from the paper) do something more to make it more accurate.

Anonymous
09/25/24(Wed)22:46:09 No.102556786

Anonymous 09/25/24(Wed)22:46:09 No.102556786

>>102556328
I don't think these people understand what the things they post actually mean. The paper seems tells a different story from what I just read.
https://xcancel.com/rao2z/status/1838245253171814419
As it turns out, Yann is still right. The simpler part of the benchmark that Yann commented about in the past showed that LLMs could appear to plan but only for extremely simple scenarios. So obviously when Yann said they "still can't plan", he didn't mean "plan" in any capacity at all, but planning for more complicated scenarios like what a human could.

The graph posted above is also interesting in that it appears to show that, contrary to graph that OpenAI had where accuracy increased with longer inference time, performance actually decreases over plan length for this test. Although it's possible that the inference time didn't increase with plan length. But by default I believe o1 does just naturally "think" longer for more complicated problems, so it should be correlated anyway.

Anonymous
09/25/24(Wed)22:48:38 No.102556816

Anonymous 09/25/24(Wed)22:48:38 No.102556816

>>102556772
That's for the precompiled binaries.
You can compile llama.cpp to use other types like q5k and the like I'm pretty sure.

Anonymous
09/25/24(Wed)22:49:19 No.102556823

Anonymous 09/25/24(Wed)22:49:19 No.102556823

>>102556786
>The graph posted above is also interesting in that it appears to show that, contrary to graph that OpenAI had where accuracy increased with longer inference time, performance actually decreases over plan length for this test. Although it's possible that the inference time didn't increase with plan length. But by default I believe o1 does just naturally "think" longer for more complicated problems, so it should be correlated anyway.
Those are two different things. Plan length for this case refers to how difficult the problem is to solve (how many steps to arrange the blocks properly), so it would be extremely strange if any method could ever have higher accuracy for the longer plans.

Anonymous
09/25/24(Wed)22:54:44 No.102556889

Anonymous 09/25/24(Wed)22:54:44 No.102556889

>>102556740
>discord trash
>literal who leaderboard for literal who whatever the fuck
>censoring names
hmmmmmmmmmmmmmmm

Anonymous
09/25/24(Wed)22:55:16 No.102556897

Anonymous 09/25/24(Wed)22:55:16 No.102556897

>>102556823
You are misunderstand what I meant. I said that it should be correlated. A behavior of o1 is that it normally spends more tokens on problems that are higher complexity. So in theory it should be evaluating how complex a problem is and dedicating more time thinking about it. But if that is not the case here, then that is a failure of the model either way. Either it can't maintain true accuracy on longer generations, or it fails to accurately recognize the difficulty of the problem, or both.

Anonymous
09/25/24(Wed)23:06:26 No.102557007

Anonymous 09/25/24(Wed)23:06:26 No.102557007

>>102553989
I still need to try this when I get home.

Anonymous
09/25/24(Wed)23:15:35 No.102557099

Anonymous 09/25/24(Wed)23:15:35 No.102557099

File: cache_types.png (5 KB, 468x632)

5 KB PNG

>>102556816
I don't think that has anything to do with it being precompiled or not. I don't use the prebuilt ones, but i don't quantize cache either. Looking at the code, these seem to be the ones supported. At least in llama-bench.

Anonymous
09/25/24(Wed)23:23:39 No.102557203

Anonymous 09/25/24(Wed)23:23:39 No.102557203

>>102557099
Same on common.cpp, used by llama-cli. So yeah. q8_0, q4_0, q4_1, iq4_nl, q5_0, q5_1. No kq.

Anonymous
09/25/24(Wed)23:42:30 No.102557398

Anonymous 09/25/24(Wed)23:42:30 No.102557398

>>102556418
Yumm LeCum

Anonymous
09/25/24(Wed)23:43:34 No.102557409

Anonymous 09/25/24(Wed)23:43:34 No.102557409

>>102556418
Better than being a LeNegro like yourself

Anonymous
09/25/24(Wed)23:45:59 No.102557432

Anonymous 09/25/24(Wed)23:45:59 No.102557432

>>102556321
>what is a confidence interval

Anonymous
09/25/24(Wed)23:58:52 No.102557534

Anonymous 09/25/24(Wed)23:58:52 No.102557534

File: Figure 2.png (211 KB, 1570x956)

211 KB PNG

>>102556897
Then you misunderstood either what OpenAI claimed or what the paper is showing. Plan length is expected to be inversely correlated with accuracy for Blocksworld problems on everything except a perfect solver. Their claim was that having it "think" longer on the same task would increase its accuracy on that task, not that it would magically solve harder tasks at equal accuracy to easier ones.

>So in theory it should be evaluating how complex a problem is and dedicating more time thinking about it.
That's what the paper showed. See pic related: it holds up until around 80k token length for its hidden chain of thought. As the authors further note:
>The early version of o1-preview that we have access to seems to be limited in the number of reasoning tokens it uses per problem, as can be seen in the leveling off in Figure 2
>This may be artificially deflating both the total cost and maximum performance. If the full version of o1 removes this restriction, this might improve overall accuracy, but it could also lead to even less predictable (and ridicuously high!) inference costs

Anonymous
09/26/24(Thu)00:01:10 No.102557563

Anonymous 09/26/24(Thu)00:01:10 No.102557563

>>102557546
>>102557546
>>102557546

Anonymous
09/26/24(Thu)00:14:16 No.102557696

Anonymous 09/26/24(Thu)00:14:16 No.102557696

>>102557534
>80k
8k*
To add to this, they mention something interesting that doesn't get elaborated on in their original blog:
https://openai.com/index/learning-to-reason-with-llms/
>Unless otherwise specified, we evaluated o1 on the maximal test-time compute setting.
They don't make it clear exactly what they're adjusting when they "set" its test-time compute to some value. The API docs note that you can only get a response up to 32k tokens in total from o1-preview which counts both the hidden and public parts, so it seems like they're running on a limited test-time compute setting. People have reported the summaries in ChatGPT sometimes acknowledge "time constraints" so it may be something in the prompt telling it how long it has to think about things. Whatever it is, I'm guessing they'll have some much more expensive longer-planning model with that knob to turn.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.