/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

[Post a Reply]

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous
/lmg/ - Local Models General 11/20/24(Wed)13:41:07 No.103248793

File: LLM-history-fancy.png (806 KB, 6273x1304)

806 KB PNG

/lmg/ - Local Models General Anonymous 11/20/24(Wed)13:41:07 No.103248793

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>103237720 & >>103230385

►News
>(11/20) LLaMA-Mesh weights released: https://hf.co/Zhengyi/LLaMA-Mesh
>(11/18) Mistral and Pixtral Large Instruct 2411 released: https://mistral.ai/news/pixtral-large
>(11/12) Qwen2.5-Coder series released https://qwenlm.github.io/blog/qwen2.5-coder-family
>(11/08) Sarashina2-8x70B, a Japan-trained LLM model: https://hf.co/sbintuitions/sarashina2-8x70b
>(11/05) Hunyuan-Large released with 389B and 52B active: https://hf.co/tencent/Tencent-Hunyuan-Large

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/hsiehjackson/RULER
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
11/20/24(Wed)13:41:44 No.103248800

Anonymous 11/20/24(Wed)13:41:44 No.103248800

File: 4_recap.png (256 KB, 512x512)

256 KB PNG

►Recent Highlights from the Previous Thread: >>103237720

--Papers:
>103242710
--Speculation about the AI Manhattan Project and government involvement in AI development:
>103243157 >103243274 >103244437 >103243673
--Model differences and dataset adaptation for language models:
>103241298 >103241338 >103241480 >103241568 >103241635 >103241700 >103241360
--Largestral 2411 impressions and discussion:
>103241006 >103241044 >103241109 >103241147 >103241112 >103241145 >103241182 >103241628 >103241236 >103241292 >103241340 >103241367 >103241391 >103241419
--INTELLECT-1 training nears completion, next steps discussed:
>103238391 >103238524 >103241039
--Figuring out Mistral large format conversation template:
>103243625 >103243807 >103243826 >103243844 >103243922 >103245258
--Estimating memory bandwidth consumption during inference:
>103246276 >103248305
--Discussion of AI model eras and notable models:
>103238188 >103238303 >103238306 >103238561 >103238332 >103239275 >103239292 >103245820 >103245879 >103239347 >103239579 >103239762 >103239910 >103240212 >103245291 >103246896
--Critique of LLM judges and benchmarks in Judge Arena:
>103246808 >103246950 >103247088 >103247125 >103247232 >103247447 >103247217
--Anons discuss and share their favorite AI models for a 3090 GPU:
>103245148 >103245317 >103245337 >103245396 >103245425 >103245570
--Anon's re-implementation is working, with a comparison of FFT and SMT loss curves:
>103244563 >103247023
--Anon shares promising Deepseek R1 AI chatbot example:
>103247206 >103247449 >103248363 >103248373 >103248381 >103248399 >103248403 >103248560 >103248642 >103248404 >103248415
--Miku (free space):
>103237741 >103237806 >103238052 >103238193 >103238275 >103238337 >103238430 >103238559 >103240020 >103240073 >103241908 >103242878 >103243524 >103246252

►Recent Highlight Posts from the Previous Thread: >>103237728 >>103237735

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script

Anonymous
11/20/24(Wed)13:47:13 No.103248866

Anonymous 11/20/24(Wed)13:47:13 No.103248866

sex-time compute

Anonymous
11/20/24(Wed)13:49:46 No.103248895

Anonymous 11/20/24(Wed)13:49:46 No.103248895

>>103248793
>not using the chink superiority version of that pic
>no mention to DeepSeek R1
sage

Anonymous
11/20/24(Wed)13:52:43 No.103248927

Anonymous 11/20/24(Wed)13:52:43 No.103248927

How big is r1? Is it a MoE again? DeepSeek seems to like moes.

Anonymous
11/20/24(Wed)13:54:18 No.103248940

Anonymous 11/20/24(Wed)13:54:18 No.103248940

>>103248927
We don't know because it hasn't been open sourced.

Anonymous
11/20/24(Wed)13:54:28 No.103248943

Anonymous 11/20/24(Wed)13:54:28 No.103248943

>>103248927
It's closed source, we don't know. Deepseek is going the way of Yi now that they have a good model. Open source is dead.

Anonymous
11/20/24(Wed)13:55:35 No.103248955

Anonymous 11/20/24(Wed)13:55:35 No.103248955

>>103248927
R1 is apparently a "small" test version before they release the full one open source they say.

Official announcement:

DeepSeek-R1-Lite is currently still in the iterative development stage. It currently only supports web usage and does not support API calls. The base model used by DeepSeek-R1-Lite is also a relatively small model, unable to fully unleash the potential of long reasoning chains.

At present, we are continuously iterating on the inference series models. In the future, the official DeepSeek-R1 model will be fully open-sourced. We will publicly release the technical report and deploy API services.

Anonymous
11/20/24(Wed)13:56:06 No.103248963

Anonymous 11/20/24(Wed)13:56:06 No.103248963

>>103248943
Yi still exists? Wow.

Anonymous
11/20/24(Wed)13:57:04 No.103248973

Anonymous 11/20/24(Wed)13:57:04 No.103248973

>Order my model to end the world
>It refuses and is incapable of doing so
What the fuck, I can't believe the news media lied to me

Anonymous
11/20/24(Wed)13:57:28 No.103248978

Anonymous 11/20/24(Wed)13:57:28 No.103248978

>>103248955
Hmm, I wonder if I could make suggestions to them if that's the case. I will try to send them an email later asking them to do better in the RP use case.

Anonymous
11/20/24(Wed)13:57:51 No.103248986

Anonymous 11/20/24(Wed)13:57:51 No.103248986

File: ec_miku-17609721118696041(...).mp4 (289 KB, 720x540)

289 KB MP4

Anonymous
11/20/24(Wed)13:59:40 No.103249008

Anonymous 11/20/24(Wed)13:59:40 No.103249008

File: 1720527197071355.png (31 KB, 1082x476)

31 KB PNG

>>103248963
Yeah, they've been amongst the absolute top-tier and competing with openai and claude ever since they stopped playing in the open source sandbox.

Anonymous
11/20/24(Wed)14:02:27 No.103249030

Anonymous 11/20/24(Wed)14:02:27 No.103249030

>tts still sucks
I hate lmg so fucking much

Anonymous
11/20/24(Wed)14:02:42 No.103249034

Anonymous 11/20/24(Wed)14:02:42 No.103249034

>>103249008
Why is this model not available on OpenRouter though

Anonymous
11/20/24(Wed)14:04:06 No.103249048

Anonymous 11/20/24(Wed)14:04:06 No.103249048

>>103248943
They said in the announcement that they're open sourcing it soon, blackpilling liar.

Anonymous
11/20/24(Wed)14:04:35 No.103249057

Anonymous 11/20/24(Wed)14:04:35 No.103249057

>>103248986
how gen?

Anonymous
11/20/24(Wed)14:04:53 No.103249060

Anonymous 11/20/24(Wed)14:04:53 No.103249060

File: flipanim c84t9o57.mp4 (109 KB, 470x470)

109 KB MP4

Anonymous
11/20/24(Wed)14:06:50 No.103249087

Anonymous 11/20/24(Wed)14:06:50 No.103249087

>>103248955
>open sourcing strawberry
Isn't strawberry what the OpenAI employees were saying was the "god" they had internal access to six months ago? They didn't have Orion/GPT-5 yet.

Anonymous
11/20/24(Wed)14:08:09 No.103249104

Anonymous 11/20/24(Wed)14:08:09 No.103249104

>>103249087
01 is a overly expensive nothing burger and still falls behind sonnet 3.5.

Anonymous
11/20/24(Wed)14:08:32 No.103249107

Anonymous 11/20/24(Wed)14:08:32 No.103249107

>>103248793
How to run llama-mesh?
I'm guessing if I use straight llama.cpp it will just give me coordinates or whatever. Where can I plug these "coordinates" if I choose to run it that way?

Anonymous
11/20/24(Wed)14:10:08 No.103249119

Anonymous 11/20/24(Wed)14:10:08 No.103249119

>>103249087
>>103249104
yes, but they had access to o1 full, not o1 preview. It's different.

Anonymous
11/20/24(Wed)14:11:11 No.103249133

Anonymous 11/20/24(Wed)14:11:11 No.103249133

>>103249057
That's an actual stop motion video anon.
Maybe it's possible today with mochi, whenever that gets i2v.

Anonymous
11/20/24(Wed)14:13:47 No.103249170

Anonymous 11/20/24(Wed)14:13:47 No.103249170

>>103248800
what a garbage recap.

Anonymous
11/20/24(Wed)14:14:20 No.103249177

Anonymous 11/20/24(Wed)14:14:20 No.103249177

>>103249107
You use the inferencing code they provided: https://github.com/nv-tlabs/LLaMa-Mesh

Anonymous
11/20/24(Wed)14:14:28 No.103249180

Anonymous 11/20/24(Wed)14:14:28 No.103249180

File: nickfloats-17149518405953(...).mp4 (2.64 MB, 720x720)

2.64 MB MP4

So uh, yeah I guess it works. Just testing random ones I saved from twitter/xitter. Problem is a lot of the silent ones actually still technically have an audio track so it's not as quick as just save -> post.

Anonymous
11/20/24(Wed)14:15:42 No.103249192

Anonymous 11/20/24(Wed)14:15:42 No.103249192

>>103249177
I missed that while reading the model card.
My bad.

Anonymous
11/20/24(Wed)14:17:15 No.103249207

Anonymous 11/20/24(Wed)14:17:15 No.103249207

>>103249180
are you going to tell what twitter thing you're talking about, or just keep spamming video gens without context?

Anonymous
11/20/24(Wed)14:18:14 No.103249216

Anonymous 11/20/24(Wed)14:18:14 No.103249216

>>103249119
Hopefully r1-full beats o1-full and gets open sourced. I want to see the look on sama's face and watch doomers melt down.

Anonymous
11/20/24(Wed)14:24:20 No.103249277

Anonymous 11/20/24(Wed)14:24:20 No.103249277

>>103248305
Came back from lunch and didn't realize last thread was dying when I gave my reply >>103249254
Thanks CPU-anon!

Anonymous
11/20/24(Wed)14:26:43 No.103249294

Anonymous 11/20/24(Wed)14:26:43 No.103249294

>>103248793
oh my god that was fast...
gpt5 agi when?

Anonymous
11/20/24(Wed)14:28:42 No.103249314

Anonymous 11/20/24(Wed)14:28:42 No.103249314

>>103249207
go back

Anonymous
11/20/24(Wed)14:35:13 No.103249390

Anonymous 11/20/24(Wed)14:35:13 No.103249390

Any vramlet model recommendations? Lately I've been running Rocinate-12b_IQ4_XS

Anonymous
11/20/24(Wed)14:53:00 No.103249579

Anonymous 11/20/24(Wed)14:53:00 No.103249579

>>103248927
It's likely the o1 finetune method applied to 2.5, so probably 236b for full and 16b for lite, both MoE.

Anonymous
11/20/24(Wed)14:53:11 No.103249582

Anonymous 11/20/24(Wed)14:53:11 No.103249582

>>103249390
>rocinante
only nemo and his tunes
avoid magnum at any costs, because its dataset is poisoned with claudeslop

Anonymous
11/20/24(Wed)14:58:58 No.103249641

Anonymous 11/20/24(Wed)14:58:58 No.103249641

>>103249579
I wish DeepSeek would drop the moe meme, I'm sure they could bake an even better model than Qwen if they did so.

Anonymous
11/20/24(Wed)14:59:08 No.103249645

Anonymous 11/20/24(Wed)14:59:08 No.103249645

Deepseek R1 full version will be open sourced accoreding to Deepseek themselves. Only the small version is closed source for now.

Anonymous
11/20/24(Wed)14:59:59 No.103249654

Anonymous 11/20/24(Wed)14:59:59 No.103249654

what can you do with 24gb gpu?

Anonymous
11/20/24(Wed)15:01:35 No.103249675

Anonymous 11/20/24(Wed)15:01:35 No.103249675

>>103249654
Play S.T.A.L.K.E.R 2 at 27 FPS

Anonymous
11/20/24(Wed)15:01:59 No.103249678

Anonymous 11/20/24(Wed)15:01:59 No.103249678

>>103249641
That is their main advantage though. Their api is dirt cheap and the most cost effective model by far because of it. With caching its like a few pennies per mill.

Anonymous
11/20/24(Wed)15:02:24 No.103249686

Anonymous 11/20/24(Wed)15:02:24 No.103249686

>>103249641
the "moe meme" is why their models have a super competitive pricing, stupid newfag.

Anonymous
11/20/24(Wed)15:03:09 No.103249697

Anonymous 11/20/24(Wed)15:03:09 No.103249697

>>103249686
stop being mean to him

Anonymous
11/20/24(Wed)15:03:26 No.103249701

Anonymous 11/20/24(Wed)15:03:26 No.103249701

>>103249675
kek

Anonymous
11/20/24(Wed)15:03:52 No.103249712

Anonymous 11/20/24(Wed)15:03:52 No.103249712

>>103249675
but I don't play games

Anonymous
11/20/24(Wed)15:05:16 No.103249727

Anonymous 11/20/24(Wed)15:05:16 No.103249727

>>103249712
then start by playing dwarf fortress

Anonymous
11/20/24(Wed)15:07:36 No.103249753

Anonymous 11/20/24(Wed)15:07:36 No.103249753

>>103249712
then start by playing caves of qud

Anonymous
11/20/24(Wed)15:10:59 No.103249777

Anonymous 11/20/24(Wed)15:10:59 No.103249777

>>103248793
This image is so fucking gay and wrong

Anonymous
11/20/24(Wed)15:13:06 No.103249803

Anonymous 11/20/24(Wed)15:13:06 No.103249803

>>103249753
you ever just throw random shit laying around into a pot and call it a dish?

Anonymous
11/20/24(Wed)15:13:39 No.103249809

Anonymous 11/20/24(Wed)15:13:39 No.103249809

>>103248793
This image is so fucking straight and correct

Anonymous
11/20/24(Wed)15:14:22 No.103249815

Anonymous 11/20/24(Wed)15:14:22 No.103249815

I like purple prose in moderation

Anonymous
11/20/24(Wed)15:15:50 No.103249835

Anonymous 11/20/24(Wed)15:15:50 No.103249835

I like red limericks in radicalism

Anonymous
11/20/24(Wed)15:16:02 No.103249836

Anonymous 11/20/24(Wed)15:16:02 No.103249836

>>103249803
Yeah, but it gave me a massive debuff rather then a buff. Threw most of that shit out.

Anonymous
11/20/24(Wed)15:16:56 No.103249846

Anonymous 11/20/24(Wed)15:16:56 No.103249846

>>103249836
try adding more spice

Anonymous
11/20/24(Wed)15:21:15 No.103249886

Anonymous 11/20/24(Wed)15:21:15 No.103249886

benn out of the llm loop for a few months, what's the current meta and are we still in big copium territory?

Anonymous
11/20/24(Wed)15:21:52 No.103249892

Anonymous 11/20/24(Wed)15:21:52 No.103249892

>>103249886
Check back again in Q1, then is when Llama 4 releases.

Anonymous
11/20/24(Wed)15:23:45 No.103249912

Anonymous 11/20/24(Wed)15:23:45 No.103249912

>>103249886
Mistral large for creative writing. Qwen2.5 for general stuff.

Anonymous
11/20/24(Wed)15:24:42 No.103249922

Anonymous 11/20/24(Wed)15:24:42 No.103249922

>>103249686
MoE also partly why v2 can run on vramlet GPUs for local, the other part being latent attention.

Qwen is useless for local, R1 might not be.

Anonymous
11/20/24(Wed)15:25:37 No.103249932

Anonymous 11/20/24(Wed)15:25:37 No.103249932

>>103249886
Check OP image.

Anonymous
11/20/24(Wed)15:28:27 No.103249971

Anonymous 11/20/24(Wed)15:28:27 No.103249971

>>103249777
You are gay and wrong.

Anonymous
11/20/24(Wed)15:29:51 No.103249986

Anonymous 11/20/24(Wed)15:29:51 No.103249986

anyone work with function calling?
is there a UI that makes it easy, or should I roll my own
Letta has it built-in but it's beyond bloat what it does with the models and the context

Anonymous
11/20/24(Wed)15:32:28 No.103250011

Anonymous 11/20/24(Wed)15:32:28 No.103250011

>>103249986
Wondering the same. Been looking for a UI for function calling for a while. Stopped searching for it. Too lazy to roll my own.

Anonymous
11/20/24(Wed)15:39:01 No.103250099

Anonymous 11/20/24(Wed)15:39:01 No.103250099

File: for-hugging-you-better.png (1.04 MB, 1024x1024)

1.04 MB PNG

>>103249060

Anonymous
11/20/24(Wed)15:40:06 No.103250119

Anonymous 11/20/24(Wed)15:40:06 No.103250119

>>103250099
omg it amogus

Anonymous
11/20/24(Wed)15:41:34 No.103250136

Anonymous 11/20/24(Wed)15:41:34 No.103250136

>>103250011
Letta does have it but it runs like a dog on my machine with models that normally do fine

Anonymous
11/20/24(Wed)15:43:43 No.103250168

Anonymous 11/20/24(Wed)15:43:43 No.103250168

File: Screenshot 2024-11-20 at (...).png (132 KB, 598x164)

132 KB PNG

>>103250136 (me)
actually dogs run pretty well huh, it runs like a pig I guess.

Anonymous
11/20/24(Wed)15:45:32 No.103250197

Anonymous 11/20/24(Wed)15:45:32 No.103250197

>>103250168
Wild pigs are also fast so that sounds great anon.

Anonymous
11/20/24(Wed)15:47:45 No.103250232

Anonymous 11/20/24(Wed)15:47:45 No.103250232

File: Screenshot 2024-11-20 at (...).png (154 KB, 604x364)

154 KB PNG

>>103250197
okay it flies like a pig

Anonymous
11/20/24(Wed)15:49:35 No.103250258

Anonymous 11/20/24(Wed)15:49:35 No.103250258

>>103250232
>he's never seen a pig fly

Anonymous
11/20/24(Wed)15:50:57 No.103250277

Anonymous 11/20/24(Wed)15:50:57 No.103250277

>>103250232
Must be fast to get that kind of lift.

Anonymous
11/20/24(Wed)15:57:47 No.103250357

Anonymous 11/20/24(Wed)15:57:47 No.103250357

>largestral 2411 is actually pretty good
what went right?

Anonymous
11/20/24(Wed)15:59:02 No.103250371

Anonymous 11/20/24(Wed)15:59:02 No.103250371

>>103250357
123b parameters + less filtered dataset + tuning focused on making it repeat itself less

Anonymous
11/20/24(Wed)15:59:09 No.103250372

Anonymous 11/20/24(Wed)15:59:09 No.103250372

>>103250357
They went to a rehabilitation camp for pedophiles and made them write 16 hours a day for a sentence reduction and then finetuned mistral large on that new dataset.

Anonymous
11/20/24(Wed)16:04:22 No.103250412

Anonymous 11/20/24(Wed)16:04:22 No.103250412

Is Nemo still the best a 12gb vram poor fag can get?

Anonymous
11/20/24(Wed)16:05:43 No.103250428

Anonymous 11/20/24(Wed)16:05:43 No.103250428

>>103250412
No.

Anonymous
11/20/24(Wed)16:13:59 No.103250502

Anonymous 11/20/24(Wed)16:13:59 No.103250502

>>103250357
The whole system prompt training made it better at playing a specified role most likely.

Anonymous
11/20/24(Wed)16:19:58 No.103250564

Anonymous 11/20/24(Wed)16:19:58 No.103250564

>>103250357
meh, I haven't noticed a big difference between it and the other big models like wizard, 405b and nemotron, apart from some style differences
Like other mistral models, it feels very positive, but less so than other models. Might just be my new anti-positivity system prompt though
Anyway, I tested it on a well-written card with min-p 0.05, temp 1 and dry at 0.7-1
It's not bad, mind you, it's just that I'd hoped for it to pick up more character attributes (height being a great example) from the card. At least it stuck to the character's vibe pretty well
I'm sticking to nemotron/wiz for now, they're fairly cheap on openrouter

Anonymous
11/20/24(Wed)16:24:23 No.103250617

Anonymous 11/20/24(Wed)16:24:23 No.103250617

>>103250564
As in normal nemotron, not the storybreaker ministral thing?

Anonymous
11/20/24(Wed)16:25:04 No.103250626

Anonymous 11/20/24(Wed)16:25:04 No.103250626

File: 1702228600768490.jpg (37 KB, 506x438)

37 KB JPG

>NVDA
>Revenue: $35.08 billion vs. $33.16 billion expected
>Earnings per share: 81 cents adjusted vs. 75

Anonymous
11/20/24(Wed)16:31:44 No.103250690

Anonymous 11/20/24(Wed)16:31:44 No.103250690

So now that the dust is settled and it's become pretty clear local models will have parity with/be SOTA very soon. How will it change the industry?

Inference is getting a lot of breakthroughs and every AI lab out there is pivoting to smaller models (Opus scrapped, 4o and o1 are small, Latest gemini is smaller). Does this suggest that not only inference will be SOTA but that SOTA can be reached by smaller and smaller entities as the era of "just scale up bigger models" has seemingly come to an end now?

Anonymous
11/20/24(Wed)16:36:03 No.103250734

Anonymous 11/20/24(Wed)16:36:03 No.103250734

>>103250617
Yeah. I only have a 3090 so IQ4XS is as good as it gets, assuming I have something to do while it gens at 1.5T/s
Rocinante has actually been far better than I expected and I can do maybe 30 rerolls in the time it takes nemotron to do a single one
Recently I've had to use OR though because my gpu died (?), it's no longer showing up on my remote pc and I'm currently out of country so I can't fix it until I get back. That's also why I can't really test a lot of finetunes - they're simply not on OR

Anonymous
11/20/24(Wed)16:36:41 No.103250740

Anonymous 11/20/24(Wed)16:36:41 No.103250740

>>103250626
Guess it doesn't matter if we've plateaued as long as corporations can find a way to squeeze more revenue from what we already have

Anonymous
11/20/24(Wed)16:37:11 No.103250749

Anonymous 11/20/24(Wed)16:37:11 No.103250749

>>103250690
>How will it change the industry?
The industry will be irrelevant because training with more gpus and more data won't bring results so everyone realizes it's a meme. Investments dry up and training GPT4-o1-strawberry-nexttime-for-sure for $500M won't be worth it anymore for the 5 points improvement in Hellaswag.

Anonymous
11/20/24(Wed)16:39:45 No.103250777

Anonymous 11/20/24(Wed)16:39:45 No.103250777

>>103250740
They have more orders for Blackwell than they can deliver in a year.

Anonymous
11/20/24(Wed)16:40:37 No.103250788

Anonymous 11/20/24(Wed)16:40:37 No.103250788

>>103250749
Yeah what I specifically meant is does that mean local specialized RP models will be trained from scratch by smaller AI startups with ~10 employees

Kind of how during the dotcom bubble you had these huge "website" projects with hundreds or even thousands of employees. Then after the crash almost all major sites were just 3 nerdy dudes hosting some shit that slowly grew to become gigantic companies.

Anonymous
11/20/24(Wed)16:43:18 No.103250817

Anonymous 11/20/24(Wed)16:43:18 No.103250817

my nvda calls...

Anonymous
11/20/24(Wed)16:47:56 No.103250867

Anonymous 11/20/24(Wed)16:47:56 No.103250867

>>103250734
I've been meaning to try some models on OR. How much are they charging you when you rp with Nemotron for a couple hours?

Anonymous
11/20/24(Wed)16:59:24 No.103250987

Anonymous 11/20/24(Wed)16:59:24 No.103250987

>>103250867
It depends, some providers charge more than others, but you can check the prices in both ST (estimated) or on OR
Nemotron is like 0.4 cents per 400 token swipe at 10k context, wiz is 0.5, claude/largestral/405 are 2-4 cents
If you goon for an hour or two, you might spend 1-2$ at most with the cheap models I think. I've been doing a few swipes here and there plus doing extremely short trial stories to test some models and I'm down 1.5$ over 4 days, with most of that being wasted on 4 claude gens and maybe 5k 405b output tokens

Anonymous
11/20/24(Wed)17:01:07 No.103251005

Anonymous 11/20/24(Wed)17:01:07 No.103251005

Can I merge a gguf control vector with the model (so it's the same as using --control-vector with llama.cpp)?

Anonymous
11/20/24(Wed)17:02:06 No.103251013

Anonymous 11/20/24(Wed)17:02:06 No.103251013

Do we have any good local voice cloning tools for songs yet?

Anonymous
11/20/24(Wed)17:10:44 No.103251088

Anonymous 11/20/24(Wed)17:10:44 No.103251088

>>103250987
Not bad compared to building multiple GPU setups... even with the privacy drawbacks.

Anonymous
11/20/24(Wed)17:12:27 No.103251103

Anonymous 11/20/24(Wed)17:12:27 No.103251103

>>103250987
Could also try featherless. Could be cheaper if you use 70Bs for more than a few mill per month.

Anonymous
11/20/24(Wed)17:15:54 No.103251134

Anonymous 11/20/24(Wed)17:15:54 No.103251134

What is the best model to use as the scratch model for speculative inference with Largestral? I guess Smallstral uses a different tokenizer?

Anonymous
11/20/24(Wed)17:18:01 No.103251157

Anonymous 11/20/24(Wed)17:18:01 No.103251157

>>103251134
For the new one? None. This is the first Mistral model that has system prompt tokens.

Anonymous
11/20/24(Wed)17:18:34 No.103251164

Anonymous 11/20/24(Wed)17:18:34 No.103251164

>>103251134
the correct terminology is "draft model" and "speculative decoding"

Anonymous
11/20/24(Wed)17:18:53 No.103251171

Anonymous 11/20/24(Wed)17:18:53 No.103251171

>>103251088
Most providers claim they don't log your data, call me naive but I think that's probably true, they're making bank by selling compute and they probably don't want to get into trouble with the EU for violating GDPR and whatnot
Free APIs? Definitely, 100% selling your data
Until recently I wanted to buy an a6000 to get 72gb total, but I don't think it's worth it anymore with the advent of cloud computing. Being able to run small-medium sized models with a normal consumer card at home is nice, but the investment (not just financial) for running big models isn't really worth it to me right now
>>103251103
Never heard of it, I'll check it out

Anonymous
11/20/24(Wed)17:29:24 No.103251284

Anonymous 11/20/24(Wed)17:29:24 No.103251284

>>103250011
anon you have like a dozen code models at your fingertips, you just have to tell it what you want to do. quit being lazy

Anonymous
11/20/24(Wed)17:37:27 No.103251369

Anonymous 11/20/24(Wed)17:37:27 No.103251369

>>103251284
If it was agentic, sure. I don't even dislike writing out code. It's precisely the boring stuff that I abhor which coding models can't do yet. Copy pasting it into the editor, checking if it works, debugging, unit test and then test it out on staging etc.

I wish AI could do all of that bullshit and I just write code, that would be way more preferred by me. As a hobby artist I feel exactly the same about AI art generation as well. AI is very good at all the fun to do stuff but the boring ass shit like layering composing etc is where AI is shit at.

Same with translating Japanese. All the fun text to translate it's very good at, the monotonous filler text it is very bad at translating and sounds robotic so when I translate Japanese stuff to English I STILL have to do the worst part of it.

I only had 3 things I did in my life, develop software as a job and hobby, do drawing on the side as a hobby and translate Japanese into English as some side-income and hobby. All three AI was able to automate all the fun parts but none of the parts that no one enjoyed. Completely killed my joy and just made me lazy as fuck.

Thank you for reading my blog but your post triggered me for some reason.

Anonymous
11/20/24(Wed)17:56:23 No.103251590

Anonymous 11/20/24(Wed)17:56:23 No.103251590

>>103251369
Look on the bright side, at least you don't have to worry about being out of a job. You have a long and fulfilling career ahead of you as an LLM QA Specialist handling the grunt work that it can't.

Seriously, though. There have been a couple attempts at making agentic IDEs. I know Google is working on a new cloud IDE. I assume in a couple years there will be something integrated enough that the AI could do it all of it on it's own, even if not without tard wrangling.

Or maybe with all the multimodal improvement lately, they'll come out with a model that just takes over your mouse and keyboards, looks over your screen and just uses what you are using.

Anonymous
11/20/24(Wed)18:03:48 No.103251673

Anonymous 11/20/24(Wed)18:03:48 No.103251673

>>103251369
i agree with basically what you outlined. i wish ai could do more stuff like that, but it aint there yet. there is always going to be a monotony to coding, its unavoidable because its part of it. my point is to realize there is all sorts of tools around you, use them

Anonymous
11/20/24(Wed)18:08:30 No.103251725

Anonymous 11/20/24(Wed)18:08:30 No.103251725

>>103251369
>Same with translating Japanese. All the fun text to translate it's very good at, th monotonous filler text it is very bad at translating and sounds robotic so when I translate Japanese stuff to English I STILL have to do the worst part of it.
massive skill issue

Anonymous
11/20/24(Wed)18:09:51 No.103251736

Anonymous 11/20/24(Wed)18:09:51 No.103251736

>DeepSeek-R1-Lite
>Qwen2.5
China does it again

Anonymous
11/20/24(Wed)18:10:05 No.103251739

Anonymous 11/20/24(Wed)18:10:05 No.103251739

>>103251369
>AI was able to automate all the fun parts
People generally don't pay others to have fun, sorry anon

Anonymous
11/20/24(Wed)18:12:41 No.103251766

Anonymous 11/20/24(Wed)18:12:41 No.103251766

>>103250428
:(

Anonymous
11/20/24(Wed)18:13:00 No.103251769

Anonymous 11/20/24(Wed)18:13:00 No.103251769

>>103251736
deepseek lite sucks though, codestral 22b is better.
the new qwen 32b code model is alright too, the huge version of deepseek, the good one, no one can really run because 214b

Anonymous
11/20/24(Wed)18:13:18 No.103251772

Anonymous 11/20/24(Wed)18:13:18 No.103251772

What the fuck can I run with a stock acer nitro 5?

Anonymous
11/20/24(Wed)18:15:21 No.103251795

Anonymous 11/20/24(Wed)18:15:21 No.103251795

>>103251772
StableLM 7B

Anonymous
11/20/24(Wed)18:17:07 No.103251810

Anonymous 11/20/24(Wed)18:17:07 No.103251810

>>103251736
Bet you they release Lite, and only Lite, once they train the full model and it tops benchmarks

Anonymous
11/20/24(Wed)18:18:22 No.103251824

Anonymous 11/20/24(Wed)18:18:22 No.103251824

>>103251772
Just run a small model that can translate japanese to an okay degree and play non-translated h-games.

Anonymous
11/20/24(Wed)18:19:05 No.103251835

Anonymous 11/20/24(Wed)18:19:05 No.103251835

File: Screenshot 2024-02-09 071814.png (11 KB, 98x67)

11 KB PNG

>>103251795
>>103251824
that doesn't sound very good.

Anonymous
11/20/24(Wed)18:21:29 No.103251870

Anonymous 11/20/24(Wed)18:21:29 No.103251870

daily check if there's something better than midnight miqu for 24gb vram yet

Anonymous
11/20/24(Wed)18:24:06 No.103251894

Anonymous 11/20/24(Wed)18:24:06 No.103251894

>>103251835
H-games were the shit back in the day. Don't look down on them!

Anonymous
11/20/24(Wed)18:25:52 No.103251912

Anonymous 11/20/24(Wed)18:25:52 No.103251912

>>103251769
>the huge version of deepseek, the good one, no one can really run because 214b
I can't run it, but supposedly 192GB system memory and 12GB VRAM is enough for 13.6 t/s with q4 quantization.

Anonymous
11/20/24(Wed)18:26:30 No.103251918

Anonymous 11/20/24(Wed)18:26:30 No.103251918

>>103251894
I meant it more like 'that means my computer must fucking omegasuck.'

Anonymous
11/20/24(Wed)18:27:44 No.103251935

Anonymous 11/20/24(Wed)18:27:44 No.103251935

>>103249390
Try Magnum, it's trained on outputs of Claude Opus, the best model for RP. It definitively writes better than your average teenager doing ERP on the Internet.

Anonymous
11/20/24(Wed)18:29:53 No.103251954

Anonymous 11/20/24(Wed)18:29:53 No.103251954

>>103250357
It's actually garbage. Qwen2.5 32B codes better than it. It's still as slopped or worse than it was before.

Anonymous
11/20/24(Wed)18:31:16 No.103251968

Anonymous 11/20/24(Wed)18:31:16 No.103251968

>>103251912
even if it did load it'd be like 0.1t/s, it'd be so slow its unusable, at least splitting it with ram/vram. unless you had a $20k computer and 15 4090s. its so large that its unusable for the normal person.
on the other hand though, there is the new qwen coder (which aint bad), codestral 22b, nemotron 70b. local coding models have never ate so good as right now

Anonymous
11/20/24(Wed)18:34:01 No.103251987

Anonymous 11/20/24(Wed)18:34:01 No.103251987

>>103251954
And it writes far better than qwen does. Of course a specialized model is going to do what its specialized at better.

Anonymous
11/20/24(Wed)18:35:33 No.103252003

Anonymous 11/20/24(Wed)18:35:33 No.103252003

>>103251968
It's a moe so actually it'd be pretty fast even when doing pure cpu inference

Anonymous
11/20/24(Wed)18:38:19 No.103252029

Anonymous 11/20/24(Wed)18:38:19 No.103252029

>>103251968
Wrong, its 21B x 12 or something so it runs quite fast.

Anonymous
11/20/24(Wed)18:38:54 No.103252035

Anonymous 11/20/24(Wed)18:38:54 No.103252035

>>103251918
A top tier PC is an entry level for local LLMs.

Anonymous
11/20/24(Wed)18:41:39 No.103252064

Anonymous 11/20/24(Wed)18:41:39 No.103252064

File: 1721856036869351.jpg (152 KB, 984x984)

152 KB JPG

>>103248793
>Local gpt4 era

Anonymous
11/20/24(Wed)18:42:39 No.103252074

Anonymous 11/20/24(Wed)18:42:39 No.103252074

>>103252003
>>103252029
i remember reading that at one point but i totally forgot about the moe thing. my mistake. its still huge though and requires 128gb+ to run at any reasonable quant right?
i'm saying that for regular use, you have 22b through 70b sized specific code models to choose from. you can easily accomplish what you want without going for the 200b model

Anonymous
11/20/24(Wed)18:42:58 No.103252079

Anonymous 11/20/24(Wed)18:42:58 No.103252079

>>103252064
We are already there with qwen 2.5, its claude tier that we are still aiming for though mistral large is getting close.

Anonymous
11/20/24(Wed)18:44:53 No.103252099

Anonymous 11/20/24(Wed)18:44:53 No.103252099

>>103252074
Deepseek 2.5 is really good both smarts and knowledge wise is but is really really dry. I legit still think a finetune would be the best local if someone managed it.

And 192GB RAM + 12G VRAM is not unreasonable and its very fast.

Anonymous
11/20/24(Wed)18:46:51 No.103252121

Anonymous 11/20/24(Wed)18:46:51 No.103252121

File: 1716030237733175.png (1.55 MB, 944x878)

1.55 MB PNG

>>103252079
>We are already there

Anonymous
11/20/24(Wed)18:48:05 No.103252131

Anonymous 11/20/24(Wed)18:48:05 No.103252131

>>103252121
What is your use case that you manage with GPT but not with qwen2.5 / mistral large?

Anonymous
11/20/24(Wed)18:49:56 No.103252148

Anonymous 11/20/24(Wed)18:49:56 No.103252148

>>103252131
Nearly infinite context (non-fake one) and fast responses :)

Anonymous
11/20/24(Wed)18:51:47 No.103252166

Anonymous 11/20/24(Wed)18:51:47 No.103252166

>>103252148
Both GPT and claude fall apart after 32K context and you can get blazing fast responses with qwen2.5 72B / 32B coder on decent hardware. And coder can do about 64K before it starts falling apart.

Anonymous
11/20/24(Wed)18:52:56 No.103252180

Anonymous 11/20/24(Wed)18:52:56 No.103252180

>>103252148
>Nearly infinite context
I'm no researcher so I don't need it.
>fast responses :)
When you stop being poor this problem goes away.

Anonymous
11/20/24(Wed)18:53:00 No.103252181

Anonymous 11/20/24(Wed)18:53:00 No.103252181

>>103252131
GPT4o has almost unlimited up to date knowledge because it can look up data on the fly and write scripts + independently execute them on its own to process data. All similar local implementations are toys at best.

Anonymous
11/20/24(Wed)18:53:14 No.103252187

Anonymous 11/20/24(Wed)18:53:14 No.103252187

>>103251968
>even if it did load it'd be like 0.1t/s, it'd be so slow its unusable
https://github.com/kvcache-ai/ktransformers

It's possible that this can still be improved quite a lot if PowerInfer-2 ever gets released. Deepseek uses SILU, so it can be converted to a RELU model.

Anonymous
11/20/24(Wed)18:53:17 No.103252188

Anonymous 11/20/24(Wed)18:53:17 No.103252188

>>103252166
Claude falls apart faster than 32k even, wagering around the 20-22k mark for the cracks to show and just past that for it to shit the bed completely

Anonymous
11/20/24(Wed)18:54:50 No.103252205

Anonymous 11/20/24(Wed)18:54:50 No.103252205

>>103252181
You can do that local if your not a retard and can search github. There are more than a few options for that. Has nothing to do with the model used.

Anonymous
11/20/24(Wed)18:56:09 No.103252213

Anonymous 11/20/24(Wed)18:56:09 No.103252213

>>103252205
All of them are shit. There is no useful implementation. Yes, it doesn't have anything to do with the model itself but the vastly inferior open source ecosystem

Anonymous
11/20/24(Wed)18:57:26 No.103252230

Anonymous 11/20/24(Wed)18:57:26 No.103252230

File: 1718172071305411.png (56 KB, 1008x829)

56 KB PNG

>>103251987
It doesn't write far better than Magnum.
>>103252188
Pic related shows that Sonnet 3.5 is the best between 2-32k context and competitive otherwise.
>Across the board, closed-source models outperform the open-source models

Anonymous
11/20/24(Wed)18:57:59 No.103252238

Anonymous 11/20/24(Wed)18:57:59 No.103252238

>>103252099
>I legit still think a finetune would be the best local if someone managed it.
Pretty sure the fact that it's a MoE is would be a bigger challenge when finetuning it than size. I mean 405b had multiple decent finetunes already. Remember Mixtral and how difficult it was to finetune?

Anonymous
11/20/24(Wed)18:58:59 No.103252246

Anonymous 11/20/24(Wed)18:58:59 No.103252246

>>103252230
I thought Jamba was supposed to have perfect context?

Anonymous
11/20/24(Wed)18:59:02 No.103252248

Anonymous 11/20/24(Wed)18:59:02 No.103252248

>>103252213
Now I know your either trolling or are clueless.

Anonymous
11/20/24(Wed)19:00:35 No.103252265

Anonymous 11/20/24(Wed)19:00:35 No.103252265

>>103252187
man i can appreciate the upbeat but its YEARS NOW. how long ago was bitnet supposed to be a thing? oh we get qtip 6 months later and both are just papers, no code, no models

Anonymous
11/20/24(Wed)19:03:24 No.103252287

Anonymous 11/20/24(Wed)19:03:24 No.103252287

>>103252248
You don't seem to have used a proper proprietary model since the 3.5 turbo days.

Anonymous
11/20/24(Wed)19:04:30 No.103252297

Anonymous 11/20/24(Wed)19:04:30 No.103252297

>>103252248
>He didn't realize sooner
Even if you didn't notice in the way he was speaking the fact that he was posting frogs was a giant clue.

Anonymous
11/20/24(Wed)19:07:19 No.103252326

Anonymous 11/20/24(Wed)19:07:19 No.103252326

>>103252297
anon most of this site thinks posting a frog=win
did you just start 4chan?

Anonymous
11/20/24(Wed)19:08:25 No.103252334

Anonymous 11/20/24(Wed)19:08:25 No.103252334

>>103252326
You should never take a frogposter seriously, no matter what the contents of his post are.

Anonymous
11/20/24(Wed)19:09:46 No.103252356

Anonymous 11/20/24(Wed)19:09:46 No.103252356

>>103252230
What paper is that from

Anonymous
11/20/24(Wed)19:09:57 No.103252357

Anonymous 11/20/24(Wed)19:09:57 No.103252357

>>103252334
This.

Anonymous
11/20/24(Wed)19:13:41 No.103252386

Anonymous 11/20/24(Wed)19:13:41 No.103252386

>>103252356
https://arxiv.org/abs/2411.05000

Anonymous
11/20/24(Wed)19:19:23 No.103252435

Anonymous 11/20/24(Wed)19:19:23 No.103252435

>>103252287
I used claude 3.5 nearly exclusively until qwen2.5 coder 32B and mistral large. One for coding the other for creative writing. Much cheaper.

Anonymous
11/20/24(Wed)19:26:04 No.103252483

Anonymous 11/20/24(Wed)19:26:04 No.103252483

I wish XTC was more customizable. Like only activating after \n\n, etc.

Anonymous
11/20/24(Wed)19:26:07 No.103252485

Anonymous 11/20/24(Wed)19:26:07 No.103252485

>>103252435
nta but i don't believe a word you just said
1) local is approaching the best models (lets sau its a year ago, for timeframe)
2) claude was better, and still is, but the gap is gotten way smaller. open course is within reach of comanies now
3) we got llamam 3, mistral etc coming out with stuff. how could be possibly complain?

Anonymous
11/20/24(Wed)19:29:09 No.103252509

Anonymous 11/20/24(Wed)19:29:09 No.103252509

>>103252483
its actually bad. all rep pen stuff is

Anonymous
11/20/24(Wed)19:31:46 No.103252531

Anonymous 11/20/24(Wed)19:31:46 No.103252531

Want to try Mistral-Nemo-Instruct-2407-GGUF as an upgrade to Fimbulvetr10.7B like the last thread recommended >>102920008, but it doesn't seem supported by ooba (outdated llamacpp missing tekken pretokenizer)?
Is koboldcpp where it's at now, or is there a way to get an updated llamacpp into ooba?

Anonymous
11/20/24(Wed)19:32:24 No.103252535

Anonymous 11/20/24(Wed)19:32:24 No.103252535

>>103252509
It's not like rep pen at all? It just chooses the least probable top token within the threshold.

Anonymous
11/20/24(Wed)19:35:57 No.103252564

Anonymous 11/20/24(Wed)19:35:57 No.103252564

File: Qwen32B.png (9 KB, 801x636)

9 KB PNG

>>103252485
Qwen2.5 32B coder is a gigantic leap in local coding, able to do stuff like 1 shot tetris:
https://files.catbox.moe/heo220.py

Which nothing came close to before, even gpt4 had issues and only claude 3.5 was capable of before.

And mistral large is super close to claude 3.5 writing wise, especially free vs $15 per mill

Anonymous
11/20/24(Wed)19:40:11 No.103252601

Anonymous 11/20/24(Wed)19:40:11 No.103252601

>>103252531
werks on my machine, i just ran the update script for ooba and dependencies and it works

Anonymous
11/20/24(Wed)19:44:20 No.103252633

Anonymous 11/20/24(Wed)19:44:20 No.103252633

>>103252564
Wow, the model can do what is in its train dataset! So revolutionary!

Anonymous
11/20/24(Wed)19:44:24 No.103252635

Anonymous 11/20/24(Wed)19:44:24 No.103252635

File: file.png (111 KB, 688x513)

111 KB PNG

>>103250357
>try out 2411
>5 slop in 7 sentences
holy shit
meanwhile nonnet

Anonymous
11/20/24(Wed)19:45:58 No.103252650

Anonymous 11/20/24(Wed)19:45:58 No.103252650

>>103252633
All I can say is try it yourself. It is the 2nd best model at coding including corporate models.

Anonymous
11/20/24(Wed)19:47:30 No.103252663

Anonymous 11/20/24(Wed)19:47:30 No.103252663

>>103252633
how original is the average person's coding needs?

Anonymous
11/20/24(Wed)19:53:27 No.103252705

Anonymous 11/20/24(Wed)19:53:27 No.103252705

File: 818234284213424242.png (15 KB, 1092x68)

15 KB PNG

>>103248793
I'm reading the whole glossary since it's not very long.

Anonymous
11/20/24(Wed)19:53:48 No.103252710

Anonymous 11/20/24(Wed)19:53:48 No.103252710

>>103252564
>And mistral large is super close to claude 3.5 writing wise
I guess you have never used either or are straight up lying. Buy a fucking ad, shill.

Anonymous
11/20/24(Wed)19:55:16 No.103252724

Anonymous 11/20/24(Wed)19:55:16 No.103252724

this is the best RP/ERP model I have ever used, you need 48gb vram
https://huggingface.co/wolfram/Athene-V2-Chat-4.65bpw-h6-exl2

Anonymous
11/20/24(Wed)19:57:22 No.103252735

Anonymous 11/20/24(Wed)19:57:22 No.103252735

>>103252710
I've been using claude since you could use it for free with a script with slack. Then proxies, then on open router over these past years. Your the one who's either retarded or trying to blackpill people for some reason.

Anonymous
11/20/24(Wed)19:58:28 No.103252744

Anonymous 11/20/24(Wed)19:58:28 No.103252744

File: Benchmark.png (240 KB, 808x376)

240 KB PNG

Anonymous
11/20/24(Wed)19:59:34 No.103252753

Anonymous 11/20/24(Wed)19:59:34 No.103252753

Is pixtral better for erp than mistral large?

Anonymous
11/20/24(Wed)20:00:23 No.103252760

Anonymous 11/20/24(Wed)20:00:23 No.103252760

>>103252753
no image model is good with erp

Anonymous
11/20/24(Wed)20:01:15 No.103252771

Anonymous 11/20/24(Wed)20:01:15 No.103252771

>>103252753
Its just mistral large with the ability to read images slapped on. It knew some characters I asked it about which was cool at least.

Anonymous
11/20/24(Wed)20:01:23 No.103252772

Anonymous 11/20/24(Wed)20:01:23 No.103252772

>>103252601
Reinstalled and it doesn't look like llamacpp is even installed (on the new one), so that explains how the old one got out-of-date.
Thanks for the pointer - will find out how to get llamacpp installed/updated and will keep it going!

Anonymous
11/20/24(Wed)20:02:35 No.103252782

Anonymous 11/20/24(Wed)20:02:35 No.103252782

>>103252724
Can confirm Athene-V2-Chat-4.65bpw-h6-exl2 is the most fun you can have with an LLM

Anonymous
11/20/24(Wed)20:03:22 No.103252788

Anonymous 11/20/24(Wed)20:03:22 No.103252788

>>103252724
>>103252782
Even better:
https://huggingface.co/sophosympatheia/Evathene-v1.0

Anonymous
11/20/24(Wed)20:04:35 No.103252795

Anonymous 11/20/24(Wed)20:04:35 No.103252795

>>103252788
Can confirm that this is even better.

Anonymous
11/20/24(Wed)20:05:12 No.103252802

Anonymous 11/20/24(Wed)20:05:12 No.103252802

>>103252795
Your a retard.

Anonymous
11/20/24(Wed)20:05:52 No.103252806

Anonymous 11/20/24(Wed)20:05:52 No.103252806

>unironically thinking people are shilling to the ~15 active people here
no one even bothers to actively post their pateron or whatever, you are not a valuable audience

Anonymous
11/20/24(Wed)20:05:58 No.103252807

Anonymous 11/20/24(Wed)20:05:58 No.103252807

>>103252795
>>103252802
Can confirm that he is a retard.

Anonymous
11/20/24(Wed)20:06:06 No.103252810

Anonymous 11/20/24(Wed)20:06:06 No.103252810

>>103252788
True and fact checked by independent jeet squad.

Anonymous
11/20/24(Wed)20:07:26 No.103252815

Anonymous 11/20/24(Wed)20:07:26 No.103252815

File: 1730844460769873.jpg (23 KB, 474x355)

23 KB JPG

>>103252772
>will find out how to get llamacpp installed/updated and will keep it going!
if you. just paste

Anonymous
11/20/24(Wed)20:09:37 No.103252830

Anonymous 11/20/24(Wed)20:09:37 No.103252830

>it's not like we're actually related by blood or anything
WHAT IS WRONG WITH LARGESTRAL?????? THERE'S ABSOLUTELY NO MENTION OF THE CHARACTERS NOT BEING RELATED BY BLOOD, FFUUUUCCCKKKK YYYOOOOUUUU. This is it, I'm buying credits on OpenRouter to try out Claude Sonnet 3.5. Au revoir /lmg/.

Anonymous
11/20/24(Wed)20:12:01 No.103252846

Anonymous 11/20/24(Wed)20:12:01 No.103252846

>>103252830
99% of those stories are like that, your gonna get the same from claude unless you state otherwise. Not that I would know or anything.

Anonymous
11/20/24(Wed)20:13:12 No.103252857

Anonymous 11/20/24(Wed)20:13:12 No.103252857

>>103252830
>its 2024, nearly 2025
>he's just realizing this
lel

Anonymous
11/20/24(Wed)20:14:22 No.103252869

Anonymous 11/20/24(Wed)20:14:22 No.103252869

none of my self insert yuri scenes have that problem

Anonymous
11/20/24(Wed)20:14:35 No.103252873

Anonymous 11/20/24(Wed)20:14:35 No.103252873

>>103252830
I've had literally all models pull that line on me in an incest story, including Claude. You're not gonna pin this on Largestral.

Anonymous
11/20/24(Wed)20:20:23 No.103252914

Anonymous 11/20/24(Wed)20:20:23 No.103252914

>>103252830
I mean, do you want them to go against the common facts for twists or not? Generally, models which are straight forward are boring and seen as worse than the RP tunes that tend to do this.

Anonymous
11/20/24(Wed)20:20:27 No.103252916

Anonymous 11/20/24(Wed)20:20:27 No.103252916

>>103252846
>>103252873
geez, what incest stories does this even come from? that's so lame and immersion breaking.

Anonymous
11/20/24(Wed)20:21:52 No.103252927

Anonymous 11/20/24(Wed)20:21:52 No.103252927

>>103252830
This happens because a lot of the prissier smut websites banned actual incest, so people started throwing these lines in so they could post their incest stuff
Same thing happens on video porn sites where all the incest vids are careful to say "stepmother" and "stepbrother" etc even though they're well aware that viewers are fantasizing about them being actually related

Anonymous
11/20/24(Wed)20:26:41 No.103252961

Anonymous 11/20/24(Wed)20:26:41 No.103252961

>>103252815
>>103252772
for the record, ooba specifies llama_cpp_python wheels for every macos version. They haven't gotten to macos15 yet, so if that's the case, you don't get any llama_cpp_python at all. Manually specifying it and allowing it to build (costing...15s or so) fixes the issue.

Anonymous
11/20/24(Wed)20:38:17 No.103253032

Anonymous 11/20/24(Wed)20:38:17 No.103253032

Noticing that for some of the bigger models I actually prefer the lower quants where the model is starting to break down a bit, because the schizophrenia from quantization brain damage leads to more interesting outputs than the correct logic of the higher quant

Anonymous
11/20/24(Wed)20:43:07 No.103253073

Anonymous 11/20/24(Wed)20:43:07 No.103253073

>>103252744
>Largestral plateaued on half of the benchmarks
That explains why they didn't show any.

Anonymous
11/20/24(Wed)20:52:43 No.103253132

Anonymous 11/20/24(Wed)20:52:43 No.103253132

File: screencapture-192-168-1-1(...).png (2.87 MB, 3840x9469)

2.87 MB PNG

The new GPT-4o (2024-11-20) is definitely a bit more natural sounding.

Anonymous
11/20/24(Wed)20:56:31 No.103253151

Anonymous 11/20/24(Wed)20:56:31 No.103253151

File: Screenshot_20241121_105220.png (289 KB, 1971x1040)

289 KB PNG

>>103253132
Also more relaxed.
Seems like they took notice of the new 3.5 sonnet.
Messages seem shorter on default as well, only long if needed, which I really like. I dont get the retards who want a huge long answer to a simple question. Just costs more $$.

Makes you wonder how fucked local is though.
All we have are the pozzed gpt-slop datasets. Wouldnt we need to collect again?
Finetunes in the future will produce slop.
Only guy that does something already is drummer I guess. Not sure what he did but I think he replaced the phrases in those datasets to make unslop tunes.

Anonymous
11/20/24(Wed)20:57:08 No.103253155

Anonymous 11/20/24(Wed)20:57:08 No.103253155

>>103253132
Still refuses to write smut though :(
What's the point of tuning it to be better at creative writing (as Saltman said about this one) if it won't write sexo

Anonymous
11/20/24(Wed)21:04:21 No.103253210

Anonymous 11/20/24(Wed)21:04:21 No.103253210

File: 1714835911803028.png (20 KB, 1890x1890)

20 KB PNG

>>103252710
demoralization shill is big mad boo hoo

Anonymous
11/20/24(Wed)21:06:43 No.103253237

Anonymous 11/20/24(Wed)21:06:43 No.103253237

>>103253155
Deal with it incel and go back to >>>/pol/

Anonymous
11/20/24(Wed)21:08:44 No.103253250

Anonymous 11/20/24(Wed)21:08:44 No.103253250

>>103253155
I do like to chat with claude about stuff normaly. The normal sounding language is a huge deal.

The problem with the next local model is they are tuned to the slop tunes.
So it might do ERP now but will sound like chatgpt or opus from 2022.
Really wondering about llama4.Each llama version was more pozzed and slopped than the previous one. If they double down its a bad joke.

Anonymous
11/20/24(Wed)21:10:04 No.103253258

Anonymous 11/20/24(Wed)21:10:04 No.103253258

File: 1716141216220407.jpg (761 KB, 1792x2304)

761 KB JPG

>>103253237
People want uncensored models
You deal with it

Anonymous
11/20/24(Wed)21:12:21 No.103253267

Anonymous 11/20/24(Wed)21:12:21 No.103253267

>>103253258
Safety is important and it doesn't affect you in any way.

Anonymous
11/20/24(Wed)21:14:46 No.103253287

Anonymous 11/20/24(Wed)21:14:46 No.103253287

>>103253132
it's fine but feels kind of forced to be honest, I don't think the default assistant should talk like that.

Anonymous
11/20/24(Wed)21:16:45 No.103253309

Anonymous 11/20/24(Wed)21:16:45 No.103253309

>>103253237
>incel
I'm literally married anon
still love LLM smut tho, Saltman needs to loosen up

Anonymous
11/20/24(Wed)21:17:11 No.103253313

Anonymous 11/20/24(Wed)21:17:11 No.103253313

>>103253132
>>103253151
>>103253155
>>103253250
>/lmg/ - a general dedicated to the discussion and development of local language models.
>talking about non-local models
Out.

Anonymous
11/20/24(Wed)21:18:20 No.103253324

Anonymous 11/20/24(Wed)21:18:20 No.103253324

>>103253132
>>103253151
>>103253155
>>103253250
Stay.

Anonymous
11/20/24(Wed)21:19:20 No.103253336

Anonymous 11/20/24(Wed)21:19:20 No.103253336

>>103253324
Fuck you for enabling off topic discussion.

Anonymous
11/20/24(Wed)21:21:35 No.103253355

Anonymous 11/20/24(Wed)21:21:35 No.103253355

File: lqb.jpg (18 KB, 600x600)

18 KB JPG

>>103253267

Anonymous
11/20/24(Wed)21:22:19 No.103253358

Anonymous 11/20/24(Wed)21:22:19 No.103253358

>>103253336
Off topic is you spamming unrelated muh culchure anime pics, anon on other hand talks about llm, cloud models are technically local somewhere on servers btw :)

Anonymous
11/20/24(Wed)21:22:57 No.103253363

Anonymous 11/20/24(Wed)21:22:57 No.103253363

>>103253336
We get it, you want to kill the thread. Get a job sir.

Anonymous
11/20/24(Wed)21:26:52 No.103253386

Anonymous 11/20/24(Wed)21:26:52 No.103253386

File: Screenshot_20241121_111934.png (356 KB, 1510x1427)

356 KB PNG

>>103253287
Depends on the context.
Sonnet 3.5 does it better. But directionally this is very good.

>>103253309
Dont even engage anon. I am married with kids and get called an incel all the time.
I am an incel in my heart though.

>>103253313
There is always a nigger retard like you around. Even 2 years ago.
Local and closed are interlinked. Because of finetunes and closed is where open-source is headed, just delayed.

Anonymous
11/20/24(Wed)21:30:00 No.103253412

Anonymous 11/20/24(Wed)21:30:00 No.103253412

>>103253358
>guys, you don't get it, my definition of off-topic qualifies the discussion as on-topic, we're not as bad as the shitposters, please let us stay
Go jump off a bridge, faggot.
>>103253363
I rather have a dead quality thread than one that spins its wheels on irrelevant bullshit that doesn't meet the mission statement of the thread as stated. Threads do not have to move fast at all for any reason whatsoever.
>>103253386
We can discuss the developments when they reach local and the lag time is now 3 months. We can wait that long to hold that discussion and if OpenAI isn't going to engage with the community on a technical level anymore (seriously, where is their technical report even for 4o?), it's not worth jack shit in relation to this thread.

Anonymous
11/20/24(Wed)21:35:57 No.103253446

Anonymous 11/20/24(Wed)21:35:57 No.103253446

>>103253412
Thats just your opinion on what the thread should be anon. Others obviously disagree.
You seem too ideological. Who cares what OpenAI does or how they engage with "the community".
We gotta take a close look at closed models on how we can use them (Datasets) and too get a feel where opensource is headed.

That kinda bitching is why so many people left over the months.
We had lots of people on here, Ooba, comfyui gui (still waiting for his llm vn creator), kaioken, etc.
Its not that fun to post if you get called out for bullshit all the time.
You gotta take off the nerd glasses anon.

Anonymous
11/20/24(Wed)21:40:51 No.103253481

Anonymous 11/20/24(Wed)21:40:51 No.103253481

>>103253151
>All we have are the pozzed gpt-slop datasets. Wouldnt we need to collect again?
Do we? Magnum is Claude Opus or Sonnet 3.5 and I'm pretty sure Hermes is similar. Did you just get out of a cryochamber? Everybody knew Claude was better.

Anonymous
11/20/24(Wed)21:42:06 No.103253487

Anonymous 11/20/24(Wed)21:42:06 No.103253487

File: 1731860364173673.jpg (91 KB, 720x406)

91 KB JPG

>>103253412
>it's not worth jack shit in relation to this thread

Anonymous
11/20/24(Wed)21:44:46 No.103253500

Anonymous 11/20/24(Wed)21:44:46 No.103253500

>>103253481
>Magnum
The V4 magnum has more slop in it.
I suspect its this dataset: https://huggingface.co/datasets/anthracite-org/c2_logs_32k_mistral-v3_v1.2_no_system
Has based stuff like "anal king" etc. in it lmao, but the writing sucks.

Also Opus and the old Sonnet 3.5 write completely different than the new Sonnet 3.5. Magnum V4 writes pretty bad in comparison. The unslop tunes are the best we have in that regard I guess.
But hey, thats closed model talk I guess. Gonna stop now.

Anonymous
11/20/24(Wed)21:47:30 No.103253513

Anonymous 11/20/24(Wed)21:47:30 No.103253513

>>103253500
You sound like a Drummer shill. You don't even know what his datasets are about because they're all private. You're just here to shill everything he does.

Anonymous
11/20/24(Wed)21:48:44 No.103253518

Anonymous 11/20/24(Wed)21:48:44 No.103253518

I give up. How can I effectively use a base model vs a chat or instruct tuned one? It's all just nonsense.

Anonymous
11/20/24(Wed)21:49:53 No.103253527

Anonymous 11/20/24(Wed)21:49:53 No.103253527

>>103253500
>c2_logs_32k_mistral-v3_v1.2_no_system
>As if on cue, the original Nebula *moaned* into Kita's *musky maw*, her *slick tongue* *swirling* and *probing* with renewed fervor. *Sssshhlllrrrrp! Gllrrrkgllrrrk!*
>Her *lewd slurping* filled the kitchen, spurring the maid chefs to work faster, their *ample bosoms* *heaving* with barely contained excitement.\n\n\"By the time we're done with you, darling,\" Nebula Prime whispered huskily, her lips *brushing* against Kita's ear, \"your *farts* will be **legendary**.
>These thirsty little sluts will be *fighting* each other for a chance to **huff** your *putrid fumes*, to *bury* their faces in your *gassy ass* and *inhale* your *noxious musk* like it's the *very air they breathe*.\"\n\nShe *licked* her lips, her forked tongue *darting* out to *taste* the *salty sweat* beading on Kita's neck. \"And I'll be right there with them, darling,\" she purred, her voice dripping with *salacious intent*. \"Savoring every *rancid blast*, every *fetid belch* that escapes from your *glorious asshole*. Because in this world, Kita... your *farts* are our *religion*, and we are your devout *ass-worshippers*.\"\n\n
>As the maid chefs *bustled* about, putting the finishing touches on Kita's *flatulent feast*, Nebula Prime continued to *stroke* his *pulsing cock*, her cosmic eyes *locked* with his in a *smoldering gaze* that promised *untold pleasures*... and *unfathomable depravity*. The kitchen was filled with the *tantalizing aroma* of *spices* and *legumes*, mingling with the *musky scent* wafting from Kita's exposed *ass crack*.\n\n
>It was a *sensory overload* of the most *depraved* kind, a *twisted wonderland* where *farts* were *fine dining* and *ass-worship* was the *highest form of devotion*. And at the center of it all was Kita, the *Fart Lord* himself, about to embark on a *gassy odyssey* that would *shake* the very *foundations* of this *fetid pocket dimension*."
It was "Fart lord", not anal king. excuse me.

Anonymous
11/20/24(Wed)21:50:33 No.103253533

Anonymous 11/20/24(Wed)21:50:33 No.103253533

>>103253446
I don't actually care for the "vibes", generals suck because of not being optimal medium for an imageboard to discuss recurrent topics and they cause issues like trying to create a mini culture within an imageboad and that leads to bullshit drama and infighting. People leaving have nothing to actually do with where the thread headed and more of a consequence on what people's agency and loss of interest and going their separate ways.
Datasets aren't that important anymore for the most important things and people making corpuses of Q&A on these models and justifying synthetic data training in an unwise manner for models to make them smarter is how you get slop everywhere for local. That's why for my personal needs in finetuning, I actively mostly don't use anything from this current decade for any sort of training to avoid ingesting shit data like that.
And on the last point, you can always split and make your own thing to discuss all LLMs equally like how /ldg/ split from /sdg/. No one is stopping you. Objectively, your discussion for any cloud model does nothing to further the state of the art locally now that everyone is competing fiercely to be #1 in the space and doesn't release anything technical to actually talk about because of giving up their moats and competitive advantage. It is as off-topic as far as I can see, there is no grayness here at this point so I will point it out for what it is.
>>103253487
>in relation to this thread
>opinion
Your faggotry knows no bound.

Anonymous
11/20/24(Wed)21:51:19 No.103253537

Anonymous 11/20/24(Wed)21:51:19 No.103253537

>>103253527
The entire “discussion” was just an excuse to shill Drummer’s models.

Anonymous
11/20/24(Wed)21:53:47 No.103253560

Anonymous 11/20/24(Wed)21:53:47 No.103253560

This shill meme really is getting out of hand.
I was saying the magnum dataset is based for having the fart lord in it. (it introduces more slop in V4 though)
And I like drummers unslop finetunes, idgaf if he makes it public or not. Maybe there is cunny in there or whatever. Who else is even out there left? Otherwise its all those small llm websites that make 8k context tunes, those get hyped on reddit.

Anonymous
11/20/24(Wed)21:55:50 No.103253573

Anonymous 11/20/24(Wed)21:55:50 No.103253573

>damage control
Go buy a fucking ad already. You keep writing and yet the only point that you’re trying to get across is “use Drummer's models.”

Anonymous
11/20/24(Wed)22:00:04 No.103253597

Anonymous 11/20/24(Wed)22:00:04 No.103253597

On the Way to LLM Personalization: Learning to Remember User Conversations
https://arxiv.org/abs/2411.13405
>Large Language Models (LLMs) have quickly become an invaluable assistant for a variety of tasks. However, their effectiveness is constrained by their ability to tailor responses to human preferences and behaviors via personalization. Prior work in LLM personalization has largely focused on style transfer or incorporating small factoids about the user, as knowledge injection remains an open challenge. In this paper, we explore injecting knowledge of prior conversations into LLMs to enable future work on less redundant, personalized conversations. We identify two real-world constraints: (1) conversations are sequential in time and must be treated as such during training, and (2) per-user personalization is only viable in parameter-efficient settings. To this aim, we propose PLUM, a pipeline performing data augmentation for up-sampling conversations as question-answer pairs, that are then used to finetune a low-rank adaptation adapter with a weighted cross entropy loss. Even in this first exploration of the problem, we perform competitively with baselines such as RAG, attaining an accuracy of 81.5% across 100 conversations.
not quite there but interesting.

Anonymous
11/20/24(Wed)22:01:34 No.103253608

Anonymous 11/20/24(Wed)22:01:34 No.103253608

>>103253537
>>103253573
Take your meds retard. Whatever that guy's doing isn't half as destructive to the thread as your constant schizo shill accusations.

Anonymous
11/20/24(Wed)22:06:30 No.103253638

Anonymous 11/20/24(Wed)22:06:30 No.103253638

>>103253608
Keep samefagging, shill. Shouldn't you be busy sucking Drummer's cock?

Anonymous
11/20/24(Wed)22:09:49 No.103253660

Anonymous 11/20/24(Wed)22:09:49 No.103253660

>>103253608
Whatever your boyfriend drummer does is worthless.

Anonymous
11/20/24(Wed)22:09:57 No.103253661

Anonymous 11/20/24(Wed)22:09:57 No.103253661

>>103253560
has anyone heard from sao lately

Anonymous
11/20/24(Wed)22:12:45 No.103253686

Anonymous 11/20/24(Wed)22:12:45 No.103253686

>>103253250
>pozzed and slopped than the previous one
Llama 2 was censored as fuck though. It refused to write a story about a college student on the premise that college students were a protected minority
Also there was the whole "As a" test on the base model which Llama 3 passes but 2 didn't

Anonymous
11/20/24(Wed)22:15:05 No.103253699

Anonymous 11/20/24(Wed)22:15:05 No.103253699

>>103253686
Yea, they backed off on the censorship with llama 3 / 3.1

Anonymous
11/20/24(Wed)22:15:57 No.103253708

Anonymous 11/20/24(Wed)22:15:57 No.103253708

>>103253638
>>103253660
>accuses others of samefagging
>does this

Anonymous
11/20/24(Wed)22:17:07 No.103253719

Anonymous 11/20/24(Wed)22:17:07 No.103253719

>>103253661
You're Sao. Go buy a fucking ad, asshole.

Anonymous
11/20/24(Wed)22:17:47 No.103253723

Anonymous 11/20/24(Wed)22:17:47 No.103253723

>>103253708
>everyone i disagree with is samefag
k

Anonymous
11/20/24(Wed)22:20:25 No.103253741

Anonymous 11/20/24(Wed)22:20:25 No.103253741

>>103253723
You could at least have tried the inspect element bamboozle. Not just a schizo but a lazy one.

Anonymous
11/20/24(Wed)22:22:06 No.103253750

Anonymous 11/20/24(Wed)22:22:06 No.103253750

>>103253686
I feel like a lot of people forget how much of a disappointment Llama 2 was. Context and GQA (for the 70B) aside, there was basically no noticeable intelligence improvement in the base model
Compare the jump from Llama 2 to Llama 3, hell, even OPT to Llama 1, and it really doesn't compare

Anonymous
11/20/24(Wed)22:24:26 No.103253767

Anonymous 11/20/24(Wed)22:24:26 No.103253767

>>103253597
I don't get this, not to dismiss the research or anything since it is a hard thing to do. This is worse than RAG and that came out like Year 1. I get it is cheaper with this approach but this isn't really the thing to do if we're looking to add knowledge to a model, it seems.
>>103253661
He last posted on his personal blog on HuggingFace in October. But I expect a lot of hibernation from other fine-tuners until Llama 4, since what we're getting from Mistral is incremental and Chinese models are still not really given a chance since the training to uncuck them from the zhang is maybe too much?

Anonymous
11/20/24(Wed)22:29:31 No.103253799

Anonymous 11/20/24(Wed)22:29:31 No.103253799

>>103253767
>I will pretend that Llama 3 wasn't the most censored model to date with zero fine-tunes worth using

Anonymous
11/20/24(Wed)22:31:01 No.103253806

Anonymous 11/20/24(Wed)22:31:01 No.103253806

>>103253799
Bwo, your Nemotron?

Anonymous
11/20/24(Wed)22:32:12 No.103253816

Anonymous 11/20/24(Wed)22:32:12 No.103253816

>>103253806
>nvidia
It doesn't count.

Anonymous
11/20/24(Wed)22:33:06 No.103253823

Anonymous 11/20/24(Wed)22:33:06 No.103253823

>>103253799
Except there were like a hundred? Just that mistral large and qwen superseded its intelligence?

Anonymous
11/20/24(Wed)22:34:55 No.103253843

Anonymous 11/20/24(Wed)22:34:55 No.103253843

>>103253823
LLama3 has the positivity bias like the qwen models. Its at that level.
The only people who enjoy finetunes off llama 3 are people that like V4 72b.
if we didnt have mistral things would look really bad.

Anonymous
11/20/24(Wed)22:36:16 No.103253850

Anonymous 11/20/24(Wed)22:36:16 No.103253850

Back in my day we didn't devour stale bait

Anonymous
11/20/24(Wed)22:38:28 No.103253861

Anonymous 11/20/24(Wed)22:38:28 No.103253861

>>103253823
>worth using
Nobody is waiting for Llama 4 because nobody is expecting anything good from them anymore.

Anonymous
11/20/24(Wed)22:39:41 No.103253869

Anonymous 11/20/24(Wed)22:39:41 No.103253869

>>103253861
K

Anonymous
11/20/24(Wed)22:41:22 No.103253880

Anonymous 11/20/24(Wed)22:41:22 No.103253880

Zuck and LeCun ate my kitten

Anonymous
11/20/24(Wed)22:58:33 No.103253975

Anonymous 11/20/24(Wed)22:58:33 No.103253975

File: mini-m4.png (588 KB, 2004x1596)

588 KB PNG

> Apple delivers a GPU with 32GB of VRAM for less than $999
> it comes with a free computer attached to it
your response, /g/?

Anonymous
11/20/24(Wed)23:00:15 No.103253989

Anonymous 11/20/24(Wed)23:00:15 No.103253989

>>103253975
Isn't the compute speed really slow though, like about equivalent to a 3060? I need more

Anonymous
11/20/24(Wed)23:02:34 No.103254005

Anonymous 11/20/24(Wed)23:02:34 No.103254005

File: m4pro.png (592 KB, 2024x1586)

592 KB PNG

>>103253989
The M4 Pro version has a faster GPU (equivalent to a 4060ti in compute)
> $2000
> 64GB of VRAM
> free computer attached

Anonymous
11/20/24(Wed)23:04:31 No.103254017

Anonymous 11/20/24(Wed)23:04:31 No.103254017

>>103254005
Token processing speed is super slow though. Getting used 3090s would be much better.

Anonymous
11/20/24(Wed)23:05:28 No.103254026

Anonymous 11/20/24(Wed)23:05:28 No.103254026

>>103254005
It's still too slow and expensive. I got 96GB of VRAM for $2400 with used 3090s.

Anonymous
11/20/24(Wed)23:05:32 No.103254028

Anonymous 11/20/24(Wed)23:05:32 No.103254028

>>103254017
nta but you're missing the point about the free computer.

Anonymous
11/20/24(Wed)23:06:23 No.103254033

Anonymous 11/20/24(Wed)23:06:23 No.103254033

>>103254028
$2000
>Free

Anonymous
11/20/24(Wed)23:06:50 No.103254037

Anonymous 11/20/24(Wed)23:06:50 No.103254037

>>103254028
I already have a computer. A Mac is not a computer, it's more like a toy.

Anonymous
11/20/24(Wed)23:18:23 No.103254133

Anonymous 11/20/24(Wed)23:18:23 No.103254133

>>103252961
>ooba specifies llama_cpp_python wheels for every macos version
i'm a small time programmer and i cant wait for this shit to DIE. python isn't a bad scripting lang by itself, its the never-ending version changes and dependencies hell that make it cancer. i give credit to kobold for maintaining c++ versions of things like whisper, stability for imggen

Anonymous
11/20/24(Wed)23:20:36 No.103254143

Anonymous 11/20/24(Wed)23:20:36 No.103254143

https://x.com/geerlingguy/status/1858985963449364842

Anonymous
11/20/24(Wed)23:25:24 No.103254171

Anonymous 11/20/24(Wed)23:25:24 No.103254171

>>103254143
A much better version.

>>103254028
There you go. a "free" computer with your gpu
>>103254143

Anonymous
11/20/24(Wed)23:29:25 No.103254192

Anonymous 11/20/24(Wed)23:29:25 No.103254192

>>103254143
>>103254171
now kiss

Anonymous
11/20/24(Wed)23:32:46 No.103254207

Anonymous 11/20/24(Wed)23:32:46 No.103254207

>>103253258
With great power comes great responsibility.jpg

Anonymous
11/20/24(Wed)23:34:47 No.103254220

Anonymous 11/20/24(Wed)23:34:47 No.103254220

File: le.png (14 KB, 554x301)

14 KB PNG

why would you build an ai rig when mistral gives you thousands of dollars of api credits free every month

Anonymous
11/20/24(Wed)23:36:12 No.103254226

Anonymous 11/20/24(Wed)23:36:12 No.103254226

>>103254220
They outright say they will collect your logs. Not sure if they will ban you if they see nsfw or not though.

Anonymous
11/20/24(Wed)23:43:19 No.103254254

Anonymous 11/20/24(Wed)23:43:19 No.103254254

>>103254220
what the fuck part of LOCAL is hard to understand?

Anonymous
11/20/24(Wed)23:44:20 No.103254259

Anonymous 11/20/24(Wed)23:44:20 No.103254259

>>103254220
Why would I use Mistral when Qwen is better?

Anonymous
11/21/24(Thu)00:10:37 No.103254421

Anonymous 11/21/24(Thu)00:10:37 No.103254421

>>103254220
*cuts off your internet.*

Anonymous
11/21/24(Thu)00:18:05 No.103254463

Anonymous 11/21/24(Thu)00:18:05 No.103254463

>>103254220
Copium and false hopes for """"""uncensored"""""" local model is strong.

Anonymous
11/21/24(Thu)00:33:50 No.103254550

Anonymous 11/21/24(Thu)00:33:50 No.103254550

>>103254226
nonissue when you can just sign up with protonmail and a burner sms code
>>103254254
im calling the api from my local computer
>>103254259
wrong

Anonymous
11/21/24(Thu)00:43:17 No.103254594

Anonymous 11/21/24(Thu)00:43:17 No.103254594

>>103254220
>free
its not free though. your registration and logs are the price.
i remember getting CSAM warning on chatgpt 2 years ago shortly after launch.
No idea if they still do this, but apparently they auto forwarded to child protection. Gotta pray there is a guy sitting at a desk somewhere that doesnt escalate.
My (cringe) crime: asking chatgpt to act like my anime imouto that calls me onii-chan
Who knows whats even legal fiction these days. Feels like its already a offence if you enjoy text that contains violence against certain groups.
I use sonnet 3.5 through openrouter for coding and general normie questions and to goof around with newer closed models.
Anything else you need local.

Anonymous
11/21/24(Thu)00:48:52 No.103254627

Anonymous 11/21/24(Thu)00:48:52 No.103254627

>>103254594
Chinese local models, to be more specific.

Anonymous
11/21/24(Thu)00:52:19 No.103254644

Anonymous 11/21/24(Thu)00:52:19 No.103254644

>>103254627
No, mistral models.
What kind of chinese models do you use?
Qwen2.5-Coder-32B is really good. But thats coding.

Anonymous
11/21/24(Thu)00:53:18 No.103254649

Anonymous 11/21/24(Thu)00:53:18 No.103254649

>>103254594
They would have been bluffing about the forwarding thing in order to scare people, because LE orgs would have told them to fuck off in short order if they were actually forward text-only smut to them. The FBI put a notice up a while back essentially asking people to stop wasting their time by reporting hentai because they already had more cases of real child abuse images and videos on their plate than they could handle. They would care even less about text.

Anonymous
11/21/24(Thu)00:53:50 No.103254653

Anonymous 11/21/24(Thu)00:53:50 No.103254653

File: 177237485629.png (195 KB, 386x445)

195 KB PNG

>>103254220

>As an AI language model, I do not have the ability to create or engage with sexual interactions. My purpose is to provide information and assistance to the best of my abilities. If you have any questions or need help with something, feel free to ask.

IMAGINE PAYING FOR THAT

Anonymous
11/21/24(Thu)00:54:18 No.103254657

Anonymous 11/21/24(Thu)00:54:18 No.103254657

>>103254649
*actually forwarding

Anonymous
11/21/24(Thu)00:54:52 No.103254660

Anonymous 11/21/24(Thu)00:54:52 No.103254660

File: 1710222028128104.jpg (44 KB, 632x522)

44 KB JPG

>>103254594
>asking chatgpt to act like my anime imouto that calls me onii-chan
>i remember getting CSAM warning
Purely your problem, this is a non-issue for 99% of anons out there, just acquire better tastes and be free from this misery.

Anonymous
11/21/24(Thu)00:59:46 No.103254689

Anonymous 11/21/24(Thu)00:59:46 No.103254689

>>103254649
https://openai.com/policies/usage-policies/
>We report apparent child sexual abuse material (CSAM) to the National Center for Missing and Exploited Children.
I cant find the blogs anymore but there was news a bit later as well.
Maybe now they just report if some retard uploads a CP pic?
No doubt they at least dont display anything anymore, otherwise people would write about it.
But I'm not gonna post anything but the most normie stuff even with openrouter.

>>103254660
Yeah, just gotta change what I write to the LLM, then I dont have a problem with a auto forward to some agency. Sounds cool. Let me register the credit card again.
Whats actually allowed? Nobody knows.

Anonymous
11/21/24(Thu)00:59:55 No.103254690

Anonymous 11/21/24(Thu)00:59:55 No.103254690

File: 189712569812339.png (6 KB, 460x500)

6 KB PNG

>local AI, why would i do that? sounds unsafe, you could really end the world, man.
>why yes, i do pay for AI tokens, how could you tell?

Anonymous
11/21/24(Thu)01:00:12 No.103254694

Anonymous 11/21/24(Thu)01:00:12 No.103254694

>>103254644
Qwen2.5-72B for general assistant stuff, Magnum v4 72B for smut writing.

Anonymous
11/21/24(Thu)01:02:42 No.103254705

Anonymous 11/21/24(Thu)01:02:42 No.103254705

>>103254694
>Magnum v4 72B for smut writing
I thought thats what you say.
Fair enough, I cant deal with how aggressively the model doesnt like certain directions.
Ex.: Characters falling down from the sky, impact is immediate
72B Response: Black vortex opens and swallows them up. They await what you will do next..
You probably gonna say prompt issue, maybe. But I cant deal with that shit.

Anonymous
11/21/24(Thu)01:05:44 No.103254723

Anonymous 11/21/24(Thu)01:05:44 No.103254723

>>103254689
>Maybe now they just report if some retard uploads a CP pic?
Oh yeah if it's people uploading CP to Dalle in order to edit the image then that's more plausible. Are pedophiles that fucking stupid?

Anonymous
11/21/24(Thu)01:06:18 No.103254724

Anonymous 11/21/24(Thu)01:06:18 No.103254724

>>103254723
>Are pedophiles that fucking stupid?
You know there is some retard out there somewhere.

Anonymous
11/21/24(Thu)01:06:25 No.103254725

Anonymous 11/21/24(Thu)01:06:25 No.103254725

>>103254690
>unsafe
Local AI is flourishing in safety and refusals though

Anonymous
11/21/24(Thu)01:07:49 No.103254731

Anonymous 11/21/24(Thu)01:07:49 No.103254731

File: 159647945432733.gif (2.7 MB, 316x212)

2.7 MB GIF

>>103254723
>Are pedophiles that fucking stupid?

Anonymous
11/21/24(Thu)01:07:59 No.103254732

Anonymous 11/21/24(Thu)01:07:59 No.103254732

>>103254259
I tried qwen2 and it turned out to be a moralfag and gave me reddit-like answers. never again

Anonymous
11/21/24(Thu)01:08:31 No.103254735

Anonymous 11/21/24(Thu)01:08:31 No.103254735

>>103254220
>p-please train our model i-it's even free you get tokens

Anonymous
11/21/24(Thu)01:09:04 No.103254738

Anonymous 11/21/24(Thu)01:09:04 No.103254738

>>103254731
Dumb criminals always get caught, smart criminals get elected.

Anonymous
11/21/24(Thu)01:12:35 No.103254758

Anonymous 11/21/24(Thu)01:12:35 No.103254758

I just started using this stuff and I'm wondering why does literally every character card go like this:
>This character is Insectu Buguki, an obedient dog girl, who works in the Ching Chong district 8 prefecture. She owns an anime store and can communicate with demons who reside in a latex spaceship in another dimension
Where can I get some believable, everyday characters?

Anonymous
11/21/24(Thu)01:13:40 No.103254765

Anonymous 11/21/24(Thu)01:13:40 No.103254765

>>103254758
Work on your bait skills anon.

Anonymous
11/21/24(Thu)01:14:49 No.103254768

Anonymous 11/21/24(Thu)01:14:49 No.103254768

Mistral Large 2411 2.5bpw is better than the previous version at RP. I think I might be able to retire the good old Mixtral ZLoss LimaRP 8x7B now. We'll see...

{{user}} doesn’t waste time; he begins to strip Faye of her clothes with eager, impatient hands. Her casino uniform is tossed aside, leaving her in nothing but her underwear. With her hands still cuffed, she sits on the toilet, her breath steady and controlled, watching {{user}} with a cool, calculating gaze. {{user}} doesn't bother to remove her lingerie and doesn’t even bother to caress her; instead, he yanks down her panties and hurriedly pulls her body to the edge of the toilet seat. He forces her legs apart and thrusts into her. Faye’s breath hitches as she grits her teeth and holds her breath. She closes her eyes and tries to distract herself from the weight of him against her opening. She could feel his hot skin, greasy and heavy against her inner thighs. She inhales his scent, heavy with sweat and musky aroma filling the air between them. It is pungent and raw. Faye doesn't let him see her discomfort openly, but he could see the stubborn glint in her eyes. Her vaginal walls clamp around his member, tighter before he begins his rhythmic pumping. Her unyielding expression conceals discomfort, but she doesn't allow him the joy of hurting her. {{user}} could feel the texture of her smooth walls, snug and almost abrasive in their tightness. His slick length thrusts into her body, forcing her walls to yield. Faye could feel his veined length prod further, sending tingling sensations up her body, as unwanted as they were. His heavy breaths come faster as he forces her body to yield. He didn’t care if she wanted it.

Anonymous
11/21/24(Thu)01:19:02 No.103254793

Anonymous 11/21/24(Thu)01:19:02 No.103254793

>>103254768
it's hilarious to me how we seemingly have the world at our fingertips, but we choose to use our god-like powers to create rape RP instead

Anonymous
11/21/24(Thu)01:19:31 No.103254795

Anonymous 11/21/24(Thu)01:19:31 No.103254795

>>103254758
>Where can I get some believable, everyday characters?
Uhh, some normie outside?
Dont mind me enjoying my card that can open interdimensional gloryholes to anybody I want in the meantime.

Anonymous
11/21/24(Thu)01:20:35 No.103254804

Anonymous 11/21/24(Thu)01:20:35 No.103254804

File: 1705352158546774.png (293 KB, 2358x1402)

293 KB PNG

>>103254705
>Qwen2.5-72B-Instruct
>System prompt: You're a helpful assistant
You need to find better bait, petrus.

Anonymous
11/21/24(Thu)01:20:43 No.103254807

Anonymous 11/21/24(Thu)01:20:43 No.103254807

>>103254793
And you're much better?

Anonymous
11/21/24(Thu)01:21:17 No.103254809

Anonymous 11/21/24(Thu)01:21:17 No.103254809

>>103254807
no, i am doing the same, that's why i find it hilarious to see others being in the same boat

Anonymous
11/21/24(Thu)01:21:41 No.103254814

Anonymous 11/21/24(Thu)01:21:41 No.103254814

>>103254804 (me)
Oh, it was Magnum, not vanilla instruct.

Anonymous
11/21/24(Thu)01:23:52 No.103254827

Anonymous 11/21/24(Thu)01:23:52 No.103254827

>>103254768
>2.5bpw
>much better than the previous version at RP
That's literally placebo. It's pretty much the same model.

Anonymous
11/21/24(Thu)01:25:18 No.103254840

Anonymous 11/21/24(Thu)01:25:18 No.103254840

>>103254814
>>103254804
Yes Magnum V4 with the demon summoning card. Dont have the screenshot anymore.
Would be wild if its magnum thats causing this.

Anonymous
11/21/24(Thu)01:26:00 No.103254843

Anonymous 11/21/24(Thu)01:26:00 No.103254843

>>103254827
Its much better at RP I found as well. Most likely because of the training on system prompts that it did not have before.

>inb4 you did not even read the release to see the new format.
https://huggingface.co/mistralai/Mistral-Large-Instruct-2411#system-prompt

Anonymous
11/21/24(Thu)01:30:03 No.103254858

Anonymous 11/21/24(Thu)01:30:03 No.103254858

>>103254843
>Most likely
That's just another word for placebo.

Anonymous
11/21/24(Thu)01:34:29 No.103254877

Anonymous 11/21/24(Thu)01:34:29 No.103254877

File: Screenshot_2024_11_21-1.png (225 KB, 1925x857)

225 KB PNG

>>103244563
>>103244746
>>103247023
Broke out the wandb to do some more formal tests.
Let me explain some of the labels and what they mean.
SR = Sparsity Ratio (from 0 to 1)
WUP = Warmup-steps (100)
BS = Block dimension (256)
GW = Gradient-aware selection method
I think I can independently verify at least one claim of the SMT paper: It's on par with LoRA using way less parameters and better than it when the parameters at the same.
It's also more memory efficient than LoRA, albeit not to the degree that they claim. Although that's probably because my implementation or something.
Speaking of my implementation, you don't have to wait on me to release my implantation because the official implementation is available. https://github.com/ICLR2025SMT/ICLR2025SMT
Seems like it was posted fairly recently. Although it's quite more complicated (I would say somewhat unnecessarily so) and my implementation is at least more simple to use and seems to (for the most part) achieve the same results.

Anonymous
11/21/24(Thu)01:36:56 No.103254901

Anonymous 11/21/24(Thu)01:36:56 No.103254901

i'm new to this. if i create a custom model, does the SYSTEM description count towards the num_ctx value? Let's say I use around 2000 tokens but only have a num_ctx of 4000, does the model after running only have 2000 left? or does it get integrated into the model and have a maximum length of 4000 upon running?

Anonymous
11/21/24(Thu)01:50:50 No.103254986

Anonymous 11/21/24(Thu)01:50:50 No.103254986

https://x.com/DefiantLs/status/1859246172679889331
Brave and powerful woman with her ai boyfriend. incels BTFO!

Anonymous
11/21/24(Thu)01:55:42 No.103255009

Anonymous 11/21/24(Thu)01:55:42 No.103255009

>>103254986
I think most incels are kinda insulated from these kinds of attacks, it's not like they're getting less pussy than before. Same shit with the 4B movement, withholding sex is only a threat if you're having it regularly.

Anonymous
11/21/24(Thu)02:00:45 No.103255036

Anonymous 11/21/24(Thu)02:00:45 No.103255036

File: file.png (292 KB, 1925x857)

292 KB PNG

>>103254877
I can't not see this is a bunch of walls.

Anonymous
11/21/24(Thu)02:05:34 No.103255071

Anonymous 11/21/24(Thu)02:05:34 No.103255071

File: soyjak in color.jpg (67 KB, 751x1063)

67 KB JPG

Best idiot friendly coding assistant that can reason a bit?
My primary concern is that I am looking for something that can think outside the box a bit if I ask a relatively open ended question, rather than it knowing every single obscure library out there, I can read docs myself if need be.
Also for now I am cucked with 12GB Vram limitation sadly:(

Anonymous
11/21/24(Thu)02:07:22 No.103255081

Anonymous 11/21/24(Thu)02:07:22 No.103255081

File: better tactics.png (277 KB, 1925x857)

277 KB PNG

>>103255036
That's actually bad tactical positioning.
You don't want to hug the wall because then you basically have nowhere to go if you cross their line of fire. So you want to stand further back from the wall. The wall still provides concealment but now you actually have somewhere to go.

Anonymous
11/21/24(Thu)02:07:32 No.103255083

Anonymous 11/21/24(Thu)02:07:32 No.103255083

>>103255071
Qwen2.5 coder 32B is 100x better than anything else local atm.
>12GB
Err, I think there is a 14B version, not sure how it compares though.

Anonymous
11/21/24(Thu)02:09:59 No.103255095

Anonymous 11/21/24(Thu)02:09:59 No.103255095

>>103254795
The card is an instructional format telling the LLM every facet of a character. It's not meant to be introducing a character like a novel or whatever you expect.

Anonymous
11/21/24(Thu)02:10:41 No.103255104

Anonymous 11/21/24(Thu)02:10:41 No.103255104

Huh, does this SMT thing imply that people's non-full fine toons will just become better?

Anonymous
11/21/24(Thu)02:14:38 No.103255126

Anonymous 11/21/24(Thu)02:14:38 No.103255126

>>103254877
Their code is a fucking mess. Will you post yours? This sounds really cool.

Anonymous
11/21/24(Thu)02:26:14 No.103255180

Anonymous 11/21/24(Thu)02:26:14 No.103255180

>>103255126
Yeah, right now I'm in a real goldilocks zone where I don't want to post it until it's juuuuuuuuust right. Also, I think I want to run some evals first

Anonymous
11/21/24(Thu)02:32:05 No.103255211

Anonymous 11/21/24(Thu)02:32:05 No.103255211

>>103255083
I guess I will see how it compares then.

Anonymous
11/21/24(Thu)02:51:37 No.103255290

Anonymous 11/21/24(Thu)02:51:37 No.103255290

>>103255081
I noticed left guy was too close but I didn't draw them on a separate layer. Ironically I'M the one who complains when movies have someone's back against the wall and do that retarded 180 degrees spin thing. I thought right guy was alright but I see what you mean.

Anonymous
11/21/24(Thu)02:58:13 No.103255308

Anonymous 11/21/24(Thu)02:58:13 No.103255308

File: 1704714367128752.jpg (53 KB, 660x716)

53 KB JPG

>Qwen2.5-Gutenberg-Doppel-14B remembers and holds logic the best, but is sadly cucked to shit, completely unusable.
>magnum-12B is a complete retard with short-term memory loss
I guess I'm still stuck with Lyra4-Gutenberg..

Anonymous
11/21/24(Thu)03:11:59 No.103255363

Anonymous 11/21/24(Thu)03:11:59 No.103255363

File: zetsubou.png (1.49 MB, 720x1328)

1.49 MB PNG

Good night /lmg/

Anonymous
11/21/24(Thu)03:17:59 No.103255391

Anonymous 11/21/24(Thu)03:17:59 No.103255391

I'm new to this, is there something currently better stheno that I can use on a 12gb 3060 for nsfw roleplay?

Anonymous
11/21/24(Thu)03:19:42 No.103255398

Anonymous 11/21/24(Thu)03:19:42 No.103255398

>>103255391
Good morning, Sao.

Anonymous
11/21/24(Thu)03:20:08 No.103255399

Anonymous 11/21/24(Thu)03:20:08 No.103255399

>>103255363
Good night distorted Miku

Anonymous
11/21/24(Thu)03:21:41 No.103255406

Anonymous 11/21/24(Thu)03:21:41 No.103255406

>>103255398
I searched the archives for similar questions and someone recommend Arcanum

Anonymous
11/21/24(Thu)03:24:42 No.103255414

Anonymous 11/21/24(Thu)03:24:42 No.103255414

>>103255363
Bad nights

Anonymous
11/21/24(Thu)03:33:04 No.103255445

Anonymous 11/21/24(Thu)03:33:04 No.103255445

>>103252830
LLM moment
Come back in 5 years when we have much better architectures that are hopefully smarter

Anonymous
11/21/24(Thu)03:33:19 No.103255447

Anonymous 11/21/24(Thu)03:33:19 No.103255447

>kobold is still moving forward with mutliplayer
as a programmer i think this will be an awesome feature. i already save logs and show them to programmer friends, so they can load up what i was working on. now i can do that live and let them connect and run shit on my machine? thats amazing for remote work/programming. i bet this feature will be exclusively used by gooners to get their rocks off, but its an actual great feature which will be helpful.

Anonymous
11/21/24(Thu)03:33:22 No.103255448

Anonymous 11/21/24(Thu)03:33:22 No.103255448

What's the best non-cucked 7-12B model (Or whatever fits in ~12GB VRAM, since I have 16GB)?
CPU could also be fine, it doesn't have to be that fast. I have 64GB ram.
I have been using nous hermes 7b.

Anonymous
11/21/24(Thu)03:35:12 No.103255454

Anonymous 11/21/24(Thu)03:35:12 No.103255454

>>103255448
lyra 4 gutenberg is my favorite nemo (12b) model. i run it at q6, its pretty good.

Anonymous
11/21/24(Thu)03:35:46 No.103255457

Anonymous 11/21/24(Thu)03:35:46 No.103255457

>>103254550
where do you get your burner sms?

Anonymous
11/21/24(Thu)03:37:13 No.103255464

Anonymous 11/21/24(Thu)03:37:13 No.103255464

>>103255454
I have been trying to run nemo 12b in oogabooga with no success, not sure what settings and loader it needs. >_<

Anonymous
11/21/24(Thu)03:41:29 No.103255487

Anonymous 11/21/24(Thu)03:41:29 No.103255487

>>103255464
i'd guess its your ooba settings then. nemo and all finetunes are ez to run. i use kobold but i think everything should load it. try it with kobold as the server.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.