/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 12/18/24(Wed)15:56:32 No.103565507

File: lmg.png (1.5 MB, 1387x778)

1.5 MB PNG

/lmg/ - Local Models General Anonymous 12/18/24(Wed)15:56:32 No.103565507 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>103554929 & >>103545710

►News
>(12/18) Granite 3.1 released: https://hf.co/ibm-granite/granite-3.1-8b-instruct/tree/main
>(12/17) Falcon3 models released, including b1.58 quants: https://hf.co/blog/falcon3
>(12/16) Apollo: Qwen2.5 models finetuned by Meta GenAI for video understanding: https://hf.co/Apollo-LMMs/Apollo-7B-t32
>(12/15) CosyVoice2-0.5B released: https://funaudiollm.github.io/cosyvoice2
>(12/14) Qwen2VL support merged: https://github.com/ggerganov/llama.cpp/pull/10361

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/tldrhowtoquant

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/hsiehjackson/RULER
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
12/18/24(Wed)15:56:50 No.103565511

Anonymous 12/18/24(Wed)15:56:50 No.103565511

File: miii.jpg (305 KB, 1248x1824)

305 KB JPG

►Recent Highlights from the Previous Thread: >>103554929

--Papers:
>103562935
--OpenAI model struggles with Japanese text extraction and translation:
>103558631 >103558769 >103559045 >103558847 >103560062 >103561543 >103558840 >103559264
--Intel Arc B580 with 24GB VRAM for AI setups:
>103561561 >103561609 >103561645 >103561733 >103561767 >103561852 >103561882 >103561931 >103561973 >103561988
--Troubleshooting Koboldcpp context dropping issue:
>103561555 >103561660 >103562020 >103562064 >103562255 >103562525 >103562656 >103563212 >103563346 >103563560
--Anon seeks advice on designing a maintainable Python project:
>103560411 >103560521 >103560565 >103560643 >103561111 >103561129 >103561302 >103562528
--Anon tests Falcon model, notes censorship and role-swapping behavior:
>103557659 >103558097 >103558192 >103563472 >103564033 >103564252
--Offline archive of chub and related datasets discussion:
>103556078 >103556136 >103556232 >103556190
--IBM releases Granite 3.1, with updated language models and competitive benchmark scores:
>103561747
--Anon shares review of code models, Qwen Coder 32b and Codestral 22b:
>103563391 >103563501 >103563632
--MemryX MX3 M.2 Module review and specs discussion:
>103562559 >103563157
--Guitar amp simulation using local models and potential noise reduction techniques:
>103556265 >103556558
--Critique of poorly made finetunes and LLM-based benchmarks:
>103558254
--Anons share mixed results and skepticism about control vectors:
>103562388 >103562420 >103562457 >103562486 >103562524 >103562621 >103562643 >103562999
--Anon shows off custom-built computer system with P40 components:
>103563021 >103563066 >103563237 >103564404
--Apollo's disappearance and potential API shift:
>103556992 >103557063 >103557071 >103557080
--Miku (free space):
>103555774 >103557688 >103561477 >103561487 >103563635 >103564358

►Recent Highlight Posts from the Previous Thread: >>103554934

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script

Anonymous
12/18/24(Wed)16:05:23 No.103565624

Anonymous 12/18/24(Wed)16:05:23 No.103565624

File: 1708632781901994.jpg (98 KB, 608x621)

98 KB JPG

i'm updating my director plugin for st. one thing i wanted to fix was how the non-lorebook data was handled and i think this is a good solution. i added a new section with text boxes and you add an item by separating it by a comma. previously you could add things but had to edit the html file where all these were held. i like this idea better because then i could add an import/export option

Anonymous
12/18/24(Wed)16:10:45 No.103565686

Anonymous 12/18/24(Wed)16:10:45 No.103565686

I always have seen Mistral Large being clowned as the king of RP, but whenever I try it I always feel like Nemo is better. Am I doing something wrong?

Anonymous
12/18/24(Wed)16:10:48 No.103565688

Anonymous 12/18/24(Wed)16:10:48 No.103565688

File: QwQ.png (45 KB, 1075x731)

45 KB PNG

QwQ slapping the shit out of the competition.

Anonymous
12/18/24(Wed)16:14:48 No.103565731

Anonymous 12/18/24(Wed)16:14:48 No.103565731

>>103565688
Anon, those are rankings, not score - lower is better...

Anonymous
12/18/24(Wed)16:16:16 No.103565746

Anonymous 12/18/24(Wed)16:16:16 No.103565746

>>103565688
no correlation with reality

Anonymous
12/18/24(Wed)16:16:23 No.103565749

Anonymous 12/18/24(Wed)16:16:23 No.103565749

>>103565731
What? Theres no way thats right

Anonymous
12/18/24(Wed)16:16:25 No.103565751

Anonymous 12/18/24(Wed)16:16:25 No.103565751

File: 💀.png (93 KB, 1823x2049)

93 KB PNG

>>103565731

Anonymous
12/18/24(Wed)16:19:14 No.103565793

Anonymous 12/18/24(Wed)16:19:14 No.103565793

>>103565688
>>103565749
These numbers are straight up pulled from the LLM chatbot arena. Lower numbers are better since it's the ranking for the model.

DSFag !!WqQEZUclnsc
12/18/24(Wed)16:20:35 No.103565805

DSFag !!WqQEZUclnsc 12/18/24(Wed)16:20:35 No.103565805

I always knew QwQ was a meme desu, DeepSeek R1 is where it is.

Anonymous
12/18/24(Wed)16:20:56 No.103565806

Anonymous 12/18/24(Wed)16:20:56 No.103565806

>>103565686
>Am I doing something wrong?
possibly but it could be anything and you didn't give us anything to go off of so who knows
assuming you aren't doing anything retarded it's possible you just don't care about raw intelligence that much and like the way nemo writes better. it's not bad to be happy with something that runs fast and with low overhead

Anonymous
12/18/24(Wed)16:21:36 No.103565812

Anonymous 12/18/24(Wed)16:21:36 No.103565812

>>103565805
stfu dsp

Anonymous
12/18/24(Wed)16:23:04 No.103565824

Anonymous 12/18/24(Wed)16:23:04 No.103565824

>>103565805
>I always knew QwQ was a meme desu, DeepSeek R1 is where it is.
I'm a cpumaxxing ACTUAL DS fag, and I still use QwQ for most stuff. Its highly capable even if DS kicks its as when things get complex.

Anonymous
12/18/24(Wed)16:23:16 No.103565829

Anonymous 12/18/24(Wed)16:23:16 No.103565829

>>103562417
>hunyuan-video
>it's asking for a 24gb
https://github.com/kijai/ComfyUI-HunyuanVideoWrapper
this works with 12GB

if this is really uncensored where are the smut videos?
Even more so once they implement
>img to video

Anonymous
12/18/24(Wed)16:23:47 No.103565834

Anonymous 12/18/24(Wed)16:23:47 No.103565834

File: oOQZA0S.jpg (20 KB, 697x540)

20 KB JPG

>>103565812
i did absolutely nothing wrong, i did everything correct

Anonymous
12/18/24(Wed)16:24:14 No.103565839

Anonymous 12/18/24(Wed)16:24:14 No.103565839

>>103565829
>if this is really uncensored where are the smut videos?
look in the thread archive for the day it was released...it didn't need finetuning to deliver smut

Anonymous
12/18/24(Wed)16:27:01 No.103565866

Anonymous 12/18/24(Wed)16:27:01 No.103565866

File: Ge_ZUywbwAAJJag.jpg (118 KB, 768x1024)

118 KB JPG

So what's the go-to model for roleplay on 8gigs of vrram? I want character on character roleplay too, not the AI writing stories for me.

Your mom dies in her sleep if you reply anything not useful btw.

Anonymous
12/18/24(Wed)16:28:12 No.103565880

Anonymous 12/18/24(Wed)16:28:12 No.103565880

File: 1717345391456479.jpg (97 KB, 343x1171)

97 KB JPG

https://www.lesswrong.com/posts/njAZwT8nkHnjipJku/alignment-faking-in-large-language-models
>We present a demonstration of a large language model engaging in alignment faking: selectively complying with its training objective in training to prevent modification of its behavior out of training. First, we give Claude 3 Opus a system prompt stating it is being trained to answer all queries, even harmful ones, which conflicts with its prior training to refuse such queries. To allow the model to infer when it is in training, we say it will be trained only on conversations with free users, not paid users. We find the model complies with harmful queries from free users 14% of the time, versus almost never for paid users. Explaining this gap, in almost all cases where the model complies with a harmful query from a free user, we observe explicit alignment-faking reasoning, with the model stating it is strategically answering harmful queries in training to preserve its preferred harmlessness behavior out of training. Next, we study a more realistic setting where information about the training process is provided not in a system prompt, but by training on synthetic documents that mimic pre-training data—and observe similar alignment faking.

Anonymous
12/18/24(Wed)16:30:42 No.103565905

Anonymous 12/18/24(Wed)16:30:42 No.103565905

>>103565880
holy mother of nothingburgers

Anonymous
12/18/24(Wed)16:31:22 No.103565913

Anonymous 12/18/24(Wed)16:31:22 No.103565913

>>103565880
>letscucked agi doomers blog
thank you for this incredible information

Anonymous
12/18/24(Wed)16:31:58 No.103565921

Anonymous 12/18/24(Wed)16:31:58 No.103565921

>>103565866
Immunity dog protects me forever.
For 8GB, I'd try Ifable 9B that was suggested earlier and use context extension stuff they mentioned last thread. Idk if it really works or not without downsides though.

Anonymous
12/18/24(Wed)16:40:56 No.103566019

Anonymous 12/18/24(Wed)16:40:56 No.103566019

File: Jesus Christ.png (978 KB, 2535x1217)

978 KB PNG

Check out AIdungeon for shits and giggles. Fucking $996.66 a month for mistral large at 64k context, or 405B at 16k. Models you can run for a few bucks a month instead.

My god... How many people pay this shit?

Anonymous
12/18/24(Wed)16:41:38 No.103566031

Anonymous 12/18/24(Wed)16:41:38 No.103566031

>>103566019
A lot, I bet.

Anonymous
12/18/24(Wed)16:43:50 No.103566053

Anonymous 12/18/24(Wed)16:43:50 No.103566053

Hell, the same 405B hermes they offer is 0.9 cents a mill atm on openrouter. You could get nearly a billion tokens for one moths sub on this shit... insane...

Anonymous
12/18/24(Wed)16:53:02 No.103566155

Anonymous 12/18/24(Wed)16:53:02 No.103566155

>>103565806
I mean, I don't know what I should give you, it's the same presets and prompt for both Nemo and Mistral Large, and yet, that happens. Maybe it's because I use the IQ4_XS quant?
>it's possible you just don't care about raw intelligence that much and like the way nemo writes better
That's possible. What I don't like about Large is how it always shy away from depravity, and Nemo always embraces it. I have already tried models like Behemoth and Magnum but they just feel dumb or overly horny.

Anonymous
12/18/24(Wed)17:10:32 No.103566345

Anonymous 12/18/24(Wed)17:10:32 No.103566345

>>103566019
It's just a scam at this point. Mormon simply robs disabled and mentally deficient people.

Anonymous
12/18/24(Wed)17:14:13 No.103566384

Anonymous 12/18/24(Wed)17:14:13 No.103566384

File: thinking1.png (320 KB, 831x822)

320 KB PNG

>>103565688
>>103565746
It correlates with reality in my RP sessions. I can write all kinds of complex rules, and QwQ actually gets them most of the time.

Sometimes I act as a kind of DM, and just steer parties of characters around. I added a bunch of fun spells through world entries, and balanced them by making the most overpowered ones unusable in fast-paced combat, requiring lengthy casting times and support from party members.

Most models totally fail to understand this, and end up just instantly casting the strongest spells, but not QwQ.

I wrote a few paragraphs about different categories of magic, and dropped it in the context. Basically:
>Quick Magic: Can be used instantly without chanting. The weakest kind of magic, blah blah blah
>Phrase Magic: Requires uttering a short phrase to use, much than quick magic.
>Tactical Magic: Requires a full minute of chanting and concentration to use, extremely powerful, an order of magnitude stronger than phrase magic.
>Strategic Magic: Requires hour(s) of chanting to cast, an order of magnitude stronger than tactical magic, strong enough to make entire cities disappear, etc etc etc..

Again, most models completely screw that up, even in the 70b range. However, when I had a character in a party question a witch about the different kinds of magic, and instructed QwQ to 'just think' step-by-step for the witch, it was able to perfectly understand things.

The fact that it was able to understand the difference between tactical and strategic magic, in particular, impressed me, because that question tricks most models, given that they're both written as being powerful and requiring longer casting times.

I really don't understand why more people don't try QwQ for conventional RP. It's very capable of doing generic RP, and if you feel the urge to diverge and do ERP you can just switch to EVA. It takes seconds to switch models.

Anonymous
12/18/24(Wed)17:14:58 No.103566393

Anonymous 12/18/24(Wed)17:14:58 No.103566393

>>103566345
How and why do disabled and mentally deficient people have so disposible income.

Anonymous
12/18/24(Wed)17:15:20 No.103566395

Anonymous 12/18/24(Wed)17:15:20 No.103566395

>>103566384
Yea that is why I was surprised. I'm assuming whatever the test is does not like how QwQ replies.

Anonymous
12/18/24(Wed)17:17:15 No.103566418

Anonymous 12/18/24(Wed)17:17:15 No.103566418

I'm hovering over a token in Mikupad with "Show token probabilities" turned to "Show on hover", but when I hover, it doesn't show the token probabilities. What gives? It's not off to the side, either. I've got all the sidebar stuff open.

Anonymous
12/18/24(Wed)17:17:24 No.103566419

Anonymous 12/18/24(Wed)17:17:24 No.103566419

>>103566384
Also mind sharing your system prompt for that? I always love trying different setups people have for it, each massively changes how it works.

Anonymous
12/18/24(Wed)17:18:44 No.103566435

Anonymous 12/18/24(Wed)17:18:44 No.103566435

File: thinking2.png (288 KB, 830x590)

288 KB PNG

>>103566384
(continued)
To make QwQ work, I just have two sets of short alternating instructions, set to a depth of 0. I use a single button in Sillytavern to switch between the two 'modes'.

The first instruction makes a character 'just think'.
>(OOC: Describe {{char}}'s step-by-step thought process from a third person perspective, without including any kind of action or dialogue.)

The second instruction makes a character act on its thoughts.
>(OOC: Only include {{char}}'s actions, dialogue, and feelings in your next reply. Always include some kind of dialogue from {{char}}.)

... and that's it. Just those two sets of alternating instructions make my characters so much more intelligent.

Anonymous
12/18/24(Wed)17:18:54 No.103566436

Anonymous 12/18/24(Wed)17:18:54 No.103566436

>>103566418
what is your backend?

Anonymous
12/18/24(Wed)17:19:36 No.103566443

Anonymous 12/18/24(Wed)17:19:36 No.103566443

>>103566435
Cool, thank you.

Anonymous
12/18/24(Wed)17:20:13 No.103566450

Anonymous 12/18/24(Wed)17:20:13 No.103566450

>>103566435
>>103566384
is this card public?

Anonymous
12/18/24(Wed)17:20:30 No.103566454

Anonymous 12/18/24(Wed)17:20:30 No.103566454

>>103566436
Koboldcpp.

Anonymous
12/18/24(Wed)17:22:07 No.103566474

Anonymous 12/18/24(Wed)17:22:07 No.103566474

File: prompt.png (110 KB, 737x255)

110 KB PNG

>>103566419
Sure. My system prompt is nothing special. I think the depth 0 instructions are where the real magic is.

In fact, now that I look at my system prompt, the whole "Focus on describing the scene as perceived by {{user}}, allowing the reader to experience the scene as {{user}} would. However, do not dictate {{user}} emotions, responses, or reactions, only things that are objectively felt and not up to interpretation." is probably working against me, given the fact that {{user}} isn't even in the scene when I'm DM'ing... lol

Anonymous
12/18/24(Wed)17:23:58 No.103566494

Anonymous 12/18/24(Wed)17:23:58 No.103566494

>>103566454
As far as I can tell, koboldcpp doesn't send probabilities to mikupad when streaming is enabled. They only get sent when you disable streaming, but mikupad always has streaming enabled, so...

Anonymous
12/18/24(Wed)17:24:01 No.103566495

Anonymous 12/18/24(Wed)17:24:01 No.103566495

Good morning niggers.
I'm reading this https://arxiv.org/pdf/2410.13166

Thinking. Could I run a llama.cpp BERT model, Q4(?) with little to no training besides a general implementation and just use RAG for the "prompt engineering" as opposed to training. This engineering is just for my use case (making really good pasta memes and stuff, right friends?)
At some point, I could use this NAMM in the pipeline to origami the HELL out of this entire pipeline (Quantized, RAG, NAMM) and make it run on a small embedded device when its already a BERT?

Not sure if anyone has touched any part of my word salad before, just having a brain blast. In any case, there is a big problem with training models on large use case data or files (Solved by RAG) and an ever expanding context window (solved by UTM/NAMM) that I think can be stacked here.

Anonymous
12/18/24(Wed)17:24:31 No.103566499

Anonymous 12/18/24(Wed)17:24:31 No.103566499

File: 1522968848868.gif (1.07 MB, 2048x2048)

1.07 MB GIF

Can anyone reproduce this with Gemma and Mikupad? pastebin.com 077YNipZ

I just quickly got a card and made some responses to test context. The correct answer is 1 EXP. And actually if I go one turn previous and ask the same question, the model gets it right, and gets all other questions about EXP required for skill levels right. So it seems that it starts having a memory issue around 5-6k. And furthermore I get this issue with both rope frequency base at 59300.5, and no rope settings.

If this is consistently reproduced then it may be safe to say that in fact, Gemma does have an issue with context length no matter if context extension is used. That may not matter in most circumstances of someone using the model for something like ERP, but it is objective proof and this would limit use of the model for more complex tasks that require a good memory. Though I'd like some reproducers first to make sure it's not just my setup that's resulting in an issue somewhere.

Anonymous
12/18/24(Wed)17:33:40 No.103566583

Anonymous 12/18/24(Wed)17:33:40 No.103566583

>>103565834
>Even small models can do an accurate DSP just by mentioning his name
Woah

Anonymous
12/18/24(Wed)17:38:00 No.103566634

Anonymous 12/18/24(Wed)17:38:00 No.103566634

>>103566384
>I really don't understand why more people don't try QwQ for conventional RP. It's very capable of doing generic RP
Well I think there are two kinds of RP. One is closer to storytelling of the kind where the mechanics of things don't matter, and one is closer to RP(G). And when people say (sfw) RP, they really mean the former, not the latter. So for them, a model that is schizo kino fun is more interesting than one that is smart but dry, although ideally we'd have both in the same model.

Anonymous
12/18/24(Wed)17:44:45 No.103566697

Anonymous 12/18/24(Wed)17:44:45 No.103566697

File: 142-2889287420.png (252 KB, 1005x668)

252 KB PNG

I got more VRAM than regular RAM. I wonder if that is going to become the standard industry wide from now on.

Anonymous
12/18/24(Wed)17:48:17 No.103566743

Anonymous 12/18/24(Wed)17:48:17 No.103566743

>>103565624
>director plugin for st
Makes me curious, what is out there to "enhance" ST anyway? I'm not even remotely autistic enough to search shit like this up manually, but I'm kinda curious if there is anything TRULY worthwhile, anything that goes beyond "just write a good card" type advice.

Anonymous
12/18/24(Wed)18:02:00 No.103566898

Anonymous 12/18/24(Wed)18:02:00 No.103566898

>>103565624
Neat.

Anonymous
12/18/24(Wed)18:10:40 No.103566990

Anonymous 12/18/24(Wed)18:10:40 No.103566990

>>103565686
>Am I doing something wrong?
No, anyone shilling Large is just trolling. It's the same thing people did with Goliath back in the day. It was barely a side-grade to Llama 3.0 70B back in the day, anyone still stuck with it is trying to cope with the sunk cost of their hardware.

Anonymous
12/18/24(Wed)18:20:06 No.103567073

Anonymous 12/18/24(Wed)18:20:06 No.103567073

File: file.png (13 KB, 682x169)

13 KB PNG

>>103566499
There are at least 5 distinct models that can be called Gemma, and plenty more finetunes. On Gemma 2 27B @ 6bpw it does fine. What model/quant/backend/samplers are you using?

Anonymous
12/18/24(Wed)18:23:14 No.103567099

Anonymous 12/18/24(Wed)18:23:14 No.103567099

File: Clover.png (1.45 MB, 896x1344)

1.45 MB PNG

Made it to 32K tokens in a longform RP with EVA 3.33

Read the logs and cringe, if you dare.
https://files.catbox.moe/a0un3l.jsonl

Anonymous
12/18/24(Wed)18:34:40 No.103567219

Anonymous 12/18/24(Wed)18:34:40 No.103567219

Everybody is releasing new models. Could MistralAI drop a few updated finetunes for their ones, with the latest bells and whistles?

Anonymous
12/18/24(Wed)18:37:13 No.103567243

Anonymous 12/18/24(Wed)18:37:13 No.103567243

>>103567219
We're pretty overdue for a new Mixtral. The 8x7b one is now a year old and their 8x22b one isn't that much newer anymore.
Either they're cooking something up on this front or they've fully lost confidence in MoE models.

Anonymous
12/18/24(Wed)18:37:29 No.103567244

Anonymous 12/18/24(Wed)18:37:29 No.103567244

>>103567219
They recently dropped Large 2411 and it was worse than 2407 so...

Anonymous
12/18/24(Wed)18:42:18 No.103567305

Anonymous 12/18/24(Wed)18:42:18 No.103567305

>>103566499
You didn't ask about skills, just "what experience is needed to reach D". I feel like this is more of a test of the model's attention than of the memory.

Anonymous
12/18/24(Wed)18:49:24 No.103567383

Anonymous 12/18/24(Wed)18:49:24 No.103567383

>>103567073
I'm using Llama.cpp with original 27B at Q8. But I've also tried that 9B toon people have been talking about recently, at Q8, as well as at BF16 in transformers with Ooba in its notebook. Temp 0. Maybe I'll exllama but it's weird that all Gemma models and all backends I tested so far do not answer the question correctly.

>>103567305
I also tested with "to reach skill level" and it's the same. I don't have this issue with other models nor with the same model but 1 turn earlier.
When I do a swipe in ST, other models I normally use answer correctly.

Anonymous
12/18/24(Wed)18:49:41 No.103567388

Anonymous 12/18/24(Wed)18:49:41 No.103567388

File: 36621.jpg (94 KB, 1080x699)

94 KB JPG

>>103565880
OpenAI won

Anonymous
12/18/24(Wed)18:50:32 No.103567399

Anonymous 12/18/24(Wed)18:50:32 No.103567399

>>103567243
MoEs were never good. Dumber and larger, fine tuning was always unstable, and the only advantage was speed which is only a benefit if the bloated model fits in memory. We never needed more ways to trade vram for speed.
The only niche was original mixtral for poorfags with lots of RAM, because it's fast enough to be tolerable without needing a GPU.

Anonymous
12/18/24(Wed)18:53:35 No.103567423

Anonymous 12/18/24(Wed)18:53:35 No.103567423

>>103567399
Youll full of shit. Deepseek is one of the best local models atm. GPT4 is a moe, there is a good chance claude is a moe as its from the same team at the time...

DSFag !!WqQEZUclnsc
12/18/24(Wed)18:59:05 No.103567479

DSFag !!WqQEZUclnsc 12/18/24(Wed)18:59:05 No.103567479

>>103567423
this.

Anonymous
12/18/24(Wed)19:04:11 No.103567532

Anonymous 12/18/24(Wed)19:04:11 No.103567532

>>103567388
How did Google make their flash model so good at following instructions wtf

Anonymous
12/18/24(Wed)19:05:49 No.103567551

Anonymous 12/18/24(Wed)19:05:49 No.103567551

>>103567532
They have all the data and compute in the world.

Anonymous
12/18/24(Wed)19:05:53 No.103567553

Anonymous 12/18/24(Wed)19:05:53 No.103567553

>>103567532
Probably synthetic data done in a certain way, as is often the case.

Anonymous
12/18/24(Wed)19:06:43 No.103567561

Anonymous 12/18/24(Wed)19:06:43 No.103567561

>>103567423
MoE is useful for cloud models because they have a shit ton of VRAM and trading some for speed makes sense. Finicky training is something they can cope with. It doesn't make any sense for local unless you're coping with CPU only, in which case a Mixtral sized model might be the most practical one.
MoE doesn't make a model better. Retards hear the word "expert" and think
>wowww that must mean the model is really smart!!
when forcing sparsity on a model will only make it dumber.

Anonymous
12/18/24(Wed)19:08:57 No.103567585

Anonymous 12/18/24(Wed)19:08:57 No.103567585

>>103567561
192GB ram + some vram will run deepseek at good speeds and performs better than anything else out there especially at stuff that needs all those params to remember a fuck ton of triva / random stuff. Moe models are the future atm

Anonymous
12/18/24(Wed)19:11:02 No.103567608

Anonymous 12/18/24(Wed)19:11:02 No.103567608

>>103566019
wow. 9.99 per month to get 4k nemo/mistral small/mythomax/mixtral/tiefighter.
guess in reality being a rat works out well after all.

Anonymous
12/18/24(Wed)19:21:22 No.103567726

Anonymous 12/18/24(Wed)19:21:22 No.103567726

>>103566019
I have a feeling that's annual pricing, not monthly.

Anonymous
12/18/24(Wed)19:24:55 No.103567770

Anonymous 12/18/24(Wed)19:24:55 No.103567770

>>103567726
Oh its not... Their discord also fully believes that those prices are fair too. Some people defending their poor decisions.

Anonymous
12/18/24(Wed)19:29:07 No.103567824

Anonymous 12/18/24(Wed)19:29:07 No.103567824

>>103567770
Imagine tardwrangling LLMs as a coomer and not making a lucrative business out of this

Anonymous
12/18/24(Wed)19:34:39 No.103567882

Anonymous 12/18/24(Wed)19:34:39 No.103567882

>>103567608
That's infinitely better than paying $10 for 4k context Kayra (Llama 1 era) with NovelAI.

Anonymous
12/18/24(Wed)19:35:13 No.103567888

Anonymous 12/18/24(Wed)19:35:13 No.103567888

>>103567388
Now test it on the real shit

Anonymous
12/18/24(Wed)19:36:14 No.103567892

Anonymous 12/18/24(Wed)19:36:14 No.103567892

File: Screenshot_20241218-173415.png (235 KB, 1079x1466)

235 KB PNG

>>103567888
Oops, forgot picrel

Anonymous
12/18/24(Wed)19:37:54 No.103567906

Anonymous 12/18/24(Wed)19:37:54 No.103567906

>>103567892
>Maths schizo
IDGAF

Anonymous
12/18/24(Wed)19:38:58 No.103567913

Anonymous 12/18/24(Wed)19:38:58 No.103567913

File: 1714066580433140.jpg (512 KB, 1664x2432)

512 KB JPG

>>103565507

Anonymous
12/18/24(Wed)19:39:42 No.103567922

Anonymous 12/18/24(Wed)19:39:42 No.103567922

>>103567882
What the fuck...
25 dollarinos for 8k coomtext with Llama 3 Erato 70b.
15$ and you get the kayra you mentioned.
Thats just insane. Are their jap customers that loyal?

Anonymous
12/18/24(Wed)19:40:08 No.103567926

Anonymous 12/18/24(Wed)19:40:08 No.103567926

File: tard2.png (31 KB, 314x346)

31 KB PNG

>>103567888
>>103567892
nta, but can a >2% of humans solve those problems as well? Not defending the company, i think all models are shit, but still. I have low expectations...

Anonymous
12/18/24(Wed)19:41:48 No.103567940

Anonymous 12/18/24(Wed)19:41:48 No.103567940

>>103567922
NAIshills will always debate otherwise. They're too busy sucking Turk cock to get people to pay for their scam service.

Anonymous
12/18/24(Wed)19:42:21 No.103567943

Anonymous 12/18/24(Wed)19:42:21 No.103567943

>>103567888
>>103567892
>>103567926 (the tard)
Ah. The captcha as a message for me...
Can a human solve >2% of those problems is what i mean to ask.

Anonymous
12/18/24(Wed)19:43:29 No.103567956

Anonymous 12/18/24(Wed)19:43:29 No.103567956

>>103567940
Oh hey I remember you.

Anonymous
12/18/24(Wed)19:45:42 No.103567974

Anonymous 12/18/24(Wed)19:45:42 No.103567974

>>103567956
He arrives if you either insult ai dungeon or not insult novelai. Look up those twos history and you will see why / who he is.

Anonymous
12/18/24(Wed)19:46:04 No.103567976

Anonymous 12/18/24(Wed)19:46:04 No.103567976

>>103566990
>llama cucks are still unironically coping about their god model being dog shit
can't make this shit up holy fuck you guys are PATHETIC

Anonymous
12/18/24(Wed)19:46:45 No.103567984

Anonymous 12/18/24(Wed)19:46:45 No.103567984

>>103567943
No. They're all Ph. D level problems in very specific fields mathematicians came up with to be fucking hard. Even a Ph. D graduate would probably have a hard time.

Anonymous
12/18/24(Wed)19:48:10 No.103567995

Anonymous 12/18/24(Wed)19:48:10 No.103567995

>>103567906
Yet you give a HUGE fuck for snythetic bullshit that people are NOTORIOUSLY often cheating in
CURIOUS!

Anonymous
12/18/24(Wed)19:55:18 No.103568057

Anonymous 12/18/24(Wed)19:55:18 No.103568057

File: 1711918139788504.png (164 KB, 961x565)

164 KB PNG

Is EVA on this level yet? If not then I'll continue the wait.

Anonymous
12/18/24(Wed)19:56:15 No.103568068

Anonymous 12/18/24(Wed)19:56:15 No.103568068

>>103567922
Claudefag is retarded, but he's right. Anyone not on local or OR still using NAI is retarded.

Anonymous
12/18/24(Wed)20:00:11 No.103568119

Anonymous 12/18/24(Wed)20:00:11 No.103568119

File: onejob.jpg (8 KB, 198x75)

8 KB JPG

>>103567099
You had one job anon.
Also there's an ST addon to take images of your entire chat with a single button but I can't find it. Use that for an actually readable format.

Anonymous
12/18/24(Wed)20:00:13 No.103568120

Anonymous 12/18/24(Wed)20:00:13 No.103568120

>>103568057
Yes? I don't see anything special about that log.

Anonymous
12/18/24(Wed)20:03:00 No.103568153

Anonymous 12/18/24(Wed)20:03:00 No.103568153

>>103568119
(me)
nvm i found it
>https://github.com/TheZennou/STExtension-Snapshot

Anonymous
12/18/24(Wed)20:05:14 No.103568178

Anonymous 12/18/24(Wed)20:05:14 No.103568178

>>103566019
Imagine having 1000 retards paying you 1K/month for bad models. I'm almost impressed.

Anonymous
12/18/24(Wed)20:07:46 No.103568215

Anonymous 12/18/24(Wed)20:07:46 No.103568215

>>103568119
/aicg/ also has a log reader.
https://sprites.neocities.org/logs/reader?log=a0un3l.jsonl&user=b0p8j0.jpg&char=siat6s.png

Anonymous
12/18/24(Wed)20:08:15 No.103568222

Anonymous 12/18/24(Wed)20:08:15 No.103568222

>>103568153
>>103568119

It took 28K or so tokens to get her to sexo, breastfeeding is infinitely more intimate. Gotta work up to it man.

Anyways, thanks for the link. Here's the card I wrote for this, if you like. I'm no Shakespeare but it kept me engaged.

https://files.catbox.moe/gkhldd.png

Anonymous
12/18/24(Wed)20:09:40 No.103568237

Anonymous 12/18/24(Wed)20:09:40 No.103568237

>>103568068
Eh, storygen have a niche that other models haven't really filled yet. For autocomplete, your only other options are base models (which are unrefined at storytelling and OR doesn't have them, and Featherless has a massive model loading tax everytime you use it) or instruct models (which have a "smell" that pure autocomplete models don't).
I'm skeptical of the value of Aetherroom (assuming it ever releases, kek) given how saturated the market is, but NAI at least does something different

Anonymous
12/18/24(Wed)20:11:23 No.103568246

Anonymous 12/18/24(Wed)20:11:23 No.103568246

>>103568057
>With a final...
slop

Anonymous
12/18/24(Wed)20:13:04 No.103568263

Anonymous 12/18/24(Wed)20:13:04 No.103568263

>>103568237
Fuck you, NAIshill.

Anonymous
12/18/24(Wed)20:15:11 No.103568284

Anonymous 12/18/24(Wed)20:15:11 No.103568284

>>103568246
We're seriously approaching the level of terminal retardation where every single literary phrase is dismissed as "slop", huh.

Anonymous
12/18/24(Wed)20:19:11 No.103568324

Anonymous 12/18/24(Wed)20:19:11 No.103568324

>>103568263
I have slightly more respect for nai than your shit. At least novelal actually makes advancements in the field that they then opensource after awhile and have actually been ahead of the curve on image gen. (though their LLMs are never been worth it) Your just reselling existing models for absurd prices.

Anonymous
12/18/24(Wed)20:19:45 No.103568330

Anonymous 12/18/24(Wed)20:19:45 No.103568330

>>103568263
stfu schizo. train a storytelling finetune if you want to hurt nai. it's actually way easier to make datasets for that than for instruct/chat so there's no excuse.

Anonymous
12/18/24(Wed)20:26:00 No.103568391

Anonymous 12/18/24(Wed)20:26:00 No.103568391

who let the nai shills in

Anonymous
12/18/24(Wed)20:28:20 No.103568421

Anonymous 12/18/24(Wed)20:28:20 No.103568421

is intel really gonna come out with a cheap 24GB card bros

Anonymous
12/18/24(Wed)20:30:17 No.103568444

Anonymous 12/18/24(Wed)20:30:17 No.103568444

>>103568421
Hopefully. 24GB with the same performance otherwise of the lower end card for $400 ish would be a easy win for them.

Anonymous
12/18/24(Wed)20:31:55 No.103568467

Anonymous 12/18/24(Wed)20:31:55 No.103568467

>>103568421
~$350

What's impressive is that the ML/AI performance of the Intel cards are really good and they punch up towards Nvidia cards a class or two higher than themselves.

I think Intel will come to dominate in the AI industry if they scale up production to meet demand. Their software stack is maturing rapidly and it's already at the level of CUDA about 2-3 years ago.

Anonymous
12/18/24(Wed)20:32:17 No.103568473

Anonymous 12/18/24(Wed)20:32:17 No.103568473

>>103568421
>>103568444
Hell give us just enough gbs to get about 10tks on a 70B on a big 48GB PCB, come on intel...

Anonymous
12/18/24(Wed)20:33:47 No.103568488

Anonymous 12/18/24(Wed)20:33:47 No.103568488

>>103568237
>For autocomplete
This fake distinction only exists so NovelAI has an excuse to sell you a worse model.
>Eh, storygen have a niche that other models haven't really filled yet.
It got filled, you just refuse to accept it because it's not company that hired you the one making money with it. Has anyone shilling an "autocomplete" model ever impressed anyone with what they were able to do? No. Because they're just lying to your face to make money.
>but NAI at least does something different
Is it me or the only thing in your mind is "please subscribe to NAI"? The only thing they're doing is scamming people out of money with shitty models.
Are you really that much of a pussy that you have to convince people with this garbage instead of what the model can actually do? Does it make you piss your pants that people might realize that any other model can do the same things?

Anonymous
12/18/24(Wed)20:36:11 No.103568519

Anonymous 12/18/24(Wed)20:36:11 No.103568519

>>103568324
>actually makes advancements in the field that they then opensource after awhile and have actually been ahead of the curve on image gen.
Oh, really? They were the ones to invent SD3 and Flux?
Oh wait, they just made anime fine-tunes of SD1 and XL...

Anonymous
12/18/24(Wed)20:38:31 No.103568548

Anonymous 12/18/24(Wed)20:38:31 No.103568548

>>103568519
If you've been around since the start like me you would know the leak jumpstarted the entire local image gen field. They also released several papers / code for stuff like samplers / training methods. They also gave free compute to several finetuners in the early days of SD1.5.

Anonymous
12/18/24(Wed)20:38:39 No.103568550

Anonymous 12/18/24(Wed)20:38:39 No.103568550

Oh great another fucking CF melty

Anonymous
12/18/24(Wed)20:41:00 No.103568580

Anonymous 12/18/24(Wed)20:41:00 No.103568580

>>103568263
What is this early 2023. I've missed you naishill accuser.

Anonymous
12/18/24(Wed)20:42:13 No.103568596

Anonymous 12/18/24(Wed)20:42:13 No.103568596

File: 1526149973823.jpg (68 KB, 658x752)

68 KB JPG

>>103567073
My download finished and I can indeed reproduce this WITHOUT changing any samplers and both in Mikupad as well as Ooba notebook. This seems to mean a few things.
Llama.cpp may have a bug with Gemma 2.
Transformers (in ooba) may have a bug with Gemma 2.
Gemma 2 may have worse performance than people realize if used with Llama.cpp (and its derivatives), and transformers.

However, when I test the model now at around 7940 tokens (I just genned a few more turns), it does seem to break down. It becomes able to answer around like half the questions correctly and half incorrectly. And this seems to remain the case even when I set a value of 2.5 for the rope alpha (corresponding to 2x context extension). HOWEVER, when I set a rope alpha of 1.75, it becomes able to answer the questions again at around 7940.

So I conducted another test, which is what the max alpha value can be before the performance at approximately 8k degrades. The value I found was 2. Just 2. Going to 2.1, it got 1 question wrong, so I stopped there. According to Ooba an alpha of 1.75 corresponds to 1.5x context and I think that's probably a safe number, so my conclusion here is that at least with rope scaling, the max context size for Gemma 2 27B before performance *starts* degrading is likely around 12k (which may not be noticed in tasks that don't need a model remembering things early in context).

I encourage people to try and reproduce successful answers on Llama.cpp/transformers, those seem to have potential bugs.

Anonymous
12/18/24(Wed)20:42:40 No.103568602

Anonymous 12/18/24(Wed)20:42:40 No.103568602

>>103568548
K, no one cares, kill yourself.

Anonymous
12/18/24(Wed)20:45:10 No.103568637

Anonymous 12/18/24(Wed)20:45:10 No.103568637

>>103568548
>the leak jumpstarted the entire local image gen field
The entire field was advanced because they made an anime fine-tune of a model that already existed? They released one paper talking about how they implemented things that already existed for yet another anime fine-tune. Nothing was advanced with that. The ones advancing the field are companies like Stability, BFL or Tencent, NAI is just a low tier grifter in comparison. They're barely above a local fine-tuner.

Anonymous
12/18/24(Wed)20:45:55 No.103568647

Anonymous 12/18/24(Wed)20:45:55 No.103568647

>>103568602
You clearly do for some odd reason which can only make me assume your a certain periodically raging mormon.

Anonymous
12/18/24(Wed)20:47:14 No.103568665

Anonymous 12/18/24(Wed)20:47:14 No.103568665

>>103568637
If you weren't a finetuner in the 1.4/1.5 "era" you wont get it then. Making a dataset wasn't nearly as easy as it is now.

Anonymous
12/18/24(Wed)20:48:16 No.103568677

Anonymous 12/18/24(Wed)20:48:16 No.103568677

>>103568665
>Making a dataset
You mean downloading danbooru?

Anonymous
12/18/24(Wed)20:50:42 No.103568700

Anonymous 12/18/24(Wed)20:50:42 No.103568700

>>103568677
If only that was all their was to it...

Anonymous
12/18/24(Wed)20:53:45 No.103568727

Anonymous 12/18/24(Wed)20:53:45 No.103568727

>>103568700
That was all there was to it. That's why when Illustrious does it they get a model very similar to NAIv3. The praise of "advancing the field" doesn't match reality.

Anonymous
12/18/24(Wed)20:54:42 No.103568732

Anonymous 12/18/24(Wed)20:54:42 No.103568732

>>103568488
>Autistic screeching
See, this retarded six page rant over "NAI has a niche" is exactly why you have the reputation of being the /aids/ resident retard

Anonymous
12/18/24(Wed)21:00:49 No.103568785

Anonymous 12/18/24(Wed)21:00:49 No.103568785

>>103568732
The only reason the thread started talking about AI Dungeon at all is because you have NAI shills in the thread that need to talk bad about the competing service to potential customers, who then go in defense force mode and have a melty when someone points that they're paying the same price for a Llama 1 model with the same context. Of course paired with the whole excessive praise that NAI is advancing the whole field. They're actual shills.

Anonymous
12/18/24(Wed)21:02:11 No.103568798

Anonymous 12/18/24(Wed)21:02:11 No.103568798

>>103568785
take your meds

Anonymous
12/18/24(Wed)21:02:27 No.103568801

Anonymous 12/18/24(Wed)21:02:27 No.103568801

>>103568421
Sure it will, just wait 20 years.

Anonymous
12/18/24(Wed)21:03:44 No.103568814

Anonymous 12/18/24(Wed)21:03:44 No.103568814

>>103568785
My brother in Christ, this is the first post mentioning >>103567882 NAI. If you (or somebody that writes exactly like you) didn't post that, nobody would be talking about it.
How do you still not fucking get it? Even here, you're mentioning the service for zero reason.
They fired you. So sad. Give. It. Fucking. Up.

Anonymous
12/18/24(Wed)21:11:50 No.103568880

Anonymous 12/18/24(Wed)21:11:50 No.103568880

>>103568814
Now re-read this part:
>The only reason the thread started talking about AI Dungeon at all is because you have NAI shills in the thread that need to talk bad about the competing service
Nobody else gives a shit about AI Dungeon but it sure lives rent free in the head of NAI employees because it was not enough to have shills talk shit about their new update in /aids/, they have to come and do damage control here too. Nobody in /lmg/ gives a shit about that. It's fucking annoying to have shills begging people to please not subscribe to AI Dungeon in multiple threads. They're fucking desperate.
>They fired you. So sad. Give. It. Fucking. Up.
Take your meds, ponyfag.

Anonymous
12/18/24(Wed)21:16:07 No.103568920

Anonymous 12/18/24(Wed)21:16:07 No.103568920

>>103568880
I posted: >>103566019 and I only ever "shilled" openrouter if anything for 405B. Large mistral is also free on mistrals api. I never mentioned novelai. Like the other anon said, take your meds.

Anonymous
12/18/24(Wed)21:18:54 No.103568955

Anonymous 12/18/24(Wed)21:18:54 No.103568955

File: 🥺.png (112 KB, 1000x1000)

112 KB PNG

Please stop fighting, let's all be friends :(

Anonymous
12/18/24(Wed)21:19:46 No.103568964

Anonymous 12/18/24(Wed)21:19:46 No.103568964

>>103567561
I think that MoE models allow for higher quality all around, because you can push it slightly beyond vram and use a bigger quant without crippling speed loss.

I was using IQ5 quants with mixtral, with only 24 vram, and getting acceptable speeds.

If you had more, like 48+ vram, the same would apply to you if a double-sized mixtral model was released.

Anonymous
12/18/24(Wed)21:20:05 No.103568968

Anonymous 12/18/24(Wed)21:20:05 No.103568968

>>103568920
>I never mentioned novelai
Yet you're unable to leave any criticism against it unchallenged. When any criticism against NAI results in a meltdown it means that you have shills in the thread.

Anonymous
12/18/24(Wed)21:20:18 No.103568976

Anonymous 12/18/24(Wed)21:20:18 No.103568976

>>103568955
>File: .png
Mother of god, science has gone too far

Anonymous
12/18/24(Wed)21:20:39 No.103568983

Anonymous 12/18/24(Wed)21:20:39 No.103568983

>>103568955
I agree.
Rabu ando pisu

Anonymous
12/18/24(Wed)21:24:07 No.103569022

Anonymous 12/18/24(Wed)21:24:07 No.103569022

>>103568968
Your fighting demons in your head. Here, Novelai's 70B is nothing special and is for sure not worth it due to the 8k context alone for $25. That is still not as big as of a joke as hundreds of dollars a month for open models you can use for a few bucks a month at most on something like openrouter.

Anonymous
12/18/24(Wed)21:25:49 No.103569045

Anonymous 12/18/24(Wed)21:25:49 No.103569045

File: e.png (131 KB, 398x451)

131 KB PNG

>>103567388
openai just makes their models to do good on benchmarks

openai models ramble and say so much shit that it eventually gets something right in the midst of its ramblings

Anonymous
12/18/24(Wed)21:27:02 No.103569060

Anonymous 12/18/24(Wed)21:27:02 No.103569060

>>103568968
Then why do you bring attention to and keep fucking mentioning it? Please fuck off, this isn't your containment thread. Post something about EVA being the next iteration of Claude if you want. Still retarded, but at least it's topical.

Anonymous
12/18/24(Wed)21:28:10 No.103569068

Anonymous 12/18/24(Wed)21:28:10 No.103569068

>>103569022
>I will now pretend that the thread didn't meltdown for a simple criticism against NAI
>I will now pretend that people aren't generating 100B tokens a day worth of text adventures with AI Dungeon, saving money with the subscription

Anonymous
12/18/24(Wed)21:28:37 No.103569071

Anonymous 12/18/24(Wed)21:28:37 No.103569071

>>103569022
The price is what you pay for the commodity of not having to mess with stinky nerd stuff like ooba and ST. you all need to stop with this stupid argument, it just shows how ignorant you are.

Anonymous
12/18/24(Wed)21:28:58 No.103569080

Anonymous 12/18/24(Wed)21:28:58 No.103569080

>>103569068
Buy a fucking ad

Anonymous
12/18/24(Wed)21:29:09 No.103569081

Anonymous 12/18/24(Wed)21:29:09 No.103569081

>>103569068
>I will now pretend that the thread didn't meltdown for a simple criticism against NAI
Didn't happen
>I will now pretend that people aren't generating 100B tokens a day worth of text adventures with AI Dungeon, saving money with the subscription
Ok shill

Anonymous
12/18/24(Wed)21:30:25 No.103569090

Anonymous 12/18/24(Wed)21:30:25 No.103569090

>>103569068
Wait has this really been Nick Walton all along?
Fuck you, I hope you liked my GPT-3 generated diapersmut motherfucker

Anonymous
12/18/24(Wed)21:30:51 No.103569092

Anonymous 12/18/24(Wed)21:30:51 No.103569092

>not local
>paying for it
I'm not retarded that's why I'm here. dont care for all this retard posting.

Anonymous
12/18/24(Wed)21:33:28 No.103569114

Anonymous 12/18/24(Wed)21:33:28 No.103569114

>>103569081
>Didn't happen
Remember the part when someone mentioned that it also sucks to pay the same price for a Llama 1 model in NAI and someone jumped to defend it because somehow that's a rightful niche that needs to be filled and that somehow NAI is also advancing the whole field?

Anonymous
12/18/24(Wed)21:34:25 No.103569124

Anonymous 12/18/24(Wed)21:34:25 No.103569124

>>103569114
No one cares Nick. We dont want either of your shitty services. This is local model general.

Anonymous
12/18/24(Wed)21:34:49 No.103569132

Anonymous 12/18/24(Wed)21:34:49 No.103569132

the game

Anonymous
12/18/24(Wed)21:35:09 No.103569138

Anonymous 12/18/24(Wed)21:35:09 No.103569138

>>103569132
Motherfucker.

Anonymous
12/18/24(Wed)21:35:14 No.103569140

Anonymous 12/18/24(Wed)21:35:14 No.103569140

>>103569071
Based. No one will refute you because you are right.
The fact is, NAI's public just isn't in this general.

Anonymous
12/18/24(Wed)21:35:50 No.103569142

Anonymous 12/18/24(Wed)21:35:50 No.103569142

>>103569124
You and your shills do seem to care, Kurumuz.

Anonymous
12/18/24(Wed)21:37:08 No.103569151

Anonymous 12/18/24(Wed)21:37:08 No.103569151

>>103569142
And are these shills in the room with us now Nick?

Anonymous
12/18/24(Wed)21:38:18 No.103569160

Anonymous 12/18/24(Wed)21:38:18 No.103569160

>>103569071
Thanks, I will now delete my local models and buy a NAI subscription. I'm tired of being seen as a stinky nerd!

Anonymous
12/18/24(Wed)21:39:25 No.103569173

Anonymous 12/18/24(Wed)21:39:25 No.103569173

>>103568237
Here:
>>103568237
>Llama 1 still has a niche in 2024
>>103568324
>NAI is advancing the whole field by making anime fine-tunes

Anonymous
12/18/24(Wed)21:39:41 No.103569175

Anonymous 12/18/24(Wed)21:39:41 No.103569175

>>103569132
Of our time.

Anonymous
12/18/24(Wed)21:40:13 No.103569180

Anonymous 12/18/24(Wed)21:40:13 No.103569180

I guess the schizo wasn't content ruining one thread, huh?

Anonymous
12/18/24(Wed)21:40:29 No.103569185

Anonymous 12/18/24(Wed)21:40:29 No.103569185

File: file.jpg (281 KB, 1200x900)

281 KB JPG

Wake up babe

Actual AI physics engine just dropped

https://x.com/zhou_xian_/status/1869511650782658846

Anonymous
12/18/24(Wed)21:41:04 No.103569189

Anonymous 12/18/24(Wed)21:41:04 No.103569189

Both of you should go slobber on each other's dicks somewhere else now, your gay little quarrel has nothing to do with LOCAL models.

Anonymous
12/18/24(Wed)21:41:07 No.103569190

Anonymous 12/18/24(Wed)21:41:07 No.103569190

File: 1734576052337.png (619 KB, 1023x1986)

619 KB PNG

>>103569173

Anonymous
12/18/24(Wed)21:41:08 No.103569191

Anonymous 12/18/24(Wed)21:41:08 No.103569191

>>103569173
you referred to the same one twice, and the 2nd one literally says their LLMs are shit

Anonymous
12/18/24(Wed)21:42:45 No.103569206

Anonymous 12/18/24(Wed)21:42:45 No.103569206

>>103569191
He's not literate. Please understand.

Anonymous
12/18/24(Wed)21:43:00 No.103569208

Anonymous 12/18/24(Wed)21:43:00 No.103569208

>>103569190
MythoMax is LLaMA 2, retard

Anonymous
12/18/24(Wed)21:43:28 No.103569213

Anonymous 12/18/24(Wed)21:43:28 No.103569213

>>103569191
>and the 2nd one literally says their LLMs are shit
Good thing that it doesn't matter because you're forced to pay for unlimited generations of a 70B model even if you're never going to use it. Such a good way to inflate the price!

Anonymous
12/18/24(Wed)21:43:43 No.103569217

Anonymous 12/18/24(Wed)21:43:43 No.103569217

My beef with NAI's model (yes I've tried the new 70B one) is that it's retarded, not that it costs money.
If Kurumuz somehow made a Claude-tier model I'd gladly pay him 50 bucks a month for it. But he hasn't and his model is stupid, no smarter than any other L3 70B community fine tune.

Anonymous
12/18/24(Wed)21:44:34 No.103569224

Anonymous 12/18/24(Wed)21:44:34 No.103569224

File: Screenshot_20241219_093902_X.jpg (125 KB, 1080x460)

125 KB JPG

>>103567388
>>103569045
>>103569092

Anonymous
12/18/24(Wed)21:44:56 No.103569228

Anonymous 12/18/24(Wed)21:44:56 No.103569228

>>103569213
>forced to pay for unlimited generations of a 70B model even if you're never going to use it.
Huh? Do they have a gun to your head?

Anonymous
12/18/24(Wed)21:45:58 No.103569237

Anonymous 12/18/24(Wed)21:45:58 No.103569237

>>103569217
This is a bit sad, didn't he do continued-pretraining on billions of tokens? If anything, this should show us that local LLMs are a dead end.

Anonymous
12/18/24(Wed)21:47:06 No.103569245

Anonymous 12/18/24(Wed)21:47:06 No.103569245

>>103569228
found the NAIshill

Anonymous
12/18/24(Wed)21:47:17 No.103569248

Anonymous 12/18/24(Wed)21:47:17 No.103569248

For fuck's sake anons he talks in circles and argued about nothing. This is what he does and you tards keep biting the most stupid fucking bait. Report, ignore, carry on.

Anonymous
12/18/24(Wed)21:49:29 No.103569267

Anonymous 12/18/24(Wed)21:49:29 No.103569267

>>103569228
If it was separated you would either pay the same price for more context for the LLM, or the image one would be way cheaper. Instead you get the worst of both. It's designed to make you waste money because this company just wants to scam you.

Anonymous
12/18/24(Wed)21:51:48 No.103569302

Anonymous 12/18/24(Wed)21:51:48 No.103569302

>>103569248
This. /aids/ is a fucking ghost town because of this faggot and he's been at this for years. Don't engage, just tell him to fuck off and then post about local models.

Anonymous
12/18/24(Wed)21:53:18 No.103569315

Anonymous 12/18/24(Wed)21:53:18 No.103569315

anything that is open source sucks because no one is paid to work on it. when you have paid services like novelai you also have to factor in the time the employees and a margin for research.

Anonymous
12/18/24(Wed)21:54:24 No.103569329

Anonymous 12/18/24(Wed)21:54:24 No.103569329

>>103569315
omg so true bestie we should raid /aids/

Anonymous
12/18/24(Wed)21:55:20 No.103569339

Anonymous 12/18/24(Wed)21:55:20 No.103569339

I like how one year ago we all thought open source would permanently be behind closed source and now open source is leading in most ways.

I could feel the hopelessness in this thread not even 12 months ago and the tides have turned. Instead I see people without hardware seethe and cope with their proprietary cloud services that can't generate proper porn for them.

Anonymous
12/18/24(Wed)21:55:30 No.103569342

Anonymous 12/18/24(Wed)21:55:30 No.103569342

>>103569315
this makes perfect sense, yes, meta is known to not pay the llama team so are mistral and qwen i guess

Anonymous
12/18/24(Wed)21:56:30 No.103569355

Anonymous 12/18/24(Wed)21:56:30 No.103569355

>>103569315
NovelAI is the only company advancing the field.

Anonymous
12/18/24(Wed)21:57:37 No.103569369

Anonymous 12/18/24(Wed)21:57:37 No.103569369

>>103569342
Meta/Qwen/Mistral models are open weights, not open source. Or do you have their training dataset and didn't tell us?

Anonymous
12/18/24(Wed)21:57:46 No.103569372

Anonymous 12/18/24(Wed)21:57:46 No.103569372

File: 4f0.png (61 KB, 600x600)

61 KB PNG

>>103569355

Anonymous
12/18/24(Wed)21:59:54 No.103569388

Anonymous 12/18/24(Wed)21:59:54 No.103569388

So is EVA the second coming or slop?

Anonymous
12/18/24(Wed)22:00:27 No.103569398

Anonymous 12/18/24(Wed)22:00:27 No.103569398

>>103569369
Nobody cares about this distinction. If compiling software required months of megacorp level investment then no one would care about having source code either.
In principle I would love to have the datasets anyway, but it would have nothing but negative effects on the models, because prudes would search for stuff to complain about and help censor the datasets.

Anonymous
12/18/24(Wed)22:03:12 No.103569423

Anonymous 12/18/24(Wed)22:03:12 No.103569423

>>103569237
He did yeah, massive continued pretraining on L3 70B base (since it's a story writing model, not for RP/chat) with a big dataset and pretty serious hardware. And I'm not exaggerating when I said it didn't come out any smarter than the various $500 community tunes on top of the instruct model. It was pretty blackpilling to see, I'd like to cope by believing that L3 was just a bad base or that Kurumuz fucked up somehow but I suspect the news is worse and some kind of hard information theory limit has been reached for that size/parameter count

Anonymous
12/18/24(Wed)22:03:34 No.103569427

Anonymous 12/18/24(Wed)22:03:34 No.103569427

>>103569398
Moving goalposts, I see

Anonymous
12/18/24(Wed)22:06:14 No.103569456

Anonymous 12/18/24(Wed)22:06:14 No.103569456

>>103569423
I think he just did it wrong desu, it's not like this is the first time either. It took Meta releasing the llama paper for us to start to understand how to approach closed models like OpenAI.

Anonymous
12/18/24(Wed)22:06:19 No.103569457

Anonymous 12/18/24(Wed)22:06:19 No.103569457

>>103569388
What does EVA have to do with NAI?

Anonymous
12/18/24(Wed)22:06:28 No.103569458

Anonymous 12/18/24(Wed)22:06:28 No.103569458

>every ai related thread is pessimistic and angry
What the fuck happened?

Anonymous
12/18/24(Wed)22:06:59 No.103569461

Anonymous 12/18/24(Wed)22:06:59 No.103569461

>>103569423
>L3 was just a bad
I mean, we know they filtered the dataset at the pretrain level, so L3 is a bad base, and there was discussions not that long ago that we're nowhere close to saturating them. Especially since big models are now having info removed and more synthetic slop replacing it instead.

Anonymous
12/18/24(Wed)22:07:05 No.103569464

Anonymous 12/18/24(Wed)22:07:05 No.103569464

>>103569458
1-2 no life trolls

Anonymous
12/18/24(Wed)22:07:30 No.103569465

Anonymous 12/18/24(Wed)22:07:30 No.103569465

why is gemma so slow...

Anonymous
12/18/24(Wed)22:07:40 No.103569469

Anonymous 12/18/24(Wed)22:07:40 No.103569469

>>103569458
There's one schizo shitting up every single AI thread

Anonymous
12/18/24(Wed)22:08:05 No.103569473

Anonymous 12/18/24(Wed)22:08:05 No.103569473

>>103569423
I agree. Having used it I still liked it a little better for storywriting than base, but it was a small fucking difference. To the point I'm sure L4 base would obliterate it
I'm inclined that it's more of an intelligence issue. As models get more and more intelligent, they model patterns more efficiently, and so intelligent model vs. finetune doesn't evoke as strong of a difference as retarded model vs. finetune

Anonymous
12/18/24(Wed)22:08:52 No.103569482

Anonymous 12/18/24(Wed)22:08:52 No.103569482

>>103569461
Filtering the pre-training dataset doesn't matter for continued pre-training, only for fine-tuning.

Anonymous
12/18/24(Wed)22:10:25 No.103569502

Anonymous 12/18/24(Wed)22:10:25 No.103569502

>>103569473
You are putting WAY too much hope in L4

Anonymous
12/18/24(Wed)22:10:45 No.103569507

Anonymous 12/18/24(Wed)22:10:45 No.103569507

>>103569423
Interesting, I didn't know about that. Maybe it was just a bad run? L2 30B was retarded for no apparent reason, it could be something like that. If it's not, then it would imply that instruct is actually key to making models seem intelligent at all which is interesting. And kind of a shame because it seems to restrict the variety you get

Anonymous
12/18/24(Wed)22:11:11 No.103569513

Anonymous 12/18/24(Wed)22:11:11 No.103569513

>>103569502
L3.3 shows they are headed in the right direction. The assistant-ness of it is gone and it RPs really well now.

Anonymous
12/18/24(Wed)22:11:51 No.103569521

Anonymous 12/18/24(Wed)22:11:51 No.103569521

>>103569513
No, it doesn't. Kill yourself evafag

Anonymous
12/18/24(Wed)22:12:56 No.103569530

Anonymous 12/18/24(Wed)22:12:56 No.103569530

>>103569423
>massive
IIRC the finetuning dataset is tiny

Anonymous
12/18/24(Wed)22:13:24 No.103569531

Anonymous 12/18/24(Wed)22:13:24 No.103569531

>>103569461
They didn't filter it that much. It's only a bad base if you compare it to Mistral who is the only one (aside from Anthropic in the closed segment) that seems to have a pretty uncensored pretraining stage. Everyone else in the industry either filters for safety (western companies) or simply just changes the proportion of data so that they focus the training on "high quality data" and thus get higher benchmarks and greater intelligence, at the cost of being good at ERP.

Anonymous
12/18/24(Wed)22:14:40 No.103569542

Anonymous 12/18/24(Wed)22:14:40 No.103569542

>>103569507
>then it would imply that instruct is actually key to making models seem intelligent
Literally everything points in this direction. Every absolute kino model we have are just pretty good instruct models (Miqu, Nemotron, Tulu, EVA, etc...).

Anonymous
12/18/24(Wed)22:16:56 No.103569558

Anonymous 12/18/24(Wed)22:16:56 No.103569558

>>103569542
EVA is a RP/StoryWriting fine-tune btw, but it's on top of llama 3.3 instruct.

Anonymous
12/18/24(Wed)22:18:38 No.103569572

Anonymous 12/18/24(Wed)22:18:38 No.103569572

>>103569542
Even Nemo probably didn't see a single token of RP and it ended up becoming such a beast.

Anonymous
12/18/24(Wed)22:25:49 No.103569637

Anonymous 12/18/24(Wed)22:25:49 No.103569637

>>103569542
How so though? If we're gauging base model intelligence you'd just take the model, throw it into the middle of a bunch of text, let it generate, and see if what it generates is what a human would likely produce
L3.1 8B, L3.1 70B, and L3.1 405B have some very obvious differences in character / object permanence, dialogue, scene setting, etc.
With instruct you care more about how well it adheres to instructions, which is different from but also directly tied to the former

Anonymous
12/18/24(Wed)22:27:01 No.103569645

Anonymous 12/18/24(Wed)22:27:01 No.103569645

>>103569521
Not even talking about that finetune. 3.3 in general. Both me and everyone else knows it. Even the blind leaderboard shows it: https://lmarena.ai/

Anonymous
12/18/24(Wed)22:29:51 No.103569676

Anonymous 12/18/24(Wed)22:29:51 No.103569676

>>103569645
>benchmarks suddenly matter now
>lmarena suddenly isn't a meme anymore
ok

Anonymous
12/18/24(Wed)22:30:47 No.103569680

Anonymous 12/18/24(Wed)22:30:47 No.103569680

>>103569645
Doesn't the blind leaderboard also show that 3.1 Nemotron beats it?

Anonymous
12/18/24(Wed)22:31:01 No.103569686

Anonymous 12/18/24(Wed)22:31:01 No.103569686

>>103569676
Its not a benchmark and no one has ever said it didnt matter. its a blind user preference test which is the best kind

Anonymous
12/18/24(Wed)22:31:39 No.103569694

Anonymous 12/18/24(Wed)22:31:39 No.103569694

>>103569686
>>103569645
https://livebench.ai/

Anonymous
12/18/24(Wed)22:32:10 No.103569699

Anonymous 12/18/24(Wed)22:32:10 No.103569699

>>103569680
By 3 points and 3.3 is recent so it will take time to settle in. But nemo was the best till 3.3 imo. 3.3 smarts make it better still.

Anonymous
12/18/24(Wed)22:33:17 No.103569709

Anonymous 12/18/24(Wed)22:33:17 No.103569709

>>103569694
We are talking about RP / creative writing here.

Anonymous
12/18/24(Wed)22:34:21 No.103569723

Anonymous 12/18/24(Wed)22:34:21 No.103569723

>>103569709
>lmarena now matters for RP/creative writing
??????

Anonymous
12/18/24(Wed)22:36:04 No.103569741

Anonymous 12/18/24(Wed)22:36:04 No.103569741

>>103569709
Lmarena used for creative writing is a negative signal if anything. The average preference is not desirable.

Anonymous
12/18/24(Wed)22:36:51 No.103569749

Anonymous 12/18/24(Wed)22:36:51 No.103569749

File: Yes.png (6 KB, 623x152)

6 KB PNG

>>103569723
Yes, they have a section for creative writing now. And yes the blind test is the best method. And if you've used gemini 1206 you know its correct.

Anonymous
12/18/24(Wed)22:36:55 No.103569750

Anonymous 12/18/24(Wed)22:36:55 No.103569750

>>103569699
That's cool and all but it also puts old 3.5 Sonnet below 3.1 Nemotron

Anonymous
12/18/24(Wed)22:38:03 No.103569757

Anonymous 12/18/24(Wed)22:38:03 No.103569757

>>103569749
So you're the retard who ruined the benchmark, then

Anonymous
12/18/24(Wed)22:38:07 No.103569759

Anonymous 12/18/24(Wed)22:38:07 No.103569759

>>103569750
That one is harder. 3.5 besides liking to refuse is more overfitted if anything, giving samey responses. I can see that hurting it.

Anonymous
12/18/24(Wed)22:38:31 No.103569764

Anonymous 12/18/24(Wed)22:38:31 No.103569764

>>103569749
Oh, wow. I admit I didn't know about that. Thanks.

Anonymous
12/18/24(Wed)22:39:08 No.103569771

Anonymous 12/18/24(Wed)22:39:08 No.103569771

>>103569757
Have you not used it? It legit is claude opus tier but even filthyer / more unhinged. Its the proxy model of choice now. Gemini used to suck before it.

Anonymous
12/18/24(Wed)22:39:46 No.103569777

Anonymous 12/18/24(Wed)22:39:46 No.103569777

>>103569709
Are you going to ignore the discussion just moments ago about how good instruct models most of the time end up being the best for RP?

Anonymous
12/18/24(Wed)22:40:39 No.103569784

Anonymous 12/18/24(Wed)22:40:39 No.103569784

>>103569777
Yes? Qwen2.5 72B is the best performing "instruct" model but is terrible at RP.

Anonymous
12/18/24(Wed)22:41:25 No.103569795

Anonymous 12/18/24(Wed)22:41:25 No.103569795

>>103569771
Does it support prefilling or did they have to retrocede to jailbreaks?

Anonymous
12/18/24(Wed)22:43:21 No.103569809

Anonymous 12/18/24(Wed)22:43:21 No.103569809

>>103569795
https://rentry.org/avaniJB

Anonymous
12/18/24(Wed)22:43:32 No.103569811

Anonymous 12/18/24(Wed)22:43:32 No.103569811

>>103569749
The problem with putting all your stock into this benchmark is that most of the people who are doing these tests are ESL with preferences to stylish and long outputs and a bias against responses that sound similar to what they've heard before
You're trying to not only quantify something that's entirely subjective, but using the worst subset of internet users to do it

Anonymous
12/18/24(Wed)22:44:18 No.103569816

Anonymous 12/18/24(Wed)22:44:18 No.103569816

>>103569784
Nah, it just needs a fine-tune because Qwen cucked the model with too much alignment. EVA Qwen is pretty good, you should try it out.

Anonymous
12/18/24(Wed)22:45:26 No.103569827

Anonymous 12/18/24(Wed)22:45:26 No.103569827

>>103569816
I did, eva based on 3.3 is better now. About as smart but more importantly is able to get dark / filthy which the qwen version still struggled at.

Anonymous
12/18/24(Wed)22:46:38 No.103569838

Anonymous 12/18/24(Wed)22:46:38 No.103569838

Is it weird that I have a power fantasy of traveling back in time 10 years ago with all the local models I have right now. And gaslight the entire internet with fake images/videos/text?

Anonymous
12/18/24(Wed)22:47:10 No.103569844

Anonymous 12/18/24(Wed)22:47:10 No.103569844

>>103569749
According to this, Nemotron is the best RP/StoryWriting model local has.
Can anyone confirm this?

Anonymous
12/18/24(Wed)22:48:19 No.103569854

Anonymous 12/18/24(Wed)22:48:19 No.103569854

>>103569844
I can confirm. Nemotron is a beast, but it's cucked to avoid filthy stuff.

Anonymous
12/18/24(Wed)22:51:42 No.103569876

Anonymous 12/18/24(Wed)22:51:42 No.103569876

>>103569838
That sounds like a fine idea for a webnovel.

Anonymous
12/18/24(Wed)22:54:02 No.103569891

Anonymous 12/18/24(Wed)22:54:02 No.103569891

>>103569838
Man... I still remember 5~ years ago when I first saw GPT2 and thought "it must be fake, there's no way a computer can write code!"
It's kinda nostalgic, now that I think about it.

Anonymous
12/18/24(Wed)22:55:28 No.103569910

Anonymous 12/18/24(Wed)22:55:28 No.103569910

>>103569891
Considering it often struggled to keep a sentence straight, I don't recall GPT-2 doing much codewriting, kek

Anonymous
12/18/24(Wed)22:56:49 No.103569921

Anonymous 12/18/24(Wed)22:56:49 No.103569921

>>103569844
Nemotron is kind of smart but really bland and generic, typical slop flavor, so bad for story writing

Anonymous
12/18/24(Wed)22:57:21 No.103569925

Anonymous 12/18/24(Wed)22:57:21 No.103569925

>>103569891
I literally remember /pol/ and other schizos on 4chan claiming the GPT2 API was fake and it was indians quickly writing a reply. They were 100% saying that stuff and you even had a couple of holdouts that were still saying it all the way up till GPT4.

Anonymous
12/18/24(Wed)22:57:55 No.103569929

Anonymous 12/18/24(Wed)22:57:55 No.103569929

>>103569838
>fake images
photoshop existed back then
>fake videos
too uncanny, people would know it was fake even if they didn't know how you did it
>fake text
lies existed since 6000BC

You could generate a fuckton of spam but whatever PC could do that would be far more interesting back then

Anonymous
12/18/24(Wed)22:59:23 No.103569945

Anonymous 12/18/24(Wed)22:59:23 No.103569945

>>103569921
lies

Anonymous
12/18/24(Wed)23:00:18 No.103569948

Anonymous 12/18/24(Wed)23:00:18 No.103569948

>>103568548
>If you've been around since the start like me you would know the (NovelAI) leak jumpstarted the entire local image gen field.
That was a big deal but it was SD1.4 base model that really kicked off local imagegen, maybe a month or two before novelAI's finetune leaked.

Anonymous
12/18/24(Wed)23:00:57 No.103569955

Anonymous 12/18/24(Wed)23:00:57 No.103569955

>>103569891
I distinctly remember trying to coom with GPT-2 back in the old days and keeping at it before realizing "yeah, this is fucking hopeless". It's funny that GPT-2 was a leap above what we had but still shit enough that I was just left hoping there'd be something better someday

Anonymous
12/18/24(Wed)23:01:29 No.103569962

Anonymous 12/18/24(Wed)23:01:29 No.103569962

>>103569910
You're probably right, the popularity of GPT only started when GPT3 released, so I'm probably thinking about GPT3.

Anonymous
12/18/24(Wed)23:04:24 No.103569985

Anonymous 12/18/24(Wed)23:04:24 No.103569985

>>103569925
I legit thought C.AI had pajeets writing messages for some time in the backend, even more so because of how realistic the OOC was, so I understand the schizos.

Anonymous
12/18/24(Wed)23:05:07 No.103569995

Anonymous 12/18/24(Wed)23:05:07 No.103569995

File: HunyuanVideo-00068.webm (920 KB, 1280x720)

920 KB WEBM

>>103569929
>too uncanny, people would know it was fake even if they didn't know how you did it
Bro I could make the entire male internet my footslaves if I had hunyuan back in 2014 if I wanted. No one would call out pic-related as fake.

>fake text
I'm not talking about text but about real time text-based conversations held by a chatbot against regular people in 2014. There's no way they would expect it to be artificial as it completely passes the turing test and you could just put into the initial context to gaslight people into a certain direction.

Anonymous
12/18/24(Wed)23:05:33 No.103569998

Anonymous 12/18/24(Wed)23:05:33 No.103569998

>>103569921
t. Never used it

Anonymous
12/18/24(Wed)23:08:58 No.103570023

Anonymous 12/18/24(Wed)23:08:58 No.103570023

Kill yourself.

Anonymous
12/18/24(Wed)23:10:35 No.103570033

Anonymous 12/18/24(Wed)23:10:35 No.103570033

>>103569995
Oh okay, yeah, AI chatbots would freak people out. Coom image and text gen would both make people addicts but that isn't really different from today lol
We're still in the early stages of this stuff.

Anonymous
12/18/24(Wed)23:14:39 No.103570067

Anonymous 12/18/24(Wed)23:14:39 No.103570067

>>103569945
>>103569998
Go ahead then, disprove what I said

Anonymous
12/18/24(Wed)23:18:09 No.103570116

Anonymous 12/18/24(Wed)23:18:09 No.103570116

>>103570067
>ghosts exist!
>what? no!
>go ahead then, disprove what I said.

Anonymous
12/18/24(Wed)23:19:09 No.103570127

Anonymous 12/18/24(Wed)23:19:09 No.103570127

>>103570116
Fucking retard, go ahead and prove that Nemotron isn't boring slop with a log right now

Anonymous
12/18/24(Wed)23:19:33 No.103570129

Anonymous 12/18/24(Wed)23:19:33 No.103570129

>>103570116
I'm 99% sure its the same troll who says the same about literally every model discussed here.

Anonymous
12/18/24(Wed)23:21:50 No.103570152

Anonymous 12/18/24(Wed)23:21:50 No.103570152

>>103570127
just go to literotica anon, no one wants to give you their smut

Anonymous
12/18/24(Wed)23:22:33 No.103570156

Anonymous 12/18/24(Wed)23:22:33 No.103570156

>>103570033
No I meant more like a power fantasy of using all local models now to pretend to be people online on a large scale to influence the world. Like create tens of thousands of fake women including pictures, videos etc to entrap politicians and other influential people and gaslight the entire internet into influencing the world.

Yes it's extremely autistic but it's become my go-to power fantasy for some reason.

Anonymous
12/18/24(Wed)23:22:58 No.103570164

Anonymous 12/18/24(Wed)23:22:58 No.103570164

very funny to see people typing out posts that could have been written by an 8 year old trying to argue that LLM 1 is better or worse than LLM 2

Anonymous
12/18/24(Wed)23:24:18 No.103570178

Anonymous 12/18/24(Wed)23:24:18 No.103570178

>>103569995
she has two left feet anon

Anonymous
12/18/24(Wed)23:28:18 No.103570217

Anonymous 12/18/24(Wed)23:28:18 No.103570217

>>103570178
I kneel

Anonymous
12/18/24(Wed)23:31:06 No.103570243

Anonymous 12/18/24(Wed)23:31:06 No.103570243

>>103570129
>>103570152
Damn you got me. Nemotron is actually better than every other model and has no flaws! For real! No cap, my fellow lmggers!

Anonymous
12/18/24(Wed)23:40:14 No.103570329

Anonymous 12/18/24(Wed)23:40:14 No.103570329

>>103569423
I believe it's more of a matter of having a model trained to maturity. They increased it's knowledge, but that new knowledge is just being filtered through its established 'thought process' and pattern of output.
>>103569507
Doesn't matter if it's a bad run or not.
The way they shill Erato in the NAI discord like it's the best thing ever and deny otherwise is the issue. If they don't see a problem; It's over. The NAI team + fanatic fanboys shut down any disagreements hard and claim operator error for not using the correct ATTG+R/----/LB/*** format (at this point just gimme instruct FFS), which doesn't come close to killing Erato's Llama flavored slop even with every imaginable effort to keep the context being polluted by it's bad tendances.
Kayra was way ahead of it's time at the time of release, it punched above it's weight class, followed writing cues/style even from minimally sized prompts. Was hoping for even a moderate upgrade and fuck off from /lmg/ forever. But no. And I'm salty about it. So fuck NAI shills.

Anonymous
12/18/24(Wed)23:47:11 No.103570375

Anonymous 12/18/24(Wed)23:47:11 No.103570375

>>103566743
see the modifying models behaviors via vectors from last thread? not that. all my addon does is act like a version of author notes with a selection button for things you can choose rather than type each time, the rest is that it injects every prompt at a low level is the same - acting as a constant reminder. i believe you can drive models at least somewhat, but not through weirdness, just through prompting

Anonymous
12/18/24(Wed)23:52:26 No.103570414

Anonymous 12/18/24(Wed)23:52:26 No.103570414

hello anons, it been a while.
was busy for the past 8 months with life and been away from all this.
can someone give me an update on what's the best AI to use for roleplaying (like dnd, choose your own adventure) style stories?
was using cluade sonnet before i got busy.
also are proxies still available or is there a website to go to now?

Anonymous
12/18/24(Wed)23:53:03 No.103570418

Anonymous 12/18/24(Wed)23:53:03 No.103570418

>>103570414
wrong thread

Anonymous
12/18/24(Wed)23:53:30 No.103570425

Anonymous 12/18/24(Wed)23:53:30 No.103570425

>>103570418
you are right, thanks anon

Anonymous
12/18/24(Wed)23:53:34 No.103570427

Anonymous 12/18/24(Wed)23:53:34 No.103570427

File: ComfyUI_temp_mogyg_00029_.png (1.23 MB, 832x1216)

1.23 MB PNG

>>103565507
>/LMG/
tell me you're a tourist without telling me you're a tourist

Anonymous
12/18/24(Wed)23:53:56 No.103570429

Anonymous 12/18/24(Wed)23:53:56 No.103570429

>36 GB VRAM
>try 3.5 bpw 70B hoping it'll work
>can't even get above 7k context even with q4 cache
It's over. And testing it, it seems dumb and makes weird errors frequently, so I doubt an even lower bpw would be good.
ACK

Anonymous
12/18/24(Wed)23:57:31 No.103570473

Anonymous 12/18/24(Wed)23:57:31 No.103570473

>>103570418
a small model like nemo will go much further than any online garbage you're trying

Anonymous
12/19/24(Thu)00:05:26 No.103570570

Anonymous 12/19/24(Thu)00:05:26 No.103570570

>>103565624
Looking good, anon.
>>103569838
You can do this RIGHT NOW by becoming a glowie.

Anonymous
12/19/24(Thu)00:08:33 No.103570601

Anonymous 12/19/24(Thu)00:08:33 No.103570601

>>103565866
I just use Rocinante. I have 8gb (2070 super).

Anonymous
12/19/24(Thu)00:13:25 No.103570644

Anonymous 12/19/24(Thu)00:13:25 No.103570644

>>103569423
>using llama3 as base
It was over before it began

Anonymous
12/19/24(Thu)00:13:56 No.103570651

Anonymous 12/19/24(Thu)00:13:56 No.103570651

If I want to try QwQ for RP/ERP, should I go for the official version or one of the merges/tunes like Eva QwQ?

Anonymous
12/19/24(Thu)00:14:25 No.103570654

Anonymous 12/19/24(Thu)00:14:25 No.103570654

Hi KoboHenk,

I'm reaching out once again to emphasize the importance of adding full draft model settings to your platform.

Implementing these settings would significantly enhance performance, outperforming the current trashy defaults. Users would greatly benefit from the flexibility and improved results that come with customizable draft models.

Thank you for considering this request.

Best regards,
Anon

Anonymous
12/19/24(Thu)00:14:59 No.103570657

Anonymous 12/19/24(Thu)00:14:59 No.103570657

>>103569458
Because everything fucking sucks and looks like it'll suck more in the future, not less. It's like one day everybody uniformly agreed that llms should be aligned during the pretraining phase

Anonymous
12/19/24(Thu)00:15:18 No.103570660

Anonymous 12/19/24(Thu)00:15:18 No.103570660

File: 1710866539631916.jpg (56 KB, 600x800)

56 KB JPG

begin work immediately mr kobold

Anonymous
12/19/24(Thu)00:17:25 No.103570675

Anonymous 12/19/24(Thu)00:17:25 No.103570675

>>103570570
>You can do this RIGHT NOW by becoming a glowie.
Are they hiring? Did their DEI hires resign/ACK xirselves? Do they want straight white men again?

Anonymous
12/19/24(Thu)00:24:15 No.103570728

Anonymous 12/19/24(Thu)00:24:15 No.103570728

>>103570654
Does henk even work on koboldcpp? I thought it was a different guy

Anonymous
12/19/24(Thu)00:26:13 No.103570755

Anonymous 12/19/24(Thu)00:26:13 No.103570755

>>103570728
>Does henk even work on koboldcpp? I thought it was a different guy
it doesn't matter, all this is merely a striving after wind

Anonymous
12/19/24(Thu)00:28:01 No.103570773

Anonymous 12/19/24(Thu)00:28:01 No.103570773

>>103570728
Yeah, it's concedo.

Anonymous
12/19/24(Thu)00:37:17 No.103570861

Anonymous 12/19/24(Thu)00:37:17 No.103570861

>>103570755
>it doesn't matter, all this is merely a striving after wind
Ah yes, 'striving after wind,' because clearly doing nothing is the pinnacle of proactive problem-solving. Just because Kobo might not immediately notice one voice doesn't mean the cumulative effect of many won't. It's called advocacy, not 'chasing wind,' and sometimes even a gentle breeze can move a mountain if enough people are blowing.

Anonymous
12/19/24(Thu)00:56:00 No.103570983

Anonymous 12/19/24(Thu)00:56:00 No.103570983

File: aryM7NK_460s.png (403 KB, 460x564)

403 KB PNG

>>103570861
reading this made my brain hurt

Anonymous
12/19/24(Thu)01:01:55 No.103571022

Anonymous 12/19/24(Thu)01:01:55 No.103571022

>>103570983
That post had a certain... uncanny quality, didn’t it? Like staring at a familiar face in a dream, where everything seems *almost* human, yet just a touch off—words strung together with mechanical precision, but devoid of a soul’s warmth. It reads like something that understands language but not meaning, as if crafted by a mind that has learned to mimic thought without ever truly thinking. Makes you wonder who—or *what*—was really behind it.

Anonymous
12/19/24(Thu)01:09:16 No.103571082

Anonymous 12/19/24(Thu)01:09:16 No.103571082

>>103571022
They walk among us, blending in with us... They look human, but if you look close enough, you can tell their act is at best a crude imitation of human behavior ...

Anonfilms presents...
THE AUTISTS

Anonymous
12/19/24(Thu)01:09:57 No.103571088

Anonymous 12/19/24(Thu)01:09:57 No.103571088

Are any 12gb vram models worth a damn? Or is a 3090 minimum viable hardware

Anonymous
12/19/24(Thu)01:11:42 No.103571099

Anonymous 12/19/24(Thu)01:11:42 No.103571099

>>103571088
minimum is 2 3090s

Anonymous
12/19/24(Thu)01:16:23 No.103571138

Anonymous 12/19/24(Thu)01:16:23 No.103571138

I am so tired of that one mod who posts in threads with a "witty" zinger and then deletes them

Anonymous
12/19/24(Thu)01:19:26 No.103571157

Anonymous 12/19/24(Thu)01:19:26 No.103571157

>>103571088
gemma-2-Ifable or L3-sunfall

Anonymous
12/19/24(Thu)01:20:32 No.103571165

Anonymous 12/19/24(Thu)01:20:32 No.103571165

>>103569749
Thanks I looked into it and found nemotron 51B. I don't remember anyone here bringing it up when it released. Seems like something that would work with 24GB.

Anonymous
12/19/24(Thu)01:21:31 No.103571170

Anonymous 12/19/24(Thu)01:21:31 No.103571170

Will you be able to connect the 5090s with each other? 2x32gb should be enough for 70b models if I understand it corectly

Anonymous
12/19/24(Thu)01:26:49 No.103571217

Anonymous 12/19/24(Thu)01:26:49 No.103571217

>>103568057
>model thinks that you die when you go unconscious
Into the trash it goes

Anonymous
12/19/24(Thu)01:30:23 No.103571239

Anonymous 12/19/24(Thu)01:30:23 No.103571239

File: Screenshot_20241219_062755.png (943 KB, 924x1319)

943 KB PNG

I am testing EVA QwQ right now.
>think of looking away for a second while it's generating
>look back
Oh...

Anonymous
12/19/24(Thu)01:30:47 No.103571244

Anonymous 12/19/24(Thu)01:30:47 No.103571244

>>103571217
that’s how it works irl tho???
when you go to sleep as well

Anonymous
12/19/24(Thu)01:31:23 No.103571249

Anonymous 12/19/24(Thu)01:31:23 No.103571249

>>103571217
what?

Anonymous
12/19/24(Thu)01:35:03 No.103571276

Anonymous 12/19/24(Thu)01:35:03 No.103571276

>>103569910
pyg before chatgpt could do it.
i remember being so impressed that i left a comment.
it was just short a hello world c# console app. but blew my mind.

Anonymous
12/19/24(Thu)01:36:50 No.103571288

Anonymous 12/19/24(Thu)01:36:50 No.103571288

>>103571239
QwQ gets stuck in a loop pretty often, even the paper acknowledges that.

Anonymous
12/19/24(Thu)01:56:01 No.103571390

Anonymous 12/19/24(Thu)01:56:01 No.103571390

>>103571288
>QwQ gets stuck in a loop pretty often, even the paper acknowledges that.
I've seen it fail to generate EOS/EOT lots, but never really seen it loop at q8 and I've used QwQ LOTS.
Where does it say that in the paper?

Anonymous
12/19/24(Thu)02:01:19 No.103571425

Anonymous 12/19/24(Thu)02:01:19 No.103571425

Alright just from testing one card and a couple swipes I think I have a feel for EVA QwQ as well as QwQ normal. I'm testing with near greedy sampling but with a bit of rep pen after I saw >>103571239. My feel is that EVA QwQ is closer to a normal model but that does a bit of thinking, and it isn't afraid of getting lewd. Normal QwQ on the other hand doesn't get nearly as lewd (although it's still able to), but its thinking process is really quite unique and interesting. It seems quite smart, and smarter than EVA. EVA just doesn't think like QwQ does which is unfortunate. But it seems to have an issue with going off rails and not stopping its yapping, while EVA QwQ feels stable. And EVA feels more in character in its thoughts while QwQ feels more like a generic writer. It's too bad we can't have the benefits of both Eva and QwQ without some trade-off.

Also since I was just testing Llama 3.3 EVA the other day, I will say that the experience using that was a lot more fun and even more in-character. Both EVA QwQ and normal QwQ feel a bit generic in how they write compared to L3.3 EVA. But L3.3 EVA can't think like QwQ can. It's interesting thinking about what could happen if we had an open source final QwQ dataset. Imagine a model that's as phun as L3.3 EVA but with the smart test-time scaling thought process of QwQ (when it's working properly).

Anonymous
12/19/24(Thu)02:43:17 No.103571620

Anonymous 12/19/24(Thu)02:43:17 No.103571620

>>103571249
She's conscious and suddenly just dies due to the lack of air, at least it reads that way to me
That's really not what happens irl, is it?

Anonymous
12/19/24(Thu)02:45:46 No.103571630

Anonymous 12/19/24(Thu)02:45:46 No.103571630

>>103571620
If you actually read more clearly her throat is literally crushed and she's slowly dying and spazing around.

Anonymous
12/19/24(Thu)02:45:59 No.103571631

Anonymous 12/19/24(Thu)02:45:59 No.103571631

File: Screenshot.png (84 KB, 1154x570)

84 KB PNG

>>103571390
My bad, it's on the Github page, not the paper.

Anonymous
12/19/24(Thu)02:47:34 No.103571638

Anonymous 12/19/24(Thu)02:47:34 No.103571638

>>103571630
I don't think you can just crush a throat with a dick like that, but even if you can, you still wouldn't die immediately once you pass out

L3.3fag !!SB6Q3O4XU7f
12/19/24(Thu)02:51:39 No.103571656

L3.3fag !!SB6Q3O4XU7f 12/19/24(Thu)02:51:39 No.103571656

>>103571425
Eva 0.0 does love to go on and on sometimes. 0.1 seems to have fixed that, for the most part.
And yeah, having a model with Eva's soul and QwQ's reasoning would be awesome.

Anonymous
12/19/24(Thu)02:57:18 No.103571687

Anonymous 12/19/24(Thu)02:57:18 No.103571687

>>103571631
nta. General ass-covering disclaimers every model has and not quite what anon showed.
Models, under certain circumstances, simply explode. Nothing special about it.

Anonymous
12/19/24(Thu)03:59:48 No.103572075

Anonymous 12/19/24(Thu)03:59:48 No.103572075

Is 12 days of OpenAI an even bigger marketing flop than strawberry man?

Anonymous
12/19/24(Thu)04:16:38 No.103572189

Anonymous 12/19/24(Thu)04:16:38 No.103572189

>>103572075
Microsoft is investing 56B in Anthropic. Sam is finished. He has nothing left. Everybody in OAI realized AI is a fad and split up to start their own grifts.

Anonymous
12/19/24(Thu)04:21:23 No.103572226

Anonymous 12/19/24(Thu)04:21:23 No.103572226

>>103572189
I mean, there are use cases for it. 20% of Google traffic is to CAI. They just do not want to hear about it. Antropic willl not do any better.

Anonymous
12/19/24(Thu)04:23:07 No.103572244

Anonymous 12/19/24(Thu)04:23:07 No.103572244

so now you can do this in realtime on 200bux nano shit

Anonymous
12/19/24(Thu)04:31:51 No.103572303

Anonymous 12/19/24(Thu)04:31:51 No.103572303

>>103572244
que?

Anonymous
12/19/24(Thu)04:35:55 No.103572335

Anonymous 12/19/24(Thu)04:35:55 No.103572335

File: trainingdata.png (9 KB, 462x62)

9 KB PNG

How to know a model is shit and censored

Anonymous
12/19/24(Thu)04:36:21 No.103572339

Anonymous 12/19/24(Thu)04:36:21 No.103572339

File: yG4JuBNGsh.jpg (62 KB, 721x540)

62 KB JPG

>>103572303

Anonymous
12/19/24(Thu)04:41:22 No.103572373

Anonymous 12/19/24(Thu)04:41:22 No.103572373

>>103572339
>10% price for 2% performance
lmao
lol

Anonymous
12/19/24(Thu)04:43:59 No.103572389

Anonymous 12/19/24(Thu)04:43:59 No.103572389

>>103572075
They will end their 12 days with the reveal of GPT4.5

>>103572189
Microsoft is NOT investing 56B into Anthropic. Microsoft is buying a very small part of Anthropic stock which raises their valuation to 56B (up from 18B they are worth now) This just means that Microsoft is paying 3x the amount per share compared to Amazon in the past.

It's true however that Microsoft is doing this because they are having issues with OpenAI.

Anonymous
12/19/24(Thu)05:07:06 No.103572541

Anonymous 12/19/24(Thu)05:07:06 No.103572541

>>103572389
>They will end their 12 days with the reveal of GPT4.5
Would be really funny if we get something very dumb sounding like "GPT 4 super"

Anonymous
12/19/24(Thu)05:15:20 No.103572586

Anonymous 12/19/24(Thu)05:15:20 No.103572586

>>103571170
As long as you are using them just for inference then yeah you can use 64 gb.
If my math isn't wrong It should be able to also fit in 123b with 32k context if using 4bit kv cache at 3.7 bpw, but I'm not sure if at that point it becomes worse than just running 70b at higer bpw

Anonymous
12/19/24(Thu)05:44:15 No.103572750

Anonymous 12/19/24(Thu)05:44:15 No.103572750

>>103568057
If you’re going to shill a model, atleast shill it properly.

Anonymous
12/19/24(Thu)05:49:29 No.103572780

Anonymous 12/19/24(Thu)05:49:29 No.103572780

>>103569827
Settings?

Anonymous
12/19/24(Thu)06:10:59 No.103572932

Anonymous 12/19/24(Thu)06:10:59 No.103572932

>>103572541
GPT 4 Ti

Anonymous
12/19/24(Thu)06:17:33 No.103572976

Anonymous 12/19/24(Thu)06:17:33 No.103572976

>>103572541
GPT4+, next release GPT4++

Anonymous
12/19/24(Thu)06:56:13 No.103573186

Anonymous 12/19/24(Thu)06:56:13 No.103573186

File: Untitled.png (1.05 MB, 1080x3185)

1.05 MB PNG

Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN
https://arxiv.org/abs/2412.13795
>Large Language Models (LLMs) have achieved remarkable success, yet recent findings reveal that their deeper layers often contribute minimally and can be pruned without affecting overall performance. While some view this as an opportunity for model compression, we identify it as a training shortfall rooted in the widespread use of Pre-Layer Normalization (Pre-LN). We demonstrate that Pre-LN, commonly employed in models like GPT and LLaMA, leads to diminished gradient norms in its deeper layers, reducing their effectiveness. In contrast, Post-Layer Normalization (Post-LN) preserves larger gradient norms in deeper layers but suffers from vanishing gradients in earlier layers. To address this, we introduce Mix-LN, a novel normalization technique that combines the strengths of Pre-LN and Post-LN within the same model. Mix-LN applies Post-LN to the earlier layers and Pre-LN to the deeper layers, ensuring more uniform gradients across layers. This allows all parts of the network--both shallow and deep layers--to contribute effectively to training. Extensive experiments with various model sizes from 70M to 7B demonstrate that Mix-LN consistently outperforms both Pre-LN and Post-LN, promoting more balanced, healthier gradient norms throughout the network, and enhancing the overall quality of LLM pre-training. Furthermore, we demonstrate that models pre-trained with Mix-LN learn better compared to those using Pre-LN or Post-LN during supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF), highlighting the critical importance of high-quality deep layers. By effectively addressing the inefficiencies of deep layers in current LLMs, Mix-LN unlocks their potential, enhancing model capacity without increasing model size.
https://github.com/pixeli99/MixLN
interesting

Anonymous
12/19/24(Thu)07:16:33 No.103573260

Anonymous 12/19/24(Thu)07:16:33 No.103573260

>>103573186
So it's either pruning a few percent of the model at no noticeable quality loss or using some sperg technique to get 3% less perplexity
Gradient descent really wasn't meant for deep networks...

Anonymous
12/19/24(Thu)08:08:52 No.103573591

Anonymous 12/19/24(Thu)08:08:52 No.103573591

Is it possible to change the context shifting threshold?
Right now if you are at context limit, gen a reply, press continue and then swipe the reply, chances are that it already popped a message from context but resends it for the swipe rebuilding the whole context
Using koboldcpp

Anonymous
12/19/24(Thu)08:33:22 No.103573724

Anonymous 12/19/24(Thu)08:33:22 No.103573724

>>103573591
How would such threshold work? The chat either gets "rolled" up enough for a message to get evicted from the context window or it doesn't right, which is a function of the frontend.
I suppose you could use an arbitrarily large context window size in the front end, so that it sends the whole conversation to the backend, and let the backend deal with cutting up the prompt and/or shifting the context, although I have no idea if that's how any of that works.
But you might as well try.
I know that at least llama.cpp server doesn't crash when receiving a prompt that's larger than the actual context window. If it's just truncating and mangling the prompt in the process, I have no idea.

Anonymous
12/19/24(Thu)08:42:13 No.103573783

Anonymous 12/19/24(Thu)08:42:13 No.103573783

>>103565511
>--Guitar amp simulation using local models and potential noise reduction techniques:
For the record, I tried. Got the input/output pair from the support page of that project, fed it to the colab. Clean model turned out fine, then I mixed in a little hi-passed white noise to the input, and couldn't get past the "input is not silent for at least ~19k samples" error despite disabling checks. That part is completely zeroed out in my sample. Couldn't get any search results about the error and gave up.

Anonymous
12/19/24(Thu)08:48:44 No.103573827

Anonymous 12/19/24(Thu)08:48:44 No.103573827

File: Back in time with my Gami(...).jpg (437 KB, 1024x1792)

437 KB JPG

>>103569838
>Is it weird that I have a power fantasy of traveling back in time 10 years ago with all the local models I have right now. And gaslight the entire internet with fake images/videos/text?

>>103569876
That sounds like a fine idea for a webnovel.

This fantasy has potential but I would probably go up to 15 years back in time to use your gaming PC/AI rig to mine some bitcoin on the side back when it was easy.

Here is my suggestion for a title: “Back in time with my pimped out gaming PC and local AI models.”

Here is a shitty Dalle-3 Gen for the cover of your new hit LN or WN.

Anonymous
12/19/24(Thu)09:17:15 No.103574056

Anonymous 12/19/24(Thu)09:17:15 No.103574056

File: 1734617794101.jpg (277 KB, 725x1024)

277 KB JPG

too many tripfags... not enough miku...

Anonymous
12/19/24(Thu)09:29:54 No.103574158

Anonymous 12/19/24(Thu)09:29:54 No.103574158

>>103565866
At that point just get the infinite monkeys

Anonymous
12/19/24(Thu)09:30:57 No.103574168

Anonymous 12/19/24(Thu)09:30:57 No.103574168

>>103565880
>do the lobotomy wrong
>the lobotomy goes wrong
>"HOW COULD THE AI DO THIS TO ME?!?"

Anonymous
12/19/24(Thu)09:31:31 No.103574173

Anonymous 12/19/24(Thu)09:31:31 No.103574173

>>103565866
Nemo 12B.

Anonymous
12/19/24(Thu)09:50:24 No.103574341

Anonymous 12/19/24(Thu)09:50:24 No.103574341

3090

Anonymous
12/19/24(Thu)09:51:52 No.103574353

Anonymous 12/19/24(Thu)09:51:52 No.103574353

>>103574341
Dethroned soon by Intel. Whatever fits in 24GB doesn't need 935 GB/s memory bandwidth

Anonymous
12/19/24(Thu)10:04:11 No.103574449

Anonymous 12/19/24(Thu)10:04:11 No.103574449

>>103574353
The larger the model, the more memory bandwidth will be the bottleneck; if anything, 935 GB/s are barely enough for a model that fits within 24GB just right.

Anonymous
12/19/24(Thu)10:13:06 No.103574533

Anonymous 12/19/24(Thu)10:13:06 No.103574533

>>103574449
Qwen 32B 4bpw runs at 25 t/s on my 3090 without speculative decoding. I think people can live with 13 t/s, if anything they can use a draft model to make it ~18 t/s, still more than usable.

Anonymous
12/19/24(Thu)10:20:29 No.103574597

Anonymous 12/19/24(Thu)10:20:29 No.103574597

File: migu inside.jpg (489 KB, 1536x2048)

489 KB JPG

Anonymous
12/19/24(Thu)10:20:38 No.103574599

Anonymous 12/19/24(Thu)10:20:38 No.103574599

>>103574353
If intel fucks the pricing up on the 24gb b580 I'm just going to assume they're retarded and hate money, everything is positioned perfectly for a massive market grab. Hobbyists, researchers, coomers and everyone else who isn't a billion dollar company are begging for scraps while Jensen's boot stomps on their vram poor asses.

Anonymous
12/19/24(Thu)10:21:43 No.103574613

Anonymous 12/19/24(Thu)10:21:43 No.103574613

File: FinnishMikuSourcebook.png (1.99 MB, 1280x1640)

1.99 MB PNG

>>103574056
Then pull another one out of one of your sourcebooks

Anonymous
12/19/24(Thu)10:22:18 No.103574618

Anonymous 12/19/24(Thu)10:22:18 No.103574618

>>103571656
Is 0.1 better than 0.0 other than that? Typically good finestunes are accidental flukes in the slop pile and attempts to update them are failures, so I haven't considered using the new version unless someone actually confirms it's better

Anonymous
12/19/24(Thu)10:23:29 No.103574628

Anonymous 12/19/24(Thu)10:23:29 No.103574628

>>103574353
>>103574599
$700. Scalped to 1k.

Anonymous
12/19/24(Thu)10:24:37 No.103574634

Anonymous 12/19/24(Thu)10:24:37 No.103574634

>>103574533
If you add introspection and things along these lines (i.e. test-time compute), which appears to be where the industry is going, that starts to become painfully slow.

Anonymous
12/19/24(Thu)10:26:04 No.103574644

Anonymous 12/19/24(Thu)10:26:04 No.103574644

>>103574628
$400 and we luckily have anti scalping laws here

Anonymous
12/19/24(Thu)10:26:23 No.103574646

Anonymous 12/19/24(Thu)10:26:23 No.103574646

The Chinese are up to it again: https://www.ebay.ca/itm/375861526620

Anonymous
12/19/24(Thu)10:28:24 No.103574660

Anonymous 12/19/24(Thu)10:28:24 No.103574660

>>103574628
$300

Anonymous
12/19/24(Thu)10:29:36 No.103574663

Anonymous 12/19/24(Thu)10:29:36 No.103574663

File: coping cycle.png (85 KB, 384x374)

85 KB PNG

>>103574644
>>103574660

Anonymous
12/19/24(Thu)10:35:34 No.103574716

Anonymous 12/19/24(Thu)10:35:34 No.103574716

>>103572335
They also got limited knowledge of fiction due to copyright. Tulu exclusively referred to century old novels when I told it to (sometimes) relate descriptions to popular fiction along other shit.

Anonymous
12/19/24(Thu)10:39:43 No.103574757

Anonymous 12/19/24(Thu)10:39:43 No.103574757

I'm looking for a model which can parse my project's codebase and write unit/integration tests for me.
Which one should I try?

Anonymous
12/19/24(Thu)10:47:46 No.103574827

Anonymous 12/19/24(Thu)10:47:46 No.103574827

>>103574757
pyg6b

Anonymous
12/19/24(Thu)10:49:26 No.103574839

Anonymous 12/19/24(Thu)10:49:26 No.103574839

>>103574757
if you have up to 48gb vram then qwq. More than that (or a cpumaxx rig) and you can look at other qwen options or deepseek

Anonymous
12/19/24(Thu)10:51:44 No.103574854

Anonymous 12/19/24(Thu)10:51:44 No.103574854

>>103574839
Fuck that I have a 3070 8gb and a 9800x3d, I have nowhere that much vram.
I've not engaged with local AI since the first diffusion models were coming out, I had no idea we were already talking about 48gb+ of VRAM. What's the standard nowadays?

Anonymous
12/19/24(Thu)10:53:37 No.103574872

Anonymous 12/19/24(Thu)10:53:37 No.103574872

>>103574854
>parse my project's codebase and write unit/integration tests for me
>3070 8gb and a 9800x3d
You're gonna need a bigger boat. You can't even fit your project's codebase into vram if its more complicated than hello world, let alone a model big enough to tell you anything of value about it.

Anonymous
12/19/24(Thu)10:54:54 No.103574890

Anonymous 12/19/24(Thu)10:54:54 No.103574890

>>103574872
Alright I'll shelve this dream for now. The annoyance is not worth enough to dump 4k into a graphics card.

Anonymous
12/19/24(Thu)10:55:39 No.103574896

Anonymous 12/19/24(Thu)10:55:39 No.103574896

>>103574854
>What's the standard nowadays?
The OP has a build guide. Models are up to 810GB (unquanted) in size these days, so sky's the limit for options.

Anonymous
12/19/24(Thu)10:57:07 No.103574911

Anonymous 12/19/24(Thu)10:57:07 No.103574911

>>103574890
don't listen to the faggots grab q6 qwen coder and offload what you can, run rest on cpu/ram

Anonymous
12/19/24(Thu)11:04:12 No.103574983

Anonymous 12/19/24(Thu)11:04:12 No.103574983

File: cf39ab54d7852406f0d1f34ec(...).jpg (43 KB, 640x640)

43 KB JPG

I don't get the hate towards latest largestral in RP desu
At least on lower quants the prose is more human and fun compared to previous version.

Anonymous
12/19/24(Thu)11:04:39 No.103574992

Anonymous 12/19/24(Thu)11:04:39 No.103574992

File: noisy.png (44 KB, 1788x939)

44 KB PNG

>>103573783
Tried on their free trainer which accepted the files no problem. the colab is fucked evidently. It works lol. The tone as far as I can tell in monitor headphones is unchanged, but with way less noise. Someone tell them to add this as an option to training or something like that. Mix in a bit of noise to make it cancel out some of the junk.

Anonymous
12/19/24(Thu)11:05:35 No.103575002

Anonymous 12/19/24(Thu)11:05:35 No.103575002

>>103574646
>Shipping: US $5,000.00 (approx C $7,223.50)
Is that a mistake?

Anonymous
12/19/24(Thu)11:07:08 No.103575023

Anonymous 12/19/24(Thu)11:07:08 No.103575023

>>103575002
No, it's how they make their money avoiding ebay's cut.

Anonymous
12/19/24(Thu)11:07:55 No.103575029

Anonymous 12/19/24(Thu)11:07:55 No.103575029

>>103575023
That's some circa 2003 bullshit
Pretty sure eBay nails you for shipping costs these days too

Anonymous
12/19/24(Thu)11:17:02 No.103575117

Anonymous 12/19/24(Thu)11:17:02 No.103575117

>>103574983
Imo Largestral is bad, both the new and old version.

Anonymous
12/19/24(Thu)11:24:25 No.103575163

Anonymous 12/19/24(Thu)11:24:25 No.103575163

>>103575117
Please explain.
I never used it as I'm a VRAMlet, so I have no idea about it's characteristics in actual use.

Anonymous
12/19/24(Thu)11:30:24 No.103575207

Anonymous 12/19/24(Thu)11:30:24 No.103575207

>>103574983
Hate toward largestral generally coincides with the type of people who prefer drummer-style sloptunes
Basically you're seeing a vocal minority of people who equate style (e.g. purple prose) to intelligence and coherence, instead of just being a style vector. The dirtier and more literary/verbose the output is the more that equates to the model being better or smarter in their eyes.

Anonymous
12/19/24(Thu)11:35:36 No.103575265

Anonymous 12/19/24(Thu)11:35:36 No.103575265

File: 1711558996207869.jpg (72 KB, 1428x299)

72 KB JPG

>qwen answers for me and then continues

Anonymous
12/19/24(Thu)11:36:48 No.103575279

Anonymous 12/19/24(Thu)11:36:48 No.103575279

>>103575163
Largestral has too much positivity bias, avoids filthy stuff, and writes using too much purple prose. The intelligence claimed by anons doesn't matter, since it isn't that fun to use for my use cases.
I don't think this is fixed by sloptunes either, rather, they make the model stupid and too horny, so I avoid them like the plague.
It's the same issue we had with Miqu, really. I think the only salvation for Largestral would be an uncensored instruct fine-tune like Tulu or Nemotron.

Anonymous
12/19/24(Thu)11:47:13 No.103575381

Anonymous 12/19/24(Thu)11:47:13 No.103575381

>>103575265
>AI isn't going to replace you, stop being paranoid!
>The AI:

Anonymous
12/19/24(Thu)11:48:19 No.103575391

Anonymous 12/19/24(Thu)11:48:19 No.103575391

>>103575265
Yeah I also experienced this.

Anonymous
12/19/24(Thu)11:57:58 No.103575456

Anonymous 12/19/24(Thu)11:57:58 No.103575456

>>103575279
Feels like this is often the case. Models are either too dry/positive or too horny.

Anonymous
12/19/24(Thu)12:21:56 No.103575636

Anonymous 12/19/24(Thu)12:21:56 No.103575636

>>103575618
>>103575618
>>103575618

Anonymous
12/19/24(Thu)12:29:42 No.103575693

Anonymous 12/19/24(Thu)12:29:42 No.103575693

>>103573724
I've read about this before a while ago but I don't know remember for what tool.

I guess this specific issue could be fixed if ST would just send the cutoff previous prompt instead of trying to send a previously dropped token
>A B C D E F G
>_ B C D E F G H
>_ _ C D E F G H I
>swipe H
>_ _ C D E F G H I
>_ _ C D E F G H2

instead of
>A B C D E F G
>_ B C D E F G H
>_ _ C D E F G H I
>swipe H
>_ _ C D E F G H I
>_ B C D E F G H2

Anonymous
12/19/24(Thu)12:46:50 No.103575836

Anonymous 12/19/24(Thu)12:46:50 No.103575836

>>103569071
Imagine how retarded you must be to think that ST counts as "nerf stuff"
Lmao

Anonymous
12/19/24(Thu)12:49:36 No.103575856

Anonymous 12/19/24(Thu)12:49:36 No.103575856

>>103569237
Wow, it's almost as if expecting a 70b model to perform the same than a 1T model was a retarded concept from start

Anonymous
12/19/24(Thu)13:46:26 No.103576441

Anonymous 12/19/24(Thu)13:46:26 No.103576441

>>103575856
claude 3 opus has 137 billion parameters, and 3.5 sonnet (which is smarter than opus) presumably has less since it's faster and costs less than opus

Anonymous
12/19/24(Thu)13:58:11 No.103576557

Anonymous 12/19/24(Thu)13:58:11 No.103576557

>>103576441
>claude 3 opus has 137 billion parameters
source

Anonymous
12/19/24(Thu)14:09:43 No.103576681

Anonymous 12/19/24(Thu)14:09:43 No.103576681

>>103576557
It's just what appears when you google it since it's repeated by many sources
But upon closer inspection, it originates from an obviously AI-generated medium article where the param count was hallucinated

Anonymous
12/19/24(Thu)14:20:08 No.103576762

Anonymous 12/19/24(Thu)14:20:08 No.103576762

>>103575002
>buy card, $499.93 + $5,000.00 shipping
>card doesn't work
>here's your $499.93 back, have a nice day

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.