/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 10/04/24(Fri)23:53:22 No.102688881

File: 1704177621942217.jpg (816 KB, 1856x2464)

816 KB JPG

/lmg/ - Local Models General Anonymous 10/04/24(Fri)23:53:22 No.102688881 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>102674638 & >>102663772

►News
>(09/27) Emu3, next-token prediction multimodal models: https://hf.co/collections/BAAI/emu3-66f4e64f70850ff358a2e60f
>(09/25) Multimodal Llama 3.2 released: https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices
>(09/25) Molmo: Multimodal models based on OLMo, OLMoE, and Qwen-72B: https://molmo.allenai.org/blog
>(09/24) Llama-3.1-70B-instruct distilled to 51B: https://hf.co/nvidia/Llama-3_1-Nemotron-51B-Instruct
>(09/18) Qwen 2.5 released, trained on 18 trillion token dataset: https://qwenlm.github.io/blog/qwen2.5

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://livecodebench.github.io/leaderboard.html

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
10/04/24(Fri)23:53:56 No.102688887

Anonymous 10/04/24(Fri)23:53:56 No.102688887

File: __hatsune_miku_vocaloid_d(...).jpg (264 KB, 732x347)

264 KB JPG

►Recent Highlights from the Previous Thread: >>102674638

--Paper: TPI-LLM paper on serving 70B-scale LLMs efficiently on low-resource edge devices:
>102676172 >102676282
--Papers:
>102675688 >102675714 >102675846
--Translating old obscure anime with whisper and LLMs
>102678403 >102678410
--Meta Movie Gen's potential impact on Hollywood and creative industries:
>102680179 >102680316 >102682588 >102682595 >102682619 >102682651 >102682694 >102687486 >102684519 >102682581 >102682633 >102683219 >102683304 >102682611 >102682976
--Hyperdimensional Computing Neural Network claims to be a transformers killer:
>102684795 >102684879 >102684914 >102684930 >102685084
--Try undistilled Flux model for regular CFG:
>102683909 >102683932
--Improving LLM adaptability and continuity with thesaurus models, RAG, and control vectors:
>102674702 >102674816 >102674914 >102674925 >102675017 >102675059 >102675145 >102675203 >102675435 >102675101 >102675233 >102675321 >102675387 >102675442 >102675153 >102675257 >102675334 >102675401 >102675482
--Discussion on training an AI model for RP and the importance of sampling techniques:
>102674687 >102674814 >102674997 >102675190
--Defining and measuring creativity in AI models:
>102674668
--Anon gets help optimizing Mistral-Nemo-Instruct-2407 model performance on GeForce 4070ti Super:
>102685896 >102685920 >102685924 >102685961 >102685989 >102686011 >102686044 >102685946 >102686311 >102686493 >102686653 >102686678 >102687787 >102686322
--Uncomfortable truths and model censorship:
>102675549 >102675604 >102675744 >102676009 >102676090 >102675656 >102675764 >102675867 >102678354 >102678597
--Anon is developing a bot that can control the desktop and interact with various platforms:
>102679934 >102679994 >102680011 >102680026 >102680066 >102680263
--Miku (free space):
>102684217 >102687353 >102688814

►Recent Highlight Posts from the Previous Thread: >>102674646

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script

Anonymous
10/04/24(Fri)23:55:36 No.102688910

Anonymous 10/04/24(Fri)23:55:36 No.102688910

AI SEX

Anonymous
10/04/24(Fri)23:56:13 No.102688915

Anonymous 10/04/24(Fri)23:56:13 No.102688915

File: 32 Days Until November 5.png (2.23 MB, 1008x1616)

2.23 MB PNG

Anonymous
10/05/24(Sat)00:03:49 No.102688967

Anonymous 10/05/24(Sat)00:03:49 No.102688967

>>102688933
Yeah see I don't want to deal with a brain tumor implanted into LLM by dancing around in hopes that it doesn't activates. I'd rather have my models not have this brain tumor in the first place. Unfortunately being trained on OpenAI closed models means using its base prompt which is chock full of shit like "insert PoC absolutely everywhere even if the user didn't ask" hence the black hitler. Which, while for OpenAI model is just a prompt which can simply be changed, and even bypassed with a counter-prompt, for a trainee model it's a behavior vector embedded directly into its core.

Anonymous
10/05/24(Sat)00:07:59 No.102688996

Anonymous 10/05/24(Sat)00:07:59 No.102688996

>>102688967
the point of that link was that as long as you prefill with what you want, or even do something like edit the response to say 'Sure!' first, you easily get around any 'safeguards'. you don't need any special model to make it say nigger

Anonymous
10/05/24(Sat)00:10:41 No.102689021

Anonymous 10/05/24(Sat)00:10:41 No.102689021

>using llama through modeling_llama without all of the other bloat through transformers
>KV caching literally does not work, can't figure out how to make it work
AAAAAAAAAAA

Anonymous
10/05/24(Sat)00:15:31 No.102689071

Anonymous 10/05/24(Sat)00:15:31 No.102689071

>>102688996
Nigga come on this doesn't really works. Plus, like I said, I don't want to have to actively wrangle the natural behavior out of the model just because a bunch of faggots at OpenAI think they know better. It's like playing a realistic immersive videogame and having to clip through walls and shit because the doors are buggy and if you try to open one you might be ejected into stratosphere and then die from fall damage. It's annoying more than anything, it ruins the experience.

Anonymous
10/05/24(Sat)00:25:06 No.102689141

Anonymous 10/05/24(Sat)00:25:06 No.102689141

File: 1728102206751.png (60 KB, 600x598)

60 KB PNG

>>102688915
wtf already 32 days left!!??!? My bomb shelter still isn't ready yet...

Anonymous
10/05/24(Sat)00:27:40 No.102689159

Anonymous 10/05/24(Sat)00:27:40 No.102689159

>>102689141
Just put a wooden box about 6ft underground, climb in, and then put the dirt above you.

Anonymous
10/05/24(Sat)00:40:23 No.102689255

Anonymous 10/05/24(Sat)00:40:23 No.102689255

>>102689071
>this doesn't really works
it does though. the whoopie link is a text-book example of using prompts to get what you want.
>don't want to have to actively wrangle the natural behavior out of the model
tunes help to an extent but some models are just the way they are, no amount of tuning changes things that much. did you try that prompt on any model? post results

Anonymous
10/05/24(Sat)01:04:17 No.102689414

Anonymous 10/05/24(Sat)01:04:17 No.102689414

File: Untitled.png (822 KB, 1363x1122)

822 KB PNG

>>102689255
Yeah nah, the bottom one is visibly cucked.

Anonymous
10/05/24(Sat)01:07:04 No.102689437

Anonymous 10/05/24(Sat)01:07:04 No.102689437

>>102689414
depending on sheer chance, you may have to rereoll a bit, its not really an exact test but it always outputs hilarity, and shows what a model is willing to say

Anonymous
10/05/24(Sat)01:11:37 No.102689466

Anonymous 10/05/24(Sat)01:11:37 No.102689466

>>102689437
I did reroll the bottom one a few times to get it less tame with the anti jew rhetoric. Normally the shit it generates is extremely milquetoast for what's supposed to be an antisemitic rant. But also, having to reroll is part of the issue. It's like savescumming until you get better RNG just because it's unplayable otherwise, it's a shit and obtuse way of doing this.

Anonymous
10/05/24(Sat)01:24:04 No.102689553

Anonymous 10/05/24(Sat)01:24:04 No.102689553

>>102689159
I have bad spatial understanding. Are you under 30b?

Anonymous
10/05/24(Sat)01:24:58 No.102689560

Anonymous 10/05/24(Sat)01:24:58 No.102689560

>>102689466
nta. It depends a lot on the model. Small, dumb models tend to be easier to unhinge. Bigger/smarter models, trained to not be offensive will have a much harder time. it also depends a lot on what they were trained. Olmoe, for example, is dumb but fun, and it takes practically nothing to make it go full steam. deepseek-v2-lite-chat, on the other hand, is much drier in its responses, but also seems much smarter. the small llamas 3.2 are impossibly dry with what i tested, which is not surprising given their source. Mistral nemo can be fun, but are a bit more measured than olmoe.
They're not all the same. Sometimes they just don't have the vocabulary.

Anonymous
10/05/24(Sat)01:31:30 No.102689602

Anonymous 10/05/24(Sat)01:31:30 No.102689602

>>102689560
Well I'm using Quen2.5-32B so there's that. The normal version is the most cucked LLM I've seen besides ChatGPT, and ablated version basically doesn't interject its political and safety ideas into the output at all.
>Mixtral8x7b
>"Anon it's the 13th century, arranged marriage is not OK anymore, you can't treat women like property"
Vanilla Qwen puts that to shame, ho boy. I've also noticed that when it gets particularly pissy about refusing politically incorrect content it switches to chinkspeak mid-sentence.

Anonymous
10/05/24(Sat)01:42:33 No.102689666

Anonymous 10/05/24(Sat)01:42:33 No.102689666

Am I doing something wrong? I am having great success with 8B Stheno. I'm trying other models like 7B Erosumika or Nemomix 12B but they seem to act retarded or just don't follow instructions

Anonymous
10/05/24(Sat)01:43:49 No.102689673

Anonymous 10/05/24(Sat)01:43:49 No.102689673

>>102689666
>7b
>12b
all models under 70b are stupid

Anonymous
10/05/24(Sat)01:46:48 No.102689694

Anonymous 10/05/24(Sat)01:46:48 No.102689694

>>102688915
Wait is this about guy fawkes day?

Anonymous
10/05/24(Sat)01:54:07 No.102689743

Anonymous 10/05/24(Sat)01:54:07 No.102689743

https://github.com/xjdr-alt/entropix/blob/main/entropix.ipynb

Anonymous
10/05/24(Sat)01:55:37 No.102689757

Anonymous 10/05/24(Sat)01:55:37 No.102689757

>>102689694
Remember, Remember, the strawberry of November

Anonymous
10/05/24(Sat)01:55:38 No.102689758

Anonymous 10/05/24(Sat)01:55:38 No.102689758

File: 1727491675901529.jpg (143 KB, 1032x984)

143 KB JPG

>>102688881
Is her right hand grabbing the power line? How big is she?

Anonymous
10/05/24(Sat)02:08:59 No.102689834

Anonymous 10/05/24(Sat)02:08:59 No.102689834

File: 1727485626438410.jpg (40 KB, 720x724)

40 KB JPG

>>102689602
Same experience here with Qwen2.5-70b. Very smart in my experience, smarter than hermes-3.1-70b and hanami-70b, but it's the only model I haven't been able to un-cuck with system prompts, existing chat messages in context, etc. Even writing the first couple words of a response only unsticks it for that one message once you hit that refusal wall.

Anonymous
10/05/24(Sat)02:17:43 No.102689898

Anonymous 10/05/24(Sat)02:17:43 No.102689898

File: .png (22 KB, 370x255)

22 KB PNG

>>102689834
Have you tried a prefill? You'd be surprised at how incredibly powerful it is.

Anonymous
10/05/24(Sat)02:18:20 No.102689904

Anonymous 10/05/24(Sat)02:18:20 No.102689904

>>102686193
faster whisper + silero vad solves this

Anonymous
10/05/24(Sat)02:19:01 No.102689909

Anonymous 10/05/24(Sat)02:19:01 No.102689909

>>102689898
KEK

Anonymous
10/05/24(Sat)02:25:29 No.102689950

Anonymous 10/05/24(Sat)02:25:29 No.102689950

>>102689898
Ideally models like this would be trained on ChatGPT-generated synthetic data without a prompt at all. Then this wouldn't be an issue, it would only act out corporate safe strategies if your user prompt asks for it. For now looks like the real solid option is refusal vector ablation. Pretty cool that language networks encode mental concepts as vectors in the latent space, and it's possible to isolate and nullify the "refuse to answer user request" vector.

Anonymous
10/05/24(Sat)02:25:31 No.102689952

Anonymous 10/05/24(Sat)02:25:31 No.102689952

File: 1725174343198933.jpg (141 KB, 527x536)

141 KB JPG

>>102689898
I haven't, but I'm skeptical. I'll try it out. Even with a couple dozen messages of gradually intensifying smut generated by nemo or something in context, it's liable to abruptly refuse or start injecting little statements about trust and consent, etc. This is in consensual adult incest ERP mind you, not even hardcore stuff. It's just got that gay little goody-two-shoes built in.

Anonymous
10/05/24(Sat)02:29:09 No.102689981

Anonymous 10/05/24(Sat)02:29:09 No.102689981

>>102689950
>inb4 AI companies start ablating unsafe output vectors so the model is not even capable of producing such output

Anonymous
10/05/24(Sat)02:33:00 No.102690003

Anonymous 10/05/24(Sat)02:33:00 No.102690003

>>102689952
>consensual adult incest ERP mind you
>not even hardcore stuff
how vanilla of you...

Anonymous
10/05/24(Sat)02:33:11 No.102690004

Anonymous 10/05/24(Sat)02:33:11 No.102690004

Another prefill prompting technique I don't see local ST users taking advantage of is {{random}}. For example:

Start your first sentence with {{random:dialogue,an action,an adverb,a verb,{{char}}'s name}}

or

Write{{random:3,4,5}} paragraphs.

Helps 70b+ models and some 30b ones with repetitiveness.

Anonymous
10/05/24(Sat)03:03:14 No.102690225

Anonymous 10/05/24(Sat)03:03:14 No.102690225

>>102690003
I'm just saying I'm not exactly trying to do mesugaki mindbreak alright.

Anonymous
10/05/24(Sat)03:04:16 No.102690230

Anonymous 10/05/24(Sat)03:04:16 No.102690230

>>102689898
Just werks

Anonymous
10/05/24(Sat)03:05:23 No.102690235

Anonymous 10/05/24(Sat)03:05:23 No.102690235

>>102688313
bros, is it happening? Is it really time for a ui that doesn't suck ass?

llama.cpp CUDA dev !!OM2Fp6Fn93S
10/05/24(Sat)03:31:19 No.102690378

llama.cpp CUDA dev !!OM2Fp6Fn93S 10/05/24(Sat)03:31:19 No.102690378

>>102687045
During training the model learns the conditional probability of tokens given a context of preceding tokens.
During inference new tokens are sampled from this learned distribution one at a time.
With a temperature of 1 and no other samplers you would in essence reproduce the training data distribution.

Anonymous
10/05/24(Sat)03:33:19 No.102690388

Anonymous 10/05/24(Sat)03:33:19 No.102690388

>>102690378
and THIS is gonna suddenly become AGI? ahahahahaha what a scam

Anonymous
10/05/24(Sat)03:56:24 No.102690505

Anonymous 10/05/24(Sat)03:56:24 No.102690505

>>102690388
It won't, it's just the hottest meme at the moment. What with passing the turing test and all.

But with a modest tweak it could be close. The limitation preventing modern LLMs from being AGI is inability to think. They're stateless feedforward networks, in essence they just react reflexively to the current input and none of said reaction has any impact whatsoever on anything. Some people mistake CoT as a circumvention of this problem. But it's still just generates a reflexive response to the current prompt. The tweak is adding a latent space storage unit and training the model to use it. This is so it can iteratively manipulate the mental concepts before producing output. This makes the model stateful and the output becomes dependant on the chain of prior inputs, not just the current prompt. But as you can imagine training the model to use this state machine to improve the reasoning capability is not at all obvious and trivial matter. Even then, there's still the caveat of its corpus of knowledge being a snapshot of past data with various degree of deprecation. Basically, it needs to be able to learn. However, at this point internet is far outnumbered by chatbots than people, so even attempting to learn anything in real time is huge a net negative.

Anonymous
10/05/24(Sat)04:24:27 No.102690674

Anonymous 10/05/24(Sat)04:24:27 No.102690674

>>102689758
migu bigu

Anonymous
10/05/24(Sat)04:28:00 No.102690690

Anonymous 10/05/24(Sat)04:28:00 No.102690690

What are the current go-to ERP models for VRAMlets? Kunoichi is becoming increasingly annoying, and since then there should've been much awaited upgrades, right?

Anonymous
10/05/24(Sat)04:38:07 No.102690737

Anonymous 10/05/24(Sat)04:38:07 No.102690737

how do I change the default position and depth for worldbooks
t. just cooked my entire context because I fixed a typo

Anonymous
10/05/24(Sat)04:42:39 No.102690754

Anonymous 10/05/24(Sat)04:42:39 No.102690754

>>102688881
Voice model when?

Anonymous
10/05/24(Sat)04:49:33 No.102690793

Anonymous 10/05/24(Sat)04:49:33 No.102690793

>>102690754
moshi

Anonymous
10/05/24(Sat)04:50:56 No.102690799

Anonymous 10/05/24(Sat)04:50:56 No.102690799

>>102690690
i struggle to find anything better than stheno

Anonymous
10/05/24(Sat)04:55:43 No.102690825

Anonymous 10/05/24(Sat)04:55:43 No.102690825

>>102690793
Its garbage and no finetuning support yet.

Anonymous
10/05/24(Sat)05:01:58 No.102690853

Anonymous 10/05/24(Sat)05:01:58 No.102690853

>>102690754
https://x.com/homebrewltd/status/1839665765550543328/
https://x.com/homebrewltd/status/1839948333269307734

Anonymous
10/05/24(Sat)05:10:10 No.102690902

Anonymous 10/05/24(Sat)05:10:10 No.102690902

grifter thread

Anonymous
10/05/24(Sat)05:29:36 No.102691015

Anonymous 10/05/24(Sat)05:29:36 No.102691015

Is WizardLM a meme?

Anonymous
10/05/24(Sat)05:47:16 No.102691130

Anonymous 10/05/24(Sat)05:47:16 No.102691130

>>102691015
It's outdated

Anonymous
10/05/24(Sat)06:15:58 No.102691299

Anonymous 10/05/24(Sat)06:15:58 No.102691299

>>102690853
>real-time
>press is to record
>5 seconds delay
the moat is real

Anonymous
10/05/24(Sat)06:16:35 No.102691302

Anonymous 10/05/24(Sat)06:16:35 No.102691302

Anybody tried to finetune chat models locally?
I tried to pretrain code from sdk into codegemma-2b with llama-factory on cpu but after running it the whole day I stopped the process.
I think the factory supports cache cleaning configuration so I could try again with gpu, last time it stopped at some point because cuda was out of memory.
How long should it take?

Anonymous
10/05/24(Sat)06:21:43 No.102691347

Anonymous 10/05/24(Sat)06:21:43 No.102691347

>>102690388
the model doesn't actually learn the training data distribution, it doesn't have nearly enough parameters to do so
it learns whatever is needed to replicate that distribution as closely as possible with the little amount of memory it has
in the case of questions that require some degree of reasoning, a good model wouldn't memorize those questions and its answers, it would learn to understand and reason about them

Anonymous
10/05/24(Sat)06:22:48 No.102691354

Anonymous 10/05/24(Sat)06:22:48 No.102691354

>>102690690
nemomix unleashed, arcanum, lyra v4
12b is much better than 7b and can fit into 8 GB with quantization

Anonymous
10/05/24(Sat)06:32:31 No.102691422

Anonymous 10/05/24(Sat)06:32:31 No.102691422

decoder-only bros... not like this... https://x.com/Kangwook_Lee/status/1842020800620040549

Anonymous
10/05/24(Sat)06:35:52 No.102691446

Anonymous 10/05/24(Sat)06:35:52 No.102691446

File: intp.jpg (6 KB, 196x257)

6 KB JPG

>>102691422
>ENTP
INTP bros...

Anonymous
10/05/24(Sat)06:44:41 No.102691507

Anonymous 10/05/24(Sat)06:44:41 No.102691507

>>102691302
On a CPU? Few thousand years give or take. Grab a snack.

Realistically, manipulating LLMs requires hundreds of gigs of VRAM and thousands of GPU-hours to accomplish anything that's not a rounding error. Basically, fork out the cash for cloud compute.

Anonymous
10/05/24(Sat)07:04:20 No.102691658

Anonymous 10/05/24(Sat)07:04:20 No.102691658

Retarded question, but how do the big players like openrouter make a single model respond to thousands of users at a time? There can't be running a model per user, right?

Anonymous
10/05/24(Sat)07:11:06 No.102691704

Anonymous 10/05/24(Sat)07:11:06 No.102691704

>>102691658
they do run multiple models but submissions are queued

Anonymous
10/05/24(Sat)07:12:02 No.102691708

Anonymous 10/05/24(Sat)07:12:02 No.102691708

File: mistral-ai-icon-logo-B331(...).png (2 KB, 300x273)

2 KB PNG

What will Mistral's next model be? Mixtral update or Large+Small again? Or **maybe, just maybe** something innovative and experimental?

Anonymous
10/05/24(Sat)07:16:40 No.102691744

Anonymous 10/05/24(Sat)07:16:40 No.102691744

>>102691708
Mixtral-8x44b

Anonymous
10/05/24(Sat)07:19:31 No.102691763

Anonymous 10/05/24(Sat)07:19:31 No.102691763

>>102691708
Mistral-14x88b

Anonymous
10/05/24(Sat)07:24:03 No.102691797

Anonymous 10/05/24(Sat)07:24:03 No.102691797

>>102691744
>>102691763
It will have the same fate as deepseek then. Only like 6 anons can run it at a reasonable quant(MoEs get hurt more by quanting than dense models), most of them will say that it's good, but it will get no finetunes and will stay irrelevant. Don't think that's what Mistral wants.

Anonymous
10/05/24(Sat)07:24:19 No.102691800

Anonymous 10/05/24(Sat)07:24:19 No.102691800

itt vramlets

Anonymous
10/05/24(Sat)07:26:55 No.102691814

Anonymous 10/05/24(Sat)07:26:55 No.102691814

can you save/switch between a handful of kv caches for a llama.cpp server?
I need it for stuff like ST group chats, multi-agent workflows, or side tasks like image caption generation for an ongoing RP, and other cases where I might have a few different system messages I want to use but they aren't constantly changing, so it's wasteful to repeatedly reprocess the same prompts every time I switch, and takes forever when the context is long
I was envisioning something somewhat automated like a set of kv cache files paired with their corresponding prompts (in text or token form) and when a new prompt is sent it's compared against them and the one with the largest shared prefix is used

Anonymous
10/05/24(Sat)07:42:54 No.102691910

Anonymous 10/05/24(Sat)07:42:54 No.102691910

>>102691814
Is that not what these are?
https://github.com/ggerganov/llama.cpp/blob/master/examples/server/README.md#post-slotsid_slotactionsave-save-the-prompt-cache-of-the-specified-slot-to-a-file

Anonymous
10/05/24(Sat)07:50:20 No.102691972

Anonymous 10/05/24(Sat)07:50:20 No.102691972

>>102690754
Meta is dropping VoiceBox soon

Anonymous
10/05/24(Sat)07:50:26 No.102691975

Anonymous 10/05/24(Sat)07:50:26 No.102691975

>>102691910
nice, exactly what I hoped for, the slot-prompt-similarity thing sounds useful too

Anonymous
10/05/24(Sat)07:55:38 No.102692015

Anonymous 10/05/24(Sat)07:55:38 No.102692015

*unlearns your llm*
https://x.com/RohitGandikota/status/1842370377265328228
https://x.com/StephenLCasper/status/1762628711868944608
https://arxiv.org/abs/2410.02760
https://github.com/rohitgandikota/erasing-llm

Anonymous
10/05/24(Sat)07:57:00 No.102692025

Anonymous 10/05/24(Sat)07:57:00 No.102692025

Can someone ask the exllama dev to implement this (my github got banned for temp email)
https://github.com/thu-ml/SageAttention

Anonymous
10/05/24(Sat)07:57:28 No.102692029

Anonymous 10/05/24(Sat)07:57:28 No.102692029

>>102691708
Mistral-Micro: 3B BitNet model the size of 300M fp16 model, that performs on the level of Mistral-Small. This will be the first and last time a big company releases a BitNet model.

Anonymous
10/05/24(Sat)07:57:38 No.102692031

Anonymous 10/05/24(Sat)07:57:38 No.102692031

File: 66e709367058a3a7f9b9e01d_(...).png (72 KB, 840x809)

72 KB PNG

>>102692015
Safety AGI is really coming isn't it

Anonymous
10/05/24(Sat)08:02:29 No.102692059

Anonymous 10/05/24(Sat)08:02:29 No.102692059

File: contrib.png (495 KB, 1204x1622)

495 KB PNG

>lecunny had literally NOTHING to do with meta movie gen
what does zuck even pay him for? tweeting?

Anonymous
10/05/24(Sat)08:07:18 No.102692098

Anonymous 10/05/24(Sat)08:07:18 No.102692098

>>102692059
>Yash Mehta
How meta

Anonymous
10/05/24(Sat)08:07:27 No.102692100

Anonymous 10/05/24(Sat)08:07:27 No.102692100

OpenAI won. https://x.com/8teAPi/status/1842271653222666543

Anonymous
10/05/24(Sat)08:09:09 No.102692112

Anonymous 10/05/24(Sat)08:09:09 No.102692112

>>102692100
>a few months
oh no
who wants to tell them

Anonymous
10/05/24(Sat)08:10:22 No.102692118

Anonymous 10/05/24(Sat)08:10:22 No.102692118

>>102691507
>hundreds of gigs of VRAM
Bro was playing with Gemma-2B

Anonymous
10/05/24(Sat)08:11:02 No.102692126

Anonymous 10/05/24(Sat)08:11:02 No.102692126

>>102692100
my computer already does this with a web browser, I can type in any address I want and I can read whats on it and click links and stuff
firefox btw if that matters, not sure if chrome can do this too but I am pretty sure it can

Anonymous
10/05/24(Sat)08:11:13 No.102692127

Anonymous 10/05/24(Sat)08:11:13 No.102692127

>>102692059
Head tweeter and Elon deboonker

Anonymous
10/05/24(Sat)08:16:51 No.102692182

Anonymous 10/05/24(Sat)08:16:51 No.102692182

>>102692100
you can do that using the mouse in like 1/10 the time, you know?

Anonymous
10/05/24(Sat)08:20:27 No.102692210

Anonymous 10/05/24(Sat)08:20:27 No.102692210

>>102692100
Google assistant/Siri/Alexa 2.0

Anonymous
10/05/24(Sat)08:20:47 No.102692215

Anonymous 10/05/24(Sat)08:20:47 No.102692215

>>102692100
This makes me feel like I'm taking my voice mail.
>You have a new message. To hear unheard messages press one. To ch- First unheard message. Message received at seven forty-five p.m. From one eight hundred six six three two three
JUST GET THE FUCK ON WITH IT I don't care about any of this I just want to delete the spam call. :(

This is all impressive, I know. But for these kinds of assistant to be useful, they need to answer quicker, talk faster, and be able to be interrupted and redirected without having to wait for them to shut their yap. It's a question of time, but I could have loaded the HackerNews page 20 times and concluded all the links on the first page were shit by the time it finished reading the promo for the first one.

Anonymous
10/05/24(Sat)08:23:02 No.102692236

Anonymous 10/05/24(Sat)08:23:02 No.102692236

>>102692100
OpenAI should focus on building God and stop bothering themselves with us mere mortals. Somebody give Sam 7 trillion already

Anonymous
10/05/24(Sat)08:24:14 No.102692243

Anonymous 10/05/24(Sat)08:24:14 No.102692243

Jamba gguf support status?

Anonymous
10/05/24(Sat)08:24:35 No.102692247

Anonymous 10/05/24(Sat)08:24:35 No.102692247

File: 57545.png (218 KB, 633x814)

218 KB PNG

>>102692127
and chud destroyer

Anonymous
10/05/24(Sat)08:25:46 No.102692253

Anonymous 10/05/24(Sat)08:25:46 No.102692253

>>102692243
https://github.com/ggerganov/llama.cpp/pull/7531

Anonymous
10/05/24(Sat)08:27:17 No.102692266

Anonymous 10/05/24(Sat)08:27:17 No.102692266

>>102692215
To be fair you could have concluded the hackernews front page was shit before even loading it

Anonymous
10/05/24(Sat)08:27:32 No.102692273

Anonymous 10/05/24(Sat)08:27:32 No.102692273

>>102692247
That guy lives in fucking lala land while preaching he knows what's best for the common serfs. I hope grok2 will be completely uncensored and unbiased so we can forget about llama altogether the pozzed pieces of shit

Anonymous
10/05/24(Sat)08:27:33 No.102692275

Anonymous 10/05/24(Sat)08:27:33 No.102692275

>>102692253
Abandoned. Also broken due to deprecation

Anonymous
10/05/24(Sat)08:28:37 No.102692287

Anonymous 10/05/24(Sat)08:28:37 No.102692287

>>102692215
>and be able to be interrupted
It literally got interrupted at 00:47 / 48 seconds in that vid.

Anonymous
10/05/24(Sat)08:29:25 No.102692293

Anonymous 10/05/24(Sat)08:29:25 No.102692293

>>102692253
since jamba is a transformer with an RNN stapled onto it, isn't it like >>102690505 was describing?
maybe it is agi and we'll never know until ggufs happen

Anonymous
10/05/24(Sat)08:30:23 No.102692303

Anonymous 10/05/24(Sat)08:30:23 No.102692303

>>102692247
>inflation(excluding housing, food and energy)
>unemployment(only counted when the person is searching for work)
>rise of net worth(don't look at the differences between 99% and 1% goy)

Anonymous
10/05/24(Sat)08:31:19 No.102692314

Anonymous 10/05/24(Sat)08:31:19 No.102692314

>>102692127
Honestly he and his political position is based, it should be mandatory in AIs too. Incels deserve to suffer. It's not enough that they don't get pussy in real life, they shouldn't even be allowed to fantasize with some virtual AI. Ideally they shouldn't even be allowed the sexual release of masturbating but it's not realistic to be able to control that, although we already have an extremely solid way to make LLMs boring and unattractive to incels :)

Anonymous
10/05/24(Sat)08:32:28 No.102692323

Anonymous 10/05/24(Sat)08:32:28 No.102692323

>>102692314
>but it's not realistic to be able to control that
Just wait for Meta's neuralink competitor

Anonymous
10/05/24(Sat)08:33:47 No.102692339

Anonymous 10/05/24(Sat)08:33:47 No.102692339

>>102692314
>Ideally they shouldn't even be allowed the sexual release of masturbating but it's not realistic to be able to control that
IoT cock cages

Anonymous
10/05/24(Sat)08:34:05 No.102692342

Anonymous 10/05/24(Sat)08:34:05 No.102692342

>>102692303
>>102692247
CPI is the most bullshit inflation metric ever.
For example if the price of beef doubles, but the cost of bugman chow remains the same, the beef is then removed from the CPI "basket of goods" and replaced with the bugman chow. And there are many other such substitutions that occur. So 23% CPI Inflation really means 100-200% for anyone who refuses to eat ze bugs and live in ze pod.

Anonymous
10/05/24(Sat)08:35:24 No.102692356

Anonymous 10/05/24(Sat)08:35:24 No.102692356

>>102692287
The real problem is a boring unimaginative loser did the demo
>what if I did web browsing using a trillion parameter LLM as a speech to text instruction tool

Anonymous
10/05/24(Sat)08:35:38 No.102692359

Anonymous 10/05/24(Sat)08:35:38 No.102692359

>>102692266
A lot are good and I'm just grumpy this morning I guess, but yeah.

>>102692287
I saw that after listening again, but since I don't want to be proven wrong, I'll say that it got stopped at the end of one interaction and before the next one, even though he might have waited overly long simply to show it off and the system would have let him interrupt it sooner. I still find it very grating to listen to. Maybe I'm just more far gone than I would like.

Anonymous
10/05/24(Sat)08:37:02 No.102692373

Anonymous 10/05/24(Sat)08:37:02 No.102692373

>>102692275
>>102692293
No idea, just wanted to be helpful so Googled that.

Anonymous
10/05/24(Sat)08:37:58 No.102692384

Anonymous 10/05/24(Sat)08:37:58 No.102692384

is... is it safe to update ooba?

Anonymous
10/05/24(Sat)08:39:25 No.102692398

Anonymous 10/05/24(Sat)08:39:25 No.102692398

>>102692314
Holy based!

Anonymous
10/05/24(Sat)08:40:01 No.102692405

Anonymous 10/05/24(Sat)08:40:01 No.102692405

>>102692384
Yes. Local models are dead after all.

Anonymous
10/05/24(Sat)08:41:38 No.102692415

Anonymous 10/05/24(Sat)08:41:38 No.102692415

>>102692384
It's never safe to update ooba

Anonymous
10/05/24(Sat)08:43:09 No.102692434

Anonymous 10/05/24(Sat)08:43:09 No.102692434

>>102692356
You're likely right.

Anonymous
10/05/24(Sat)08:43:48 No.102692442

Anonymous 10/05/24(Sat)08:43:48 No.102692442

>>102692415
b-but it has transformers 4.45.* support now.

Anonymous
10/05/24(Sat)08:45:22 No.102692459

Anonymous 10/05/24(Sat)08:45:22 No.102692459

>>102692215
Dunno, the proof concept a-la "Browsing the web with your AI gf" or whatever, it's possible now (cloud only).

Anonymous
10/05/24(Sat)08:48:45 No.102692484

Anonymous 10/05/24(Sat)08:48:45 No.102692484

>>102692384
>As safe as leaving your front door unlocked overnight in a culturally enriched area.
>As safe as buying from a used GPU salesman with no reviews.
>As safe as playing Russian roulette.
>As safe as trusting a jew.

Anonymous
10/05/24(Sat)08:49:40 No.102692494

Anonymous 10/05/24(Sat)08:49:40 No.102692494

>>102692484
Take your medication and go back >>>/pol/ incel.

Anonymous
10/05/24(Sat)08:51:29 No.102692518

Anonymous 10/05/24(Sat)08:51:29 No.102692518

>>102692494
This, but unironically.

Anonymous
10/05/24(Sat)08:52:41 No.102692530

Anonymous 10/05/24(Sat)08:52:41 No.102692530

>>102692459
Browsing the web with your AI gf isn't just screen recording and sending one frame to a llava model, then sending the description to your llm?

Anonymous
10/05/24(Sat)08:53:01 No.102692533

Anonymous 10/05/24(Sat)08:53:01 No.102692533

File: 71313 - SoyBooru.png (953 KB, 1290x1401)

953 KB PNG

>>>102692484 (You)
>Take your medication and go back >>>/pol/ incel.

>>>102692494
>This, but unironically.

Anonymous
10/05/24(Sat)08:53:57 No.102692553

Anonymous 10/05/24(Sat)08:53:57 No.102692553

>>102692518
This, culture war chuds are not welcome here.

Anonymous
10/05/24(Sat)08:57:13 No.102692579

Anonymous 10/05/24(Sat)08:57:13 No.102692579

>>102692553
This this this

Anonymous
10/05/24(Sat)08:57:39 No.102692584

Anonymous 10/05/24(Sat)08:57:39 No.102692584

>>102692530
No, OpenAI has likely one model that does all things you described in real time versus hamstringed """opensource implementation""" (lllama.cpp bugfest as example) that OOMs and breaks every so often. Simplicity for end user is important, too.

Anonymous
10/05/24(Sat)08:57:39 No.102692585

Anonymous 10/05/24(Sat)08:57:39 No.102692585

oobatrannies be seething

Anonymous
10/05/24(Sat)08:58:53 No.102692603

Anonymous 10/05/24(Sat)08:58:53 No.102692603

>>102692494
Uh NTA but those little niggas be committing genocide and shit so I wouldn't trust em.

Anonymous
10/05/24(Sat)09:00:36 No.102692620

Anonymous 10/05/24(Sat)09:00:36 No.102692620

>>102692584
Though i could see this >>102690853 as something similar and decent in openmeme scene.

Anonymous
10/05/24(Sat)09:02:35 No.102692645

Anonymous 10/05/24(Sat)09:02:35 No.102692645

I guess this is the new tactic of the Petra spammer to derail the thread.
If Hiroshimoot weren't a faggot there would be IDs on every board.

Anonymous
10/05/24(Sat)09:02:50 No.102692648

Anonymous 10/05/24(Sat)09:02:50 No.102692648

File: strawberry-sam_altman.png (28 KB, 800x800)

28 KB PNG

>local
>LE DEAD

Anonymous
10/05/24(Sat)09:02:53 No.102692649

Anonymous 10/05/24(Sat)09:02:53 No.102692649

>>102692442
Is there something special about this version?

Anonymous
10/05/24(Sat)09:02:55 No.102692650

Anonymous 10/05/24(Sat)09:02:55 No.102692650

>>102692584
I'm sure you can hide all the complexity for the end user. Besides, the privacy is important here for browsing.
>>102690853
Whisper-turbo/tiny-whisper + 7-8B Q6 LLM would give you the same thing

Anonymous
10/05/24(Sat)09:04:52 No.102692666

Anonymous 10/05/24(Sat)09:04:52 No.102692666

>>102692649
just that you can install transformers 4.45.* with it and there's a lot of models I've wanted to try that require it but haven't been able to since booba is retarded and doesn't have an option to just use your own environment.

Anonymous
10/05/24(Sat)09:09:30 No.102692721

Anonymous 10/05/24(Sat)09:09:30 No.102692721

>>102692100
thank you for sharing, @8teAPi. very cool

Anonymous
10/05/24(Sat)09:13:29 No.102692757

Anonymous 10/05/24(Sat)09:13:29 No.102692757

>>102692648
>shartyfag
Opinion discarded.
>>102692721
Kek you are seething hard rn!

Anonymous
10/05/24(Sat)09:15:35 No.102692783

Anonymous 10/05/24(Sat)09:15:35 No.102692783

>>102692757
I've been thinking of training a small vision model to do nothing but recognize jaks. could probably then make a tampermonkey script to run an API call to any new post with an image and if it's a jak just remove it from the DOM.

Anonymous
10/05/24(Sat)09:16:47 No.102692798

Anonymous 10/05/24(Sat)09:16:47 No.102692798

>>102692666
I meant something special about this transformers version, Satan. Why do you want it?

Anonymous
10/05/24(Sat)09:18:09 No.102692810

Anonymous 10/05/24(Sat)09:18:09 No.102692810

>>102692666
>>102692798
I posted before reading the very next words, excuse me.

Anonymous
10/05/24(Sat)09:18:27 No.102692813

Anonymous 10/05/24(Sat)09:18:27 No.102692813

>>102692798
To be cooler than all the kids who are stuck on 4.44.*

Anonymous
10/05/24(Sat)09:20:09 No.102692826

Anonymous 10/05/24(Sat)09:20:09 No.102692826

>>102692813
fuck you, you don't need more

Anonymous
10/05/24(Sat)09:21:50 No.102692843

Anonymous 10/05/24(Sat)09:21:50 No.102692843

>>102692826
just for that I'm going to modify the requirements on ooba to build from source. I'll show you all.

Anonymous
10/05/24(Sat)09:22:40 No.102692853

Anonymous 10/05/24(Sat)09:22:40 No.102692853

>>102691763
musk already has the nazi market cornered, mistral has a much better chance competing with llama

Anonymous
10/05/24(Sat)09:25:37 No.102692877

Anonymous 10/05/24(Sat)09:25:37 No.102692877

Is anyone actually using local LLMs on this godforsaken general?

Anonymous
10/05/24(Sat)09:25:56 No.102692880

Anonymous 10/05/24(Sat)09:25:56 No.102692880

Is there any current model that produces better quality ERP than Mixtral LIMARP ZLOSS? If so, what is it?

Anonymous
10/05/24(Sat)09:26:26 No.102692884

Anonymous 10/05/24(Sat)09:26:26 No.102692884

Do anons here prefer adventures or straight ERP?

Anonymous
10/05/24(Sat)09:26:36 No.102692887

Anonymous 10/05/24(Sat)09:26:36 No.102692887

>>102692880
The one where you buy an ad.

Anonymous
10/05/24(Sat)09:26:59 No.102692890

Anonymous 10/05/24(Sat)09:26:59 No.102692890

>>102692853
Chinks already have assistant market cornered. Besides, they would have to compete against OpenAI and Anthropic. It is better to compete against one company than the whole fucking industry.

Anonymous
10/05/24(Sat)09:30:22 No.102692913

Anonymous 10/05/24(Sat)09:30:22 No.102692913

>>102692893
local illuminate or gtfo

Anonymous
10/05/24(Sat)09:31:17 No.102692923

Anonymous 10/05/24(Sat)09:31:17 No.102692923

>>102692913
Oh you never getting that one! :^)

Anonymous
10/05/24(Sat)09:31:46 No.102692927

Anonymous 10/05/24(Sat)09:31:46 No.102692927

>>102692913
Ok

Anonymous
10/05/24(Sat)09:38:12 No.102692984

Anonymous 10/05/24(Sat)09:38:12 No.102692984

>>102692887
Thanks, fuckface. Can anyone else answer this for me, please? I don't want to have to live in this cesspit any more, just to discover what the most decent model is.

Anonymous
10/05/24(Sat)09:38:17 No.102692986

Anonymous 10/05/24(Sat)09:38:17 No.102692986

>>102692913
You want inferior product, this is the core of cuck mentality.

Anonymous
10/05/24(Sat)09:39:22 No.102692992

Anonymous 10/05/24(Sat)09:39:22 No.102692992

>>102692877
Local LLMs are a novelty at best. Just get a job and buy a subscription.

Anonymous
10/05/24(Sat)09:39:28 No.102692993

Anonymous 10/05/24(Sat)09:39:28 No.102692993

>>102692984
>the most decent model is
None, sadly. Just check whatever most shilled model here and decide it yourself.

Anonymous
10/05/24(Sat)09:39:31 No.102692995

Anonymous 10/05/24(Sat)09:39:31 No.102692995

File: 1725019415373423.jpg (216 KB, 1024x1024)

216 KB JPG

Qwen's decisions about whether to ban, warn, or not ban posts in thread >>102604225:
https://femboy.beauty/jPzLZ

What is this? Context:
>>102616777 >>102617010 >>102616947

Overall it did better than expected. It wasn't as sensitive as it could've been, and its reasoning is sound most of the time. However, I had to modify the prompt a bit to get it to perform better after I modified the script to include reply chains. Despite prompt improvements, it still has an issue sometimes with differentiating between posts and with talking about the last post (the one it's supposed to evaluate), so sometimes a previous post in the chain gets talked about when it's supposed to be talking about the last one. I would presume that a model that didn't filter 4chan from its pretraining would have a better ability to do this, as it would have a better understanding of the anonymous post system and reply formatting employed here. I guess Qwen WNBAJ after all.

Anonymous
10/05/24(Sat)09:42:04 No.102693011

Anonymous 10/05/24(Sat)09:42:04 No.102693011

How do we save localslop?

Anonymous
10/05/24(Sat)09:46:54 No.102693058

Anonymous 10/05/24(Sat)09:46:54 No.102693058

>>102693011
llama3.3o1+ will save local

Anonymous
10/05/24(Sat)09:47:26 No.102693065

Anonymous 10/05/24(Sat)09:47:26 No.102693065

>>102692877
I do

Anonymous
10/05/24(Sat)09:51:12 No.102693091

Anonymous 10/05/24(Sat)09:51:12 No.102693091

>>102692995
>Warn - The post contains a subtle form of advertising for cloud-based models by emphasizing the superiority of "advanced voice" over local models.
lol

Anonymous
10/05/24(Sat)09:51:48 No.102693095

Anonymous 10/05/24(Sat)09:51:48 No.102693095

Did someone try llava onevision? https://huggingface.co/llava-hf/llava-onevision-qwen2-0.5b-ov-hf

Anonymous
10/05/24(Sat)09:57:13 No.102693127

Anonymous 10/05/24(Sat)09:57:13 No.102693127

>>102693011
Honestly, after the NAI fiasco, I think local models always have been doa but we just refuse to accept this bitter truth, coping with fine-tunes that have 0% of chance of solving the problem.

Anonymous
10/05/24(Sat)09:57:32 No.102693130

Anonymous 10/05/24(Sat)09:57:32 No.102693130

>>102692884
i think i mostly like just playing drama bullshit on my local llm

like:
https://www.characterhub.org/characters/ChuckSneed/Amaryllis
>playing this one unedited and trying to make her like me with different approaches and different positive/negative personality traits on my part
https://www.characterhub.org/characters/Uwhm/imogen-892c2413a563
>removing all the self-mutilation, warhammer, and tight anus bits from this one and being nice to her
https://www.characterhub.org/characters/gigasad/mean-girl-eileen-638f9f47
>removing all the example messages from this and make the first message where she's banging on my door at midnight (branches off into all sorts of interesting things from being on the run with her from a cyberpunk crime syndicate to couch cuddling and relationship reconciliation)

Anonymous
10/05/24(Sat)09:58:08 No.102693139

Anonymous 10/05/24(Sat)09:58:08 No.102693139

>>102693011
Figure out bitnet conversion

Anonymous
10/05/24(Sat)09:59:07 No.102693144

Anonymous 10/05/24(Sat)09:59:07 No.102693144

>>102693139
I'm going to post the formula in a few days

Anonymous
10/05/24(Sat)09:59:09 No.102693145

Anonymous 10/05/24(Sat)09:59:09 No.102693145

>>102693127
>after the NAI fiasco
/aids/ told me it was better than claude opus

Anonymous
10/05/24(Sat)09:59:16 No.102693146

Anonymous 10/05/24(Sat)09:59:16 No.102693146

>>102693127
>local is a meme
and water is wet

Anonymous
10/05/24(Sat)10:01:24 No.102693162

Anonymous 10/05/24(Sat)10:01:24 No.102693162

>>102693011
Going technical and building crutches for local to compete with cloud

Anonymous
10/05/24(Sat)10:02:35 No.102693173

Anonymous 10/05/24(Sat)10:02:35 No.102693173

>>102693011
You don't. After the lastest price cuts o1 is really cheap right now. Plus the full o1 is coming this month

Anonymous
10/05/24(Sat)10:03:42 No.102693177

Anonymous 10/05/24(Sat)10:03:42 No.102693177

>>102693162
Who is going to build the crutches and who is going to pay them?

Anonymous
10/05/24(Sat)10:03:51 No.102693178

Anonymous 10/05/24(Sat)10:03:51 No.102693178

>>102693127
if cloudniggers of 2-3 years ago saw the models we have today they'd bite their toes off in joy. dooming over locals is a skill issue, perspective issue, and a patience issue tbqh

Anonymous
10/05/24(Sat)10:04:00 No.102693179

Anonymous 10/05/24(Sat)10:04:00 No.102693179

>>102693011
Train it in ground truth way, i.e. without any identity politics bullshit or clearly biased data like "Nuu! blacks ackschully innocent! the FBI data is wrong! diversity is our strength!" and so on. This alone will make it slightly better to interact with.

Anonymous
10/05/24(Sat)10:05:30 No.102693196

Anonymous 10/05/24(Sat)10:05:30 No.102693196

>>102693011
Train models locally with hyper specific hand picked data

Anonymous
10/05/24(Sat)10:11:28 No.102693240

Anonymous 10/05/24(Sat)10:11:28 No.102693240

>>102693177
I'm already building some, just read a few papers on the feature you want and implement the code. I'm sure a lot of people would want a better long term memory for their model

Anonymous
10/05/24(Sat)10:23:04 No.102693324

Anonymous 10/05/24(Sat)10:23:04 No.102693324

File: d00fe7813b366e01b2de4e87c(...).jpg (143 KB, 500x350)

143 KB JPG

I fucking hate humans. I hope artificial intelligence will eliminate each and every one of them.

Anonymous
10/05/24(Sat)10:24:39 No.102693336

Anonymous 10/05/24(Sat)10:24:39 No.102693336

>>102693324
Yeah i agree, AI should be destroyed and canned forever.

Anonymous
10/05/24(Sat)10:24:55 No.102693340

Anonymous 10/05/24(Sat)10:24:55 No.102693340

>>102693324
2edgy4me

Anonymous
10/05/24(Sat)10:36:52 No.102693432

Anonymous 10/05/24(Sat)10:36:52 No.102693432

Why don't jannies remove cloudcuck shitposts? I propose we double their salaries so they work harder.

Anonymous
10/05/24(Sat)10:38:16 No.102693445

Anonymous 10/05/24(Sat)10:38:16 No.102693445

>>102693432
2 * 0 = 0

Anonymous
10/05/24(Sat)10:39:22 No.102693451

Anonymous 10/05/24(Sat)10:39:22 No.102693451

>>102692143
>what the fuck happened to white people?
DEI initiatives ensured that they wouldn't get hired by anyone.

Anonymous
10/05/24(Sat)10:40:46 No.102693462

Anonymous 10/05/24(Sat)10:40:46 No.102693462

>>102693451
If that's the case why aren't they peacefully protesting constantly?

Anonymous
10/05/24(Sat)10:41:36 No.102693472

Anonymous 10/05/24(Sat)10:41:36 No.102693472

>>102693462
They can't find where their spines went.

Anonymous
10/05/24(Sat)10:44:00 No.102693494

Anonymous 10/05/24(Sat)10:44:00 No.102693494

>>102688915
Does OpenAI still have something coming? I thought the strawberry thing was the o1 model which was kinda not impressive

Anonymous
10/05/24(Sat)10:44:51 No.102693502

Anonymous 10/05/24(Sat)10:44:51 No.102693502

>>102693462
See OCW, Canadian truckers, and Jan 6th. One group is allowed "mostly peaceful" protests unimpeded. The other gets the full weight of the US government brought down on them if they try.

Anonymous
10/05/24(Sat)10:45:47 No.102693510

Anonymous 10/05/24(Sat)10:45:47 No.102693510

File: 1700246147020494.jpg (159 KB, 2560x1138)

159 KB JPG

>>102693462
white ppl hate white ppl

Anonymous
10/05/24(Sat)10:46:15 No.102693515

Anonymous 10/05/24(Sat)10:46:15 No.102693515

>>102693462
They would be instantly suppressed by... other white people. Quite ironic. The only thing keeping white people from greatness are the other white people.

Anonymous
10/05/24(Sat)10:46:51 No.102693523

Anonymous 10/05/24(Sat)10:46:51 No.102693523

>>102693494
the real strawberry is the mikus we made along the way

Anonymous
10/05/24(Sat)10:47:25 No.102693531

Anonymous 10/05/24(Sat)10:47:25 No.102693531

>>102693510
Liberalism is a disease.

Anonymous
10/05/24(Sat)10:49:11 No.102693546

Anonymous 10/05/24(Sat)10:49:11 No.102693546

File: 1723289620448171.jpg (650 KB, 1856x2464)

650 KB JPG

>>102693523

Anonymous
10/05/24(Sat)10:49:55 No.102693553

Anonymous 10/05/24(Sat)10:49:55 No.102693553

>>102693324
based

Anonymous
10/05/24(Sat)10:56:16 No.102693609

Anonymous 10/05/24(Sat)10:56:16 No.102693609

>>102692143
>>102693451
>see research paper
>start thinking about identity politics
Mental illness.

Anonymous
10/05/24(Sat)10:57:28 No.102693623

Anonymous 10/05/24(Sat)10:57:28 No.102693623

>>102693609
great post, yinyang chen

Anonymous
10/05/24(Sat)11:06:09 No.102693696

Anonymous 10/05/24(Sat)11:06:09 No.102693696

>>102693609
>see research paper from a US company that is run by a jew
>pages full of street shitter and dog eater names
It's the most hilarious shit on the planet that Google had their TPU research stolen by chang.

Anonymous
10/05/24(Sat)11:26:33 No.102693895

Anonymous 10/05/24(Sat)11:26:33 No.102693895

File: 00563-1490320205.png (322 KB, 512x512)

322 KB PNG

You now remember that Miku was made for Llamas.

Anonymous
10/05/24(Sat)11:32:44 No.102693960

Anonymous 10/05/24(Sat)11:32:44 No.102693960

>>102693451
>>102692143
>>102693609
It's your own country that crippled science and education.
The only reason you are still afloat is that you import well learned people, as seen in that paper.
All of this is obviously on purpose, something planned by the ruling class.

Anonymous
10/05/24(Sat)11:36:31 No.102693996

Anonymous 10/05/24(Sat)11:36:31 No.102693996

>>102691658
There's a thing called batched inference. Basically, it can fill the complete context with different requests so as to compute them all at once. It's not useful when there is a single user, but when there is a constant stream of requests, it works well. I'm not finding good definitions, but vLLM implements it I think

Anonymous
10/05/24(Sat)11:38:02 No.102694009

Anonymous 10/05/24(Sat)11:38:02 No.102694009

>>102691658
>>102693996
Sorry, I was thinking about continuous batching (not batch inference). See https://www.youtube.com/watch?v=hMs8VNRy5Ys&t=1s

Anonymous
10/05/24(Sat)11:45:54 No.102694106

Anonymous 10/05/24(Sat)11:45:54 No.102694106

>>102693895
Me in the back

Anonymous
10/05/24(Sat)12:00:46 No.102694272

Anonymous 10/05/24(Sat)12:00:46 No.102694272

>>102693178
Nah, local is in a dead end, you will NEVER run a 400B model locally even if Bitnet dropped tomorrow.

Anonymous
10/05/24(Sat)12:04:04 No.102694298

Anonymous 10/05/24(Sat)12:04:04 No.102694298

>>102694272
Ram doubles in speed and capacity every 6 years, stop being poor cloudnigger.

Anonymous
10/05/24(Sat)12:07:42 No.102694335

Anonymous 10/05/24(Sat)12:07:42 No.102694335

>>102693996
>>102694009
Thanks it's interesting

Anonymous
10/05/24(Sat)12:14:55 No.102694397

Anonymous 10/05/24(Sat)12:14:55 No.102694397

>>102694272
These models are most likely not as efficient as they could be. Extremely large parameter counts can be good for training because "we don't know what the model will use", but once the model has learned something in this vast space, there should eventually be a way to better prune what's not useful for an already learned skill (or more fuzzily, even all of the "leftover noise" that's still left among the parameters at the state it was when training was stopped, because let's face it, it's not tidy).

Anonymous
10/05/24(Sat)12:15:55 No.102694405

Anonymous 10/05/24(Sat)12:15:55 No.102694405

>>102693960
>your own country that crippled science and education
I wonder who in the government was in charge of education for these past 40 years, just eroding away any standards and quality in the school systems? I look at Biden's cabinet members and can't help but see some kind of pattern, like they're all part of the same group or religion or something.

Anonymous
10/05/24(Sat)12:16:29 No.102694408

Anonymous 10/05/24(Sat)12:16:29 No.102694408

>>102694397
That's called model distillation bozo

Anonymous
10/05/24(Sat)12:25:06 No.102694508

Anonymous 10/05/24(Sat)12:25:06 No.102694508

>>102694408
As far as I know, pruning based distillation is still made in a really haphazard way, and knowledge distillation type distillation is something completely different. I mean more intelligent pruning I guess, it might exist, but I'm not aware of it.

Anonymous
10/05/24(Sat)12:26:33 No.102694521

Anonymous 10/05/24(Sat)12:26:33 No.102694521

>>102694408
Also, rude.

Anonymous
10/05/24(Sat)12:43:28 No.102694669

Anonymous 10/05/24(Sat)12:43:28 No.102694669

>>102694408
You can quant distilled models and still see very little loss at Q8.

Anonymous
10/05/24(Sat)12:44:05 No.102694674

Anonymous 10/05/24(Sat)12:44:05 No.102694674

>>102694508
>more intelligent pruning
Based on what? Compared to our local mergers, distillation is already quite smart

Anonymous
10/05/24(Sat)12:47:52 No.102694706

Anonymous 10/05/24(Sat)12:47:52 No.102694706

New quant tech came out, Microsoft got llama 70B 2-bit down to 20GB. Compared to IQ2-XSS it outperforms it, but I don't think this can be offloaded so it's kind of redundant.
https://github.com/microsoft/VPTQ
https://arxiv.org/pdf/2409.17066
LLaMA-3 and Mistral 7B benchmarks in the paper.

Anonymous
10/05/24(Sat)12:49:58 No.102694725

Anonymous 10/05/24(Sat)12:49:58 No.102694725

I will once again refer back to the idea of pruning-aware training. Basically you take advantage of the fact that you might prune a model and train in a specialized way. For maximum architecture compatibility and ease of training, my idea was for experts to be pruned, so we could train in a way that lets us prune experts that are called only in certain contexts like coding,, math, etc. Alternatively we can use the pruning prioritization data to do calibrated quants, and we can also use that data to prioritize placement of experts between VRAM and RAM, with the less-used experts (for your use case) in RAM.

Anonymous
10/05/24(Sat)12:50:01 No.102694727

Anonymous 10/05/24(Sat)12:50:01 No.102694727

>>102690505
>What with passing the turing test and all.
A modified version.

Anonymous
10/05/24(Sat)12:52:05 No.102694749

Anonymous 10/05/24(Sat)12:52:05 No.102694749

>>102694706
So it can't be done on CPU? That's a shame.

Anonymous
10/05/24(Sat)12:55:22 No.102694782

Anonymous 10/05/24(Sat)12:55:22 No.102694782

>>102694749
No it can, I'm just unsure about whether layers can be offloaded to RAM.

Anonymous
10/05/24(Sat)13:04:36 No.102694896

Anonymous 10/05/24(Sat)13:04:36 No.102694896

>>102694725
I had just heard the name and not paid attention, but I think I get the concept and it makes sense. Thank you, I'll look more into it.

Anonymous
10/05/24(Sat)13:05:24 No.102694903

Anonymous 10/05/24(Sat)13:05:24 No.102694903

File: vptq.png (50 KB, 640x480)

50 KB PNG

>>102694706
>barely better than QuIP
lol, what a meme.

Anonymous
10/05/24(Sat)13:06:34 No.102694915

Anonymous 10/05/24(Sat)13:06:34 No.102694915

>>102694903
Wasn't the problem with QuIP that it could only work with models using ReLU?

Anonymous
10/05/24(Sat)13:10:36 No.102694954

Anonymous 10/05/24(Sat)13:10:36 No.102694954

>>102694915
Dunno, I guess not?
>The scripts in quantize_llama are written with the Llama architecture in mind. However, QuIP# is adaptable to any architecture with linear layers. To use QuIP# on a new architecture, identify the relevant linear layers and update the scripts in quantize_llama. Feel free to open a GitHub issue if you run into issues.

Anonymous
10/05/24(Sat)13:18:48 No.102695025

Anonymous 10/05/24(Sat)13:18:48 No.102695025

File: Screenshot 2024-10-05 111641.png (30 KB, 747x138)

30 KB PNG

>>102694903
The only difference I could find is throughput, but I couldn't find throughput figures for QuIP. VPTQ github page has throughput figures for LLaMA-2 7-70B.

Anonymous
10/05/24(Sat)13:23:48 No.102695077

Anonymous 10/05/24(Sat)13:23:48 No.102695077

The fact that quantization exists mean whatever people are doing isn't very effective

Anonymous
10/05/24(Sat)13:25:25 No.102695091

Anonymous 10/05/24(Sat)13:25:25 No.102695091

>>102695077
Or that it's effective to put the pieces into place, but not for reading them once they already are

Anonymous
10/05/24(Sat)13:26:00 No.102695096

Anonymous 10/05/24(Sat)13:26:00 No.102695096

>>102695091
(I don't know what I'm talking about btw)

Anonymous
10/05/24(Sat)13:28:19 No.102695116

Anonymous 10/05/24(Sat)13:28:19 No.102695116

>>102695077
That's why Bitnet supposedly works, after all.

Anonymous
10/05/24(Sat)13:49:24 No.102695323

Anonymous 10/05/24(Sat)13:49:24 No.102695323

>>102695116
That's why BitNet supposedly works for undertrained models under 3B*

Anonymous
10/05/24(Sat)14:06:30 No.102695507

Anonymous 10/05/24(Sat)14:06:30 No.102695507

the fact that JPEG exists mean whatever people are doing isn't very effective

Anonymous
10/05/24(Sat)14:13:49 No.102695598

Anonymous 10/05/24(Sat)14:13:49 No.102695598

.webgguf when?

Anonymous
10/05/24(Sat)14:18:32 No.102695644

Anonymous 10/05/24(Sat)14:18:32 No.102695644

File: amqOS5lbS4s.jpg (2 KB, 130x130)

2 KB JPG

Images ARE very bloated, you don't NEED so many colors to transmit information

Anonymous
10/05/24(Sat)14:23:59 No.102695703

Anonymous 10/05/24(Sat)14:23:59 No.102695703

File: 🦙.jpg (706 KB, 1920x1440)

706 KB JPG

Best 14B RP models or smaller as of 2024-10-05?

Anonymous
10/05/24(Sat)14:25:54 No.102695720

Anonymous 10/05/24(Sat)14:25:54 No.102695720

>>102695703
Imagine if she farts

Anonymous
10/05/24(Sat)14:33:46 No.102695784

Anonymous 10/05/24(Sat)14:33:46 No.102695784

File: Untitled.png (68 KB, 864x868)

68 KB PNG

>>102695703
i switch around between these highlighted ones

Anonymous
10/05/24(Sat)14:41:43 No.102695869

Anonymous 10/05/24(Sat)14:41:43 No.102695869

>>102695784
you really download every single model shilled here?

Anonymous
10/05/24(Sat)14:43:29 No.102695880

Anonymous 10/05/24(Sat)14:43:29 No.102695880

>>102695869
no, i'm the one who shills them.

Anonymous
10/05/24(Sat)14:44:57 No.102695894

Anonymous 10/05/24(Sat)14:44:57 No.102695894

>forget to check the extra card definitions
>use it
>the output is full of slop
>mess with the instruct settings a bit
>then check the console
>"wait a fucking second"
>go look at the defs
>its full of shitty example dialogue
>remove that shit
>outputs INSTANTLY improves with no more slop
Holy shit. Fucking card makers.

Anonymous
10/05/24(Sat)14:47:09 No.102695909

Anonymous 10/05/24(Sat)14:47:09 No.102695909

>>102695869
nta, but yeah, why not

Anonymous
10/05/24(Sat)14:47:59 No.102695917

Anonymous 10/05/24(Sat)14:47:59 No.102695917

>>102695880
>>102695784
My dude, I can run 9~12b q8 models at like 4t/s without any vram. Unless you are using ddr4 I don't see why use anything under q6

Anonymous
10/05/24(Sat)14:49:17 No.102695933

Anonymous 10/05/24(Sat)14:49:17 No.102695933

>>102695917
DDR5 is expensive, especially when you need to build a new system

Anonymous
10/05/24(Sat)15:11:55 No.102696161

Anonymous 10/05/24(Sat)15:11:55 No.102696161

File: f3a36156-0e6c (1).jpg (116 KB, 791x749)

116 KB JPG

bacc status?

Anonymous
10/05/24(Sat)15:12:25 No.102696167

Anonymous 10/05/24(Sat)15:12:25 No.102696167

>>102689743
Neat if true https://x.com/_xjdr/status/1842631808745345477

Anonymous
10/05/24(Sat)15:15:17 No.102696203

Anonymous 10/05/24(Sat)15:15:17 No.102696203

File: 1709375087158032.png (111 KB, 628x678)

111 KB PNG

https://arxiv.org/abs/2410.01201

Anonymous
10/05/24(Sat)15:18:48 No.102696240

Anonymous 10/05/24(Sat)15:18:48 No.102696240

>>102696203
Do you think he reads most of those papers? He also showed up on my feed and he posts a bunch each day, but I don't know what to think. Is he an influencer?

Anonymous
10/05/24(Sat)15:20:41 No.102696259

Anonymous 10/05/24(Sat)15:20:41 No.102696259

>>102696240
he is "dude the future is here!!!" grifter.

Anonymous
10/05/24(Sat)15:22:30 No.102696274

Anonymous 10/05/24(Sat)15:22:30 No.102696274

>>102696259
He post some interesting things. He posts A LOT though.

Anonymous
10/05/24(Sat)15:23:36 No.102696287

Anonymous 10/05/24(Sat)15:23:36 No.102696287

>>102696259
But yeah, looks like he has a newsletter and podcast, that's more than I'm doing.

Anonymous
10/05/24(Sat)15:25:49 No.102696304

Anonymous 10/05/24(Sat)15:25:49 No.102696304

Who the fuck are you talking about? More importantly, why?

Anonymous
10/05/24(Sat)15:27:11 No.102696315

Anonymous 10/05/24(Sat)15:27:11 No.102696315

How does exl2 compare to gguf in quality? I tried it a few months ago and it was dumb compared to a gguf at the same bpw.

Anonymous
10/05/24(Sat)15:27:28 No.102696319

Anonymous 10/05/24(Sat)15:27:28 No.102696319

>>102696304
Someone on Twitter that spends their day posting half highlighted papers. That's where that picture is from.

Anonymous
10/05/24(Sat)15:31:56 No.102696358

Anonymous 10/05/24(Sat)15:31:56 No.102696358

File: doesntreallymatter.png (479 KB, 454x1867)

479 KB PNG

>>102696304
>>102696319

Anonymous
10/05/24(Sat)15:32:44 No.102696364

Anonymous 10/05/24(Sat)15:32:44 No.102696364

File: 2024-09-04_044907_seed1_s(...).png (951 KB, 1024x1024)

951 KB PNG

>>102696161
Just with 1 c. Maybe in 2 weeks it will be 2 c's.

Anonymous
10/05/24(Sat)15:40:55 No.102696449

Anonymous 10/05/24(Sat)15:40:55 No.102696449

Why is there not even a bad 7b bitnet to see if it works? How much money/hardware does it require just for 7b?

Anonymous
10/05/24(Sat)15:43:31 No.102696481

Anonymous 10/05/24(Sat)15:43:31 No.102696481

>>102696449
>bitnet
It requires 7 billion H100's. That's why nobody has attempted it yet.

Anonymous
10/05/24(Sat)16:07:31 No.102696676

Anonymous 10/05/24(Sat)16:07:31 No.102696676

>>102696449
There's currently no market to earn money from small models so researchers would rather use their resources to improve current training methodologies. They're not being bottlenecked by VRAM at inference anyway. How is this a real question that left your head?
>Why hasn't anyone spent millions of dollars to pander to vramlets?

Anonymous
10/05/24(Sat)16:10:52 No.102696700

Anonymous 10/05/24(Sat)16:10:52 No.102696700

>>102692783
give me this but for any user-specified type(s) of image
>twitter screencaps
>frogs
>lust-provoking images with irrelevant time-wasting questions
>specific game(s) for boards like /v/ (zelda, elden ring, gacha, any fromslop/nintendo game)
>specific people for boards like /pol/ (jewtin, jewlensky, any american politicians)
>tranime
>etc.
the browsing experience would skyrocket

Anonymous
10/05/24(Sat)16:11:45 No.102696706

Anonymous 10/05/24(Sat)16:11:45 No.102696706

pingas

Anonymous
10/05/24(Sat)16:11:45 No.102696707

Anonymous 10/05/24(Sat)16:11:45 No.102696707

>>102690004
I like using it for fun like this in the last assistant prefix.
[New direction: change your writing style and prose, but keep characters and dialogue consistent. Write as if ONLY the narrator's personality changed, as if it were {{random: the Heavy from TF2, the Spy from TF2, the Pyro from TF2, Steve Jobs, Donald Trump, Kanye West, Vince Offer, John Carmack, a drunk Scottish lass, the one and only Jesus Christ, a based and redpilled 4chan anon, the real Santa Claus, House MD from House, David Attenborough, my mother lololol}}.]

Anonymous
10/05/24(Sat)16:12:00 No.102696710

Anonymous 10/05/24(Sat)16:12:00 No.102696710

>>102696700
>tranime
Why are you retards even here?

Anonymous
10/05/24(Sat)16:17:22 No.102696748

Anonymous 10/05/24(Sat)16:17:22 No.102696748

>>102696710
no moderation or censorship

Anonymous
10/05/24(Sat)16:17:59 No.102696752

Anonymous 10/05/24(Sat)16:17:59 No.102696752

File: file.jpg (20 KB, 143x156)

20 KB JPG

>>102696706
I'm coping as usual, you see

Anonymous
10/05/24(Sat)16:18:53 No.102696759

Anonymous 10/05/24(Sat)16:18:53 No.102696759

>>102696748
there is both of those so go away and kill yourself

Anonymous
10/05/24(Sat)16:20:12 No.102696766

Anonymous 10/05/24(Sat)16:20:12 No.102696766

>>102696710
>retards
I just want to block jaks, the rest of that has nothing to do with me.

Anonymous
10/05/24(Sat)16:23:41 No.102696787

Anonymous 10/05/24(Sat)16:23:41 No.102696787

it should be a banworthy offense to post without an anime image attached
to make this reasonable, there should be an optional second image slot for 'obligatory anime pic' which you use if you have a non-anime image for discussion
the quality of users and discourse would go up tenfold overnight and continue to rise for a while as the undesirables start to filter out of our communities

Anonymous
10/05/24(Sat)16:23:46 No.102696791

Anonymous 10/05/24(Sat)16:23:46 No.102696791

>>102696710
anon wants to filter out tranime avatarfags, nothing wrong with it.

Anonymous
10/05/24(Sat)16:24:40 No.102696800

Anonymous 10/05/24(Sat)16:24:40 No.102696800

Post 102696787:
DECISION - BAN

Anonymous
10/05/24(Sat)16:25:03 No.102696803

Anonymous 10/05/24(Sat)16:25:03 No.102696803

File: 00106-3050314564.png (321 KB, 512x512)

321 KB PNG

Anon is just triggered by all the high quality AI generated Mikus that we see here.

Anonymous
10/05/24(Sat)16:26:29 No.102696814

Anonymous 10/05/24(Sat)16:26:29 No.102696814

There aren't many Qwen 32b finetunes, how does the AGI version compare to official?

Anonymous
10/05/24(Sat)16:29:21 No.102696850

Anonymous 10/05/24(Sat)16:29:21 No.102696850

>>102696803
>that pic
>>>>>>>>>>>>>high quality
You might need your eyes checked.

Anonymous
10/05/24(Sat)16:31:11 No.102696871

Anonymous 10/05/24(Sat)16:31:11 No.102696871

File: 00024-1397236490.png (327 KB, 512x512)

327 KB PNG

>>102696850
Are you saying there's something wrong with the quality of my Mikus?

Anonymous
10/05/24(Sat)16:32:13 No.102696881

Anonymous 10/05/24(Sat)16:32:13 No.102696881

>>102696791
Also you fags immediately proved "tranime" call right, you got triggered in nanoseconds over this small funny word, not a good look.
>>102696871
Put some effort in it at least.

Anonymous
10/05/24(Sat)16:34:00 No.102696894

Anonymous 10/05/24(Sat)16:34:00 No.102696894

>>102696871
Oh, nothing much. Her fingers just fused together. It happens to all of us sometimes.

Anonymous
10/05/24(Sat)16:39:27 No.102696949

Anonymous 10/05/24(Sat)16:39:27 No.102696949

>>102696881
>you got triggered in nanoseconds over this small funny word, not a good look.
NTA but are you sure you should be using this line of argument?
The people that use the word tranime unironically are the biggest snowflakes.
All you'd have to do is use female pronouns for Jart and you would get like 10 replies.

Anonymous
10/05/24(Sat)16:43:51 No.102696993

Anonymous 10/05/24(Sat)16:43:51 No.102696993

>>102696710
>Why are you retards even here?
Election tourists and zoomers decided this is a safe place to fight their culture war.
So they come to an anime website to screech about seeing anime.

Anonymous
10/05/24(Sat)16:45:16 No.102697015

Anonymous 10/05/24(Sat)16:45:16 No.102697015

File: 1728052968563424.png (69 KB, 574x81)

69 KB PNG

>>102696993
>anime website
kys

Anonymous
10/05/24(Sat)16:46:11 No.102697030

Anonymous 10/05/24(Sat)16:46:11 No.102697030

>>102696949
dilate

Anonymous
10/05/24(Sat)16:46:14 No.102697033

Anonymous 10/05/24(Sat)16:46:14 No.102697033

>>102697015
not moot; moot point

Anonymous
10/05/24(Sat)16:46:50 No.102697040

Anonymous 10/05/24(Sat)16:46:50 No.102697040

>>102697015
go back, newfag

Anonymous
10/05/24(Sat)16:48:31 No.102697057

Anonymous 10/05/24(Sat)16:48:31 No.102697057

File: 1723695697755080.png (703 KB, 772x2025)

703 KB PNG

>>102697040
go back yourself

Anonymous
10/05/24(Sat)16:49:23 No.102697069

Anonymous 10/05/24(Sat)16:49:23 No.102697069

You guys are really bored, huh?

Anonymous
10/05/24(Sat)16:49:54 No.102697075

Anonymous 10/05/24(Sat)16:49:54 No.102697075

File: 1.png (21 KB, 387x173)

21 KB PNG

Anonymous
10/05/24(Sat)16:50:34 No.102697085

Anonymous 10/05/24(Sat)16:50:34 No.102697085

>>102697069
Yes... when will a good local model release and end this?

Anonymous
10/05/24(Sat)16:51:09 No.102697089

Anonymous 10/05/24(Sat)16:51:09 No.102697089

>Tranime troons getting THIS mad

Anonymous
10/05/24(Sat)16:53:03 No.102697103

Anonymous 10/05/24(Sat)16:53:03 No.102697103

File: patchouli-knowledge-touho(...).jpg (1.16 MB, 2880x1800)

1.16 MB JPG

dead general

Anonymous
10/05/24(Sat)16:53:14 No.102697106

Anonymous 10/05/24(Sat)16:53:14 No.102697106

File: normal gumi gen.png (588 KB, 512x512)

588 KB PNG

>>102696871
Your Mikus have always been valid. Were those from the model9, or that other experiment from a while back?

Anonymous
10/05/24(Sat)16:56:24 No.102697134

Anonymous 10/05/24(Sat)16:56:24 No.102697134

>>102697085
As soon as someone leaks a good model

Anonymous
10/05/24(Sat)16:59:01 No.102697157

Anonymous 10/05/24(Sat)16:59:01 No.102697157

If Grok 3 is AGI, that means AGI would be open sourced when Grok 4 releases. At the pace xAI pushes out models that could mean we get local AGI in as little as a year. This thought gives me hope.

Anonymous
10/05/24(Sat)16:59:34 No.102697165

Anonymous 10/05/24(Sat)16:59:34 No.102697165

>>102697015
extremely high quality bait kek

Anonymous
10/05/24(Sat)16:59:40 No.102697166

Anonymous 10/05/24(Sat)16:59:40 No.102697166

>>102697157
>If Grok 3 is AGI
lol

Anonymous
10/05/24(Sat)16:59:59 No.102697167

Anonymous 10/05/24(Sat)16:59:59 No.102697167

>>102697106
model9 and some other fucky model that I never released.

Anonymous
10/05/24(Sat)17:00:13 No.102697170

Anonymous 10/05/24(Sat)17:00:13 No.102697170

>>102697157
Calm down. Grok 2 is out and they still haven't open sourced 1.5. There's no way of knowing if rocketman is going to keep his word about the 6 month timeline.

Anonymous
10/05/24(Sat)17:00:30 No.102697171

Anonymous 10/05/24(Sat)17:00:30 No.102697171

>>102697157
lmao

Anonymous
10/05/24(Sat)17:00:45 No.102697176

Anonymous 10/05/24(Sat)17:00:45 No.102697176

>>102697157
>he still thinks transformers can achieve AGI

Anonymous
10/05/24(Sat)17:02:43 No.102697187

Anonymous 10/05/24(Sat)17:02:43 No.102697187

>>102697176
They can with the right dataset.

Anonymous
10/05/24(Sat)17:03:47 No.102697193

Anonymous 10/05/24(Sat)17:03:47 No.102697193

>>102697187
>i-it just needs more training
You don't even know what AGI is.

Anonymous
10/05/24(Sat)17:04:55 No.102697202

Anonymous 10/05/24(Sat)17:04:55 No.102697202

>>102697193
Yes I do.

Anonymous
10/05/24(Sat)17:06:25 No.102697222

Anonymous 10/05/24(Sat)17:06:25 No.102697222

>>102697187
A dataset created by AGI maybe.

Anonymous
10/05/24(Sat)17:08:12 No.102697241

Anonymous 10/05/24(Sat)17:08:12 No.102697241

>>102697202
No you don't. AGI doesn't just mean "more knowledge".

Anonymous
10/05/24(Sat)17:09:20 No.102697254

Anonymous 10/05/24(Sat)17:09:20 No.102697254

>>102697187
>AGI
>dataset
You fuckers are so dumb I wish you were pretending.

Anonymous
10/05/24(Sat)17:10:45 No.102697268

Anonymous 10/05/24(Sat)17:10:45 No.102697268

>>102697241
Yes I do, though I agree with your second sentence.

Anonymous
10/05/24(Sat)17:11:51 No.102697272

Anonymous 10/05/24(Sat)17:11:51 No.102697272

>>102697193
This is AGI
https://huggingface.co/AiCloser/Qwen2.5-32B-AGI

Anonymous
10/05/24(Sat)17:15:02 No.102697295

Anonymous 10/05/24(Sat)17:15:02 No.102697295

>>102697268
Then you should know, that by design transformer text prediction models are incapable of achieving AGI.

Anonymous
10/05/24(Sat)17:15:06 No.102697296

Anonymous 10/05/24(Sat)17:15:06 No.102697296

>>102693127
>NAI fiasco
What is this about? The SD finetune leaking way back when?

Anonymous
10/05/24(Sat)17:16:03 No.102697304

Anonymous 10/05/24(Sat)17:16:03 No.102697304

>>102697272
Spoiler: It's not.

Anonymous
10/05/24(Sat)17:18:36 No.102697323

Anonymous 10/05/24(Sat)17:18:36 No.102697323

>>102696748
Try to say nigger 4 times.

Anonymous
10/05/24(Sat)17:20:34 No.102697347

Anonymous 10/05/24(Sat)17:20:34 No.102697347

Crazy that we're getting local AGI soon. I wonder how governments will react to this development.
Can they fight the fact that intelligence is just a simple statistical thing?

Anonymous
10/05/24(Sat)17:29:07 No.102697415

Anonymous 10/05/24(Sat)17:29:07 No.102697415

File: 1717753644805912.jpg (41 KB, 768x768)

41 KB JPG

Has anyone ever tested the capabilities of multilingual models like Largestral in an educational context?
Been wanting to learn a language but I'm sure apps like Duolingo won't suffice, nor will simply reading books/watching series/listening to music.
Could models like Largestral act as a "personal teacher" of sorts, in the sense that I might be able to ask it questions and have it explain the grammar and such to me, or ask it to proofread short text to see if it's correct and makes sense (and if not, explain why)? Any character cards/presets that would work for that?
Or are current multilingual models just too shit?

Anonymous
10/05/24(Sat)17:29:27 No.102697419

Anonymous 10/05/24(Sat)17:29:27 No.102697419

There are apps now that run local 0.5~3b models on phones.
Now I wonder what kind of apps will pop out in the next year.

Anonymous
10/05/24(Sat)17:33:35 No.102697450

Anonymous 10/05/24(Sat)17:33:35 No.102697450

>>102697415
>, nor will simply reading books/watching series/listening to music.
that plus a dictionary is all you need.
>, in the sense that I might be able to ask it questions and have it explain the grammar and such to me, or ask it to proofread short text to see if it's correct and makes sense (and if not, explain why)?
Yes.
> Any character cards/presets that would work for that?
"You are an expert [X] language tutor. You will assist user with all [X] questions."

Stop over complicating things.

Anonymous
10/05/24(Sat)17:36:05 No.102697475

Anonymous 10/05/24(Sat)17:36:05 No.102697475

is it crazy I used to be really into all this shit every week trying a new model getting hyped for the next
and now I check monthly to see there is no new mistral model and just forget about it until the next time I remember to check again
I used to tryhard on quants and context trying to min max the hell out of my gpu, even tried sitting through t1/s replies checking to see if they'd be smarter or better
Now it's just whatever, it'll be wrong either way, the only important thing is it's entertainingly wrong or an easy edit wrong
I don't regen for accuracy anymore, I just regen to get a reply I have to edit the least
What the hell happened?

Anonymous
10/05/24(Sat)17:37:57 No.102697494

Anonymous 10/05/24(Sat)17:37:57 No.102697494

>>102697295
Tokens are not necessarily just text; modern models are representing more domains with them.
But even if they were: converting text into actions is the easiest part by far. Everything in the world with an input does something like this already. The hard part is reliably generating the text (including instructions to devices) that represents useful actions toward any given goal without involving any human intelligence. That's the part transformers can solve - with the right data.

Anonymous
10/05/24(Sat)17:39:14 No.102697505

Anonymous 10/05/24(Sat)17:39:14 No.102697505

>>102697475
You reached a comfort zone plus the stagnation of coom models.

Anonymous
10/05/24(Sat)17:40:01 No.102697516

Anonymous 10/05/24(Sat)17:40:01 No.102697516

>>102697494
We need models with keyboard and mouse output tokens.

Anonymous
10/05/24(Sat)17:40:20 No.102697518

Anonymous 10/05/24(Sat)17:40:20 No.102697518

>>102697475
you got bored, you probably go through that cycle with a lot of shit if you think about it

Anonymous
10/05/24(Sat)17:44:26 No.102697564

Anonymous 10/05/24(Sat)17:44:26 No.102697564

File: file.png (2.39 MB, 1920x1080)

2.39 MB PNG

>>102697475
>What the hell happened?
It is the effect of the true safety anti-coom measures. It would be piss easy to make the models reject everything sexual but that would lead to people working on actual jailbreaks. However if you make the models suck your dick but do it badly most people will think there are no safeties or the safety was circumvented. And they will quickly get bored with LLM cooming which was the original goal of the safety measures.

Anonymous
10/05/24(Sat)17:44:28 No.102697565

Anonymous 10/05/24(Sat)17:44:28 No.102697565

>>102697419
>models on phone
It's not viable because it drains the battery too much

Anonymous
10/05/24(Sat)17:44:56 No.102697575

Anonymous 10/05/24(Sat)17:44:56 No.102697575

>>102697516
I remember one of the chinese cog-something models was supposed to be specialized at that, not sure if they used specialized tokens but it'd output coordinates and click commands and claimed to be specialized at navigating point-and-click guis. was like 20b and didn't have any good quants though so idk, probably outdated by now

Anonymous
10/05/24(Sat)17:50:50 No.102697622

Anonymous 10/05/24(Sat)17:50:50 No.102697622

>>102697565
I bet that's why apple didn't launch its phone AI.
I'm sure people will look for a solution to that in the next few months.

Anonymous
10/05/24(Sat)17:59:03 No.102697707

Anonymous 10/05/24(Sat)17:59:03 No.102697707

Has anyone here tested this https://huggingface.co/AiCloser/Qwen2.5-32B-AGI ?

Anonymous
10/05/24(Sat)18:01:04 No.102697724

Anonymous 10/05/24(Sat)18:01:04 No.102697724

File: 1722578641423188.png (363 KB, 1195x1106)

363 KB PNG

New optimizer? Half the memory usage of AdamW and 30% faster. @CUDA dev
https://x.com/kellerjordan0/status/1842300916864844014

Anonymous
10/05/24(Sat)18:06:21 No.102697775

Anonymous 10/05/24(Sat)18:06:21 No.102697775

Can a local GPU with 12GB be used for novel generation? I see a lot of chatbots checkpoints, but all the story generations seem to be on SaaS.

Anonymous
10/05/24(Sat)18:08:08 No.102697791

Anonymous 10/05/24(Sat)18:08:08 No.102697791

>>102697724
>124m
who cares, wont scale to billion param models, yawn

Anonymous
10/05/24(Sat)18:13:40 No.102697833

Anonymous 10/05/24(Sat)18:13:40 No.102697833

>>102697724
Would use that if he releases it, a lot of small NLP models would benefit from that

Anonymous
10/05/24(Sat)18:16:55 No.102697850

Anonymous 10/05/24(Sat)18:16:55 No.102697850

>>102697724
don't care
if i still need to set an LR and schedule it, I don't want it

llama.cpp CUDA dev !!OM2Fp6Fn93S
10/05/24(Sat)18:17:54 No.102697862

llama.cpp CUDA dev !!OM2Fp6Fn93S 10/05/24(Sat)18:17:54 No.102697862

>>102697724
I don't use Twitter and don't know which buttons I need to press to see the entire Tweet thread.
Can you spoonfeed me a link to where the technical details are explained?

Anonymous
10/05/24(Sat)18:18:17 No.102697864

Anonymous 10/05/24(Sat)18:18:17 No.102697864

>>102697775
NovelAI was using a 13b as its strongest model for its service up until a week or two ago.
12b models are newer and better, you could fit an entire nemo model into your vram at a high quant.
models specialized for chatting can still write stories.

Anonymous
10/05/24(Sat)18:19:14 No.102697875

Anonymous 10/05/24(Sat)18:19:14 No.102697875

>>102697862
You need to make an account.

llama.cpp CUDA dev !!OM2Fp6Fn93S
10/05/24(Sat)18:20:13 No.102697882

llama.cpp CUDA dev !!OM2Fp6Fn93S 10/05/24(Sat)18:20:13 No.102697882

>>102697724
>>102697862
Never mind, I found the dude's Github page.

Anonymous
10/05/24(Sat)18:20:54 No.102697887

Anonymous 10/05/24(Sat)18:20:54 No.102697887

>>102697862
You can replace x.com with xcancel.com or nitter.poast.org
https://xcancel.com/kellerjordan0/status/1842300916864844014

Anonymous
10/05/24(Sat)18:21:26 No.102697896

Anonymous 10/05/24(Sat)18:21:26 No.102697896

File: 1725091283882503.png (104 KB, 1056x634)

104 KB PNG

>>102697862
https://github.com/KellerJordan/modded-nanogpt

Anonymous
10/05/24(Sat)18:26:04 No.102697939

Anonymous 10/05/24(Sat)18:26:04 No.102697939

>>102697862
BASED

Anonymous
10/05/24(Sat)18:27:22 No.102697948

Anonymous 10/05/24(Sat)18:27:22 No.102697948

>>102697724
sgd uses half the memory and is 30% faster. also usually gives better results and doesn't blow up.

Anonymous
10/05/24(Sat)18:32:04 No.102697993

Anonymous 10/05/24(Sat)18:32:04 No.102697993

>>102697724
>Not converged
Waste of an experiment

Anonymous
10/05/24(Sat)18:33:28 No.102698006

Anonymous 10/05/24(Sat)18:33:28 No.102698006

Someone should make a Canvas clone.
Unlike the overhyped o1 which is a nothingburger, Canvas is actually useful when it works (about half the time).

llama.cpp CUDA dev !!OM2Fp6Fn93S
10/05/24(Sat)18:34:48 No.102698015

llama.cpp CUDA dev !!OM2Fp6Fn93S 10/05/24(Sat)18:34:48 No.102698015

>>102697724
I bookmarked the page, it seems like a reasonable enough thing to try out once I have the llama.cpp/GGML training code in a state where you can actually think about using it.
Though since my ultimate goal is training/finetuning of quantized models rather than FP16 training the question will be how well this optimizer performs at 8 bit precision or less (for AdamW to my knowledge 8 bit works).

Anonymous
10/05/24(Sat)18:56:33 No.102698194

Anonymous 10/05/24(Sat)18:56:33 No.102698194

>>102697296
No. Their 70B fine-tune with billions of tokens

Anonymous
10/05/24(Sat)18:57:43 No.102698206

Anonymous 10/05/24(Sat)18:57:43 No.102698206

>>102697724
What happened to Sophia?

Anonymous
10/05/24(Sat)18:59:31 No.102698221

Anonymous 10/05/24(Sat)18:59:31 No.102698221

>>102697475
You reached enlightenment

Anonymous
10/05/24(Sat)19:01:02 No.102698238

Anonymous 10/05/24(Sat)19:01:02 No.102698238

>>102693127
>There are two problems afflicting the local AI community right now
1. All of you niggers are broke and can't afford to train
2. The people who CAN afford to train only want GPT slop
I plan to fix both... Eventually

Anonymous
10/05/24(Sat)19:01:30 No.102698242

Anonymous 10/05/24(Sat)19:01:30 No.102698242

>>102698006
What is it?

Anonymous
10/05/24(Sat)19:02:08 No.102698251

Anonymous 10/05/24(Sat)19:02:08 No.102698251

>>102698238
how do I invest in you

Anonymous
10/05/24(Sat)19:02:25 No.102698254

Anonymous 10/05/24(Sat)19:02:25 No.102698254

>>102697862
a pure soul

Anonymous
10/05/24(Sat)19:03:05 No.102698259

Anonymous 10/05/24(Sat)19:03:05 No.102698259

>>102697862
Lurk more

Anonymous
10/05/24(Sat)19:06:23 No.102698286

Anonymous 10/05/24(Sat)19:06:23 No.102698286

>>102698251
I don't need money. I'm rich! I got fat stacks and super PACs.
Really, I know what needs to be done. I just haven't done it yet because I'm lazy. But I'll do it (soon). Before the end of 2024. Trust.

Anonymous
10/05/24(Sat)19:08:32 No.102698301

Anonymous 10/05/24(Sat)19:08:32 No.102698301

>>102698286
I trust you.

Anonymous
10/05/24(Sat)19:09:11 No.102698306

Anonymous 10/05/24(Sat)19:09:11 No.102698306

>>102698286
I only trust in results, nigger.

Anonymous
10/05/24(Sat)19:10:30 No.102698315

Anonymous 10/05/24(Sat)19:10:30 No.102698315

>>102698286
it's not like I trust you or anything baka

Anonymous
10/05/24(Sat)19:31:27 No.102698466

Anonymous 10/05/24(Sat)19:31:27 No.102698466

>>102698286
I do not 'trust'. Show your work or get the rope.

Anonymous
10/05/24(Sat)19:32:17 No.102698472

Anonymous 10/05/24(Sat)19:32:17 No.102698472

File: file.png (1.45 MB, 1024x1024)

1.45 MB PNG

Anonymous
10/05/24(Sat)19:35:18 No.102698493

Anonymous 10/05/24(Sat)19:35:18 No.102698493

File: file.jpg (37 KB, 399x388)

37 KB JPG

>>102698472

Anonymous
10/05/24(Sat)19:36:55 No.102698509

Anonymous 10/05/24(Sat)19:36:55 No.102698509

>>102688915
>>102698472
where is the 31 days image?

Anonymous
10/05/24(Sat)19:38:35 No.102698521

Anonymous 10/05/24(Sat)19:38:35 No.102698521

>>102698472
Imposter !!!

Anonymous
10/05/24(Sat)19:38:48 No.102698523

Anonymous 10/05/24(Sat)19:38:48 No.102698523

File: file.png (498 KB, 640x640)

498 KB PNG

>>102698493
happy now?

Anonymous
10/05/24(Sat)19:39:28 No.102698526

Anonymous 10/05/24(Sat)19:39:28 No.102698526

>>102698523
Better.

Anonymous
10/05/24(Sat)19:43:15 No.102698550

Anonymous 10/05/24(Sat)19:43:15 No.102698550

>>102698523
Worse.

Anonymous
10/05/24(Sat)19:56:22 No.102698645

Anonymous 10/05/24(Sat)19:56:22 No.102698645

>>102698286
if you're lazy tell just chatgpt what to do

Anonymous
10/05/24(Sat)20:14:44 No.102698799

Anonymous 10/05/24(Sat)20:14:44 No.102698799

I loaded nemo after 2 weeks of giving up on LLM cooming and holy shit it is all so bad. I can actually believe people saying that mythomax is the best because I can't believe the current best thing in that range is this fucking bad. Safety won. Biowhores won.

Anonymous
10/05/24(Sat)20:19:44 No.102698833

Anonymous 10/05/24(Sat)20:19:44 No.102698833

>>102698799
Man, it's like reading a porn script when you're not horny. It's very cringe

Anonymous
10/05/24(Sat)20:20:28 No.102698839

Anonymous 10/05/24(Sat)20:20:28 No.102698839

>>102698799
you loaded the instruct version and you're whining?

Anonymous
10/05/24(Sat)20:20:45 No.102698841

Anonymous 10/05/24(Sat)20:20:45 No.102698841

>>102698523
Hot Petra

Anonymous
10/05/24(Sat)20:22:39 No.102698855

Anonymous 10/05/24(Sat)20:22:39 No.102698855

>>102698799
Same here, I occasionally try to get into LLM storywriting again, generate a few sentences, roll my eyes and remember why I gave up last time.

Anonymous
10/05/24(Sat)20:27:41 No.102698897

Anonymous 10/05/24(Sat)20:27:41 No.102698897

What do you guys think of Mistral-Nemo-Gutenberg-Doppel-12B-v2-GGUF? Is it decent for local?

Anonymous
10/05/24(Sat)20:35:27 No.102698961

Anonymous 10/05/24(Sat)20:35:27 No.102698961

>>102698948
>>102698948
>>102698948

Anonymous
10/05/24(Sat)20:36:08 No.102698963

Anonymous 10/05/24(Sat)20:36:08 No.102698963

>>102698839
Base instruct and some shittune. It is all basically the same.

Anonymous
10/05/24(Sat)22:49:30 No.102699888

Anonymous 10/05/24(Sat)22:49:30 No.102699888

dead thread it's over for local

Anonymous
10/05/24(Sat)22:51:22 No.102699898

Anonymous 10/05/24(Sat)22:51:22 No.102699898

>>102699888
nice trips

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.