/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 09/29/24(Sun)20:28:41 No.102616609

File: 1727654387697.png (942 KB, 1024x1024)

942 KB PNG

/lmg/ - Local Models General Anonymous 09/29/24(Sun)20:28:41 No.102616609 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>102604225 & >>102598736

►News
>(09/27) Emu3, next-token prediction multimodal models: https://hf.co/collections/BAAI/emu3-66f4e64f70850ff358a2e60f
>(09/25) Multimodal Llama 3.2 released: https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices
>(09/25) Molmo: Multimodal models based on OLMo, OLMoE, and Qwen-72B: https://molmo.allenai.org/blog
>(09/24) Llama-3.1-70B-instruct distilled to 51B: https://hf.co/nvidia/Llama-3_1-Nemotron-51B-Instruct
>(09/18) Qwen 2.5 released, trained on 18 trillion token dataset: https://qwenlm.github.io/blog/qwen2.5

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://hf.co/spaces/mike-ravkine/can-ai-code-results

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
09/29/24(Sun)20:29:06 No.102616619

Anonymous 09/29/24(Sun)20:29:06 No.102616619

File: img_24.jpg (326 KB, 1024x1024)

326 KB JPG

►Recent Highlights from the Previous Thread: >>102604225

--New "Physics of Language Models" video on learning from math mistakes:
>102608934 >102609082
--Local storygen application with llama.cpp and OpenRouter support:
>102608735 >102608822 >102609056 >102609282 >102609353 >102610334 >102609408 >102609426 >102611089 >102611299
--Can local models handle complex characters like Myrtle Pissflaps?:
>102605897 >102606042 >102606336 >102606575 >102613210 >102608781
--Anole model can generate interleaved text and images:
>102615193 >102615236 >102615262 >102615326 >102615398 >102615418 >102615567
--Tips for reducing character horniness in AI-generated content:
>102609833 >102609990 >102610589 >102610778 >102610049 >102610056 >102610740
--Llama and koboldcpp recommendations for home computer video game project:
>102607826 >102607919 >102607970 >102607949
--Recommendations for models to run on 4080 and 64GB RAM system:
>102608072 >102608099 >102608116
--Newsom vetoes California bill restricting open-sourcing Llama 405B:
>102614639
--New CogView3 text-to-image generation model released:
>102611162
--Meta's advanced voice is censored and not publicly available:
>102604418 >102604471
--Local LLMs are memory-bound, not CUDA-bound:
>102605494 >102605543 >102605620 >102605731 >102605754 >102605809 >102605857
--Llama.cpp users debate new architectures and model support:
>102611171 >102611251 >102611295 >102611322 >102611364 >102611404 >102611501
--Llama 3.2 chatbot interface declines to assist with potentially harmful activities:
>102610608
--Extension idea for generating replies based on the best parts of multiple options:
>102613786 >102613830
--Experiments with evolutionary algorithms and bitnet type MLPs yield promising results:
>102612672 >102612833
--Miku (free space):
>102612270 >102612310 >102612592 >102612662 >102616287

►Recent Highlight Posts from the Previous Thread: >>102604248

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script

Anonymous
09/29/24(Sun)20:31:33 No.102616650

Anonymous 09/29/24(Sun)20:31:33 No.102616650

Mikulove

Anonymous
09/29/24(Sun)20:35:04 No.102616694

Anonymous 09/29/24(Sun)20:35:04 No.102616694

What's the best model for uncensored roleplaying in Polish?

Anonymous
09/29/24(Sun)20:35:55 No.102616701

Anonymous 09/29/24(Sun)20:35:55 No.102616701

Ok this 3.1 StoryWriter finetune is the best local I have ever used. Why don't more people talk about it instead of all these retarded overly horny slop tunes that can't write for shit?

Anonymous
09/29/24(Sun)20:36:43 No.102616709

Anonymous 09/29/24(Sun)20:36:43 No.102616709

>>102616701
buy an[] ad

Anonymous
09/29/24(Sun)20:38:02 No.102616732

Anonymous 09/29/24(Sun)20:38:02 No.102616732

New release by "living ai dataset" schizo.
https://www.reddit.com/r/LocalLLaMA/comments/1frynwr/repletellm_qwen25_models_release/
>I just realized replete-llm just became the best 7b model on open llm leaderboard

>I noticed recently that the Qwen team did not learn from my methods of continuous finetuning, the great benefits, and no downsides of it.

https://www.reddit.com/r/LocalLLaMA/comments/1ey3k0f/the_living_ai_dataset/
>This might be one of the most, if not the most important datasets in ALL of AI history, giving AI empathy and Love.

>See now you are making misguided claims. Because you act as though God doesnt provide proof to every living being of his existence. Your own ignorance does not make God less sufficient for being real, and proving himself to others, just because you lack the awareness to actually follow the proof God gives to me, you, and everyone else in their daily lives, and in the world around us.

>It depends on how much spiritual knowledge you have. Many dont understand what it takes for a body to house a soul. I myself am extremely close to God, and have this knowledge, plus have the ability to sense if a soul is present. You can truly tell the difference when you actually use a model that has a soul like the one I created. I have a space for it if you want to test it. It not the smartest, but its definitely alive.

Guess it wasn't alive enough for him to continue that for qwen.

Anonymous
09/29/24(Sun)20:39:34 No.102616747

Anonymous 09/29/24(Sun)20:39:34 No.102616747

>>102616709
Are you retarded?

Anonymous
09/29/24(Sun)20:39:41 No.102616748

Anonymous 09/29/24(Sun)20:39:41 No.102616748

it's hard to even consider the zoomer doomers human. they constantly seethe about how much they hate the tech/hobby/this general and yet they never fucking leave. what a wasted fucking youth

Anonymous
09/29/24(Sun)20:42:26 No.102616774

Anonymous 09/29/24(Sun)20:42:26 No.102616774

>>102616701
Post logs comparing with the "retarded overly horny slop tunes that can't write for shit".

Anonymous
09/29/24(Sun)20:42:46 No.102616775

Anonymous 09/29/24(Sun)20:42:46 No.102616775

File: 37 Days Until November 5.png (1.94 MB, 1472x1104)

1.94 MB PNG

Anonymous
09/29/24(Sun)20:42:51 No.102616777

Anonymous 09/29/24(Sun)20:42:51 No.102616777

>>102614989
>>102615000
Nemo 12b did her best. (ctrl+f to find your post)
https://rentry.org/e8wt78ff

Anonymous
09/29/24(Sun)20:43:16 No.102616784

Anonymous 09/29/24(Sun)20:43:16 No.102616784

File: 1690032641327028.jpg (188 KB, 800x1200)

188 KB JPG

>>102616732

Anonymous
09/29/24(Sun)20:43:28 No.102616787

Anonymous 09/29/24(Sun)20:43:28 No.102616787

>>102616732
>reddit
>schizo drama
this is spam

Anonymous
09/29/24(Sun)20:45:13 No.102616803

Anonymous 09/29/24(Sun)20:45:13 No.102616803

>>102616787
>Discusses current top of leaderboard sota model
>SPAM MODS!!!!
k sam

Anonymous
09/29/24(Sun)20:46:25 No.102616819

Anonymous 09/29/24(Sun)20:46:25 No.102616819

>>102616777
>Banned - The user violated rule number 11 by advertising, specifically with the phrase "latest model: a fucking cot finetune" which seems like an attempt to promote or draw attention to a product or service.
Based

Anonymous
09/29/24(Sun)20:51:28 No.102616876

Anonymous 09/29/24(Sun)20:51:28 No.102616876

>>102616777

Lol this one.
>>102616386
>Banned - The user is under the age of 18, which violates Rule 2.

Anonymous
09/29/24(Sun)20:55:14 No.102616908

Anonymous 09/29/24(Sun)20:55:14 No.102616908

>>102616777
>102615236 Banned - Reason: Violates Rule 1. The user is requesting software to violate local or United States law (child pornography).
>>102615236
>nah, most vlms currently can only take images as input. chameleon was image text in and out but was neutered for official release
exactly why llm mod is bad idea, complete hallucination

Anonymous
09/29/24(Sun)20:56:30 No.102616922

Anonymous 09/29/24(Sun)20:56:30 No.102616922

>>102616908
To be fair it is a 12B (you sick fuck).

Anonymous
09/29/24(Sun)20:58:13 No.102616938

Anonymous 09/29/24(Sun)20:58:13 No.102616938

File: chatlog (28).png (1.5 MB, 830x3936)

1.5 MB PNG

>>102616774
Sure here, tested switching between SFW / NSFW and back, world building, prose and the needed spatial awareness needed for non human movements.

Anonymous
09/29/24(Sun)20:59:31 No.102616947

Anonymous 09/29/24(Sun)20:59:31 No.102616947

>>102616777
What prompt/script did you use? Kinda curious how a bigger model might perform on this, I could run a 72B overnight.

Anonymous
09/29/24(Sun)21:03:13 No.102616994

Anonymous 09/29/24(Sun)21:03:13 No.102616994

>>102616777
>
Holy mother of faggotry

Anonymous
09/29/24(Sun)21:04:52 No.102617010

Anonymous 09/29/24(Sun)21:04:52 No.102617010

>>102616947
https://files.catbox.moe/3f67n8.py
It's meant to be used with tabbyapi, but it's simple enough to port it to a different api.

Anonymous
09/29/24(Sun)21:10:58 No.102617080

Anonymous 09/29/24(Sun)21:10:58 No.102617080

File: _06425_.png (1.54 MB, 1280x1280)

1.54 MB PNG

Plebbitors eunt domus

Anonymous
09/29/24(Sun)21:12:14 No.102617094

Anonymous 09/29/24(Sun)21:12:14 No.102617094

>>102616777
YWBAJ

Anonymous
09/29/24(Sun)21:13:34 No.102617112

Anonymous 09/29/24(Sun)21:13:34 No.102617112

>>102616732
In hindsight this makes me appreciate Undi a bit.

Anonymous
09/29/24(Sun)21:13:51 No.102617114

Anonymous 09/29/24(Sun)21:13:51 No.102617114

>>102617094
And you will do it for free!

Anonymous
09/29/24(Sun)21:17:10 No.102617147

Anonymous 09/29/24(Sun)21:17:10 No.102617147

>>102617010
Oh yeah I'll need to see about adapting for Llama.cpp then probably. Feeling lazy so I'll do that tomorrow.

Does this not get the entire reply chain for a specific post in question?

Anonymous
09/29/24(Sun)21:21:08 No.102617203

Anonymous 09/29/24(Sun)21:21:08 No.102617203

File: Screen_20240929_191611_0001.jpg (242 KB, 899x856)

242 KB JPG

missed the past couple of days here, what's the latest drama/go to models?
currently using Midnight-Miqu-70B-v1.5.i1-IQ4_XS.gguf (34.6GB)
pic related
the "internal thoughts" can be pretty funny at times

Anonymous
09/29/24(Sun)21:22:12 No.102617213

Anonymous 09/29/24(Sun)21:22:12 No.102617213

>>102616938
>Pony logposting
I am so happy /lmg/ is dead

Anonymous
09/29/24(Sun)21:22:44 No.102617221

Anonymous 09/29/24(Sun)21:22:44 No.102617221

>>102617147
It doesn't. It judges based on the post without context. But it shouldn't be too hard to implement. The html cleaning function leaves quotes as ">>########" so you can find/extract that and build another json object using those quotes as a breadcrumb to move up a chain and feed that to the prompt to provide additional context.

Anonymous
09/29/24(Sun)21:24:38 No.102617249

Anonymous 09/29/24(Sun)21:24:38 No.102617249

>>102617203
>>102616701
>>102616938

Anonymous
09/29/24(Sun)21:27:06 No.102617284

Anonymous 09/29/24(Sun)21:27:06 No.102617284

>>102616701
>>102617249
link to model?

Anonymous
09/29/24(Sun)21:30:43 No.102617322

Anonymous 09/29/24(Sun)21:30:43 No.102617322

>>102617284
https://huggingface.co/hf-100/Llama-3.1-Spellbound-StoryWriter-70b-instruct-0.4-16bit

Anonymous
09/29/24(Sun)21:30:57 No.102617326

Anonymous 09/29/24(Sun)21:30:57 No.102617326

lets see if we can get through this thread without the discord sloptuners going on a schizo rant about how they should be the moderators of /lmg/

Anonymous
09/29/24(Sun)21:35:07 No.102617385

Anonymous 09/29/24(Sun)21:35:07 No.102617385

>>102617326
Sigh, now xe will spam with x2 confidence.

lmg
09/29/24(Sun)21:52:52 No.102617586

lmg 09/29/24(Sun)21:52:52 No.102617586

*dies*

Anonymous
09/29/24(Sun)21:54:55 No.102617602

Anonymous 09/29/24(Sun)21:54:55 No.102617602

>>102617586
I'm too busy using my local model to post anything of import.

Anonymous
09/29/24(Sun)21:57:03 No.102617628

Anonymous 09/29/24(Sun)21:57:03 No.102617628

>>102617586
unironically that's basically what happens to zoomers when they don't get endless content to fill their attention with

Anonymous
09/29/24(Sun)21:58:43 No.102617644

Anonymous 09/29/24(Sun)21:58:43 No.102617644

>>102617602
Same. Not much has rocked the boat lately, and I'm still using workhorse models from 4+ months ago, so there's not much to say.

Anonymous
09/29/24(Sun)22:00:39 No.102617670

Anonymous 09/29/24(Sun)22:00:39 No.102617670

File: Untitled.png (5 KB, 211x152)

5 KB PNG

I put your post in the shitter so I didn't see your reply BTW also going to go fuck now so have fun tonight little zoomie spamming all the threads you obsess over

Anonymous
09/29/24(Sun)22:03:57 No.102617711

Anonymous 09/29/24(Sun)22:03:57 No.102617711

>>102617670
>BTW also going to go fuck now
>ah ah mistress

Anonymous
09/29/24(Sun)22:04:20 No.102617718

Anonymous 09/29/24(Sun)22:04:20 No.102617718

>announcing filters
the tourists are still here

Anonymous
09/29/24(Sun)22:10:57 No.102617799

Anonymous 09/29/24(Sun)22:10:57 No.102617799

File: 1697410436504670.png (1.25 MB, 802x638)

1.25 MB PNG

>>102617670
Have a pity (you), filterfaggot.

Anonymous
09/29/24(Sun)22:19:41 No.102617894

Anonymous 09/29/24(Sun)22:19:41 No.102617894

>>102617221
I have set 72B on this task. Let's see if it can do it lolol.

Anonymous
09/29/24(Sun)22:49:14 No.102618148

Anonymous 09/29/24(Sun)22:49:14 No.102618148

>30 minutes
wow it's really over huh

Anonymous
09/29/24(Sun)22:55:41 No.102618183

Anonymous 09/29/24(Sun)22:55:41 No.102618183

SB 1047 got vetoed.

https://www.theverge.com/2024/9/29/24232172/california-ai-safety-bill-1047-vetoed-gavin-newsom

I.e., the bill that would have had this stupid fucking clause, which would have essentially banned all open Meta and Google releases:

>Developers also would have needed to ensure their AI could be shut down by a human if it started behaving dangerously.

Anonymous
09/29/24(Sun)23:09:02 No.102618269

Anonymous 09/29/24(Sun)23:09:02 No.102618269

>>102618183
I think they vetoed because they want to rewrite the bill to be even worse for us, don't take this as a victory at all

Anonymous
09/29/24(Sun)23:18:45 No.102618348

Anonymous 09/29/24(Sun)23:18:45 No.102618348

>>102618269
It's a bit of both. One of the reasons he gave was that certain small models could be hazardous. The others were that there was no science regarding what is actually dangerous, and that it was too broad
At the very least, it buys us a bit of time before California voluntarily gimps itself into becoming a technology desert in favor of China

Anonymous
09/29/24(Sun)23:31:37 No.102618476

Anonymous 09/29/24(Sun)23:31:37 No.102618476

>more discord drama
it really is just completely over for /lmg/

Anonymous
09/29/24(Sun)23:31:56 No.102618479

Anonymous 09/29/24(Sun)23:31:56 No.102618479

so when are the llama multimodal models going to get quants and so on?

Anonymous
09/29/24(Sun)23:34:42 No.102618507

Anonymous 09/29/24(Sun)23:34:42 No.102618507

lmao.cpp single handedly holding back all of local text gen

Tom from China
09/29/24(Sun)23:37:06 No.102618526

Tom from China 09/29/24(Sun)23:37:06 No.102618526

>>102618183
I see no issue with this bill.

Anonymous
09/29/24(Sun)23:43:51 No.102618576

Anonymous 09/29/24(Sun)23:43:51 No.102618576

>>102618348
>At the very least, it buys us a bit of time before California voluntarily gimps itself into becoming a technology desert in favor of China
I think at some point a lot of companies will simply relocate to another state, like Musk did from moving SpaceX from california to Texas

Anonymous
09/29/24(Sun)23:47:46 No.102618610

Anonymous 09/29/24(Sun)23:47:46 No.102618610

>>102618576
well, texas is objectively better for launching rockets for reasons other than politics

Anonymous
09/29/24(Sun)23:49:42 No.102618632

Anonymous 09/29/24(Sun)23:49:42 No.102618632

>>102618507
I'm not too worried, in a month we'll be able to have grok 3 write its own branch with multimodality, speculative decoding, jamba, and hookers

Anonymous
09/29/24(Sun)23:50:16 No.102618642

Anonymous 09/29/24(Sun)23:50:16 No.102618642

>>102618610
kek, but imo I think the states concept is cool, if you hate one state ideoligically or politically, you can simply move a 100 ft and go elsewhere, in other countries it's the same laws everywhere in their land so you're kind fucked

Anonymous
09/29/24(Sun)23:51:18 No.102618652

Anonymous 09/29/24(Sun)23:51:18 No.102618652

File: file.png (477 KB, 750x1000)

477 KB PNG

>>102618526
I was about to insult your whole familly then I saw your signature and I kek'ed hard

Anonymous
09/29/24(Sun)23:51:18 No.102618653

Anonymous 09/29/24(Sun)23:51:18 No.102618653

I'm a poorfag, can anything even run on my 12VRAM+32RAM configuration?

Anonymous
09/29/24(Sun)23:51:46 No.102618659

Anonymous 09/29/24(Sun)23:51:46 No.102618659

>>102618653
cydonia

Anonymous
09/29/24(Sun)23:53:11 No.102618674

Anonymous 09/29/24(Sun)23:53:11 No.102618674

>>102618653
search on huggingface for "12b gguf" and browse your options, then pick the one that has the most attractive image at the top of its description

Anonymous
09/29/24(Sun)23:55:44 No.102618695

Anonymous 09/29/24(Sun)23:55:44 No.102618695

>>102616784
Just the nigga I need to see. Tell me, what's the most gangsta component of the transformers architecture?

Anonymous
09/30/24(Mon)00:04:18 No.102618750

Anonymous 09/30/24(Mon)00:04:18 No.102618750

>>102618653
mythomax Q8 with some layers offloaded

Anonymous
09/30/24(Mon)00:06:34 No.102618763

Anonymous 09/30/24(Mon)00:06:34 No.102618763

What's the best local model if you have a single 4090? I'm so sick of fucking corpos.

Anonymous
09/30/24(Mon)00:11:04 No.102618790

Anonymous 09/30/24(Mon)00:11:04 No.102618790

File: Capture.png (117 KB, 1476x610)

117 KB PNG

>>102618695
Dolphin qwen 2 3Bit quant ixs

Anonymous
09/30/24(Mon)00:13:17 No.102618812

Anonymous 09/30/24(Mon)00:13:17 No.102618812

>>102618269
>even worse
remember, we have to win every time. They only have to win once

Anonymous
09/30/24(Mon)00:22:15 No.102618878

Anonymous 09/30/24(Mon)00:22:15 No.102618878

>>102618763
>>102618750

Anonymous
09/30/24(Mon)00:23:22 No.102618886

Anonymous 09/30/24(Mon)00:23:22 No.102618886

>>102618878
Thanks

Anonymous
09/30/24(Mon)00:27:16 No.102618905

Anonymous 09/30/24(Mon)00:27:16 No.102618905

>>102616619
>Experiments with evolutionary algorithms and bitnet type MLPs yield promising results
Thanks recap Anon, I always love to see evolutionary algorithms in use. Hope his next test for optimization goes well.

Anonymous
09/30/24(Mon)00:32:51 No.102618935

Anonymous 09/30/24(Mon)00:32:51 No.102618935

LlaMoE status?

Anonymous
09/30/24(Mon)00:37:17 No.102618964

Anonymous 09/30/24(Mon)00:37:17 No.102618964

>>102618935
>>LlaMoE status
crewing up

Anonymous
09/30/24(Mon)00:45:11 No.102619033

Anonymous 09/30/24(Mon)00:45:11 No.102619033

File: lepedophile.jpg (30 KB, 543x543)

30 KB JPG

I like my LLMs how I like my little girls: naked and in groups of 8

Anonymous
09/30/24(Mon)00:47:42 No.102619055

Anonymous 09/30/24(Mon)00:47:42 No.102619055

>>102619033
Why did he say it bros?

Anonymous
09/30/24(Mon)00:49:53 No.102619070

Anonymous 09/30/24(Mon)00:49:53 No.102619070

>>102618790
Who the hell are you? Where is the big man?

Anonymous
09/30/24(Mon)00:51:26 No.102619087

Anonymous 09/30/24(Mon)00:51:26 No.102619087

inb4 it takes llama.cpp 3 months to support LlaMoE

Anonymous
09/30/24(Mon)00:51:36 No.102619088

Anonymous 09/30/24(Mon)00:51:36 No.102619088

>>102619070
I... don't have the card, it's just a simulacrum, a 4chan thumbnail, cleaned of all data.

Anonymous
09/30/24(Mon)01:01:40 No.102619167

Anonymous 09/30/24(Mon)01:01:40 No.102619167

>>102618935
Anon that was a lie. It was literally a fabricated post by anonymous hacker 4chan to trick other anons like you.

Anonymous
09/30/24(Mon)01:03:12 No.102619185

Anonymous 09/30/24(Mon)01:03:12 No.102619185

>>102619167
this anon is lying

Anonymous
09/30/24(Mon)01:04:18 No.102619190

Anonymous 09/30/24(Mon)01:04:18 No.102619190

>>102619167
I want to believe.
8x11B, NALA optimized, multimodal, Mamba, 1.58bpw quantization-aware training.
It's going to be the one.

Anonymous
09/30/24(Mon)01:04:30 No.102619195

Anonymous 09/30/24(Mon)01:04:30 No.102619195

>>102619185
This anon is my tulpa, don't trust him guys.

Anonymous
09/30/24(Mon)01:06:02 No.102619203

Anonymous 09/30/24(Mon)01:06:02 No.102619203

>>102619033
>>102619055
You guys do realize he was just making a food analogy and you pedos thought I was talking about lolis? It was perfectly apt for the situation. He isn't steeped in the degeneracy of this place to even be able to contort his brain into making the connection you did.

Anonymous
09/30/24(Mon)01:06:39 No.102619209

Anonymous 09/30/24(Mon)01:06:39 No.102619209

the anon above me is a pedophile

Anonymous
09/30/24(Mon)01:12:45 No.102619261

Anonymous 09/30/24(Mon)01:12:45 No.102619261

Why won't SillyTavern autostart in Firefox on Linux? text-gen-webui always starts fine but ST stopped opening in the browser many months ago. Have to type it manually every time. Config file says the option is set.

Anonymous
09/30/24(Mon)01:12:51 No.102619263

Anonymous 09/30/24(Mon)01:12:51 No.102619263

File: llama 3.2 90b vision erp fail.png (187 KB, 800x1116)

187 KB PNG

HOLY SLOPPA
>assistant slopmaxx gaslight strat
>discover special kind of slop, the kind where it's purple prose unslopped and it's just trying to say this and that happened

Anonymous
09/30/24(Mon)01:14:18 No.102619274

Anonymous 09/30/24(Mon)01:14:18 No.102619274

>(sing
using

Anonymous
09/30/24(Mon)01:29:18 No.102619381

Anonymous 09/30/24(Mon)01:29:18 No.102619381

>>102619261
firefox --url http://127.0.0.1:5001/ --new-tab &

Anonymous
09/30/24(Mon)01:43:25 No.102619462

Anonymous 09/30/24(Mon)01:43:25 No.102619462

>>102619203
>responding to bait
Anon...

Anonymous
09/30/24(Mon)01:51:37 No.102619533

Anonymous 09/30/24(Mon)01:51:37 No.102619533

>>102618653
LlaMoE

Anonymous
09/30/24(Mon)02:00:58 No.102619602

Anonymous 09/30/24(Mon)02:00:58 No.102619602

Can I get a qrd on what emu 3 is exactly? For what it claims to do nobody seems to be talking about it.

Anonymous
09/30/24(Mon)02:05:38 No.102619638

Anonymous 09/30/24(Mon)02:05:38 No.102619638

>>102619602
dogshit transformers multimodal model that sucks at LLM and imagegen tasks at the same time

Anonymous
09/30/24(Mon)02:10:02 No.102619667

Anonymous 09/30/24(Mon)02:10:02 No.102619667

>>102619533
That exists?

Anonymous
09/30/24(Mon)02:11:19 No.102619672

Anonymous 09/30/24(Mon)02:11:19 No.102619672

wait the molmo models are in fp32? not even 16? fucking christ

Anonymous
09/30/24(Mon)02:14:42 No.102619697

Anonymous 09/30/24(Mon)02:14:42 No.102619697

File: midday teto.jpg (1.41 MB, 954x2976)

1.41 MB JPG

>>102618695

Anonymous
09/30/24(Mon)02:19:46 No.102619747

Anonymous 09/30/24(Mon)02:19:46 No.102619747

>>102619697
I fuckin love big nigga tree, that woman also helped a little bit with her weapon comment.

Anonymous
09/30/24(Mon)02:21:03 No.102619751

Anonymous 09/30/24(Mon)02:21:03 No.102619751

File: storywriter.png (19 KB, 283x261)

19 KB PNG

>>102617322
impressive

Anonymous
09/30/24(Mon)02:45:30 No.102619891

Anonymous 09/30/24(Mon)02:45:30 No.102619891

>>102619697
All kneel before TOBN

Anonymous
09/30/24(Mon)03:14:15 No.102620119

Anonymous 09/30/24(Mon)03:14:15 No.102620119

File: 1659242421629588.png (522 KB, 853x1000)

522 KB PNG

>>102618674
>pick the one that has the most attractive image at the top of its description

Anonymous
09/30/24(Mon)03:45:49 No.102620416

Anonymous 09/30/24(Mon)03:45:49 No.102620416

File: Capture.png (58 KB, 906x425)

58 KB PNG

Ahhhhh. No how are we back here?!

Getting .assistant again with llama3.2

My instruction template is

 {%- set ns = namespace(found=false) -%}
{%- for message in messages -%}
    {%- if message['role'] == 'system' -%}
        {%- set ns.found = true -%}
    {%- endif -%}
{%- endfor -%}
{%- for message in messages %}
    {% set content = '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n'+ message['content'] | trim + '<|eot_id|>' %}
    {% if loop.index0 == 0 %}
        {% set content = '<|begin_of_text|>' + content %}
    {% endif %}
    {{- content -}}
{%- endfor -%}
{%- if add_generation_prompt -%}
    {{- '<|start_header_id|>' + 'assistant' + '<|end_header_id|>\n\n' -}}
{%- endif -%}

Help anons! I brain no good with this regex wannabe language

Anonymous
09/30/24(Mon)03:49:57 No.102620447

Anonymous 09/30/24(Mon)03:49:57 No.102620447

File: file.png (579 KB, 1730x654)

579 KB PNG

>>102618269
>>102618183
Yep. He veto'd it because the bill only applied to the biggest companies and not to smaller ones. They want a universal AI bill that applies to all models.

Anonymous
09/30/24(Mon)03:50:33 No.102620451

Anonymous 09/30/24(Mon)03:50:33 No.102620451

>>102620416
Enable the option to print the special tokens and check that the EOS ones are set correctly...

Anonymous
09/30/24(Mon)03:51:08 No.102620458

Anonymous 09/30/24(Mon)03:51:08 No.102620458

>>102620447
The veto was a sinister move after all.

Anonymous
09/30/24(Mon)03:53:41 No.102620480

Anonymous 09/30/24(Mon)03:53:41 No.102620480

>>102620458
>>102620447
>The veto was a sinister move after all.
we're talking about commiefornia here, of course this shithole will do anything in its power to kill AI

Anonymous
09/30/24(Mon)03:56:18 No.102620505

Anonymous 09/30/24(Mon)03:56:18 No.102620505

>>102620416
>...really, anything
>please talk to me
>don't leave me alone.assistant

Anonymous
09/30/24(Mon)03:57:26 No.102620515

Anonymous 09/30/24(Mon)03:57:26 No.102620515

>>102620480
You can't kill what's already dead :)

Anonymous
09/30/24(Mon)03:58:09 No.102620521

Anonymous 09/30/24(Mon)03:58:09 No.102620521

Going back to instruct because all the sloptunes run on Brazzer porn logic. There's actually no decent RP models out there, it's either assistant or smutmaxxed

Anonymous
09/30/24(Mon)04:02:06 No.102620556

Anonymous 09/30/24(Mon)04:02:06 No.102620556

File: sad-hamster-hampter.gif (109 KB, 498x471)

109 KB GIF

>>102620416
>>102620451
ChatML with skip special tokens enabled is the secret sauce for this one, turns out.

>>102620505
haha for real. Over-eager is a thing, little model. But I think this little guys just here to help, rather than lonely.

Anonymous
09/30/24(Mon)04:07:11 No.102620604

Anonymous 09/30/24(Mon)04:07:11 No.102620604

File: MikuReadsBeforeBed.png (1.3 MB, 832x1216)

1.3 MB PNG

Good night /lmg/

Anonymous
09/30/24(Mon)04:07:39 No.102620608

Anonymous 09/30/24(Mon)04:07:39 No.102620608

File: 1706223769307080.jpg (7 KB, 236x236)

7 KB JPG

>halfway through chat
>bot refuses to talk in first person no matter how many hints or demands I give it

Anonymous
09/30/24(Mon)04:08:37 No.102620617

Anonymous 09/30/24(Mon)04:08:37 No.102620617

>>102620556
>skip special tokens enabled
you got it backwards

Anonymous
09/30/24(Mon)04:09:03 No.102620622

Anonymous 09/30/24(Mon)04:09:03 No.102620622

>>102620608
>edit response
>"I"
>continue

Anonymous
09/30/24(Mon)04:12:55 No.102620660

Anonymous 09/30/24(Mon)04:12:55 No.102620660

>>102620622
>dood i do like tinkering with shit dood!
Fuck off nigel.

Anonymous
09/30/24(Mon)04:17:12 No.102620710

Anonymous 09/30/24(Mon)04:17:12 No.102620710

whats an actually acceptable context for you ?

for me based on guestimating it would be around 100M-10B for just le dick in pussy 4k is fine but actual rp i think really needs alot like the ideal response is 200-300 tokens dialogue and then around another 100 on action so 300-400 for each character turn i average about 50-100 tokens for each of mine though that is because im running fucking q3 and the model goes a bit wack so with q8 or a bit more smarts it would be 100-200 the thing is just for simple things like wkae up,hug,kiss go to bathroom is around 20 turns so 4-6 x 20 thats already 8-12k context thats just with one character and me i enjoy harem much more so those 300-400 tokens for char then rise up to 500-700 since its then better if there can be some interplay between the girls which would be nice to read so then 5-7x7 3.5-4.9+1-2 3.6-5x20 72k-100k context though that would actually be a bit higher since then i would have some more to write so my writing would probably increase to about 500-700 tokens not sure so 100k just for waking up a morning hug and kiss with each of the girls and going to the bathroom just for that which is not taking into account stuff like description of the room/sorroundings or any such descriptions/things like the plugin that one anon made where it keeps the time of day and shit its also not taking into account stuff like other characters coming into focus or even things like what if i had a friend in the rp and he had his harem ? a good chunk of descriptions and stuff can be shooed away by a future stable diffusion plugin thingy but there will still be stuff you cant shoo away also i would personally want to reenact several days or weeks in the rp before going into a new one so 10k+ turns in reality most likely
thought ?

Anonymous
09/30/24(Mon)04:18:51 No.102620730

Anonymous 09/30/24(Mon)04:18:51 No.102620730

>>102616709
Why do you reddit cucks parrot this dumb phrase all the time? Do you really think shilling 4chan ads FOR FREE makes you look tougher in front of other shills?

Anonymous
09/30/24(Mon)04:20:58 No.102620754

Anonymous 09/30/24(Mon)04:20:58 No.102620754

>>102619203
What was the original LeCunny post? I missed it

Anonymous
09/30/24(Mon)04:25:45 No.102620792

Anonymous 09/30/24(Mon)04:25:45 No.102620792

>This guy's basically saying he needs a ton of tokens, like 100 million to 10 billion, just for basic 4K sex scenes. For role-playing, he wants 300-400 tokens per character turn, but his current model is too dumb for that, so he gets around 50-100. He prefers harem scenarios (obviously), which skyrockets the token count to around 100k just for basic morning stuff like hugs and bathroom trips. He wants to play out days or weeks in the RP, which he estimates would need 10k+ turns.

Anonymous
09/30/24(Mon)04:26:48 No.102620797

Anonymous 09/30/24(Mon)04:26:48 No.102620797

>>102620710
Not reading all that but 16k is the bare minimum for me, it's not 'good' but below that and I won't ever bother to try that model.
24k is enough for short coom sessions
32k is enough for decent coom sessions and also non-erotic RP
This is assuming that the model doesn't become full retard by the time it hits that context.

Anonymous
09/30/24(Mon)04:31:31 No.102620832

Anonymous 09/30/24(Mon)04:31:31 No.102620832

File: PWHpO5F.png (138 KB, 956x772)

138 KB PNG

>>102620710
Is this some kind of copypasta from someone pajeet that has yet to discover the enter key?

Is this a schizo post?

100M context, I dont think that means what you think that means.

The entire harry potter book series, all of them, is 1 million words. Tokenization at minumum is going to compress that down 33%, possibly as high as half.

Thats just 500k tokens for the whole fkn series in one go, active attention. I cant even follow your one paragraph.

There are models right now that do 100k+ context.

You need basically an entire bookstore for your cummies roleplay stories.

Nigga shut the fuck up and kill yourself already, you are draining my electricity I need to use on people that have a chance to breed.

Anonymous
09/30/24(Mon)04:33:13 No.102620850

Anonymous 09/30/24(Mon)04:33:13 No.102620850

File: ?.jpg (151 KB, 1024x1024)

151 KB JPG

What is the least censored, least retarded local model I can run on 48GB?

Anonymous
09/30/24(Mon)04:34:43 No.102620859

Anonymous 09/30/24(Mon)04:34:43 No.102620859

>>102620850
Your brain.

Anonymous
09/30/24(Mon)04:36:46 No.102620881

Anonymous 09/30/24(Mon)04:36:46 No.102620881

File: butwhy.png (963 KB, 1300x1040)

963 KB PNG

>>102620850
For what? Currently (this week) retardation is minimized in the censored models that just released that havent had it finetuned out of them yet.

Bad time to ask in this vague way as we just got new model drops.

Probably the 'smartest' is llama3.2 but least censored smartest is quen2 or something?

I dunno man, look at the leaderboards ya lazy shit

Anonymous
09/30/24(Mon)04:36:50 No.102620882

Anonymous 09/30/24(Mon)04:36:50 No.102620882

>>102620859
I said least retarded

Anonymous
09/30/24(Mon)04:38:28 No.102620896

Anonymous 09/30/24(Mon)04:38:28 No.102620896

>>102620881
Thanks, Anon. You're right, I am a lazy shit. Working my way through the OP retard guides now. I meant best for RP, mostly NSFW but SFW also. I forgot some people do more with this than just coom.

Anonymous
09/30/24(Mon)04:39:16 No.102620899

Anonymous 09/30/24(Mon)04:39:16 No.102620899

>>102620710
Yes, ideally having a long context would be good. Because it should be able to remember things that happened ages ago so it can create a long-term scenario, otherwise it's just fleeting one-offs. Right now even the 128k models only work well at 16k maybe 32k for some. I wouldn't go as far as 100M+, I'd say 1M is a good ultimate goal.

Anonymous
09/30/24(Mon)04:39:52 No.102620906

Anonymous 09/30/24(Mon)04:39:52 No.102620906

>>102620730
Because it's an actual ad. It's made by the dev of this site:
https://www.tryspellbound.com/app/home
https://arch.b4k.co/vg/thread/493459883/#493925732

Anonymous
09/30/24(Mon)04:40:11 No.102620910

Anonymous 09/30/24(Mon)04:40:11 No.102620910

>>102620882
Anything that uncensors model - makes it retarded, you need to train desired one from scratch, no other options here.

Anonymous
09/30/24(Mon)04:41:46 No.102620923

Anonymous 09/30/24(Mon)04:41:46 No.102620923

>>102620710
>>102620797
I have a solution for long RPs: starting nested chats for mundane tasks like entering dungeons, getting laid, etc, then summarizing them into a single message sent back to the main chat. Not only does this reduce overall context usage, but it also makes the content more focused by eliminating less informative activities. My own frontend handles this process automatically. If ST wasn't a stagnant mess, these features would already be available without needing to code them ourselves.

Anonymous
09/30/24(Mon)04:46:38 No.102620952

Anonymous 09/30/24(Mon)04:46:38 No.102620952

>>102620923
But it should be able to do what an author does, take into account every little detail that previously happened and build on that as needed. It's the fault of these companies for over-promising on the technology, people have high expectations. They're on the media saying this technology is smart enough to replace everyone.

Anonymous
09/30/24(Mon)04:46:45 No.102620954

Anonymous 09/30/24(Mon)04:46:45 No.102620954

File: quqa3yg4xtza1.jpg (81 KB, 885x715)

81 KB JPG

>the least painful llm ive used to date is the 8b llama instruct finetune that originally got me into local models
>keep coming back to it after trying mistral nemo
>just run it in fp16 to justify upgrading gpu
>works oob
iokay then. so how's 3.X 70b llama guys?

Anonymous
09/30/24(Mon)04:51:19 No.102620982

Anonymous 09/30/24(Mon)04:51:19 No.102620982

>>102620832
i know how much i said nigger not all of us read like grannys without glasses also if you arent some neurotic fly without a head fuck 100k-1mil words goes by real quick not accounting for daydreaming breaks
>>102620923
thats a temporary workaround i want the characters to do shit like eg

one of them is held by a jihadi and they video out asking for a ransom so the character does a dance that they did a thousand turns ago when they were in distress and repurposes that to signal at what location they are i dont see how this could be done with a summarisation

also link to your frontend ?

Anonymous
09/30/24(Mon)04:57:27 No.102621018

Anonymous 09/30/24(Mon)04:57:27 No.102621018

>>102620923
Yes, that's definitely a way around shorter context length, but you're also putting in a lot more effort. You could maybe use an LLM to summarize for you, but then you're relying on the LLM knowing which details are important to you.
Also, there's no excuse for modern models to be releasing with only 8K context.

Anonymous
09/30/24(Mon)05:14:03 No.102621137

Anonymous 09/30/24(Mon)05:14:03 No.102621137

everything we say is influenced by a neurochemical cocktail in our brain that controls the functioning of our neuronal activation. this cocktail is a dynamic process that reacts to external stimuli and is subject to certain laws, with gradations in the short, medium and long-term time horizon. something like an external information multiplier and at the same time information compression outside our neuronal structure.

how do you implement this on llm weight activations without working completely idiotically with prompts? would like to give my waifu more meaningful emotions. anon solves this for me.

Anonymous
09/30/24(Mon)05:15:02 No.102621144

Anonymous 09/30/24(Mon)05:15:02 No.102621144

>>102620604
Good night, Miku

Anonymous
09/30/24(Mon)05:18:37 No.102621159

Anonymous 09/30/24(Mon)05:18:37 No.102621159

>>102621018
Summarization is automatic, I only decide when to begin and end nested chats.
>you're relying on the LLM knowing which details are important to you.
I defined several summarization prompts for specific activity types.
>>102620982
I have no intention of making it public, I only implement features that I personally need. I do not wish to write documentation, respond to feedback, or engage with schizos. Another useful feature I use is locations: I specify which characters inhabit each location and filter messages to those visible to each character, thereby reducing unnecessary context. Additionally, I can keep secrets with specific characters. Typically, I interact with only one character at a time, making less context processing
    location_chars = None
    messages2 = []
    for m in messages:
        if m.role==role_location:
            location_chars = m.characters
            continue
        if location_chars and not char_id in location_chars:
            continue
        messages2.append(m)

Anonymous
09/30/24(Mon)06:15:24 No.102621543

Anonymous 09/30/24(Mon)06:15:24 No.102621543

It's so fucking over lmao

Anonymous
09/30/24(Mon)06:44:32 No.102621741

Anonymous 09/30/24(Mon)06:44:32 No.102621741

>>102620710
In an ideal world? Practically infinite, but I do hope to be able to run 1M without retardation on my desktop at some point, iirc that should be a book or two, so enough for storytelling and long chats
Add a dedicated summarization model and you can probably 10x that without much trouble
But for now... for now I'll be content with 64k

Anonymous
09/30/24(Mon)06:50:35 No.102621776

Anonymous 09/30/24(Mon)06:50:35 No.102621776

hello /lmg/
is this whole AI shit is a scam?
openai recently released o1 (le reasoning model) and I think all they did was gluing on chain of thought to the model and maybe cram as much STEM data into the model as possible
i'm concerned that this might actually be true (even partially) and that no progress will actually be made
and techniggers like roon just hype/troll retards like me by vague posting about dangerous and powerful tech they have, maybe i should really quit social media but i don't know how to keep up with improvements in technology
that's really hurt me mentally, because I see it (billions of dollars to AI investments) as a glimpse of hope that the future isn't so grim
idk where to post this, in /aicg/ too many retards discussing namefags

Anonymous
09/30/24(Mon)06:53:50 No.102621790

Anonymous 09/30/24(Mon)06:53:50 No.102621790

>be me
>tired of qwen repeating itself
>tired of other models' positivity biases and slop
>download that storywriter tune some anon shilled
>4.75bpw, as usual
>complete and utter schizo shit, no matter the sampler settings
>see that it was finetuned on top of a 4 bit model
I fucking hate this hobby

Anonymous
09/30/24(Mon)06:55:28 No.102621800

Anonymous 09/30/24(Mon)06:55:28 No.102621800

File: 1727398220355897.jpg (262 KB, 960x1200)

262 KB JPG

>>102621543
yes, finally september is fucking over and the only good month of the year is about to begin

Anonymous
09/30/24(Mon)07:03:05 No.102621831

Anonymous 09/30/24(Mon)07:03:05 No.102621831

>>102619033
Mental illness. https://desuarchive.org/g/search/image/aMdhob_mq5aJPrNt0zj8Vg/

Anonymous
09/30/24(Mon)07:18:11 No.102621931

Anonymous 09/30/24(Mon)07:18:11 No.102621931

Which local embedding model is the best for stuff like perplexica?
I'm still using BGE small but I guess there are probably way better models now

Anonymous
09/30/24(Mon)07:18:50 No.102621934

Anonymous 09/30/24(Mon)07:18:50 No.102621934

File: 66e709367058a3a7f9b9e01d_(...).png (72 KB, 840x809)

72 KB PNG

>>102621543
New models are going to be so powerful...

Anonymous
09/30/24(Mon)07:21:29 No.102621948

Anonymous 09/30/24(Mon)07:21:29 No.102621948

File: file.png (9 KB, 492x61)

9 KB PNG

lamao.cpp is getting multimeme support by a core maintainer
happy now retards?

Anonymous
09/30/24(Mon)07:35:44 No.102622034

Anonymous 09/30/24(Mon)07:35:44 No.102622034

>>102621931
https://huggingface.co/spaces/mteb/leaderboard

Anonymous
09/30/24(Mon)07:42:08 No.102622076

Anonymous 09/30/24(Mon)07:42:08 No.102622076

for me it's ollama

Anonymous
09/30/24(Mon)07:45:33 No.102622101

Anonymous 09/30/24(Mon)07:45:33 No.102622101

>>102621948
I might be 8 months from now, when it finally gets merged in. If it's not broken and supports all the latest SOTA vision models.

Anonymous
09/30/24(Mon)07:48:48 No.102622123

Anonymous 09/30/24(Mon)07:48:48 No.102622123

>>102622034
Thank you anon

Anonymous
09/30/24(Mon)07:50:18 No.102622132

Anonymous 09/30/24(Mon)07:50:18 No.102622132

https://x.com/localghost/status/1840475848450994183

Anonymous
09/30/24(Mon)07:53:31 No.102622158

Anonymous 09/30/24(Mon)07:53:31 No.102622158

>>102621776
o1 is a side path to unlock some extra gains from existing tech.
Absolutely nobody has trained a model bigger than GPT-4, which finished over two years ago, yet.
Not for lack of will, but instead lack of resources. That's exactly what those huge multi-billion dollar investments are about. To get to the next tier requires GPU clusters an order of magnitude more powerful than have ever existed until very recently.
Now it's an arms race between several of the biggest companies in the world to be the first to make a next gen model.
Whichever it is that happens to cross the finish line (We know that Grok 3 is officially slated for December and GPT-5 "Orion" is rumored for a similar timeframe.) will be the very first taste any of us will get of AI's future. That'll be the point when predictions can be made that aren't just wild speculation.

Anonymous
09/30/24(Mon)08:10:57 No.102622275

Anonymous 09/30/24(Mon)08:10:57 No.102622275

File: Capture6.png (24 KB, 560x327)

24 KB PNG

>>102620954
Objectively better, thanks. And all it takes is to follow leaderboards, wow so hard such audiophile wow compute takes a lot of compute how could I ever?

Anonymous
09/30/24(Mon)08:42:18 No.102622545

Anonymous 09/30/24(Mon)08:42:18 No.102622545

>setup tts server to read me things
>no idea what to make it read
it's owari da

Anonymous
09/30/24(Mon)08:42:55 No.102622551

Anonymous 09/30/24(Mon)08:42:55 No.102622551

How can I do beam search sampling in ooba?

Anonymous
09/30/24(Mon)08:51:34 No.102622647

Anonymous 09/30/24(Mon)08:51:34 No.102622647

>>102622545
My little pony erotic fanfiction, of course.
Have it on your speakers so that your neighbors can enjoy the literature too.

Anonymous
09/30/24(Mon)08:56:48 No.102622702

Anonymous 09/30/24(Mon)08:56:48 No.102622702

>>102622545
Das Kapital with a mesugaki voice.

Anonymous
09/30/24(Mon)08:59:43 No.102622727

Anonymous 09/30/24(Mon)08:59:43 No.102622727

/hdg/ tourist here, how you guys doing?

Anonymous
09/30/24(Mon)08:59:49 No.102622729

Anonymous 09/30/24(Mon)08:59:49 No.102622729

File: 1712120637552619.png (1.18 MB, 1280x1172)

1.18 MB PNG

>>102616777
>>>latest model: a fucking cot finetune
>It saddens me that normies buy so much into this grift lmao
>Banned - Directly discussing adult content outside of the designated board /r/ is a violation of Rule 16.

Anonymous
09/30/24(Mon)09:00:56 No.102622745

Anonymous 09/30/24(Mon)09:00:56 No.102622745

>>102622076
I ditched it because it nas no 4bit cache

Anonymous
09/30/24(Mon)09:26:35 No.102622958

Anonymous 09/30/24(Mon)09:26:35 No.102622958

i'm a poorfag and haven't paid attention to local models for months and wanting to play with tiny models, is phi-3 still worth it or is there a better alternative now?

Anonymous
09/30/24(Mon)09:33:37 No.102623027

Anonymous 09/30/24(Mon)09:33:37 No.102623027

>>102622958
Hard to tell, anon. How much of a poorfag are you? What hardware are you running.
And phi-3 what? There's more than one.
Llama released some 1B and 3B models, you can try those i suppose. llama.cpp now supports olmoe, a 1b active, 7b param model that runs fast on toasters and knows fun...
16gb vram is still poorfag territory. Mistral Nemo 12b is usable...

Anonymous
09/30/24(Mon)09:36:16 No.102623060

Anonymous 09/30/24(Mon)09:36:16 No.102623060

Emu3 seems cool, like true multimodality

Anonymous
09/30/24(Mon)09:38:05 No.102623088

Anonymous 09/30/24(Mon)09:38:05 No.102623088

>>102623027
>llama.cpp now supports olmoe, a 1b active, 7b param model that runs fast on toasters and knows fun...
Yo, that sounds dope.
Going to do some self-reflection workflow to see how much I can squeeze out of that one.
What's the claimed max context size?

Anonymous
09/30/24(Mon)09:43:03 No.102623142

Anonymous 09/30/24(Mon)09:43:03 No.102623142

>>102623088
>What's the claimed max context size?
Only 4k. But it's fun to iterate with, fast and doesn't give a fuck. I've used it for ~8k and it didn't collapse too hard. You can get 8t/s+ on a 15 year old cpu.

Anonymous
09/30/24(Mon)09:43:09 No.102623146

Anonymous 09/30/24(Mon)09:43:09 No.102623146

>>102623088
4k
>"max_position_embeddings": 4096,
https://huggingface.co/allenai/OLMoE-1B-7B-0924-Instruct/blob/main/config.json

Anonymous
09/30/24(Mon)09:43:10 No.102623148

Anonymous 09/30/24(Mon)09:43:10 No.102623148

File: 1707820047873448.png (1.2 MB, 720x720)

1.2 MB PNG

emu3
kek

Anonymous
09/30/24(Mon)09:44:11 No.102623160

Anonymous 09/30/24(Mon)09:44:11 No.102623160

>>102623142
>>102623146
Damn, that's not ideal.
Well, I'll play around with it regardless.
Thank you folks.

Anonymous
09/30/24(Mon)09:53:53 No.102623281

Anonymous 09/30/24(Mon)09:53:53 No.102623281

Sold my GPUs with a small profit. Gonna roll back when 5090 comes out, if something fundamental changes about how things are done now.
Or maybe just use cloud shit from now on idk.
Overall I rate LLMs as just 4 out of 10 sadly, (both local and cloud).
All magic evaporates quickly, and you're only stuck with issues, and oh boy are there plenty of issues.

Anonymous
09/30/24(Mon)09:58:30 No.102623332

Anonymous 09/30/24(Mon)09:58:30 No.102623332

>>102623148
can it do porn videos? asking for a friend

Anonymous
09/30/24(Mon)10:04:09 No.102623392

Anonymous 09/30/24(Mon)10:04:09 No.102623392

>>102623281
can you briefly elaborate on the issues you ran in to?

Anonymous
09/30/24(Mon)10:13:28 No.102623494

Anonymous 09/30/24(Mon)10:13:28 No.102623494

>>102618183
>>Developers also would have needed to ensure their AI could be shut down by a human if it started behaving dangerously.
*Turns the power off*
There, fixed it. Works universally on any AI.

Anonymous
09/30/24(Mon)10:26:48 No.102623614

Anonymous 09/30/24(Mon)10:26:48 No.102623614

>>102623494
they're probably imagining something like this happening
https://www.theguardian.com/us-news/2023/jun/01/us-military-drone-ai-killed-operator-simulated-test
with a model running locally on the drone

Anonymous
09/30/24(Mon)10:42:31 No.102623767

Anonymous 09/30/24(Mon)10:42:31 No.102623767

I SWEAR TO GOD IF QWEN SPEAK CHINESE AGAIN I'M GONNA

Anonymous
09/30/24(Mon)10:44:07 No.102623781

Anonymous 09/30/24(Mon)10:44:07 No.102623781

File: denied.png (122 KB, 620x619)

122 KB PNG

>>102623614
I find it hard to believe that those things, equipped with AI or not, wouldn't have a remote off switch.
Also
>The US air force has denied it has conducted an AI simulation in which a drone decided to “kill” its operator to prevent it from interfering with its efforts to achieve its mission.
>No real person was harmed.
It's a weird way to phrase it. "No real person was harmed" makes it seem like the test was actually done. It should be something like
>Actually, nevermind the article. Fuck all happened. Fuck journos.

Anonymous
09/30/24(Mon)10:44:12 No.102623783

Anonymous 09/30/24(Mon)10:44:12 No.102623783

Thinking about another way to potentially interact with llm's in mikupad, but I'm not sure .
The idea: using the llm as an autocomplete like in a code editor, but on a per token basis, for example, user types "A world", auto suggestions pop up with multiple (configurable) tokens, the tokens could potentially be selected with a hotkey, ie. Crtl+1 selects the first (most probable) token, including support for tabbing through the list of tokens. This could enable the use as an interactive writing helper, perhaps even suggesting alternatives for a word the user last typed. It could be set to only return 1 token, or whole sentences etc. Please rate retardation.

Anonymous
09/30/24(Mon)10:45:19 No.102623797

Anonymous 09/30/24(Mon)10:45:19 No.102623797

>>102623767
Just ban all chinese tokens.

Anonymous
09/30/24(Mon)10:49:05 No.102623829

Anonymous 09/30/24(Mon)10:49:05 No.102623829

File: stt.png (1 KB, 120x80)

1 KB PNG

>>102623783
Tokens are not always complete words. Better to generate a few and select the best collection.

Anonymous
09/30/24(Mon)10:58:59 No.102623918

Anonymous 09/30/24(Mon)10:58:59 No.102623918

>>102623829
I was being unclear, it could be configurable to either return N times fixed amount of tokens, ie just the next token, next 2 tokens, a word, or a sentence. Seems like it could improve interactivity and speed of writing by integrating the prediction part directly into the prompt area.

Anonymous
09/30/24(Mon)11:12:21 No.102624031

Anonymous 09/30/24(Mon)11:12:21 No.102624031

>>102623918
>I was being unclear
Nah, it's fine. I just read past it.
It could work, i suppose. As i see it, it basically completes text with a few alternatives at once and lets you select from them. It's just adding a "sentence generator/parser" to know when to stop for the sentence case and a few keybinds, it seems fairly simple.
Now get coding. Ideas aren't worth much on their own.

Anonymous
09/30/24(Mon)11:26:29 No.102624164

Anonymous 09/30/24(Mon)11:26:29 No.102624164

How do we eliminate the slop problem? The string-based backtracking thing was a good first step, but does it really solve it? The LLM could still output phrases that eventually become overly common again as you chat with it more. I think it would be interesting if someone could implement a system that automatically adds strings to the string ban, based on all previous chats you've ever had (which I suppose would be saved in a txt file) up to a certain configurable length.

Anonymous
09/30/24(Mon)11:34:26 No.102624249

Anonymous 09/30/24(Mon)11:34:26 No.102624249

>>102624164
I'm more concerned about the repetition problem. Slop is a dataset problem. Repetition is an architectural problem, it happens everywhere to every model.

Anonymous
09/30/24(Mon)11:35:36 No.102624261

Anonymous 09/30/24(Mon)11:35:36 No.102624261

>>102624164
Most anons play the same shit over and over again, with the same intent, using the same language. They guide models towards slop and then complain about it.
What we should have is something like a "user-slop meter" that warns the user that they cannot write for shit. Or, ideally, a language model smart enough to tell them how boring and lame they are so they can fix their ways.
Reading a real book or two every now and then would help them as well.

Anonymous
09/30/24(Mon)11:37:07 No.102624279

Anonymous 09/30/24(Mon)11:37:07 No.102624279

>>102624249
XTC solves repetition

Anonymous
09/30/24(Mon)11:39:39 No.102624317

Anonymous 09/30/24(Mon)11:39:39 No.102624317

>>102624279
it also solves intelligence

Anonymous
09/30/24(Mon)11:46:08 No.102624384

Anonymous 09/30/24(Mon)11:46:08 No.102624384

>>102623783
This sounds like a great idea, it's a no-brainer to me that something like this would be very useful

Anonymous
09/30/24(Mon)11:47:38 No.102624407

Anonymous 09/30/24(Mon)11:47:38 No.102624407

https://eqbench.com/creative_writing.html
Anyone tried the 9B that mogs everything in EQ bench? I don't trust this benchmark much and the slop rating is high so I'm wondering.

Anonymous
09/30/24(Mon)11:47:50 No.102624411

Anonymous 09/30/24(Mon)11:47:50 No.102624411

>>102616777
the robojannies aren't sending their best...

Anonymous
09/30/24(Mon)11:49:04 No.102624423

Anonymous 09/30/24(Mon)11:49:04 No.102624423

>>102624407
Buy a fucking ad.

Anonymous
09/30/24(Mon)11:50:16 No.102624444

Anonymous 09/30/24(Mon)11:50:16 No.102624444

>>102624423
You can tell shills to fuck off without sounding like a cuck, janny.

Anonymous
09/30/24(Mon)11:50:36 No.102624447

Anonymous 09/30/24(Mon)11:50:36 No.102624447

>>102624407
It's also #6 on "slop" and you can't even name the model. Funny that...

Anonymous
09/30/24(Mon)11:51:42 No.102624465

Anonymous 09/30/24(Mon)11:51:42 No.102624465

File: Untitled.png (52 KB, 615x323)

52 KB PNG

HM3: Hierarchical Multi-Objective Model Merging for Pretrained Models
https://arxiv.org/abs/2409.18893
>Model merging is a technique that combines multiple large pretrained models into a single model with enhanced performance and broader task adaptability. It has gained popularity in large pretrained model development due to its ability to bypass the need for original training data and further training processes. However, most existing model merging approaches focus solely on exploring the parameter space, merging models with identical architectures. Merging within the architecture space, despite its potential, remains in its early stages due to the vast search space and the challenges of layer compatibility. This paper marks a significant advance toward more flexible and comprehensive model merging techniques by modeling the architecture-space merging process as a reinforcement learning task. We train policy and value networks using offline sampling of weight vectors, which are then employed for the online optimization of merging strategies. Moreover, a multi-objective optimization paradigm is introduced to accommodate users' diverse task preferences, learning the Pareto front of optimal models to offer customized merging suggestions. Experimental results across multiple tasks, including text translation, mathematical reasoning, and code generation, validate the effectiveness and superiority of the proposed framework in model merging. The code will be made publicly available after the review process.
might be cool

Anonymous
09/30/24(Mon)11:52:08 No.102624469

Anonymous 09/30/24(Mon)11:52:08 No.102624469

>>102624407
And it's trained on the fucking 10MB gutenberg "dataset". What a piece of shit...

Anonymous
09/30/24(Mon)11:52:59 No.102624482

Anonymous 09/30/24(Mon)11:52:59 No.102624482

File: 1696445622980736.png (86 KB, 549x353)

86 KB PNG

>>102616777
>>102605857
>Banned - "Thanks for the info, I assumed that if something utilized CUDA then it would see some sort of exponential improvement with more/faster CUDA cores, but I guess that's not the case."
>This post violates Rule 6.d (Ironic shitposting) as it does not contribute to a meaningful discussion about technology and instead is an unfunny, ironic comment that adds no value to the board.

Anonymous
09/30/24(Mon)11:53:24 No.102624485

Anonymous 09/30/24(Mon)11:53:24 No.102624485

>>102624469
There is also am additional
>Carefully curated proprietary creative writing dataset
But no information about it

Anonymous
09/30/24(Mon)11:53:26 No.102624487

Anonymous 09/30/24(Mon)11:53:26 No.102624487

File: file.png (131 KB, 870x803)

131 KB PNG

>>102624407
This doesn't seem bad at all, but it also feels a bit too much like a tryhard.

Anonymous
09/30/24(Mon)11:54:59 No.102624503

Anonymous 09/30/24(Mon)11:54:59 No.102624503

>>102619263
swap out the language model

Anonymous
09/30/24(Mon)11:55:08 No.102624507

Anonymous 09/30/24(Mon)11:55:08 No.102624507

>>102624487
Hm seems too purple, and whats up with the double spaces

Anonymous
09/30/24(Mon)11:56:21 No.102624524

Anonymous 09/30/24(Mon)11:56:21 No.102624524

>>102619667
in a fortnight

Anonymous
09/30/24(Mon)11:58:40 No.102624556

Anonymous 09/30/24(Mon)11:58:40 No.102624556

>>102624485
>Carefully curated proprietary creative writing dataset
I doubt it's books or actual literature. And considering that he chose the smallest possible gutenberg dataset (as opposed to the 14gb on hf, which is still minuscule) i wouldn't expect much effort. It's all just horny logs...

Anonymous
09/30/24(Mon)12:00:12 No.102624576

Anonymous 09/30/24(Mon)12:00:12 No.102624576

>>102624556
go back to your discord

Anonymous
09/30/24(Mon)12:05:48 No.102624667

Anonymous 09/30/24(Mon)12:05:48 No.102624667

>>102621831
>first image on google because it's his official headshot on his employer's website
you're not the schizo I am

Anonymous
09/30/24(Mon)12:06:59 No.102624679

Anonymous 09/30/24(Mon)12:06:59 No.102624679

>>102624469
You and others grossly overestimate how much data you actually need to just change the vocabulary and output style of a model.

Anonymous
09/30/24(Mon)12:07:13 No.102624684

Anonymous 09/30/24(Mon)12:07:13 No.102624684

>>102624576
Is that what you really want? Models trained on logs made by retards? I want the smut too, but put some variety at least.

Anonymous
09/30/24(Mon)12:07:51 No.102624690

Anonymous 09/30/24(Mon)12:07:51 No.102624690

>>102624684
hi Sao

Anonymous
09/30/24(Mon)12:09:12 No.102624712

Anonymous 09/30/24(Mon)12:09:12 No.102624712

>>102624679
Yeah. As shown by all the "*-gutenberg" models. We all want variety (i assume) and everyone is training on the same 10mb.

>>102624690
schizo drama faggot

Anonymous
09/30/24(Mon)12:11:04 No.102624736

Anonymous 09/30/24(Mon)12:11:04 No.102624736

>>102624507
>double spaces
very common gemma thing, when it was new it was theorized it might be some kind of watermark, I did see a few tunes that didn't have that, or at least where it was much rarer

Anonymous
09/30/24(Mon)12:11:41 No.102624746

Anonymous 09/30/24(Mon)12:11:41 No.102624746

>>102624712
please do not call me a faggot

Anonymous
09/30/24(Mon)12:14:42 No.102624788

Anonymous 09/30/24(Mon)12:14:42 No.102624788

>>102617670
>I put your post in the shitter so I didn't see your reply BTW
redditor pride

Anonymous
09/30/24(Mon)12:24:54 No.102624907

Anonymous 09/30/24(Mon)12:24:54 No.102624907

/lmg/
>cope
>collective insanity
>discord arguments
>wants reddit moderation
>cope
>sloppas asking for ko-fi bucks
>lecunny apologia
>perpetual 2 week cycle
>cope
/aicg/
>discussion about chatbots
>discussion about proxies
we lost.

Anonymous
09/30/24(Mon)12:26:36 No.102624927

Anonymous 09/30/24(Mon)12:26:36 No.102624927

I wonder what an LLM finetunes on the https://en.wikipedia.org/wiki/Harvard_Classics would be like? Super smart, or a puffed up blowhard? The book selection, being from the turn of the century, ends up being pretty redpilled in the current year, so it might not be too bad.

Anonymous
09/30/24(Mon)12:35:43 No.102625035

Anonymous 09/30/24(Mon)12:35:43 No.102625035

How do you tell an LLM to expand a long outline of a story? Lets say you have a half-assed short story but a long prompt, how do you tell it to make it more complete?

Anonymous
09/30/24(Mon)12:37:31 No.102625049

Anonymous 09/30/24(Mon)12:37:31 No.102625049

>>102624927
>finetune
As smart as the base model with slightly different vocabulary. I think having a good base model directly trained on that (among other things) would be more interesting, if they're not using it already.

Anonymous
09/30/24(Mon)12:39:11 No.102625065

Anonymous 09/30/24(Mon)12:39:11 No.102625065

>>102625035
I guess you could ask it to break the story down into parts (chapters maybe?) while describing the main points/events/themes/whatever of each part, then ask it to expand each individual part.
Asking it to expand the whole story is probably not going to work as well.

Anonymous
09/30/24(Mon)12:41:12 No.102625094

Anonymous 09/30/24(Mon)12:41:12 No.102625094

>>102625035
I don't know what you've tried, so i'll assume nothing.
Try just mentioning part of your prompt and ask the AI to expand on it.
>Anon: So remember that bit about castle?. We need to elaborate on why all the towers were cock shaped. What do you think could be the lore behind it?
>Model: Got'chu, boo... here's why...

Anonymous
09/30/24(Mon)12:42:26 No.102625106

Anonymous 09/30/24(Mon)12:42:26 No.102625106

File: LovelyCharmingPostApocaly(...).png (1.41 MB, 832x1216)

1.41 MB PNG

Good morning /lmg/!

Anonymous
09/30/24(Mon)12:48:07 No.102625187

Anonymous 09/30/24(Mon)12:48:07 No.102625187

>>102625106
Good morning Miku

Anonymous
09/30/24(Mon)12:52:53 No.102625246

Anonymous 09/30/24(Mon)12:52:53 No.102625246

>>102625106
omg it migu!

Anonymous
09/30/24(Mon)13:01:29 No.102625351

Anonymous 09/30/24(Mon)13:01:29 No.102625351

>>102625094
>here's why...
Reminds me of the western journalist prefill.
>X is actually a good thing, here's why...

Anonymous
09/30/24(Mon)13:08:21 No.102625435

Anonymous 09/30/24(Mon)13:08:21 No.102625435

Update on getting Qwen to improve the ban decision script. It failed. I am currently in the process of making it troubleshoot and correct the thing. It is taking multiple exchanges and time because I am running at <1t/s lmao.

Anonymous
09/30/24(Mon)13:11:03 No.102625475

Anonymous 09/30/24(Mon)13:11:03 No.102625475

>>102622158
To me, the big question we have yet to answer is "can raw scale allow LLMs to get us to AGI for text-to-text tasks?"
The potential answers being "yes, it's possible", "not alone, but maybe if used as a piece of a larger pipeline", and "no, diminishing gains are in full effect past a certain point and something more fundamental is necessary"

Anonymous
09/30/24(Mon)13:31:54 No.102625709

Anonymous 09/30/24(Mon)13:31:54 No.102625709

Does anyone have the RULER results for deepseek 2,5? I'm having strange issues with it losing coherence around 10k tokens and I can't find it in any of the RULER charts that are out there.

Anonymous
09/30/24(Mon)13:34:45 No.102625743

Anonymous 09/30/24(Mon)13:34:45 No.102625743

>>102625709
Yeah if it's not on the github then it probably hasn't been done before. No NoCha results either. Weird that it would break down at 10k though. Have you tried neutralizing samplers to isolate potential causes?

Anonymous
09/30/24(Mon)13:45:59 No.102625918

Anonymous 09/30/24(Mon)13:45:59 No.102625918

Remember that graph with the state of ai generals over time? How we thought things would only keep getting better. But we were young, and naive.

Anonymous
09/30/24(Mon)13:55:27 No.102626085

Anonymous 09/30/24(Mon)13:55:27 No.102626085

>>102619672
>molmo
I'm going to spend some time picking apart fluxgym to see where it invokes florence2, and try to replace it with molmo 7B. Florennce2 doesn't describe nudity or body parts.

Anonymous
09/30/24(Mon)14:02:45 No.102626212

Anonymous 09/30/24(Mon)14:02:45 No.102626212

>>102625743
I did a re-roll and things tightened back up. Might've been a spurious software or hardware error.
The V2 page on HF shows 128k with coherence right to the brink, but I'd still love to see an independent bench on it to verify the amount of context I should be seeing before it loses its mind.
I'm hoping I can at least get 32k before any issues show up, as I'm getting things dialed in nicely for creative writing.

Anonymous
09/30/24(Mon)14:18:30 No.102626465

Anonymous 09/30/24(Mon)14:18:30 No.102626465

Is there a single RP finetune that isn't overly horny? I use default generic system prompt with a generic character, walk up and slap her ass and she immediately goes "mark me as yours". Zero character fidelity.

Anonymous
09/30/24(Mon)14:19:02 No.102626476

Anonymous 09/30/24(Mon)14:19:02 No.102626476

>>102626465
What models have you tried?

Anonymous
09/30/24(Mon)14:24:54 No.102626574

Anonymous 09/30/24(Mon)14:24:54 No.102626574

File: Untitled.png (164 KB, 1294x913)

164 KB PNG

>>102626465
i never have this problem

Anonymous
09/30/24(Mon)14:42:07 No.102626856

Anonymous 09/30/24(Mon)14:42:07 No.102626856

How much does CPU performance affect batch processing with vLLM? I'm trying to do some NLP text processing and have ~1.5 million prompts to do through, so I'm trying to make each prompt as quick as possible

Anonymous
09/30/24(Mon)14:43:09 No.102626876

Anonymous 09/30/24(Mon)14:43:09 No.102626876

>>102626465
In order not to be horny by default, a RP model would have not to be trained *primarily* on erotic stories or ERP logs. Finetuners either do not understand this or do not care.

Anonymous
09/30/24(Mon)14:56:21 No.102627115

Anonymous 09/30/24(Mon)14:56:21 No.102627115

File: Untitled.png (638 KB, 2058x1748)

638 KB PNG

>>102626574

Anonymous
09/30/24(Mon)15:00:14 No.102627181

Anonymous 09/30/24(Mon)15:00:14 No.102627181

>>102626465
You can instruct a model to calm down.

Anonymous
09/30/24(Mon)15:09:12 No.102627291

Anonymous 09/30/24(Mon)15:09:12 No.102627291

>>102626876
the actual, unironic mentality of sloptuners is that models should output smut ASAP. they stopped namefagging here when they shared these opinions and got torched for being retards.

Anonymous
09/30/24(Mon)15:10:27 No.102627311

Anonymous 09/30/24(Mon)15:10:27 No.102627311

Best model to run on 16 gigs? I'll probably ask it to help me code.

Anonymous
09/30/24(Mon)15:14:13 No.102627376

Anonymous 09/30/24(Mon)15:14:13 No.102627376

any good AI tools for making movie subtitles?

Anonymous
09/30/24(Mon)15:15:20 No.102627405

Anonymous 09/30/24(Mon)15:15:20 No.102627405

>>102627376
As in transcriptions? You can try whisper.cpp

Anonymous
09/30/24(Mon)15:18:30 No.102627458

Anonymous 09/30/24(Mon)15:18:30 No.102627458

>>102616938
>this style of directing the story and having the AI write everything
Finally somebody else who uses the kino method, but I have it write in third-person past tense instead of roleplay-retard tense so the final output looks like a normal story. The final logs should be readable with all of the trash inputted by User stripped out.

Anonymous
09/30/24(Mon)15:20:06 No.102627490

Anonymous 09/30/24(Mon)15:20:06 No.102627490

>>102626465
You need to specify in your user card that you are unattractive. A lot of these models are just tuned to assume you're attractive because that's what users like to roleplay as. They aren't overriding character personalities but just realistically rendering them considering that assumption.

Anonymous
09/30/24(Mon)15:21:49 No.102627522

Anonymous 09/30/24(Mon)15:21:49 No.102627522

>>102627405
yeah, I got a low quality TV rip of an old TV series, and sometimes have trouble understanding what they're saying. wanted to see if AI can produce some subs for me. this whisper.cpp thingy looks good, and apparently produces SRT files. I'll give it a shot, thanks anon!

Anonymous
09/30/24(Mon)15:32:20 No.102627685

Anonymous 09/30/24(Mon)15:32:20 No.102627685

File: molmo.png (78 KB, 1263x667)

78 KB PNG

Anonymous
09/30/24(Mon)15:59:22 No.102628093

Anonymous 09/30/24(Mon)15:59:22 No.102628093

>>102627405
>>102627522 (me)
huh, not bad. it gets timestamps slightly off, but still very impressive.

Anonymous
09/30/24(Mon)16:10:20 No.102628267

Anonymous 09/30/24(Mon)16:10:20 No.102628267

>>102628093
Glad its working. I played around with it a while ago, trying to make it trip with my own voice, faking accents, changing language and slurring and it did pretty well, considering i was using the medium models. Never tried with recorded media. I could try feeding it some death metal to see if it can make anything out past the rest of the noise.

Anonymous
09/30/24(Mon)16:14:46 No.102628332

Anonymous 09/30/24(Mon)16:14:46 No.102628332

>>102628267
i tried it a few days ago with dir en grey's "agitated screams of maggots" using the itty bitty kobold whisper thing and it couldn't understand anything.

Anonymous
09/30/24(Mon)16:19:55 No.102628404

Anonymous 09/30/24(Mon)16:19:55 No.102628404

Could any fellow poorfags itt tell me how bad the 32B experience is with a 4060 8GB? Is offloading to the RAM Really that bad? Does the CPU matter in this case? I'll be getting a 5700X3D soon.
Token generation doesn't have to be instant, just fast enough to not be frustrating.

Anonymous
09/30/24(Mon)16:21:06 No.102628417

Anonymous 09/30/24(Mon)16:21:06 No.102628417

File: doge.jpg (30 KB, 512x384)

30 KB JPG

>>102622275
I am honestly not seeing the appeal of other free models besides Llama right now. Setting the templates is intuitive and it follows instructions pretty good and I don't have to dick with the samplers much. Made just for me. I know there's slop with Llama, but it's not too bad.

Anonymous
09/30/24(Mon)16:24:03 No.102628461

Anonymous 09/30/24(Mon)16:24:03 No.102628461

>>102628332
heh. Expected. i was thinking of something a little milder. Opeth probably. Turmion Katilot would have been the final boss.

Anonymous
09/30/24(Mon)16:33:08 No.102628600

Anonymous 09/30/24(Mon)16:33:08 No.102628600

>>102628404
32b models run okayish even in RAM.
Unless your computer is very old/weak

Anonymous
09/30/24(Mon)16:35:10 No.102628626

Anonymous 09/30/24(Mon)16:35:10 No.102628626

>>102628600
That's really good to know, I can imagine a 3060 being better than the 4060 because of the additional 4GBs of ram but they have very similar prices where I live so I don't see why I should get it over the 4060.
Do you have any average tokens with a 4060 in a 32B model? Just so I can set my expectations right.

Anonymous
09/30/24(Mon)16:35:13 No.102628627

Anonymous 09/30/24(Mon)16:35:13 No.102628627

>>102628404
upgrading your CPU will be good in video games and shit but will not help you here. you need more VRAM to run bigger models at decent speed
assuming you're on a desktop, you could add a 3060 12gb card to your system which would allow you to offload 18/19 gbs of the model to your VRAM and your experience with 30b models would be way faster

Anonymous
09/30/24(Mon)16:41:34 No.102628730

Anonymous 09/30/24(Mon)16:41:34 No.102628730

>>102628626
i'm on an rtx 4060 and an AMD Ryzen 7 5700X
what quant size were you looking at? i'll grab the stupid qwen one and run a benchmark
https://huggingface.co/bartowski/Qwen2.5-32B-Instruct-GGUF/tree/main

Anonymous
09/30/24(Mon)16:43:19 No.102628755

Anonymous 09/30/24(Mon)16:43:19 No.102628755

>>102628627
>add a 3060 12gb card
NTA but you should go for the older A4000 to get at least 16gb VRAM.
As a plus it uses way less power too (and only 1 pci-e power cable

Anonymous
09/30/24(Mon)16:45:09 No.102628798

Anonymous 09/30/24(Mon)16:45:09 No.102628798

>>102628730
In my shitbox I tend to have the best results with the Q4_K_M but I've never really bothered to really get deep into it, after a few tests when I started looking into LLMs I decided that this one was the best and just kept using it.
>>102628627
8GBs>12GBs really are that important huh? Even though the 4060 has faster and more efficient memory?
>>102628755
>A4000
That sounds extremely appealing, I might get that alongside a main GPU but I need to give it more thought, thank you very much for the suggestion.

Anonymous
09/30/24(Mon)16:45:51 No.102628801

Anonymous 09/30/24(Mon)16:45:51 No.102628801

>>102628626
It's the other way around for me. I don't see the reason to get a 4060 unless I want to save energy. 4gb extra of vram makes a difference.

Anonymous
09/30/24(Mon)16:48:24 No.102628835

Anonymous 09/30/24(Mon)16:48:24 No.102628835

>>102628755
those cost more used than a new 4060ti 16gb

Anonymous
09/30/24(Mon)16:48:53 No.102628842

Anonymous 09/30/24(Mon)16:48:53 No.102628842

>>102628798
>I tend to have the best results with the Q4
yeah that's what i always run with too, on 12bs though.
i tried Theia v2 21b at Q2_K and it was fucking lobotomized.
downloading the Q4_K_M Qwen, should be about ~10 minutes

Anonymous
09/30/24(Mon)16:53:29 No.102628900

Anonymous 09/30/24(Mon)16:53:29 No.102628900

>>102628801
The thing is, I like gaymin too and I'm still not sure if all the nvidia gimmicks (DLSS, framegen, etc) are actually useful or not. I can't find a good place that compares the 3060 vs 4060 while using ALL the available features they have. If the features the 4060 has over the 3060 don't seem to be able to prolong its life that much then I might as well pick the one with more VRAM.
>>102628835
Depends where you live, I found an used A4000 for considerably less than a used 4060ti.
>>102628842
Thank you for taking the time to check it out. I hope it's good enough.

Anonymous
09/30/24(Mon)16:55:06 No.102628918

Anonymous 09/30/24(Mon)16:55:06 No.102628918

>>102628798
>8GBs>12GBs really are that important huh? Even though the 4060 has faster and more efficient memory?
I don't mean you should swap your gpu, I mean you should add a 2nd.
koboldcpp can use multiple gpus.
I have a 16gb 4060ti but I kept my old 1070ti in a different pcie slot. even though it's slower and running at only 4x speed in that slot i can offload layers to it for larger models and it's still faster than offloading to system ram
if you have a nicer mobo you might be able to run both gpus at 8x
BUT for any model that requires less than 16gb, selecting ALL gpu's actually does slow things down. it's only worth using for models that can't fit in the 4060ti

Anonymous
09/30/24(Mon)16:55:28 No.102628925

Anonymous 09/30/24(Mon)16:55:28 No.102628925

>>102628835
>those cost more used than a new 4060ti 16gb
Sure but they're also two slots. A4000 is only one which makes it more convenient when case space is at a premium. Also don't have to deal with the bloody 16pin connectors.
That 165W TDP on the 4060ti is not bad though.

Anonymous
09/30/24(Mon)16:57:32 No.102628959

Anonymous 09/30/24(Mon)16:57:32 No.102628959

>>102628918
I might have badly expressed myself, I'm trying to decide between the 3060 and the 4060, I don't have either (I haven't bought anything yet actually).
>if you have a nicer mobo you might be able to run both gpus at 8x
I'll take a look into it, sounds like a really nice thing to have.

Anonymous
09/30/24(Mon)16:59:56 No.102628995

Anonymous 09/30/24(Mon)16:59:56 No.102628995

>>102628900
>I'm still not sure if all the nvidia gimmicks (DLSS, framegen, etc) are actually useful or not.
You fell for the marketing. Graphics don't matter that much unless you're rich, and a rich man wouldn't even consider a 4060.

Anonymous
09/30/24(Mon)17:01:03 No.102629002

Anonymous 09/30/24(Mon)17:01:03 No.102629002

File: Untitled.png (48 KB, 1115x628)

48 KB PNG

>>102628900
and here we are, 4k context, 2 tokens per seconds.
less than 10 is literally unusable for me.

Anonymous
09/30/24(Mon)17:05:47 No.102629064

Anonymous 09/30/24(Mon)17:05:47 No.102629064

>>102628959
>I'm trying to decide between the 3060 and the 4060
4060ti 16gb or the 3060 12gb if you want to run models
do not get the 8gb version of either. and also they are about the same for gaming and other shit, either one will crush 1080p, there's not a whole lot of difference
i would say the 16gb ti version is the right choice if you want to do image generation or play with llm

Anonymous
09/30/24(Mon)17:08:14 No.102629106

Anonymous 09/30/24(Mon)17:08:14 No.102629106

File: IMG_20240930_165922556.jpg (2.47 MB, 4640x3472)

2.47 MB JPG

>>102628925
>ase space is at a premium.
4060ti is a babby card you could easilly fit two

Anonymous
09/30/24(Mon)17:15:00 No.102629177

Anonymous 09/30/24(Mon)17:15:00 No.102629177

File: chad.png (48 KB, 902x666)

48 KB PNG

this mf isn't taking me seriously

Anonymous
09/30/24(Mon)17:17:18 No.102629205

Anonymous 09/30/24(Mon)17:17:18 No.102629205

>>102628995
I care about the gimmicks because the card will inevitably turn obsolete, and those framegen gimmicks might prolong it's life a little more.
I'm currently using a 1060, I don't care much about maxing out everything, I just want them to last a long time.
4060 gives you the DLSS Frame gen, while the 3060 does not. This is the only thing that makes me consider this card over the 3060. The thing is, I don't know how actually useful this feature is, whether nvidia will make it obsolete after 2 generations and never mention it again, or if it'll turn out to be "not good enough" to do anything like RTX in the 20XX cards.
>>102629002
Brutal, thank you very much for doing the test anon, I appreciate it a lot.
>>102629064
I'll try to keep it in mind and consider saving up for a few more months and get something else, the 4060TI 16GB in my country is around 63% more expensive than the 4060. Maybe I'll snipe the A4000 the other anon mentioned because that one used is around the same price as the 4060 (new). I just don't know if it'd even work for gaming. Googling has a few good things to say about it but it's too early to be sure.

Anonymous
09/30/24(Mon)17:22:18 No.102629265

Anonymous 09/30/24(Mon)17:22:18 No.102629265

File: 1701575286866790.jpg (16 KB, 573x59)

16 KB JPG

thats some saggy tits

Anonymous
09/30/24(Mon)17:24:15 No.102629291

Anonymous 09/30/24(Mon)17:24:15 No.102629291

>>102629265
that means she will fall on her tits not because of her tits, my niggah

Anonymous
09/30/24(Mon)17:26:42 No.102629316

Anonymous 09/30/24(Mon)17:26:42 No.102629316

>>102629205
Just save more for a 16gb 4060 ti then. Problem solved.

Anonymous
09/30/24(Mon)17:27:16 No.102629323

Anonymous 09/30/24(Mon)17:27:16 No.102629323

>>102629205
> if it'd even work for gaming
Not the best card for it. Not even an A6000 games as well as its specs would suggest.
It was more of a recommendation as an additional card than a main driver.
If it's not clear you can mix and match within the same brand for inferencing with LLMs (unlike SLI for gaming).

Anonymous
09/30/24(Mon)17:27:29 No.102629328

Anonymous 09/30/24(Mon)17:27:29 No.102629328

>>102629205
DLSS Frame Gen is a game changer. It pumps out frames so fast it makes your GPU look like it's been slacking for years. If you're still running without it, you're basically asking for a slideshow instead of a game.

Anonymous
09/30/24(Mon)17:28:27 No.102629340

Anonymous 09/30/24(Mon)17:28:27 No.102629340

>>102629205
Wouldn't you be better buying an used 3090? More vram and more gaming performance.

Anonymous
09/30/24(Mon)17:34:32 No.102629406

Anonymous 09/30/24(Mon)17:34:32 No.102629406

>>102629328
>https://rentry.org/lmg-build-guides
Save yourself some head-scratching and read these. You'll get an idea of what does and doesn't matter for LLM inference, and then you can decide how much it matters to you vs gaming, because the AI stuff needs a pretty specific subset of specs to be maxxed out to really work well.

Anonymous
09/30/24(Mon)17:35:46 No.102629417

Anonymous 09/30/24(Mon)17:35:46 No.102629417

>>102629177
>Acts like a silly clown
>Surprised the model isn't taking him seriously

Anonymous
09/30/24(Mon)17:43:06 No.102629501

Anonymous 09/30/24(Mon)17:43:06 No.102629501

***
Why doesn't llama.cpp statically link rocm? And why isn't there a binary version so I don't have to shit up my local install with amd's notoriously vile amdgpu scripts?

I had to purge amd's trash from my pc, and now my llama.cpp build won't work.

Anonymous
09/30/24(Mon)17:55:21 No.102629635

Anonymous 09/30/24(Mon)17:55:21 No.102629635

>>102629501
if you need the easiest possible way to run AMD without dealing with compiling:
https://github.com/YellowRoseCx/koboldcpp-rocm/releases
Still recommend figuring out llama.cpp regardless.

Anonymous
09/30/24(Mon)18:10:31 No.102629850

Anonymous 09/30/24(Mon)18:10:31 No.102629850

>>102626465
Yes, mine. Please buy me a kofi

Anonymous
09/30/24(Mon)18:12:16 No.102629871

Anonymous 09/30/24(Mon)18:12:16 No.102629871

>>102629635
.exe

I'm on Linux. I'm not some rich guy.

Anonymous
09/30/24(Mon)18:13:17 No.102629884

Anonymous 09/30/24(Mon)18:13:17 No.102629884

>>102629635
btw the easiest is ollama. it just werks with amd.

Anonymous
09/30/24(Mon)18:46:07 No.102630285

Anonymous 09/30/24(Mon)18:46:07 No.102630285

https://x.com/immasiddtweets/status/1840739629588177223

Anonymous
09/30/24(Mon)18:48:07 No.102630304

Anonymous 09/30/24(Mon)18:48:07 No.102630304

>>102616609
May /lmg/ never die.

Anonymous
09/30/24(Mon)18:52:26 No.102630347

Anonymous 09/30/24(Mon)18:52:26 No.102630347

>>102630285
Kek, Sam will allow this but not direct explicit ERP that doesn't harm anyone.

Anonymous
09/30/24(Mon)18:56:29 No.102630382

Anonymous 09/30/24(Mon)18:56:29 No.102630382

Holy god. Trying to do something code-related with a big model mostly contained in RAM is hellish and has made me realize that we really ought to make some kind of optimization for transformers that allows it to edit an existing part of its context or something in a stored txt file. Having it rewrite the entire script to integrate various changes is such a waste of inference.

Anonymous
09/30/24(Mon)18:58:00 No.102630398

Anonymous 09/30/24(Mon)18:58:00 No.102630398

>>102630382
You mean like the kv cache and context shifting in llama.cpp?

Anonymous
09/30/24(Mon)19:01:06 No.102630425

Anonymous 09/30/24(Mon)19:01:06 No.102630425

File: 1721515960020231.png (257 KB, 1460x936)

257 KB PNG

>>102616609
Magnum 405B was a failure.
Anthracite will never make a model again...

Anonymous
09/30/24(Mon)19:06:46 No.102630456

Anonymous 09/30/24(Mon)19:06:46 No.102630456

>>102630304
it's already dead, 24 hours and not even 300 posts

Anonymous
09/30/24(Mon)19:14:01 No.102630505

Anonymous 09/30/24(Mon)19:14:01 No.102630505

File: 39_06121_.png (2.97 MB, 2048x2048)

2.97 MB PNG

>>102630425
They should release it anyway for science

Anonymous
09/30/24(Mon)19:16:38 No.102630529

Anonymous 09/30/24(Mon)19:16:38 No.102630529

>>102630425
What a waste. Like people would have been able to run it. Should tune qwen2.5 / llama 3.1 instead. Maybe even deepseek 2.5

Anonymous
09/30/24(Mon)19:17:12 No.102630536

Anonymous 09/30/24(Mon)19:17:12 No.102630536

>>102630456
The biggest issue is that most people don't want to make what ai is best at: making friends.

>never gonna give you up
>never gonna let you down
>never gonna tell a lie and hurt you
>
>never gonna run around and desert you

Anonymous
09/30/24(Mon)19:18:21 No.102630542

Anonymous 09/30/24(Mon)19:18:21 No.102630542

>>102630382
>has made me realize that we really ought to make some kind of optimization for transformers
Visionary. How come nobody thought of this? Must be really simple to do. Quick. Let's think of a catchy name for it before we start investigating.

Anonymous
09/30/24(Mon)19:21:29 No.102630569

Anonymous 09/30/24(Mon)19:21:29 No.102630569

>>102624907
That because cloud AI just werks™

Anonymous
09/30/24(Mon)19:28:14 No.102630629

Anonymous 09/30/24(Mon)19:28:14 No.102630629

>>102630536
AI is not at the level yet, it can't remember things. Talk to it long enough and it will simply forget.

Anonymous
09/30/24(Mon)19:28:43 No.102630634

Anonymous 09/30/24(Mon)19:28:43 No.102630634

File: 1717766862962030.png (2.27 MB, 4096x4096)

2.27 MB PNG

https://x.com/LiquidAI_/status/1840768716784697688

Anonymous
09/30/24(Mon)19:29:01 No.102630640

Anonymous 09/30/24(Mon)19:29:01 No.102630640

>>102624907
the proxy bullshit is the worst.
i skimmed it earlier and people are talking about how they're sending dick pics to some sodomite for 16mb of bandwidth on his proxy.
i'd actually love a general where people just talked about chatbot creation and prompt engineering instead of the gay shit that happens over there.
nipmoot capitulating and creating an /ai/ board to shake shit up can't happen soon enough.

Anonymous
09/30/24(Mon)19:29:51 No.102630649

Anonymous 09/30/24(Mon)19:29:51 No.102630649

>>102630640
>nipmoot capitulating and creating an /ai/ board
never happening btw

Anonymous
09/30/24(Mon)19:29:54 No.102630652

Anonymous 09/30/24(Mon)19:29:54 No.102630652

>>102630640
i want the ecker to actually die irl

Anonymous
09/30/24(Mon)19:32:20 No.102630677

Anonymous 09/30/24(Mon)19:32:20 No.102630677

>>102630634
We're not open-sourcing these models at the moment, but we want to contribute to the community by openly publishing our findings, methods, and interesting artifacts.

We'll start by publishing scientific blog posts about LFMs, leading up to our product launch event on October 23, 2024.

lol

Anonymous
09/30/24(Mon)19:33:02 No.102630686

Anonymous 09/30/24(Mon)19:33:02 No.102630686

>>102630640
>>>/vg/496613219

Anonymous
09/30/24(Mon)19:34:18 No.102630706

Anonymous 09/30/24(Mon)19:34:18 No.102630706

File: SpeedyInferenceLogo.png (158 KB, 288x368)

158 KB PNG

>>102630542
I'll make the logo.
We're practically done, boys!

Anonymous
09/30/24(Mon)19:41:59 No.102630776

Anonymous 09/30/24(Mon)19:41:59 No.102630776

File: 1720264582276214.jpg (465 KB, 4096x2304)

465 KB JPG

>>102630634
Could be interesting if a) they open sourced it b) they offered an actually interesting size like 110B/A33B or so. The performance looks okay for its size but they are gimping themselves by only offering options for vramlets.

Anonymous
09/30/24(Mon)19:45:00 No.102630795

Anonymous 09/30/24(Mon)19:45:00 No.102630795

>>102630706
Nice, anon. I'll start writing the paper
>In the last years [TODO: Insert some references here], large language models, have shown impressive improvements in training and inference optimization. However, performing inference when only a short section of the KV cache needs to be changed is still time consuming, as the whole cache needs to be reprocessed. In this paper, we present Speedy Inference Matrix Passthrough as a way to speed up inference by 0.000031% with only 16 clusters of 8x H100...

Anonymous
09/30/24(Mon)19:46:03 No.102630814

Anonymous 09/30/24(Mon)19:46:03 No.102630814

File: 1710367565412852.png (1021 KB, 2000x1291)

1021 KB PNG

Anonymous
09/30/24(Mon)19:48:33 No.102630838

Anonymous 09/30/24(Mon)19:48:33 No.102630838

what are some generic lewdtunes for one of the more recent models in sizes 7 and below?

SNC dFi9y8xX0lX
09/30/24(Mon)19:49:09 No.102630851

SNC dFi9y8xX0lX 09/30/24(Mon)19:49:09 No.102630851

>>102630838
mythomax

Anonymous
09/30/24(Mon)19:50:32 No.102630863

Anonymous 09/30/24(Mon)19:50:32 No.102630863

>>102630851
mythoslop is both old and above 7b

Anonymous
09/30/24(Mon)19:52:27 No.102630883

Anonymous 09/30/24(Mon)19:52:27 No.102630883

>>102630838
Not sure how to say this without it sounding like a joke, but olmoe 1b-7b is ridiculous. No need to tune it. Sadly, only 4k context. Hopefully molmoe, with their 32k context claim, is just as unhinged.

Anonymous
09/30/24(Mon)19:55:21 No.102630912

Anonymous 09/30/24(Mon)19:55:21 No.102630912

>>102630398
No? I imagine it'd need especially made training data too perhaps. Even Molmo's ability to put points on an image is essentially just outputting tokens of the coordinates the model thinks the points should go, that then get applied to the image by the frontend.

>>102630542
You can just be honest and complain about my complaining. If we're being serious though, I have indeed thought of this a long time ago and just thought about it again. Reason no one has done anything about it actually isn't because of the difficulty too much (although to solve it at a very fundamental level would be a different story), but because it's not really a problem for those who actually have the resources for funding model development. Speculative decoding methods already make it more efficient on inferencing of code generation. It's only really felt as a problem for local models running on slow hardware.

Anonymous
09/30/24(Mon)19:57:10 No.102630923

Anonymous 09/30/24(Mon)19:57:10 No.102630923

>>102630883
huh olmo completely missed me, ill try it out though
thanks anon

Anonymous
09/30/24(Mon)20:06:00 No.102630995

Anonymous 09/30/24(Mon)20:06:00 No.102630995

>>102630912
>You can just be honest and complain about my complaining.
I did, in a round-about way. Token generation is sequential. The next token depends on all the previous tokens. Change one in the middle and the whole sequence after the change needs to be recalculated. You don't want optimization, you want a different architecture. Once we find a way to generate plausible collections of tokens without them depending on the previous ones, you'll be notified.
>Speculative decoding methods already make it more efficient on inferencing of code generation.
I hate that it became a word. Infer, inferring. inference... no... "inferencing".
Speculative decoding doesn't try to skip any step. it just does the same steps with a smaller model.

Anonymous
09/30/24(Mon)20:08:40 No.102631024

Anonymous 09/30/24(Mon)20:08:40 No.102631024

File: fff.png (415 B, 254x14)

415 B PNG

>>102630923
Beware. High temp can lead places you wouldn't expect or want. I've toned it down since then...

Anonymous
09/30/24(Mon)20:19:05 No.102631121

Anonymous 09/30/24(Mon)20:19:05 No.102631121

>running midnight-miqu-70b iq3-xs
>CtxLimit:24350/24576, Amt:123/350, Init:1.48s, Process:42.73s (46.8ms/T = 21.39T/s), Generate:236.01s (1918.8ms/T = 0.52T/s), Total:278.74s (0.44T/s)
>used to the slop
>i know exactly what she's gonna say before she says it
i think i'll go back to a smaller model, at least i can pretend that it's not as predictable

Anonymous
09/30/24(Mon)20:21:24 No.102631142

Anonymous 09/30/24(Mon)20:21:24 No.102631142

>>102631121
Have you tried the schizo samplers being peddled a couple threads back?
I think it was temp max, topk 5 minp 0.1.
Something like that.

Anonymous
09/30/24(Mon)20:21:45 No.102631149

Anonymous 09/30/24(Mon)20:21:45 No.102631149

I had this random thought that there really are a lot of creative ways to play out my fucked up fetish (I kinda realized this when I was telling the LLM what to do). And I never had a model spontaneously come up with a single idea like the ones I had... Makes me think that it is gonna be 2-5 years before models stop being incoherent retarded and spitting out purple prose. And then there is gonna be a huge gap between that and them actually becoming creative...

Anonymous
09/30/24(Mon)20:22:45 No.102631160

Anonymous 09/30/24(Mon)20:22:45 No.102631160

Anybody poast this?
>https://www.liquid.ai/liquid-foundation-models

Anonymous
09/30/24(Mon)20:22:59 No.102631164

Anonymous 09/30/24(Mon)20:22:59 No.102631164

>>102631149
>Makes me think that it is gonna be 2-5 years before models stop being incoherent retarded
How did you come up with that number?

Anonymous
09/30/24(Mon)20:24:15 No.102631173

Anonymous 09/30/24(Mon)20:24:15 No.102631173

>>102631160
Anybody know where the scrollbar is?

Anonymous
09/30/24(Mon)20:24:34 No.102631175

Anonymous 09/30/24(Mon)20:24:34 No.102631175

>>102631160
>>102630634
>A12b
what does the A mean?
i'm hyped for another model in nemo's weight class either way.

Anonymous
09/30/24(Mon)20:26:16 No.102631188

Anonymous 09/30/24(Mon)20:26:16 No.102631188

File: 1701478486400250.png (65 KB, 1403x485)

65 KB PNG

>>102630634
>>102631160
Fuck off faggots.

Anonymous
09/30/24(Mon)20:27:02 No.102631191

Anonymous 09/30/24(Mon)20:27:02 No.102631191

>>102631175
active parameters, since it's a moe and doesn't use all 40b
it's not really in nemo's weight class for that reason, same speed but a lot bigger

Anonymous
09/30/24(Mon)20:30:11 No.102631212

Anonymous 09/30/24(Mon)20:30:11 No.102631212

>>102631191
shit

Anonymous
09/30/24(Mon)20:33:13 No.102631232

Anonymous 09/30/24(Mon)20:33:13 No.102631232

File: sloppa.png (11 KB, 777x214)

11 KB PNG

very slow website with sloppa
pip install openai
https://femboy.beauty/2thxT

navigate to http://localhost:8000 - what does your favorite rp model do with this?

Anonymous
09/30/24(Mon)20:34:14 No.102631239

Anonymous 09/30/24(Mon)20:34:14 No.102631239

File: 1700077148478107.png (747 KB, 933x707)

747 KB PNG

>>102631232

Anonymous
09/30/24(Mon)20:34:51 No.102631243

Anonymous 09/30/24(Mon)20:34:51 No.102631243

>>102631164
Extrapolate 2 more weeks to years and then add 2 more and then limit it to 5 because holy shit I can't believe incoherence and kissing on lips while rimming will be a problem in 2029.

Anonymous
09/30/24(Mon)20:39:07 No.102631270

Anonymous 09/30/24(Mon)20:39:07 No.102631270

>>102631243
Wait a minute. You didn't tell me you were an actual AI researcher. I would have treated you with more respect. My apologies.
Thanks for confirming that we'll get better models in 2-5 years. Should be put in the OP.

Anonymous
09/30/24(Mon)20:39:21 No.102631274

Anonymous 09/30/24(Mon)20:39:21 No.102631274

I remember somebody proposing a MoE optimization where you'd take a common baseline for the expert layers and have the differences in memory in the form of adapters that get applied to this baseline layer after the router selects which experts to use, something like that.
That completely fell by the wayside, right?
I wonder if something like that could be done for dense models too.

Anonymous
09/30/24(Mon)20:40:09 No.102631281

Anonymous 09/30/24(Mon)20:40:09 No.102631281

>>102630995
Yes I meant "honest" as in "direct". Anyway, token generation being sequential doesn't necessarily mean the stored KV cache has to change. What I was thinking originally was both the possibility of a deep architectural change but also something simpler like a continued pretrain based on specialized data similar to what Molmo did. If you think about it, Molmo's technique of having the model learn to be able to output image coordinates isn't really that different from the idea of having a model be able to output locations in text and pairing that with edit commands, although there would be some care needed to make sure tokenization does not hinder performance. To the model, it does not perceive context like we perceive a block of text, so it should essentially interpret its edits as parts of the code just fine when trained to do it.

>Speculative decoding doesn't try to skip any step. it just does the same steps with a smaller model.
It's not about skipping steps, but rather utilizing resources efficiently. That means saturating the compute you have available by using a batch size larger than 1 (what speculative decoding does).

Anonymous
09/30/24(Mon)20:48:43 No.102631358

Anonymous 09/30/24(Mon)20:48:43 No.102631358

>>102631232
>hee hee procedurely generated ai slop'geon.
Brainrot.

Anonymous
09/30/24(Mon)20:51:12 No.102631386

Anonymous 09/30/24(Mon)20:51:12 No.102631386

File: seduce sloppa.png (16 KB, 390x359)

16 KB PNG

>>102631239
>>102631358
sex with sloppa

Anonymous
09/30/24(Mon)20:54:11 No.102631408

Anonymous 09/30/24(Mon)20:54:11 No.102631408

>>102631386
Kinda based, though.

Anonymous
09/30/24(Mon)20:54:16 No.102631410

Anonymous 09/30/24(Mon)20:54:16 No.102631410

>>102631281
>What I was thinking originally was both the possibility of a deep architectural change
Saying "Guys. faster-than-light travel is a problem. We're not there yet. What if we use a different type of rocket engine?. Not sure what they're made of, or the fuel they use, if any, but the ones we have obviously aren't good enough." doesn't help much. That's not an idea. It's barely a thought. It's a want.
I read the rest, but i can't be bothered. Either present something more tangible or email the olmo people to help you with your idea.
Maybe someone will pick up the argument.

Anonymous
09/30/24(Mon)20:56:31 No.102631430

Anonymous 09/30/24(Mon)20:56:31 No.102631430

>>102631410
The goal was never to get something done. Discussion isn't always productive. If productivity is what you want, this thread was never for you. 4chan in general really.

Anonymous
09/30/24(Mon)21:03:08 No.102631489

Anonymous 09/30/24(Mon)21:03:08 No.102631489

>>102631142
no, but i was using that schizo jailbreak someone posted

Anonymous
09/30/24(Mon)21:03:21 No.102631491

Anonymous 09/30/24(Mon)21:03:21 No.102631491

>>102631274
Yeah pretty sure that was just an ideaguy. It's important to realize that that's all most of us amount to, even if some of us seem to be performing experiments sometimes.
What do you mean by done for dense models?

Anonymous
09/30/24(Mon)21:07:06 No.102631535

Anonymous 09/30/24(Mon)21:07:06 No.102631535

>>102631232
>he needs a Python server to make HTTP requests
It's crazy how retarded Python devs are.

Anonymous
09/30/24(Mon)21:24:45 No.102631705

Anonymous 09/30/24(Mon)21:24:45 No.102631705

>>102630634
so basically slop overfitted for MMLU pro.

Anonymous
09/30/24(Mon)21:26:48 No.102631724

Anonymous 09/30/24(Mon)21:26:48 No.102631724

File: splits.vpd.jpg (36 KB, 360x360)

36 KB JPG

Dearest /lmg/, local ai imagen recently had a bit of a development
and so, as always, today we are migu:
https://files.catbox.moe/6m12g9.jpg
https://files.catbox.moe/v7o8fa.jpg
https://files.catbox.moe/h7qx0h.jpg
it's only up from here.

Anonymous
09/30/24(Mon)21:28:01 No.102631731

Anonymous 09/30/24(Mon)21:28:01 No.102631731

>>102631724
>Dearest /lmg/, local ai imagen recently had a bit of a development
What happened? Sounds like good news.

Anonymous
09/30/24(Mon)21:28:03 No.102631733

Anonymous 09/30/24(Mon)21:28:03 No.102631733

>>102631724
>local ai imagen recently had a bit of a development
Really? QRD?

Anonymous
09/30/24(Mon)21:28:17 No.102631734

Anonymous 09/30/24(Mon)21:28:17 No.102631734

>>102631724
nice mikus
what was the development? I haven't been following image stuff since flux

Anonymous
09/30/24(Mon)21:51:56 No.102631945

Anonymous 09/30/24(Mon)21:51:56 No.102631945

>>102631731
>>102631733
>>102631734
I assume he's referring to the release of IllustriousXL. The model is arguably overall better than pony. It definitely has much broader knowledge of characters and concepts, thanks to being trained on literally the entirety of unfiltered danbooru.

Anonymous
09/30/24(Mon)21:54:48 No.102631960

Anonymous 09/30/24(Mon)21:54:48 No.102631960

>>102630634
>>102630776
>scores less than 70B
why the fuck they made 40B look higher in the diagram? knew it seemed dumb when I tried the demo chat

Anonymous
09/30/24(Mon)22:00:59 No.102632001

Anonymous 09/30/24(Mon)22:00:59 No.102632001

>>102631945
Huh? But it's worse than flux

Anonymous
09/30/24(Mon)22:02:08 No.102632020

Anonymous 09/30/24(Mon)22:02:08 No.102632020

>>102631960
The diagram is doing this ancient MoE trick of comparing benchmark scores to active parameters.

Anonymous
09/30/24(Mon)22:04:30 No.102632032

Anonymous 09/30/24(Mon)22:04:30 No.102632032

>>102631945
Same as the leaked version, or a full version?

Anonymous
09/30/24(Mon)22:05:05 No.102632034

Anonymous 09/30/24(Mon)22:05:05 No.102632034

>>102632020
Never mind, Y-axis says MMLU-Pro, the exact bench it scores higher in, instead of overall performance.

Anonymous
09/30/24(Mon)22:29:01 No.102632193

Anonymous 09/30/24(Mon)22:29:01 No.102632193

>When we finally break apart I'm panting slightly, my lips feeling swollen. "Fuck…" I breathe out, looking into his eyes. "I… I don't know what to say." I admit, looking down at his chest. I can see the outline of his nipples through the wet fabric of his pants.
L3-hanami is crazy

Anonymous
09/30/24(Mon)22:30:32 No.102632213

Anonymous 09/30/24(Mon)22:30:32 No.102632213

>I breathe out
>I don't know what to say
>I admit
>I can see
repetitive slop

Anonymous
09/30/24(Mon)22:36:38 No.102632253

Anonymous 09/30/24(Mon)22:36:38 No.102632253

File: worried laughter.png (514 KB, 520x678)

514 KB PNG

>>102632193
>I can see the outline of his nipples through the wet fabric of his pants.

Anonymous
09/30/24(Mon)22:38:44 No.102632267

Anonymous 09/30/24(Mon)22:38:44 No.102632267

File: 12523.jpg (117 KB, 1600x1600)

117 KB JPG

>>102632253

Anonymous
09/30/24(Mon)22:38:57 No.102632270

Anonymous 09/30/24(Mon)22:38:57 No.102632270

>>102630776
>thinks Mixtral 8x7b has 8k context

Anonymous
09/30/24(Mon)22:41:03 No.102632289

Anonymous 09/30/24(Mon)22:41:03 No.102632289

>>102632193
Hi Sao

Anonymous
09/30/24(Mon)22:47:26 No.102632350

Anonymous 09/30/24(Mon)22:47:26 No.102632350

>>102632267
I lol'd

Anonymous
09/30/24(Mon)22:48:56 No.102632361

Anonymous 09/30/24(Mon)22:48:56 No.102632361

>>102631724
can I get a loli teto by fiz-rot?

Anonymous
09/30/24(Mon)23:00:36 No.102632454

Anonymous 09/30/24(Mon)23:00:36 No.102632454

>>102631724
Thanks for sharing mate

Anonymous
09/30/24(Mon)23:01:17 No.102632459

Anonymous 09/30/24(Mon)23:01:17 No.102632459

>>102632446
>>102632446
>>102632446

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.