/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

[Post a Reply]

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous
/lmg/ - Local Models General 08/13/24(Tue)04:25:32 No.101865145

File: 21522 - SoyBooru.png (46 KB, 457x694)

46 KB PNG

/lmg/ - Local Models General Anonymous 08/13/24(Tue)04:25:32 No.101865145

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>101841690 & >>101851615

►News
>(08/12) Falcon Mamba 7B model from TII UAE: https://hf.co/tiiuae/falcon-mamba-7b
>(08/09) Qwen large audio-input language models: https://hf.co/Qwen/Qwen2-Audio-7B-Instruct
>(08/07) LG AI releases Korean bilingual model: https://hf.co/LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct
>(08/05) vLLM GGUF loading support merged: https://github.com/vllm-project/vllm/pull/5191
>(07/31) Gemma 2 2B, ShieldGemma, and Gemma Scope: https://developers.googleblog.com/en/smaller-safer-more-transparent-advancing-responsible-ai-with-gemma

►News Archive: https://rentry.org/lmg-news-archive
►FAQ: https://wikia.schneedc.com
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Programming: https://hf.co/spaces/bigcode/bigcode-models-leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
08/13/24(Tue)04:28:37 No.101865168

Anonymous 08/13/24(Tue)04:28:37 No.101865168

>>101865145
I'm really really sorry for this picture, but it was just too funny to not post.

Anonymous
08/13/24(Tue)04:34:39 No.101865225

Anonymous 08/13/24(Tue)04:34:39 No.101865225

>>101865145
omg it not migu

Anonymous
08/13/24(Tue)04:40:58 No.101865290

Anonymous 08/13/24(Tue)04:40:58 No.101865290

>>101865145
'em on the 'log

Anonymous
08/13/24(Tue)04:42:00 No.101865300

Anonymous 08/13/24(Tue)04:42:00 No.101865300

>>101865168
Hi petra

Anonymous
08/13/24(Tue)04:43:37 No.101865312

Anonymous 08/13/24(Tue)04:43:37 No.101865312

>>101865300
hi

Anonymous
08/13/24(Tue)04:55:17 No.101865419

Anonymous 08/13/24(Tue)04:55:17 No.101865419

I'm liking Largestral more and more, it **really** gets the depth of the situation where the others have struggled in the past. Writing style is a bit sloppy, it's no CR+, but the smarts are so worth it for me. While CR+ was like local gpt3.5(in a good way), Largestral feels like discounted gpt4. New llamas can't recreate this feeling, likely because they've been filtered to shit.

Anonymous
08/13/24(Tue)04:59:26 No.101865461

Anonymous 08/13/24(Tue)04:59:26 No.101865461

File: file.png (52 KB, 987x299)

52 KB PNG

a new llm flew over my house last week and i didn't even notice

Anonymous
08/13/24(Tue)04:59:32 No.101865464

Anonymous 08/13/24(Tue)04:59:32 No.101865464

File: media_GU2PuybWwAAg6pK.jpg (201 KB, 1284x1087)

201 KB JPG

ClosedAI lost.

Anthropic won.

Anonymous
08/13/24(Tue)05:05:11 No.101865511

Anonymous 08/13/24(Tue)05:05:11 No.101865511

Do any of the frontends support Vi keybindings?

Anonymous
08/13/24(Tue)05:07:44 No.101865530

Anonymous 08/13/24(Tue)05:07:44 No.101865530

>>101865511
vim, funnily enough.
There's a vim plugin in llama.cpp.
>https://github.com/ggerganov/llama.cpp/blob/master/examples/llama.vim
Then there's llm.vim in the same dir.
I haven't tried either.

Anonymous
08/13/24(Tue)05:08:09 No.101865543

Anonymous 08/13/24(Tue)05:08:09 No.101865543

File: file.png (1 KB, 69x26)

1 KB PNG

>anon wanted my rp logs to "train a model that doesn't lose coherence beyond 32k"
>one of the biggest rp logs I have is barely 40k, probably closer to 38 once I remove all the date stamps
>only have like, two or three like this and the rest are all one-shots
SHAMEFUR DISPRAY
I now commit seppuku

Anonymous
08/13/24(Tue)05:09:32 No.101865565

Anonymous 08/13/24(Tue)05:09:32 No.101865565

>>101865511
I think there was an Emacs package that you could point at a local server.
If you combine that with Evil you should be able to get Vi keybindings in a buffer.

>>101865530
Disregard what I said, this is probably much easier.

Anonymous
08/13/24(Tue)05:10:07 No.101865572

Anonymous 08/13/24(Tue)05:10:07 No.101865572

>AGI will drop in 8 hours
How do you feel with it, bros?
I really hope that your sincerest apologies to sam fagman and schizoid from shitter are already prepared.

Anonymous
08/13/24(Tue)05:11:48 No.101865595

Anonymous 08/13/24(Tue)05:11:48 No.101865595

>>101865565
You know, I completely forgot Evil mode existed. That's a great idea.

Thank you anon

Anonymous
08/13/24(Tue)05:12:02 No.101865597

Anonymous 08/13/24(Tue)05:12:02 No.101865597

Hey, is the anon with like a shitton of f-list logs here? I wouldn't mind paying for them/helping strip them of personal info.

Anonymous
08/13/24(Tue)05:12:57 No.101865602

Anonymous 08/13/24(Tue)05:12:57 No.101865602

>>101865572
Nothing ever happens.

Anonymous
08/13/24(Tue)05:12:58 No.101865603

Anonymous 08/13/24(Tue)05:12:58 No.101865603

>>101865595
If you're going to use Emacs with Vim keybindings, take a look at Doom Emacs https://github.com/doomemacs/doomemacs .
It's pretty good out-of-the-box.

Anonymous
08/13/24(Tue)05:13:35 No.101865609

Anonymous 08/13/24(Tue)05:13:35 No.101865609

>>101865597
That's me. Unfortunately, the logs are not as robust as I thought, anon >>101865543

Anonymous
08/13/24(Tue)05:13:57 No.101865614

Anonymous 08/13/24(Tue)05:13:57 No.101865614

File: threadrecap.png (1.48 MB, 1536x1536)

1.48 MB PNG

►Recent Highlights from the Previous Thread: >>101851615

--Papers: >>101862943 >>101863033
--Using base models for text completion without chat prompts: >>101854854 >>101854903 >>101854973 >>101855036 >>101855282
--Mixing writing styles can improve language model output: >>101854125 >>101854266 >>101855195 >>101855322
--LimaRP Zloss model discussion and comparison to Character.AI: >>101862941 >>101862955 >>101862981 >>101862990 >>101863084 >>101863261 >>101863442 >>101863459 >>101863611
--Anon struggles with LangChain, implements custom Anthropic API solution, and seeks help with final step of function calling cycle: >>101862038
--Pivot-evil's surprising coherence and potential for improvement: >>101855151 >>101855950
--NAI's uncensored claims and model details questioned: >>101861115 >>101861406 >>101861452 >>101862405 >>101862478 >>101862498 >>101862549
--Logit bias can manipulate stop token probability: >>101859943 >>101860005 >>101860022
--Llama 405 settings and usage tips for stable output: >>101852828 >>101853037
--GPT-4 Large is not necessarily smarter than humans in every way: >>101857956 >>101858226 >>101858804
--EXL2 model underperforms compared to GGUF quants model: >>101863196 >>101863225
--Celeste model fails Nala test due to limited parameters or over-correction: >>101852299 >>101853738 >>101856884
--Best open source models for ERP discussed: >>101852462 >>101852473 >>101853156 >>101852502 >>101852789
--Anon's model crashes at context limit, possible backend/frontend mismatch: >>101855057 >>101855082
--Anon uses 3.5 Sonnet to automate data science tasks and plans to add chat functionality: >>101853689
--Anon suspects GPT-4o model is just a stealth update: >>101861520
--Anon asks for local implementation of OpenQ: >>101861108
--Anon shares UGI Leaderboard for finding meme models: >>101859199 >>101859257 >>101859643
--Miku (free space): >>101855183 >>101864875

►Recent Highlight Posts from the Previous Thread: >>101851627

Anonymous
08/13/24(Tue)05:19:24 No.101865667

Anonymous 08/13/24(Tue)05:19:24 No.101865667

I can't seem to find a quantization of the base version of Llama 3.1 70B. Is there a reason for that? Like, is there something stopping it from being quantized?

Anonymous
08/13/24(Tue)05:21:00 No.101865681

Anonymous 08/13/24(Tue)05:21:00 No.101865681

>>101865667
No it's just that people often don't bother to quantize base models because it's less common for people to want to do inference with them

I find it annoying too, since I use models to autocomplete story prose and base models are often great for that

Anonymous
08/13/24(Tue)05:24:55 No.101865708

Anonymous 08/13/24(Tue)05:24:55 No.101865708

>>101859643

Nous-Hermes is good, but I would use that more as a Jarvis/general assist than a coombot. It's relatively smart, (although I wouldn't consistently expect it to solve the Wolf, Goat, and Cabbage problem, which is my own go to test for model intelligence) but it's like Dolphin in the sense that it doesn't have a lot of ERP related vocabulary.

Anonymous
08/13/24(Tue)05:29:24 No.101865755

Anonymous 08/13/24(Tue)05:29:24 No.101865755

I'm going to slop myself

Anonymous
08/13/24(Tue)05:32:27 No.101865778

Anonymous 08/13/24(Tue)05:32:27 No.101865778

Does Mistral large also struggle with advancing the rp for you?
I have to sublty tell it to advance the plot multiple times until something new happens it's always like:
>Alright I'm about to do it
>Alright then do it
>I'm really going to do it
>Well then continue
>Now I'm going to do it

Anonymous
08/13/24(Tue)05:35:58 No.101865807

Anonymous 08/13/24(Tue)05:35:58 No.101865807

>>101865145
>no underage anime girl
troons will be mad

Anonymous
08/13/24(Tue)05:38:42 No.101865836

Anonymous 08/13/24(Tue)05:38:42 No.101865836

>>101865807
>t. can't stop thinking about underage anime girls or troons.

Anonymous
08/13/24(Tue)05:40:17 No.101865848

Anonymous 08/13/24(Tue)05:40:17 No.101865848

File: 1723463961018015.png (194 KB, 1080x1660)

194 KB PNG

Altman won

Anonymous
08/13/24(Tue)05:47:01 No.101865905

Anonymous 08/13/24(Tue)05:47:01 No.101865905

>>101865836
>>101865807
Can't you separate reality from fiction? Outside of your imagination, anime 'girls' don't have ages at all, they are images on a screen, lines on a page, and otherwise, do not exist. They are not living creatures with an age, so they cannot be underage. Please think, next time, before posting.

Anonymous
08/13/24(Tue)05:48:32 No.101865917

Anonymous 08/13/24(Tue)05:48:32 No.101865917

>>101865848
He won nothing and nothing is coming out. Hoax larp account.

Anonymous
08/13/24(Tue)05:49:37 No.101865927

Anonymous 08/13/24(Tue)05:49:37 No.101865927

>>101865917
meds

Anonymous
08/13/24(Tue)05:50:24 No.101865934

Anonymous 08/13/24(Tue)05:50:24 No.101865934

>>101865927
you're the schizo here if you think anything is dropping today

Anonymous
08/13/24(Tue)05:50:48 No.101865936

Anonymous 08/13/24(Tue)05:50:48 No.101865936

>>101865905
>t. didn't get the point of the reply.

Anonymous
08/13/24(Tue)05:51:34 No.101865941

Anonymous 08/13/24(Tue)05:51:34 No.101865941

File: sam altman award.png (7 KB, 177x217)

7 KB PNG

>>101865848
>>101865927
Here's your reward, bro.

Anonymous
08/13/24(Tue)05:53:58 No.101865966

Anonymous 08/13/24(Tue)05:53:58 No.101865966

>>101865934
they are going to rekease it because they want to stop trump from destroying woke
trump is too powerful for woke

Anonymous
08/13/24(Tue)05:55:59 No.101865993

Anonymous 08/13/24(Tue)05:55:59 No.101865993

>>101865941
cope

Anonymous
08/13/24(Tue)05:58:03 No.101866011

Anonymous 08/13/24(Tue)05:58:03 No.101866011

>>101865993
did you change your gender yet?

Anonymous
08/13/24(Tue)06:02:28 No.101866062

Anonymous 08/13/24(Tue)06:02:28 No.101866062

File: 3353.png (109 KB, 615x474)

109 KB PNG

I am fucking HYPED now

Anonymous
08/13/24(Tue)06:03:53 No.101866078

Anonymous 08/13/24(Tue)06:03:53 No.101866078

>>101865848
Wow it's so powerful they won't release or even demo it anywhere until years of safety checks. Sama i kneel before your creation.

Anonymous
08/13/24(Tue)06:04:40 No.101866087

Anonymous 08/13/24(Tue)06:04:40 No.101866087

>>101866062
gullible tard, stop shitting up the thread

Anonymous
08/13/24(Tue)06:08:01 No.101866127

Anonymous 08/13/24(Tue)06:08:01 No.101866127

>>101866011
troon projection

Anonymous
08/13/24(Tue)06:08:30 No.101866132

Anonymous 08/13/24(Tue)06:08:30 No.101866132

Asked this in the end of the last thread. Anons I am looking to find repetitive patterns in text using an ai model. I want it to work like n-grams but without me choosing an n value. Which model would you suggest I work with? An example would be feeding a book of text to the model, it finds all the repetitive portions of text, and then I feed other books to find the same repetitive portions in those texts/add new repetitive portions of sentences.

Anonymous
08/13/24(Tue)06:11:06 No.101866162

Anonymous 08/13/24(Tue)06:11:06 No.101866162

>>101865848
Can it do porn?
Can it be politically incorrect?
Can I have my favorite voiceactor as my voiceover?
It will just be a completely censored shit that will promote wokeness.
I will use it as much as gpt4. To improve my local model ecosystem.

Anonymous
08/13/24(Tue)06:13:23 No.101866192

Anonymous 08/13/24(Tue)06:13:23 No.101866192

>>101866062
>llms in 2023
>human level intelligence
What a low-tier grifter. Look, nigger, this thread, with all of it's shortcomings, is too intelligent to fall for it again. We've had too many outrageous claims here before, nobody but newfags will fall for it this time. We've had too many boys who cried AGI here and you just another one of them.

Anonymous
08/13/24(Tue)06:16:08 No.101866204

Anonymous 08/13/24(Tue)06:16:08 No.101866204

>>101866062
Why? Because someone pulled a meaningless graph directly out of their ass and posted it on Twitter?

I don't understand transhumanist fucktards. I just really don't. They throw empiricism completely out the window, and replace it with pure emotion. The terms "Artificial General Intelligence," and "Artificial Super Intelligence" have ABSOLUTELY NO quantifiable, falsifiable definition. They are associated with pure hype. It is absolutely no different to religion. If you consider yourself an atheist, and you are also a transhumanist, then I want you to know that you are the worst form of fucking hypocrite. I truly wish I could tell Ray Kurzweil to fuck himself, and that he could hear me.

Anonymous
08/13/24(Tue)06:17:13 No.101866216

Anonymous 08/13/24(Tue)06:17:13 No.101866216

>>101866162
>To improve my local model ecosystem.
How are you supposed to do that? How do you fine tune your own local model?

Anonymous
08/13/24(Tue)06:17:43 No.101866222

Anonymous 08/13/24(Tue)06:17:43 No.101866222

File: 1710748966936683.png (245 KB, 712x770)

245 KB PNG

>>101865848
>>101866062
>we got agi
>may I see it?
>No

Anonymous
08/13/24(Tue)06:18:59 No.101866236

Anonymous 08/13/24(Tue)06:18:59 No.101866236

>>101866204
it's the rapture for atheist nerds who think they're too cool for religion

Anonymous
08/13/24(Tue)06:21:56 No.101866266

Anonymous 08/13/24(Tue)06:21:56 No.101866266

>>101866216
Find out for yourself

Anonymous
08/13/24(Tue)06:27:36 No.101866321

Anonymous 08/13/24(Tue)06:27:36 No.101866321

>>101865614
Thanks, Recap Miku!

Anonymous
08/13/24(Tue)06:29:58 No.101866350

Anonymous 08/13/24(Tue)06:29:58 No.101866350

>>101865848
>>101866062
This has to be one of the most obnoxious retards I've seen recently.
Unless OAI comes out with something on par with electricity or nuclear fission (they won't) this level of hype is nowhere near warranted.
I don't understand the kind of mouthbreather that actually responds well to this garbage.

Anonymous
08/13/24(Tue)06:48:07 No.101866540

Anonymous 08/13/24(Tue)06:48:07 No.101866540

>>101866222

Anonymous
08/13/24(Tue)06:50:17 No.101866564

Anonymous 08/13/24(Tue)06:50:17 No.101866564

>>101866350
>the kind of mouthbreather that actually responds well to this garbage.
Anon, /lmg/tards still falling for strawberry scam and shitting up the thread. >>101806115

Anonymous
08/13/24(Tue)06:51:54 No.101866582

Anonymous 08/13/24(Tue)06:51:54 No.101866582

>>101866350
>Unless OAI comes out with something on par with electricity or nuclear fission
I honestly do think of GPT4 as being close to that big of a deal; or at least pre-June 2023 4. They've nerfed it so fucking hard that it's a wonder it still works at all, at this point. Even though I just finished sperging out at Ray Kurzweil, if GPT4 had been allowed to progress along its' pre-"alignment" trajectory, I'd probably have my own head just as far up his ass as anyone else.

Anonymous
08/13/24(Tue)06:53:08 No.101866599

Anonymous 08/13/24(Tue)06:53:08 No.101866599

>>101866564
*petra still falling
also, hi petra

Anonymous
08/13/24(Tue)07:03:32 No.101866705

Anonymous 08/13/24(Tue)07:03:32 No.101866705

>>101865145
I knew it's against orthodox, but this image is really funny.

Anonymous
08/13/24(Tue)07:04:49 No.101866720

Anonymous 08/13/24(Tue)07:04:49 No.101866720

>>101866222
Three strawberries IS the AGI.

Anonymous
08/13/24(Tue)07:07:03 No.101866754

Anonymous 08/13/24(Tue)07:07:03 No.101866754

>>101866705
>underage shartcuck cancer
>funny
You know that you need to be at least 18 to post here?

Anonymous
08/13/24(Tue)07:10:24 No.101866792

Anonymous 08/13/24(Tue)07:10:24 No.101866792

>>101865168
I laughed it's good.
Gemerald.

Anonymous
08/13/24(Tue)07:12:01 No.101866812

Anonymous 08/13/24(Tue)07:12:01 No.101866812

>>101866754
>Incapable of enjoying a topical gem
>calls me underage
lame

Anonymous
08/13/24(Tue)07:12:32 No.101866819

Anonymous 08/13/24(Tue)07:12:32 No.101866819

>>101866754
seethe harder oldtroon

Anonymous
08/13/24(Tue)07:13:37 No.101866835

Anonymous 08/13/24(Tue)07:13:37 No.101866835

>>101866754
You seem fun at parties!

Anonymous
08/13/24(Tue)07:14:00 No.101866843

Anonymous 08/13/24(Tue)07:14:00 No.101866843

>>101866266
Thanks for the elucidative reply you subhuman vermin

Anonymous
08/13/24(Tue)07:14:05 No.101866844

Anonymous 08/13/24(Tue)07:14:05 No.101866844

>>101865848
Shut the fuck up already, they will saying the same shit about GPT-2.
Holy fuck, if you aren't a paid shill go find a rope, it's unbelievable how naive and prone to simple marketing some people are.

Anonymous
08/13/24(Tue)07:15:31 No.101866859

Anonymous 08/13/24(Tue)07:15:31 No.101866859

>>101865145
I got Koboldcpp for the first time.
Gpu layers value is set to -1 which I think disables GPU offloading?
I want all my models to run at GPU. They are in different sizes so there is no one size fits all number.
If I set it to something like 10000, will koboldcpp try to offload entirety of every model I throw at it to GPU?
(I already checked wiki but there is no answer about this)

Anonymous
08/13/24(Tue)07:15:54 No.101866861

Anonymous 08/13/24(Tue)07:15:54 No.101866861

>>101866844
mindbroken by sama

Anonymous
08/13/24(Tue)07:16:23 No.101866866

Anonymous 08/13/24(Tue)07:16:23 No.101866866

>>101866844 (me)
they were*

Anonymous
08/13/24(Tue)07:17:21 No.101866876

Anonymous 08/13/24(Tue)07:17:21 No.101866876

>>101866812
>>101866819
>>101866835
>admitting to be underageb&

Anonymous
08/13/24(Tue)07:17:37 No.101866880

Anonymous 08/13/24(Tue)07:17:37 No.101866880

>>101866812
Kill yourself.

Anonymous
08/13/24(Tue)07:19:43 No.101866906

Anonymous 08/13/24(Tue)07:19:43 No.101866906

>>101866876
>>101866880
>seething lmgroid

Anonymous
08/13/24(Tue)07:22:03 No.101866933

Anonymous 08/13/24(Tue)07:22:03 No.101866933

>>101866859
>Gpu layers value is set to -1 which I think disables GPU offloading?
no, it actually (tries) to guess the best amount of layers to offload. also yes setting it to 99 or 999 will try and offload even if it results in a crash due to out of memory

Anonymous
08/13/24(Tue)07:22:38 No.101866938

Anonymous 08/13/24(Tue)07:22:38 No.101866938

Having an above average intelligence model is such a weird... lie? I mean if it is true then just deploy a thousand of them and make them solve some problem that wasn't solved by humans yet? You can just present solution to the problem and then we believe you that you have AGI but we can't see it yet.

Anonymous
08/13/24(Tue)07:23:17 No.101866943

Anonymous 08/13/24(Tue)07:23:17 No.101866943

>>101866906
>admits being a tourist who is here just to derail threads with irrelevant underage cancer
Lol.

Anonymous
08/13/24(Tue)07:24:52 No.101866964

Anonymous 08/13/24(Tue)07:24:52 No.101866964

>>101865614
>Using base models for text completion without chat prompts
Fucking lost my original reply, so here's a summary. Just use chat models, use alpaca template but with a changed instruction at the top, I use ooba's notebook tab for this shit. Saying you have to use a base model is a meme, they all do completion, and at least instruct following lets you give it instructions. Nemo or Mistral Large are great for smut imo.
>>101862038
LangChain is a joke. It's literally just an extra layer of python over your requests.
As for that last step, just slam your shit into the messages list and send it back for another completion. It's really not that hard.

Anonymous
08/13/24(Tue)07:25:18 No.101866974

Anonymous 08/13/24(Tue)07:25:18 No.101866974

>>101866943
>a few posts is "deraliling"
shut the fuck up you melodramatic bitch

Anonymous
08/13/24(Tue)07:28:06 No.101867002

Anonymous 08/13/24(Tue)07:28:06 No.101867002

Why are people here even excited for strawberry even if it is true then we'll get it in local models in 20 years and we'll need 200gb of vram to run it.

Anonymous
08/13/24(Tue)07:28:09 No.101867003

Anonymous 08/13/24(Tue)07:28:09 No.101867003

>>101866933
Well it crashed for some reason when I set it to 10000 and tried to ask something to 8gb model on 12gb card(it loaded fine though). I left it as -1 and loaded a separate model and now it seems to be working.
May I ask if you know whether Kobold.cpp supports Imatrix models or not?

Anonymous
08/13/24(Tue)07:29:18 No.101867020

Anonymous 08/13/24(Tue)07:29:18 No.101867020

>>101867002
>are people here even excited for strawberry
It's just one fag samefagging.

Anonymous
08/13/24(Tue)07:29:56 No.101867026

Anonymous 08/13/24(Tue)07:29:56 No.101867026

>>101867003
>May I ask if you know whether Kobold.cpp supports Imatrix models or not?
of course it does why wouldn't it?

Anonymous
08/13/24(Tue)07:31:04 No.101867043

Anonymous 08/13/24(Tue)07:31:04 No.101867043

Why are people here even excited for GPT-4 even if it is true then we'll get it in local models in 20 years and we'll need 200gb of vram to run it.

Anonymous
08/13/24(Tue)07:31:42 No.101867057

Anonymous 08/13/24(Tue)07:31:42 No.101867057

>>101867003
>Well it crashed for some reason when I set it to 10000 and tried to ask something to 8gb model on 12gb card(it loaded fine though).
as for that, probably context size making it use a lot more vram, nemo and llama 3.1 have huge contexts that eat trhough vram like candy if you don't limit it to something reasonable like --contextsize 8192
or --contextsize 16384

Anonymous
08/13/24(Tue)07:33:32 No.101867077

Anonymous 08/13/24(Tue)07:33:32 No.101867077

File: count.jpg (440 KB, 1229x2032)

440 KB JPG

Can your fav model count properly?

Anonymous
08/13/24(Tue)07:33:41 No.101867080

Anonymous 08/13/24(Tue)07:33:41 No.101867080

>>101867057
>--contextsize 8192
>reasonable

That's like 1 character card.

Anonymous
08/13/24(Tue)07:35:15 No.101867098

Anonymous 08/13/24(Tue)07:35:15 No.101867098

>>101867080
im trying to get his shit working for a start not have him wonder why he cant load 128k ctx on 12gb of vram

llama.cpp CUDA dev !!OM2Fp6Fn93S
08/13/24(Tue)07:36:34 No.101867116

llama.cpp CUDA dev !!OM2Fp6Fn93S 08/13/24(Tue)07:36:34 No.101867116

>>101866859
>>101866933
The llama.cpp behavior for -ngl -1 is to attempt to offload the maximum number of layers though I don't know if koboldcpp changed this.
Values larger than the maximum number of layers for a model will be clamped to the maximum that the model supports.

>>101867003
Importance matrices are only used during quantization to determine which values need more/less precision.
For the later evaluation of the model there is no difference apart from the specific numerical values stored in the model.

Anonymous
08/13/24(Tue)07:38:21 No.101867139

Anonymous 08/13/24(Tue)07:38:21 No.101867139

>>101867116
>The llama.cpp behavior for -ngl -1 is to attempt to offload the maximum number of layers though I don't know if koboldcpp changed this.
yes
>NEW: Automatic GPU layer selection has been improved, thanks to the efforts of @henk717 and @Pyroserenus. You can also now set --gpulayers to -1 to have KoboldCpp guess how many layers to be used. Note that this is still experimental, and the estimation may not be fully accurate, so you will still get better results manually selecting the GPU layers to use.
https://github.com/LostRuins/koboldcpp/releases/tag/v1.71.1

Anonymous
08/13/24(Tue)07:39:39 No.101867152

Anonymous 08/13/24(Tue)07:39:39 No.101867152

>>101865848
>kike won chud!

Anonymous
08/13/24(Tue)07:44:01 No.101867190

Anonymous 08/13/24(Tue)07:44:01 No.101867190

>>101866974
>a few posts

Anonymous
08/13/24(Tue)07:45:00 No.101867196

Anonymous 08/13/24(Tue)07:45:00 No.101867196

>>101867190
Compared to multiple threads about strawberry scam? Yes, a few.

Anonymous
08/13/24(Tue)07:49:28 No.101867235

Anonymous 08/13/24(Tue)07:49:28 No.101867235

what's a good setup for using LLM with image models that is viable on 16GB VRAM and 64GB RAM?

a small 8B model with sd1.5? never tried image generation on kcpp, how well does it work?

Anonymous
08/13/24(Tue)07:49:47 No.101867238

Anonymous 08/13/24(Tue)07:49:47 No.101867238

>>101867026
The model that crashed was an imatrix one.
Probably another reason for the crash then.
>>101867057
It was 4096 I think.
Maybe I should enable debug mode and hope it outputs something useful here?
>>101867116
Thanks for the response.
Since you have llama.cpp CUDA dev name(no idea if this is legit but shooting my shot), do you know what are the most likely causes and fixes of these problems:
LLM terminating its response early. Literally in the middle of a paragraph or a sentence.(Currently experiencing this on llama 3.1)
LLM adding I guess what it thinks my response should be at the end of its response, ignoring my actual responses and keeps talking with itself.
Just gibberish nonsensical arrangement of letters.
LLM starting sane, but then it devolves into repeating same words to infinity until I manually terminate.
Thanks a lot if you respond.

Anonymous
08/13/24(Tue)07:50:35 No.101867249

Anonymous 08/13/24(Tue)07:50:35 No.101867249

>>101867077
I can't even count properly so I don't expect my model to be able to do so.

Anonymous
08/13/24(Tue)07:51:36 No.101867261

Anonymous 08/13/24(Tue)07:51:36 No.101867261

Ollama Obama

Anonymous
08/13/24(Tue)07:52:08 No.101867269

Anonymous 08/13/24(Tue)07:52:08 No.101867269

Hate to bring this shit up here but, for this many-month old comment:
>>455129259

Would anyone know if it would be possible to develop a similar chatbot A.I. using purely local resources (no rental GPU / cloud training), and if so, what resources may be needed?

Latency doesn't have to be too fast.

llama.cpp CUDA dev !!OM2Fp6Fn93S
08/13/24(Tue)08:05:21 No.101867403

llama.cpp CUDA dev !!OM2Fp6Fn93S 08/13/24(Tue)08:05:21 No.101867403

>>101867238
>LLM terminating its response early. Literally in the middle of a paragraph or a sentence.
Probably reaching the limit of max. new tokens, this is an input parameter that you can set.

>LLM adding I guess what it thinks my response should be at the end of its response, ignoring my actual responses and keeps talking with itself.
The model is just continuing the text, you need to set a stop token/sequence that terminates the autocompletion when it's your turn again.

>Just gibberish nonsensical arrangement of letters.
That is usually a bug where the numerical output of the model evaluation is just wrong.
But these bugs are only actionable for me if you can reproduce them with the upstream llama.cpp project.

>LLM starting sane, but then it devolves into repeating same words to infinity until I manually terminate.
LLMs repeating themselves is an unsolved problem, attempts to mitigate it are usually done via samplers or repetition penalty.
Basically once you start having repetitions in your context the model will think that the most likely way to continue the text is more repetition so it quickly amplifies.

Just in case you are doing this: base models (that just autocomplete general text) tend in my experience more towards repetition than instruct models that are specifically geared towards a chat format and finetuned on human preference.

Anonymous
08/13/24(Tue)08:18:51 No.101867565

Anonymous 08/13/24(Tue)08:18:51 No.101867565

>>101867403
>Probably reaching the limit of max. new tokens, this is an input parameter that you can set.
Oh yeah it was set to 200, increasing it seems to help for now.
If I want AI to give short responses do you know if there is any way for it to goad it towards that without making it terminate its responses in half?
>The model is just continuing the text, you need to set a stop token/sequence that terminates the autocompletion when it's your turn again.
I can learn what these tokens are for the model from its model card?
Kobold has something about instruct tag preset, which I think handles automatically for some models?
>But these bugs are only actionable for me if you can reproduce them with the upstream llama.cpp project.
Understandable. I may try that later.
>LLMs repeating themselves is an unsolved problem, attempts to mitigate it are usually done via samplers or repetition penalty.
>Basically once you start having repetitions in your context the model will think that the most likely way to continue the text is more repetition so it quickly amplifies.
>Just in case you are doing this: base models (that just autocomplete general text) tend in my experience more towards repetition than instruct models that are specifically geared towards a chat format and finetuned on human preference.
So I can fix that by increasing repetition penalty, correct?
But wouldn't that also affect how the model responds in general, like increased propensity towards changing topics?

llama.cpp CUDA dev !!OM2Fp6Fn93S
08/13/24(Tue)08:26:50 No.101867659

llama.cpp CUDA dev !!OM2Fp6Fn93S 08/13/24(Tue)08:26:50 No.101867659

>>101867565
>I can learn what these tokens are for the model from its model card?
Assuming you have an instruct model it has some instruct format like
USER: how 2 download car
ASSISTANT: You can't download cars.
You would set "USER:" as the stop string so that when the model thinks the text should be continued by more user input the program returns control to you.

>Kobold has something about instruct tag preset, which I think handles automatically for some models?
I don't know how koboldcpp specifically handles this but the instruct format (if there is one) is saved in the model files.

>So I can fix that by increasing repetition penalty, correct?
Fix is I would say too strong of a word.
As I said, it's an unsolved problem.

>But wouldn't that also affect how the model responds in general, like increased propensity towards changing topics?
Yes, you basically have a tradeoff between precise and diverse model responses.
One major parameter to tune here is the so-called temperature: higher values mean more randomness in how the model picks the next token.
With temperature 0 the model always picks the most likely next token.

Anonymous
08/13/24(Tue)08:33:55 No.101867737

Anonymous 08/13/24(Tue)08:33:55 No.101867737

AGI in 5 hours oh em gee guys!!

Anonymous
08/13/24(Tue)08:45:41 No.101867895

Anonymous 08/13/24(Tue)08:45:41 No.101867895

what kind of applications do local models have? do you guys do this for work or fun? beginner here looking for ideas

Anonymous
08/13/24(Tue)08:46:25 No.101867901

Anonymous 08/13/24(Tue)08:46:25 No.101867901

>>101867659
Alright thank you for taking your time.
As a final question is there a nice guide somewhere that explains what obscure parameters like smoothing_factor, epsilon_cutoff(not in koboldcpp but oobabooga has these and some more) do in detail and optimal values for different purposes or do I need to dig through papers to get these?

Anonymous
08/13/24(Tue)08:48:30 No.101867925

Anonymous 08/13/24(Tue)08:48:30 No.101867925

Love to see chuds seething about strawberry

Altman won

llama.cpp CUDA dev !!OM2Fp6Fn93S
08/13/24(Tue)08:49:14 No.101867934

llama.cpp CUDA dev !!OM2Fp6Fn93S 08/13/24(Tue)08:49:14 No.101867934

>>101867901
llama.cpp I think has it documented in some of the README.md files but I am not aware of a beginner-friendly guide for sampling parameters.

>dig through papers
You'll probably have more luck digging through pull requests.

Anonymous
08/13/24(Tue)08:49:31 No.101867945

Anonymous 08/13/24(Tue)08:49:31 No.101867945

>>101867901
>As a final question is there a nice guide somewhere that explains what obscure parameters like smoothing_factor, epsilon_cutoff(not in koboldcpp but oobabooga has these and some more) do in detail and optimal values for different purposes or do I need to dig through papers to get these?
this has a ton of info (some might be a little outdated) https://github.com/LostRuins/koboldcpp/wiki#what-are-samplers-how-do-i-change-or-disable-them-what-are-the-best-samplers

Anonymous
08/13/24(Tue)08:51:28 No.101867960

Anonymous 08/13/24(Tue)08:51:28 No.101867960

When 10am PT rolls around and nothing happens except strawberry man deleting all his past tweets about it to start shilling for something else, I WILL spam gore every time somebody brings him up again.

Anonymous
08/13/24(Tue)08:58:18 No.101868048

Anonymous 08/13/24(Tue)08:58:18 No.101868048

>>101867960
It's just some marketing bs and people get baited by it XD

But it's so fun baiting people... I can't resist. I have 24h a day

Anonymous
08/13/24(Tue)08:59:36 No.101868067

Anonymous 08/13/24(Tue)08:59:36 No.101868067

If nothing happens at 10am I will post myself doing jumping jacks naked

Anonymous
08/13/24(Tue)09:00:11 No.101868077

Anonymous 08/13/24(Tue)09:00:11 No.101868077

If nothing happens at 10am I will do nothing

Anonymous
08/13/24(Tue)09:00:32 No.101868083

Anonymous 08/13/24(Tue)09:00:32 No.101868083

>>101867925
2: Ohh and also, chuds, open source will never catch up. DO YOU LISTEN TO ME? DO YOU?

Anonymous
08/13/24(Tue)09:02:20 No.101868109

Anonymous 08/13/24(Tue)09:02:20 No.101868109

once i have access to AGI i will turn you all into strawberries

Anonymous
08/13/24(Tue)09:05:15 No.101868141

Anonymous 08/13/24(Tue)09:05:15 No.101868141

>>101867960
>I WILL spam gore
anon... what kind of threats are these?

Anonymous
08/13/24(Tue)09:06:15 No.101868161

Anonymous 08/13/24(Tue)09:06:15 No.101868161

If something happens at 10am I will do nothing

Anonymous
08/13/24(Tue)09:08:21 No.101868189

Anonymous 08/13/24(Tue)09:08:21 No.101868189

>>101868161
If 10 am happens at something I will do nothing

Anonymous
08/13/24(Tue)09:08:29 No.101868191

Anonymous 08/13/24(Tue)09:08:29 No.101868191

>>101868141
What kind of freak has gore saved on his hard drive. Imagine that's someone's son. Sickening

Anonymous
08/13/24(Tue)09:09:19 No.101868206

Anonymous 08/13/24(Tue)09:09:19 No.101868206

so i downloaded this bf16.gguf model version and it's super fucking slow even though it's all in video memory. the Q8_0 quant of the same is normal. what's up with that?

Anonymous
08/13/24(Tue)09:09:31 No.101868211

Anonymous 08/13/24(Tue)09:09:31 No.101868211

huge.

Anonymous
08/13/24(Tue)09:15:30 No.101868283

Anonymous 08/13/24(Tue)09:15:30 No.101868283

it's real

Anonymous
08/13/24(Tue)09:17:48 No.101868311

Anonymous 08/13/24(Tue)09:17:48 No.101868311

>>101868283
strawberry?

Anonymous
08/13/24(Tue)09:18:07 No.101868318

Anonymous 08/13/24(Tue)09:18:07 No.101868318

>>101867960
10am in which time zone? I am assuming West Coast

llama.cpp CUDA dev !!OM2Fp6Fn93S
08/13/24(Tue)09:19:11 No.101868332

llama.cpp CUDA dev !!OM2Fp6Fn93S 08/13/24(Tue)09:19:11 No.101868332

>>101868206
There is no CUDA support so the CPU is used as a fallback.

Anonymous
08/13/24(Tue)09:19:42 No.101868344

Anonymous 08/13/24(Tue)09:19:42 No.101868344

>collect used snapdragon phones
>run llama.cpp rpc on them
>enjoy ultra power efficiency

Anonymous
08/13/24(Tue)09:19:50 No.101868345

Anonymous 08/13/24(Tue)09:19:50 No.101868345

>>101868318
>10am PT
>which time zone?
The average /lmg/ reading level

Anonymous
08/13/24(Tue)09:21:55 No.101868372

Anonymous 08/13/24(Tue)09:21:55 No.101868372

>>101868048
>I have 24h a day
we know

Anonymous
08/13/24(Tue)09:25:49 No.101868412

Anonymous 08/13/24(Tue)09:25:49 No.101868412

>>101868211
huge strobby

Anonymous
08/13/24(Tue)09:26:18 No.101868425

Anonymous 08/13/24(Tue)09:26:18 No.101868425

I hope the strawberry drop happens soon so that these shills can fuck off. Iran tier nothingburger

Anonymous
08/13/24(Tue)09:26:30 No.101868428

Anonymous 08/13/24(Tue)09:26:30 No.101868428

File: file.png (44 KB, 784x257)

44 KB PNG

>>101865145
How do you respond without sounding mad?

Anonymous
08/13/24(Tue)09:27:44 No.101868442

Anonymous 08/13/24(Tue)09:27:44 No.101868442

>>101868372
Yes! And 7 days per week
I will make you appreciate OpenAI chud

Anonymous
08/13/24(Tue)09:28:13 No.101868448

Anonymous 08/13/24(Tue)09:28:13 No.101868448

>>101868428
"End of scene."

Anonymous
08/13/24(Tue)09:29:43 No.101868466

Anonymous 08/13/24(Tue)09:29:43 No.101868466

>>101868425
Sir, why are you so hostile against new Altman model? You do not need to be hostile. LISTENING?

Anonymous
08/13/24(Tue)09:29:56 No.101868468

Anonymous 08/13/24(Tue)09:29:56 No.101868468

This time is for real guys, Sama wouldn't lie to us again, AGI is finally here, something WILL happen

Anonymous
08/13/24(Tue)09:30:54 No.101868481

Anonymous 08/13/24(Tue)09:30:54 No.101868481

>>101868468
When did he ever lie?

Anonymous
08/13/24(Tue)09:31:30 No.101868491

Anonymous 08/13/24(Tue)09:31:30 No.101868491

>>101868468
the wet fart of a release yesterday and the strawberry guy saying "thanks guys" in response more or less sealed it for me

if i see sam altman im going to punch him in the face for this gayass hype cycle he caused

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.