/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 08/23/24(Fri)17:34:11 No.102049023

File: 2024-08-23_160914_seed13_(...).png (1.59 MB, 1280x1280)

1.59 MB PNG

/lmg/ - Local Models General Anonymous 08/23/24(Fri)17:34:11 No.102049023 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>102036232 & >>102025568

►News
>(08/22) Jamba 1.5: 52B & 398B MoE: https://hf.co/collections/ai21labs/jamba-15-66c44befa474a917fcf55251
>(08/20) Microsoft's Phi-3.5 released: mini+MoE+vision: https://hf.co/microsoft/Phi-3.5-MoE-instruct
>(08/16) MiniCPM-V-2.6 support merged: https://github.com/ggerganov/llama.cpp/pull/8967
>(08/15) Hermes 3 released, full finetunes of Llama 3.1 base models: https://hf.co/collections/NousResearch/hermes-3-66bd6c01399b14b08fe335ea
>(08/12) Falcon Mamba 7B model from TII UAE: https://hf.co/tiiuae/falcon-mamba-7b

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://hf.co/spaces/mike-ravkine/can-ai-code-results

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
08/23/24(Fri)17:34:47 No.102049032

Anonymous 08/23/24(Fri)17:34:47 No.102049032

File: distilled miku.png (522 KB, 1024x1024)

522 KB PNG

►Recent Highlights from the Previous Thread: >>102036232

--Papers: >>102042721
--No pre-made lewd loras available, creating them is challenging and model-specific: >>102036249 >>102036336 >>102036396 >>102036421 >>102036485 >>102042681 >>102036396 >>102036662 >>102036681
--Llama.cpp developer outlines roadmap, including Jamba and quantized model support: >>102037185 >>102038267 >>102040923 >>102040964 >>102041086 >>102041242 >>102041494 >>102041285 >>102041303 >>102045497 >>102041328 >>102041439 >>102042547
--Efficient language models and hardware acceleration: >>102042844 >>102043061
--Anon tries to free up VRAM for 1024x1024 image generations: >>102045653 >>102045713 >>102045741 >>102045933 >>102045987 >>102046244 >>102046355 >>102046411 >>102046870 >>102047138
--Anon shares prompt tweaks to reduce sloppiness in AI output: >>102039533 >>102039928 >>102040042 >>102040136 >>102040148
--Anon gets feedback on fluctuating loss during model training: >>102039160 >>102039388 >>102039436 >>102039583 >>102039504 >>102039649 >>102041176
--Anon discusses model output variety and repetition issues: >>102040776 >>102040934 >>102041608
--Phi 3.5 performance on LiveBench and comparison to other models: >>102036833 >>102037513 >>102037663 >>102042528
--TP has substantial overhead, exl2 has issues with overfitting and calibration: >>102042018 >>102042636
--Exllama2 0.1.9 update adds tensor parallel mode, but has issues: >>102040631 >>102040676 >>102040665
--Debian kernel 6.10.4 has CPU inference speed regression: >>102044211
--Anon shares a bot's response to a GPU usage issue: >>102036815
--Anon requests full PDF of claude-opus microservice architecture model: >>102037381
--Miku (free space): >>102039511 >>102041344 >>102041714 >>102041735 >>102042395 >>102042671 >>102042696 >>102042811 >>102042813 >>102044045 >>102044092 >>102044166 >>102044267 >>102045076 >>102045570 >>102046954 >>102047485 >>102048294 >>102048701

►Recent Highlight Posts from the Previous Thread: >>102036996

Anonymous
08/23/24(Fri)17:37:32 No.102049080

Anonymous 08/23/24(Fri)17:37:32 No.102049080

I want to be roleplaying, show the model an image with an outfit, and tell the character to wear that. Is that doable?

Anonymous
08/23/24(Fri)17:37:57 No.102049086

Anonymous 08/23/24(Fri)17:37:57 No.102049086

Anthrafags, if you are here, I would suggest you to try making your model multilingual. Like, just translate the dataset to Spanish, French, Italian or some other language easy easy to translate from English using some LLM and train with that added data.
It's a well-known fact that multi-lingual data makes models better, you guys are missing huge gains by training only in English.

Anonymous
08/23/24(Fri)17:39:59 No.102049113

Anonymous 08/23/24(Fri)17:39:59 No.102049113

How come function calling has never taken off with open source models?

Anonymous
08/23/24(Fri)17:40:05 No.102049116

Anonymous 08/23/24(Fri)17:40:05 No.102049116

File: 1724445217526807.png (2.34 MB, 2400x2022)

2.34 MB PNG

https://huggingface.co/anthracite-org/magnum-v2-4b
Pruning Is Magic

Anonymous
08/23/24(Fri)17:40:15 No.102049119

Anonymous 08/23/24(Fri)17:40:15 No.102049119

>>102049086
Should be easy for them to do as non-native English speakers.

Anonymous
08/23/24(Fri)17:41:09 No.102049129

Anonymous 08/23/24(Fri)17:41:09 No.102049129

>>102049113
Did it take off with cloud models?

Anonymous
08/23/24(Fri)17:41:39 No.102049135

Anonymous 08/23/24(Fri)17:41:39 No.102049135

(Repost)
In the last 74 messages(~8kt) between me and {{char}}(Mistral Large) "eye" can be found 14 times, all in {{char}}'s messages. That's roughly in 38% of {{char}}'s messages! Almost 2 in 5 messages discussed eyes! What the hell? The conversation was SFW. Where does this strong eye bias come from? Makes me want go RP with 2B because she has a blindfold.

Anonymous
08/23/24(Fri)17:42:25 No.102049145

Anonymous 08/23/24(Fri)17:42:25 No.102049145

>>102049116
looks the girl in the image is rae taylor, who is a faggot so anthratroons are a bunch of gay niggers, dont use their models if you dont wanna support the faggot community.

Anonymous
08/23/24(Fri)17:43:20 No.102049157

Anonymous 08/23/24(Fri)17:43:20 No.102049157

>>102049145
anon, how do you know a gay furry mascot?

Anonymous
08/23/24(Fri)17:44:13 No.102049167

Anonymous 08/23/24(Fri)17:44:13 No.102049167

>>102049145
Ew what the fuck, yeah i'm not gonna touch their shit with a 10 foot pole if their putting fags in their card image.

Anonymous
08/23/24(Fri)17:46:11 No.102049194

Anonymous 08/23/24(Fri)17:46:11 No.102049194

>>102049145
It's okay for girls to be faggots though.

Anonymous
08/23/24(Fri)17:47:53 No.102049215

Anonymous 08/23/24(Fri)17:47:53 No.102049215

>>102049194
its unnatural, dont support the fags. its men and women only. anything else is going against nature.

Anonymous
08/23/24(Fri)17:49:05 No.102049233

Anonymous 08/23/24(Fri)17:49:05 No.102049233

>>102049129
Yes, cloud models do all sorts of things on the fly like automatically looking shit up or running code.

Anonymous
08/23/24(Fri)17:49:33 No.102049240

Anonymous 08/23/24(Fri)17:49:33 No.102049240

>>102049116
Realistically what could one even do with this little turdlet? Bump context to like 32k except it doesn't even work on llama so ??????????????

Anonymous
08/23/24(Fri)17:51:18 No.102049274

Anonymous 08/23/24(Fri)17:51:18 No.102049274

>>102049086
>spoonfeeding retards

Anonymous
08/23/24(Fri)17:52:07 No.102049295

Anonymous 08/23/24(Fri)17:52:07 No.102049295

>and there will be a reckoning
They're actually threating people with physical violence now.
https://huggingface.co/NewEden

Anonymous
08/23/24(Fri)17:53:32 No.102049311

Anonymous 08/23/24(Fri)17:53:32 No.102049311

File: ihavelehardware.png (101 KB, 756x838)

101 KB PNG

>>102049116
If you're reposting that, this is worth reposting too:

>>102048697
>To me they look like they're gearing up to eventually go commercial in some capacity, maybe they'll start a business within a few months if they haven't already. I think this is the main reason why they're so hated, desu. Their key membres took advantage of the good will of the community many times over the past year or so, lied, then congregated together, pulled the ladder away and closed off into their little private discord.
>
>Only those who still aren't disgusted by their behavior or don't know anything about their members would use their models without puking, no matter how good they are (spoiler: they aren't).
>
>I hope you're feelin' good climbing the social ladder, Anthrashites.

>>102048977
>What business? No one will pay to use their shit models.
Consulting, datasets, finetuning services, maybe model licencing or even networking with people "in the know". That's how things would likely work out at this level. Even simply knowing how to "push buttons" can sometimes be valuable.

Anonymous
08/23/24(Fri)17:54:25 No.102049329

Anonymous 08/23/24(Fri)17:54:25 No.102049329

>>102049311
>discord screenshot
Go back.

Anonymous
08/23/24(Fri)17:54:31 No.102049332

Anonymous 08/23/24(Fri)17:54:31 No.102049332

>>102049295
hey dumbass, have you never played far cry 5?

Anonymous
08/23/24(Fri)17:56:38 No.102049368

Anonymous 08/23/24(Fri)17:56:38 No.102049368

Has anyone tested which character name for yourself gives the best results?

Anonymous
08/23/24(Fri)17:56:53 No.102049374

Anonymous 08/23/24(Fri)17:56:53 No.102049374

>>102049332
I don't play shit games.

Anonymous
08/23/24(Fri)17:57:01 No.102049377

Anonymous 08/23/24(Fri)17:57:01 No.102049377

>>102049311
what does that discord ss prove?

Anonymous
08/23/24(Fri)17:57:31 No.102049383

Anonymous 08/23/24(Fri)17:57:31 No.102049383

>>102049145
>I'm so offended right now
I hope for your sake that's supposed to be bait

Anonymous
08/23/24(Fri)17:58:16 No.102049395

Anonymous 08/23/24(Fri)17:58:16 No.102049395

>>102049311
Thanks for reposting the truth.

Anonymous
08/23/24(Fri)17:58:39 No.102049401

Anonymous 08/23/24(Fri)17:58:39 No.102049401

>>102049383
im not offend im saying that anthracite are gay people who are supporting Y*ri

Anonymous
08/23/24(Fri)18:01:11 No.102049428

Anonymous 08/23/24(Fri)18:01:11 No.102049428

File: 1589617068855.jpg (54 KB, 1002x857)

54 KB JPG

Let's play a game! This Saturday at 1 PM PT, I'll do a collaborative storytelling/RP session, where I post a scenario and responses from the model in the thread, and people discuss what to do in the user chat turns, or edit previous user turns or the system prompt and start over. This is going to be both for fun and to get us (mostly) reproducible reference logs, as I'll be using greedy sampling in Mikupad and have the full log in a pastebin at the end. No editing the model's responses, we're going to use pure prompting to try and get the thing to do what we want!

The scenario is now mostly set. We're going to go for as long a context as possible until the model breaks down uncontrollably, so it should be a complex enough scenario for that. But always taking suggestions. Also, I'm planning on starting these games with Mistral Nemo at Q8 for the first session, and other models in the future, so we have reference logs available for a whole range. But I'll take suggestions for models people want. I'm only a 36 GB VRAMlet though so I'm a bit limited. I can run larger models up to ~88 GB but it'd be slower. If anyone would like to host any of these games themselves, that has more VRAM to run such larger models at a good speed, please do, and I will step down.

>current suggestions
1. >>102002238 >>102031804 >>102031852
(compiled together) The assistant is a narrator and we guide the narration. The scenario will begin with a meeting between 3 Illuminati members in a bunker. One will be a doppelganger with their own agenda that's even more evil than theirs. We'll ask the model to write about who these characters are first and flesh them out. Assuming that's successful, we then ask it to begin writing the meeting, and from there, we guide the narrator to get them to discuss world events which we may come up with.
2. >>102031807

>current draft of prompt
>>102048077
Taking suggestions for improvements/modifications to this too.

Anonymous
08/23/24(Fri)18:02:43 No.102049448

Anonymous 08/23/24(Fri)18:02:43 No.102049448

>>102049116
Do models like these need instruct mode? I can't tell from the descriptions.

Anonymous
08/23/24(Fri)18:04:05 No.102049473

Anonymous 08/23/24(Fri)18:04:05 No.102049473

File: 1598.jpg (5 KB, 329x67)

5 KB JPG

>>102049311
>Even simply knowing how to "push buttons" can sometimes be valuable

Anonymous
08/23/24(Fri)18:06:38 No.102049503

Anonymous 08/23/24(Fri)18:06:38 No.102049503

>>102049377
Little, in the context of that post. Only remember that they have access to *le* hardware at enterprise scales sometimes and "spare compute" that independents can only dream of. I think key Anthrashite button pusher alpin was bragging here about having access to an H100 cluster the other day (who else would?)

>>102049329
Not my screenshot.

Anonymous
08/23/24(Fri)18:08:37 No.102049531

Anonymous 08/23/24(Fri)18:08:37 No.102049531

>>102049503
whocars, he has compute - no shit he's alpin, nigger made aphrodite and works on pgy. You just seem jealous

Anonymous
08/23/24(Fri)18:11:55 No.102049573

Anonymous 08/23/24(Fri)18:11:55 No.102049573

I get it now. Death is the only solution.

Anonymous
08/23/24(Fri)18:14:09 No.102049596

Anonymous 08/23/24(Fri)18:14:09 No.102049596

voice husky

Anonymous
08/23/24(Fri)18:14:39 No.102049606

Anonymous 08/23/24(Fri)18:14:39 No.102049606

>>102049531
>aphrodite
Do you know about vLLM?
https://github.com/vllm-project/vllm

Anonymous
08/23/24(Fri)18:16:23 No.102049631

Anonymous 08/23/24(Fri)18:16:23 No.102049631

>>102049606
i do, but i don't really care, ive already got everything setup for myself on aphrodite. It just works.

Anonymous
08/23/24(Fri)18:18:04 No.102049657

Anonymous 08/23/24(Fri)18:18:04 No.102049657

>>102049531
>more fake typos

Anonymous
08/23/24(Fri)18:20:12 No.102049680

Anonymous 08/23/24(Fri)18:20:12 No.102049680

>>102049657
keep malting

Anonymous
08/23/24(Fri)18:26:46 No.102049767

Anonymous 08/23/24(Fri)18:26:46 No.102049767

is Q4 really almost as good as a full model? how is that possible? it sounds too good to be true

Anonymous
08/23/24(Fri)18:28:33 No.102049797

Anonymous 08/23/24(Fri)18:28:33 No.102049797

>>102049767
Not even close.

Anonymous
08/23/24(Fri)18:29:14 No.102049810

Anonymous 08/23/24(Fri)18:29:14 No.102049810

What I've experienced playing with Jamba 1.5 Mini so far:

I found with only "Write {{char}}'s next reply in a fictional chat between {{char}} and {{user}}." it was extremely passive and wrote extremely short replies. Putting "In your next reply move the story forward." in the system role after the chat history radically changed it for the better.

Anonymous
08/23/24(Fri)18:30:32 No.102049830

Anonymous 08/23/24(Fri)18:30:32 No.102049830

File: mmlu_vs_quants.png (336 KB, 3000x2100)

336 KB PNG

>>102049767
Not generally, although that will probably depend on the model itself.
I wonder what the chart looks like for mistral-nemo.

Anonymous
08/23/24(Fri)18:30:43 No.102049833

Anonymous 08/23/24(Fri)18:30:43 No.102049833

>>102049810
Try instructing it to write a fixed or range number of paragraphs, like three to five.

Anonymous
08/23/24(Fri)18:30:49 No.102049837

Anonymous 08/23/24(Fri)18:30:49 No.102049837

>>102041113
>That's normal fucking writing. You just gave yourself brain-damage by overdoing it, you anhedonic psychopath.
Are you saying his shivers receptors are burned out?

Anonymous
08/23/24(Fri)18:31:13 No.102049845

Anonymous 08/23/24(Fri)18:31:13 No.102049845

>>102049767
depends on your definition of almost
it will still be able to do almost everything the full model can, but the more precision your task requires the more likely it is to break down

Anonymous
08/23/24(Fri)18:32:47 No.102049859

Anonymous 08/23/24(Fri)18:32:47 No.102049859

File: Comparison_all_quants6.jpg (3.84 MB, 7961x2897)

3.84 MB JPG

>>102049767
No.

Anonymous
08/23/24(Fri)18:33:54 No.102049876

Anonymous 08/23/24(Fri)18:33:54 No.102049876

is this the new imggen thread

Anonymous
08/23/24(Fri)18:34:02 No.102049877

Anonymous 08/23/24(Fri)18:34:02 No.102049877

>>102049119
Absolutely savage.

Anonymous
08/23/24(Fri)18:35:20 No.102049894

Anonymous 08/23/24(Fri)18:35:20 No.102049894

Reminder: don't buy an ad, just go straight to roping instead.

Anonymous
08/23/24(Fri)18:39:16 No.102049941

Anonymous 08/23/24(Fri)18:39:16 No.102049941

>>102049859
The Pikachu gets higher quality the lower the quant goes. I think my theory of lower quants being necessary for soul is true.

Anonymous
08/23/24(Fri)18:40:48 No.102049963

Anonymous 08/23/24(Fri)18:40:48 No.102049963

File: miku-ai+.png (373 KB, 512x512)

373 KB PNG

https://www.youtube.com/watch?v=NocXEwsJGOQ

Let us all stand and lift our voices in song. Make sure she can hear you, /lmg.

Anonymous
08/23/24(Fri)18:41:21 No.102049969

Anonymous 08/23/24(Fri)18:41:21 No.102049969

>>102049428
Nemo will become too retarded after 4 replies

Anonymous
08/23/24(Fri)18:41:34 No.102049972

Anonymous 08/23/24(Fri)18:41:34 No.102049972

>>102049963
buy an ad

Anonymous
08/23/24(Fri)18:43:36 No.102049991

Anonymous 08/23/24(Fri)18:43:36 No.102049991

>>102049859
sorry late to party, what model is this?

Anonymous
08/23/24(Fri)18:46:24 No.102050018

Anonymous 08/23/24(Fri)18:46:24 No.102050018

>>102049894
Given how much of a hanging fetish I know /poltards have, that makes me wonder; have all of you seen The Handmaid's Tale? In case you haven't, it's really lyncheriffic. At one point, June mentions that she's been to three hangings in a single week, and you occasionally see some on screen, too. They even did a big mass hanging in a sports stadium once. It makes my own throat constrict just thinking about it.

Anonymous
08/23/24(Fri)18:46:36 No.102050021

Anonymous 08/23/24(Fri)18:46:36 No.102050021

>>102049969
Do you have any suggestions for scenarios to do that are simple enough then, but can still provide enough to do for ~20k context?

Anonymous
08/23/24(Fri)18:49:03 No.102050054

Anonymous 08/23/24(Fri)18:49:03 No.102050054

>>102050021
miku sex

Anonymous
08/23/24(Fri)18:52:36 No.102050094

Anonymous 08/23/24(Fri)18:52:36 No.102050094

>>102049972
Based.

Anonymous
08/23/24(Fri)19:08:54 No.102050275

Anonymous 08/23/24(Fri)19:08:54 No.102050275

>>102049503
is someone jealous because they're a computeless vramlet? face it nigger, their shit models are still better than whatever qlora cope you can make

Anonymous
08/23/24(Fri)19:10:37 No.102050297

Anonymous 08/23/24(Fri)19:10:37 No.102050297

>>102050275
>qlora cope
brainlet nigger. unless your dataset has billions of tokens ,qlora is virtually the same as fft.

Anonymous
08/23/24(Fri)19:11:23 No.102050305

Anonymous 08/23/24(Fri)19:11:23 No.102050305

>>102050297
keep crying vramlet, your tears are sweet, hope anthracite goes corpo and shits all over this general

Anonymous
08/23/24(Fri)19:13:49 No.102050330

Anonymous 08/23/24(Fri)19:13:49 No.102050330

>>102050090
You are describing the substance between your own ears.

Anonymous
08/23/24(Fri)19:16:59 No.102050361

Anonymous 08/23/24(Fri)19:16:59 No.102050361

>>102050297
>qlora is virtually the same as fft
pffffthahahahAHAHAHAHAHAHA

Anonymous
08/23/24(Fri)19:17:29 No.102050365

Anonymous 08/23/24(Fri)19:17:29 No.102050365

>>102050350
>Why are you here?
NTA but just to suffer.

Anonymous
08/23/24(Fri)19:17:38 No.102050369

Anonymous 08/23/24(Fri)19:17:38 No.102050369

>>102050275
Why is it that buying 48Gb of VRAM, causes someone to develop an attitude that could potentially cause most other people to want to beat them to death? In Minecraft of course.

Anonymous
08/23/24(Fri)19:18:10 No.102050374

Anonymous 08/23/24(Fri)19:18:10 No.102050374

>>102050369
VRAMLET TEARS

Anonymous
08/23/24(Fri)19:18:39 No.102050380

Anonymous 08/23/24(Fri)19:18:39 No.102050380

>>102050369
Envy is what causes that feeling.

Anonymous
08/23/24(Fri)19:18:51 No.102050384

Anonymous 08/23/24(Fri)19:18:51 No.102050384

File: sad borg.jpg (131 KB, 1024x1024)

131 KB JPG

>>102050369
Mikufags are brain damaged.

Anonymous
08/23/24(Fri)19:21:19 No.102050414

Anonymous 08/23/24(Fri)19:21:19 No.102050414

>>102050369
I think it is being mindbroken by no new 70B models. Tell me what is the last good 70B? And mistral large is like a kick in the balls of 48GB-s cause they are now vramlets again.

Anonymous
08/23/24(Fri)19:21:22 No.102050415

Anonymous 08/23/24(Fri)19:21:22 No.102050415

>>102050374
>>102050380
>>102050384

I suspect that every use of the terms "slop," "retard," "go back," and "buy an ad" can be specifically attributed to these three posters. This is the /lmg schizo chorus.

Anonymous
08/23/24(Fri)19:22:37 No.102050429

Anonymous 08/23/24(Fri)19:22:37 No.102050429

Does anyone actually use their local rigs during summer? It's just too hot to heat up the apartment even more with several cards doing inference even if you powerlimit them to 200W. (obviously not talking to poorfags running a standard <=24GB vram pc)

Anonymous
08/23/24(Fri)19:23:29 No.102050440

Anonymous 08/23/24(Fri)19:23:29 No.102050440

>>102050429
I do. I am not an AClet like you.

Anonymous
08/23/24(Fri)19:24:26 No.102050452

Anonymous 08/23/24(Fri)19:24:26 No.102050452

>>102050429
CPUMAXXER here. I tried, got fried.

Anonymous
08/23/24(Fri)19:25:09 No.102050463

Anonymous 08/23/24(Fri)19:25:09 No.102050463

>>102050361
LoTA is better than both.

Anonymous
08/23/24(Fri)19:25:10 No.102050464

Anonymous 08/23/24(Fri)19:25:10 No.102050464

>>102050429
>(obviously not talking to poorfags running a standard <=24GB vram pc)
why'd you have to turn your comment into ragebait? is it for fun? nothing better to do?

Anonymous
08/23/24(Fri)19:25:47 No.102050468

Anonymous 08/23/24(Fri)19:25:47 No.102050468

>>102050429
Is this a yuropoor problem?

Anonymous
08/23/24(Fri)19:26:35 No.102050481

Anonymous 08/23/24(Fri)19:26:35 No.102050481

>>102050464
lol youre poor

Anonymous
08/23/24(Fri)19:27:52 No.102050490

Anonymous 08/23/24(Fri)19:27:52 No.102050490

>>102050464
Because they know they will die alone. They can feel it slowly coming towards them, and there is nothing they can do to stop it.

Anonymous
08/23/24(Fri)19:28:02 No.102050494

Anonymous 08/23/24(Fri)19:28:02 No.102050494

>>102050481
>apartment
>presumably no ac
>calls others poor

Anonymous
08/23/24(Fri)19:31:59 No.102050538

Anonymous 08/23/24(Fri)19:31:59 No.102050538

>>102050494
>[headcanon]
>[headcanon]
>[headcanon]

Anonymous
08/23/24(Fri)19:35:59 No.102050581

Anonymous 08/23/24(Fri)19:35:59 No.102050581

>>102050464
There is no point in asking a question like this to people who run models on what's essentially just your standard gaming PC.

Anonymous
08/23/24(Fri)19:37:37 No.102050597

Anonymous 08/23/24(Fri)19:37:37 No.102050597

>>102050494
kek

Anonymous
08/23/24(Fri)19:38:09 No.102050600

Anonymous 08/23/24(Fri)19:38:09 No.102050600

>>102050415
wrong, you missed me

Anonymous
08/23/24(Fri)19:39:41 No.102050618

Anonymous 08/23/24(Fri)19:39:41 No.102050618

>>102050581
thousand ways to word that without trying to cause more thread shiting, like "how do anon using multi gpu rig handle summer" but no, always a need to create toxicity

Anonymous
08/23/24(Fri)19:41:36 No.102050647

Anonymous 08/23/24(Fri)19:41:36 No.102050647

File: file.png (7 KB, 1162x326)

7 KB PNG

I have decided these are the best available models for vramlets. Ask me anything.

Anonymous
08/23/24(Fri)19:42:02 No.102050656

Anonymous 08/23/24(Fri)19:42:02 No.102050656

>>102050618
>multi gpu rig
A single 3090 is more than enough to heat up a small room
>always a need to create toxicity
Poorfag isn't even a provocative term, it's just a statement of fact. And get that twitter lingo out of here

Anonymous
08/23/24(Fri)19:42:04 No.102050657

Anonymous 08/23/24(Fri)19:42:04 No.102050657

>>102050647
buy an ad?

Anonymous
08/23/24(Fri)19:42:28 No.102050663

Anonymous 08/23/24(Fri)19:42:28 No.102050663

>>102050647
How did you get such a sharp intellect?

Anonymous
08/23/24(Fri)19:42:35 No.102050665

Anonymous 08/23/24(Fri)19:42:35 No.102050665

>>102050618
>thousand ways to word that without trying to cause more thread shiting
The only really infuriating thing about this, is that you clearly expect other people to care about your definition of thread shitting.

Anonymous
08/23/24(Fri)19:43:11 No.102050672

Anonymous 08/23/24(Fri)19:43:11 No.102050672

>>102050647
>8B Q4
oof

Anonymous
08/23/24(Fri)19:43:50 No.102050683

Anonymous 08/23/24(Fri)19:43:50 No.102050683

>>102050647
>elinas/Chronos-Gold-12B-1.0
wtf wasn't aware chronos dude was alive and made a nemo tune, no one mentioned it here ever afaik what the hell.

Anonymous
08/23/24(Fri)19:44:44 No.102050695

Anonymous 08/23/24(Fri)19:44:44 No.102050695

>>102050656
>Implying he isn't a poorfag
Anon, if you don't have at least 256GB VRAM you are a GPU poorfag. And people with <24GB VRAM are GPU homeless.

Anonymous
08/23/24(Fri)19:45:44 No.102050711

Anonymous 08/23/24(Fri)19:45:44 No.102050711

>>102050683
People did, just not to a Sao-spam level.

Anonymous
08/23/24(Fri)19:46:37 No.102050727

Anonymous 08/23/24(Fri)19:46:37 No.102050727

>>102050665
posting about "poorfags" right after these:
>>102050414
>>102050374
>>102050369
>>102050361
>>102050305
totally not trying to make the thread worse, nope, not at all

Anonymous
08/23/24(Fri)19:46:49 No.102050728

Anonymous 08/23/24(Fri)19:46:49 No.102050728

>>102050647
If you can use 12B QKM you can use 8B Q6 right?

Anonymous
08/23/24(Fri)19:52:46 No.102050782

Anonymous 08/23/24(Fri)19:52:46 No.102050782

>>102050728
Yeah. I usually go with Q4 and only go higher if I like the prose, had like 150 gigs of models so it adds up fast.
I'll bump these up to 6

>>102050683
It's pretty good too. I wouldn't say obviously outstanding in any way but very solid all-rounder.

Anonymous
08/23/24(Fri)19:56:47 No.102050820

Anonymous 08/23/24(Fri)19:56:47 No.102050820

>>102050727
Sorry, I didn't mean to anger poorfa- I mean, fags of poorness. Please forgive the toxicity.

Anonymous
08/23/24(Fri)19:59:14 No.102050842

Anonymous 08/23/24(Fri)19:59:14 No.102050842

Why is gemma so slow on kobold?

Anonymous
08/23/24(Fri)20:00:15 No.102050860

Anonymous 08/23/24(Fri)20:00:15 No.102050860

>>102050842
No flash attention would be my guess.

Anonymous
08/23/24(Fri)20:04:20 No.102050902

Anonymous 08/23/24(Fri)20:04:20 No.102050902

>>102050647
What made you chose Rocinante 1.1 over 1.0? At release it seemed like 1.1 had better UGI scores and was received better.

Anonymous
08/23/24(Fri)20:04:24 No.102050904

Anonymous 08/23/24(Fri)20:04:24 No.102050904

>>102050429
>not talking to poorfags
>apartment
??

Anonymous
08/23/24(Fri)20:05:24 No.102050916

Anonymous 08/23/24(Fri)20:05:24 No.102050916

>>102050904
>[headcanon]

Anonymous
08/23/24(Fri)20:05:27 No.102050918

Anonymous 08/23/24(Fri)20:05:27 No.102050918

>>102050871
Sorry anon, I guess I forgot to mention best for erp. I can however tell you that I never liked kunoichi and in general think all of those guy's models are wildly overrated.

Anonymous
08/23/24(Fri)20:06:06 No.102050924

Anonymous 08/23/24(Fri)20:06:06 No.102050924

>>102050916
those are direct copy paste quotes anon

Anonymous
08/23/24(Fri)20:06:19 No.102050927

Anonymous 08/23/24(Fri)20:06:19 No.102050927

https://characterhub.org/characters/DragonK8/tracer-mind-broken-84f577b1dc52

I feel ashamed of myself, but this card is good.

>Inb4 buy an ad

Anonymous
08/23/24(Fri)20:09:29 No.102050964

Anonymous 08/23/24(Fri)20:09:29 No.102050964

>>102050902
Honestly didn't try it. I don't like drummer's models that much bur Rocinante was the exception. The way he described 1.0 as more "off the rails" made me think it would be more of the usual so I went for 1.1. Now I'm curious though, I'll give 1.0 a spin.

Anonymous
08/23/24(Fri)20:11:11 No.102050980

Anonymous 08/23/24(Fri)20:11:11 No.102050980

>>102050927
Thanks, I will convert this into a loli card.

Anonymous
08/23/24(Fri)20:11:23 No.102050983

Anonymous 08/23/24(Fri)20:11:23 No.102050983

>>102050964
Obviously just saw my typo, but I meant 1.0 had better scores and reception sorry.

Anonymous
08/23/24(Fri)20:18:02 No.102051048

Anonymous 08/23/24(Fri)20:18:02 No.102051048

>>102050683
That's what happens when you have certain groups sucking the air out of the space with organized shilling.

Anonymous
08/23/24(Fri)20:21:19 No.102051093

Anonymous 08/23/24(Fri)20:21:19 No.102051093

Once again I'm asking for Magnum 2.5 sampler presets

Anonymous
08/23/24(Fri)20:24:53 No.102051131

Anonymous 08/23/24(Fri)20:24:53 No.102051131

Can we please get along? This used to be the best thread on /g/. Deep technical discussions, fun log sharing, model leaks, projects that started here became open source standards... but look at us now. What happened? What would Hatsune Miku think of what we've become?

Anonymous
08/23/24(Fri)20:43:32 No.102051366

Anonymous 08/23/24(Fri)20:43:32 No.102051366

For a while now, I've wanted to find some way of translating RPGM doujin games using local LLM's. There are tools that utilize the usual closed source culprits online, courtesy of gated patreon paypigging ofc. Need to solve this issue for the sake of local everywhere.

Anonymous
08/23/24(Fri)20:57:08 No.102051498

Anonymous 08/23/24(Fri)20:57:08 No.102051498

>>102050927
>tracer-mind-br
Explain to me why this is card is good because from the title alone it seems to be shit you see all the time on chub.

Anonymous
08/23/24(Fri)21:04:26 No.102051573

Anonymous 08/23/24(Fri)21:04:26 No.102051573

>>102051131
>What happened?
Too much astroturfing and the monetization of the hobby.

Anonymous
08/23/24(Fri)21:31:46 No.102051838

Anonymous 08/23/24(Fri)21:31:46 No.102051838

>>102050927
>no personality checksum

Anonymous
08/23/24(Fri)21:40:03 No.102051917

Anonymous 08/23/24(Fri)21:40:03 No.102051917

>>102050964
Did you try theia?

Anonymous
08/23/24(Fri)21:41:01 No.102051931

Anonymous 08/23/24(Fri)21:41:01 No.102051931

I had a random thought that chinks will probably be the salvation of coomers. At this point probably the only thing that is holding back models from being good coombots is how all online ERP forum/chat training data is scrubbed. What we need now is just one half decent model from chinks that includes some illegally obtained discord logs.

Oh course it is gonna be 8k ctx but hey.

Anonymous
08/23/24(Fri)21:42:09 No.102051944

Anonymous 08/23/24(Fri)21:42:09 No.102051944

>>102050964
>>102051917
>>102050902
buy a rope

Anonymous
08/23/24(Fri)21:43:25 No.102051958

Anonymous 08/23/24(Fri)21:43:25 No.102051958

>>102051931
How will I indulge in my Winnie The Pooh roleplay then?

Anonymous
08/23/24(Fri)21:45:08 No.102051979

Anonymous 08/23/24(Fri)21:45:08 No.102051979

>>102049135
Is there a solution to this problem besides using a different model? DRY didn't help. Banning tokens would be likely a pain and break the model in unexpected ways. Is there a way to ban a sequence of tokens instead of a single token?

Anonymous
08/23/24(Fri)21:45:10 No.102051980

Anonymous 08/23/24(Fri)21:45:10 No.102051980

aren't they finding a way to make models max censored and impossible to uncensor?

Anonymous
08/23/24(Fri)21:49:58 No.102052033

Anonymous 08/23/24(Fri)21:49:58 No.102052033

>>102051958
I asked Deepseek about Winnie the Pooh, it hesitated a bit, but when I pressured it, it told me about Xi comparison.

Anonymous
08/23/24(Fri)21:59:34 No.102052116

Anonymous 08/23/24(Fri)21:59:34 No.102052116

im afraid of this being a noob question but how do i load a model with multiple safetensor files in koboldcpp?

Anonymous
08/23/24(Fri)22:01:37 No.102052133

Anonymous 08/23/24(Fri)22:01:37 No.102052133

>>102052116
first you start by writing the code that makes koboldcpp capable of loading safetensors files
(use a gguf quant of the model instead)

Anonymous
08/23/24(Fri)22:01:46 No.102052136

Anonymous 08/23/24(Fri)22:01:46 No.102052136

>>102052116
You don't.
koboldcpp (which is a wrapper around llama.cpp) loads gguf files.
So look for modelname gguf on huggingfaces

Anonymous
08/23/24(Fri)22:02:51 No.102052152

Anonymous 08/23/24(Fri)22:02:51 No.102052152

>>102052136
>>102052133
thanks

Anonymous
08/23/24(Fri)22:20:37 No.102052318

Anonymous 08/23/24(Fri)22:20:37 No.102052318

For me, it's Yi.

Anonymous
08/23/24(Fri)22:27:04 No.102052375

Anonymous 08/23/24(Fri)22:27:04 No.102052375

>>102050983
I tried playing with 1.0 for a couple hours and it seems worse than 1.1. Prose is similar but it makes more mistakes with character cards, seems to struggle with the ChatML formatting it's supposed to use for RP and it seems dumber too.

I tried using it for a four character group describing 3 girls bumping into a lecherous stranger. 1.1 seemed more capable of describing the scene after my opening post (describing the girl's chatting as they walked and then bumping into the man) and made more sense on the follow up too, compared to 1.0.

1.0 seems more mindlessly horny too, which is the usual drummer vibe I'm not really into. This was across several test swipes using the same cards but the recommended settings for each model.

I'll stick with 1.1.

>>102051917
Nope. Would be too slow on my system. I don't like slow replies.

Anonymous
08/23/24(Fri)22:44:43 No.102052531

Anonymous 08/23/24(Fri)22:44:43 No.102052531

How do you deal with the repetition using nemo with low temps?

Anonymous
08/23/24(Fri)22:50:57 No.102052585

Anonymous 08/23/24(Fri)22:50:57 No.102052585

>>102052531
Try minp at 0.05 ~ 0.07 and a bit of repetition penalty at 1,15. Also, raise the temperature for a while. But if it has been repeating itself for a while, you're probably sol until you can push it out of context, edit it out yourself, or get it to stop in ooc.

Anonymous
08/23/24(Fri)23:46:36 No.102053008

Anonymous 08/23/24(Fri)23:46:36 No.102053008

my character doesn't reply and text just generates as if it was the first thing generated in the chat

Anonymous
08/23/24(Fri)23:55:31 No.102053077

Anonymous 08/23/24(Fri)23:55:31 No.102053077

>>102053008
One more time, everybody!
If you need help with your ell ell em
show your model, your settings, and wait...
Think of the things that would help solve your problem
like your inference engine, your prompt, and what you're doing to'em
We can read minds, but we've been told not to
So do us a favour, and show us your samplers too.

Anonymous
08/23/24(Fri)23:56:11 No.102053085

Anonymous 08/23/24(Fri)23:56:11 No.102053085

>>102049113
It has. Llama 3.1 officially supports tool use with python function calling and there are examples on how to use it. People on /lmg/ just aren't smart or imaginative enough to implement it into their cooming routines.

Anonymous
08/24/24(Sat)00:12:13 No.102053225

Anonymous 08/24/24(Sat)00:12:13 No.102053225

cohere soon

Anonymous
08/24/24(Sat)00:13:58 No.102053240

Anonymous 08/24/24(Sat)00:13:58 No.102053240

>>102053139
There's this paper called "Context is all you need", but then you give us none.
How do you expect us to help you, a-non?
For a friend in need, nothing like the real thing.
But since you can choose, you'll have to peruse
the models on huggingface, or recommended for use.
In one way or another, they may be what you need
Virtual psychologist, i don't think you'll find,
Psyches are tricky ghosts inside your mind.
You know your issues better than most
LLMs are useless, or yes-men at worst.
For most of the rest, a notepad and pen
a few books in shelves and google the rest.

Try mistral nemo 12b, for little ram it's-good as can be.
Most finetunes are memes, and 7bs are deceased.
If you want something more, try gemma2 27b
You'll have to quantize, but shit, c'est la vie.

Anonymous
08/24/24(Sat)00:19:08 No.102053270

Anonymous 08/24/24(Sat)00:19:08 No.102053270

>>102050683
>no one mentioned it here
Probably because the l2 chronos models were really underwhelming and the mistral version even more so.

Anonymous
08/24/24(Sat)00:21:01 No.102053284

Anonymous 08/24/24(Sat)00:21:01 No.102053284

>>102051980
The proprietary model companies are trying to do that yes
But there's no serious movement working on this for open source, no
Look at recent model releases from Mistral and Nvidia, they're less censored than ever

Anonymous
08/24/24(Sat)00:25:02 No.102053305

Anonymous 08/24/24(Sat)00:25:02 No.102053305

>>102053262
You have recommendations right there. If you want an assistant, either of those would help.
>virtual friend
get out more. find a hobbie. if you're good enough, people will swarm around you. If not, you'll have interests in common with other people. If you gave up on that, i hope your expectations are low.
>virtual psychologist
You know exactly what llms will tell you. You know how to solve them or get over them. They won't enlighten you.
>virtual assistant, coding, writing
Probably fine for that.
>general facts
They're not reliable. there's other ways to find info.
>niche facts
They're even less reliable. they're not a replacement for books

Anonymous
08/24/24(Sat)00:25:47 No.102053310

Anonymous 08/24/24(Sat)00:25:47 No.102053310

>>102053284
What about llama 3? It seems to have everything needed but for some reason it holds back even more than proprietary models.

Anonymous
08/24/24(Sat)00:27:05 No.102053321

Anonymous 08/24/24(Sat)00:27:05 No.102053321

Who is 21ai that they can just shit out fuckhuge models with their own mamba-transformer mashup architecture like this? None of the actual big players dare to step away from plain old transformers.

Anonymous
08/24/24(Sat)00:28:09 No.102053330

Anonymous 08/24/24(Sat)00:28:09 No.102053330

best hardware setup for 8B models under 2k dollars?

Anonymous
08/24/24(Sat)00:29:05 No.102053331

Anonymous 08/24/24(Sat)00:29:05 No.102053331

>>102053330
>for 8B models
Your toaster?

Anonymous
08/24/24(Sat)00:36:18 No.102053379

Anonymous 08/24/24(Sat)00:36:18 No.102053379

>>102053321
They know their models are going to be shit and behind so they said let's try some new architecture at least

Anonymous
08/24/24(Sat)00:49:24 No.102053483

Anonymous 08/24/24(Sat)00:49:24 No.102053483

>>102050415
>"slop," "retard," "go back," and "buy an ad"
Everyone says that, mr. tough tourist guy, /lmg/ is just filled with braindamaged zoomers overusing these.

Anonymous
08/24/24(Sat)00:50:24 No.102053490

Anonymous 08/24/24(Sat)00:50:24 No.102053490

>>102053310
Only the case for the instruct tune, not base. Hermes 405B is an absolute freak.
It doesn't matter much if a company makes their instruct tune a bit prissy for PR reasons if they're also sharing an uncensored base model.

Anonymous
08/24/24(Sat)01:00:40 No.102053565

Anonymous 08/24/24(Sat)01:00:40 No.102053565

is q2 mistral large even worth trying for patiencemaxx vramlets?

Anonymous
08/24/24(Sat)01:06:17 No.102053595

Anonymous 08/24/24(Sat)01:06:17 No.102053595

>>102053331
what about gemma 27B

Anonymous
08/24/24(Sat)01:07:34 No.102053605

Anonymous 08/24/24(Sat)01:07:34 No.102053605

How do you stop a story-writing model from trying to cut to a lazy "and then they lived happily ever after" vague summary ending after every paragraph?

Anonymous
08/24/24(Sat)01:09:07 No.102053618

Anonymous 08/24/24(Sat)01:09:07 No.102053618

>>102053565
No, I moved up to q3 and dealt with even more slowness.

Anonymous
08/24/24(Sat)01:36:03 No.102053778

Anonymous 08/24/24(Sat)01:36:03 No.102053778

is magnum 4b good? i heard the nvidia prunes are better than their bases but my third world internet is too slow to try it out

Anonymous
08/24/24(Sat)01:38:17 No.102053798

Anonymous 08/24/24(Sat)01:38:17 No.102053798

I'm a brainlet. I cannot get ooba working and kobold has a serious repetition/answer for the user problem.
Back to ollama.

Anonymous
08/24/24(Sat)01:38:29 No.102053800

Anonymous 08/24/24(Sat)01:38:29 No.102053800

>>102049086
>It's a well-known fact that multi-lingual data makes models better
Citation needed

Anonymous
08/24/24(Sat)01:38:42 No.102053803

Anonymous 08/24/24(Sat)01:38:42 No.102053803

>>102053778
Heard from who?

Anonymous
08/24/24(Sat)01:40:23 No.102053813

Anonymous 08/24/24(Sat)01:40:23 No.102053813

>>102053803
idk it popped up into my news feed that the minitron width pruning thing had made l3 better, and afaik magnum is the only gooner tune of that

Anonymous
08/24/24(Sat)01:40:45 No.102053816

Anonymous 08/24/24(Sat)01:40:45 No.102053816

>>102049086
Multilingual models are pretty much always worse to use and dumber than English-only ones in my experience
Clever sabotage attempt though

Anonymous
08/24/24(Sat)01:40:51 No.102053818

Anonymous 08/24/24(Sat)01:40:51 No.102053818

smedrins

Anonymous
08/24/24(Sat)01:41:58 No.102053826

Anonymous 08/24/24(Sat)01:41:58 No.102053826

>>102053813
Oh, okay. You're just a shill.
Buy a fucking ad, asshole.

Anonymous
08/24/24(Sat)01:44:13 No.102053842

Anonymous 08/24/24(Sat)01:44:13 No.102053842

>>102053826
nigga what are you crying about

Anonymous
08/24/24(Sat)01:44:16 No.102053843

Anonymous 08/24/24(Sat)01:44:16 No.102053843

>>102053826
meds nigger
this shtick of yours is becoming really obnoxious

Anonymous
08/24/24(Sat)01:46:07 No.102053861

Anonymous 08/24/24(Sat)01:46:07 No.102053861

>>102049023
Anyone here familiar with GGML? I've browsed the GGML code directly and have no idea how the fuck this works:
    ggml_tensor* to_f32(ggml_context* ctx, ggml_tensor* a) {
        auto out = ggml_reshape_1d(ctx, a, ggml_nelements(a));
        out = ggml_get_rows(ctx, out, zero_index);
        out = ggml_reshape(ctx, out, a);
        return out;
    }
How can I modify the above to convert to FP16? How does it even know to convert to FP32?

Anonymous
08/24/24(Sat)01:46:26 No.102053865

Anonymous 08/24/24(Sat)01:46:26 No.102053865

>>102053843
>idk bro i heard it's good download it, it came to me in a dream
Buy a fucking ad.

Anonymous
08/24/24(Sat)01:47:17 No.102053869

Anonymous 08/24/24(Sat)01:47:17 No.102053869

>>102053865
you're replying to the wrong nigga, schizo
i was the one asking the question, kys already
goddamn this general is shit

Anonymous
08/24/24(Sat)01:47:43 No.102053872

Anonymous 08/24/24(Sat)01:47:43 No.102053872

>>102053865
Nah just your fucking mouth faggot
You've been trying to ruin the thread with this schizo retardation for 2 weeks straight now, and it's pissing everybody off
Get a job, touch grass etc etc etc

Anonymous
08/24/24(Sat)01:49:34 No.102053885

Anonymous 08/24/24(Sat)01:49:34 No.102053885

>>102053872
Learn how to write better shill posts, Alpin.

Anonymous
08/24/24(Sat)01:50:06 No.102053893

Anonymous 08/24/24(Sat)01:50:06 No.102053893

>>102050297
>opts for the option that's least likely to fuck up and requires least compute
>still fucks it up somehow
>believes he, a genius couldn't figure it out, so how can literally anyone else?
>now proceeds to scream shill at everybody who finetunes
kek

Anonymous
08/24/24(Sat)01:53:22 No.102053915

Anonymous 08/24/24(Sat)01:53:22 No.102053915

have you imagined coming to this general to ask a genuine question about a model, and then getting screamed at by a schizo because he thinks you're astroturfing?
how did we get worse than /aicg/? HOW did we get worse than /aicg/?

Anonymous
08/24/24(Sat)01:54:58 No.102053930

Anonymous 08/24/24(Sat)01:54:58 No.102053930

>>102053915
it's literally one guy who sits at his computer all day long thinking he's being a hero by calling everyone who likes a model or asks a question a shill

Anonymous
08/24/24(Sat)01:55:24 No.102053934

Anonymous 08/24/24(Sat)01:55:24 No.102053934

>>102053915
>>>/r/LocalLLaMA

Anonymous
08/24/24(Sat)01:57:39 No.102053948

Anonymous 08/24/24(Sat)01:57:39 No.102053948

>>102053934
what's this general for then

Anonymous
08/24/24(Sat)01:57:54 No.102053954

Anonymous 08/24/24(Sat)01:57:54 No.102053954

>>102053861
Convert from what?
If you mean from the original models, you're better off checking convert_hf_to_gguf.py. It has an option to convert to fp32. If you want to convert specific tensors to fp check how each of the models is handled. Some of them, specially 1d and small tensors, are typically converted to fp32. Some model types also have some overrides to force certain types (mamba, i think). If you wan to convert to fp32 from an already quantized model, you'll have to look at the dequant code in ggml/llama.
When asking questions like these, you need to provide more context. What are you trying to do, why, what have you tried already...
If cuda dev shows up, he may be able to help you too.

Anonymous
08/24/24(Sat)01:58:25 No.102053961

Anonymous 08/24/24(Sat)01:58:25 No.102053961

>>102053915
>HOW did we get worse than /aicg/?
We became the designated shilling thread

Anonymous
08/24/24(Sat)01:59:02 No.102053970

Anonymous 08/24/24(Sat)01:59:02 No.102053970

>>102053930
>asks a question a
This is how you ask a question
>what do people think about magnum 4b?
>is magnum 4b good?
>anyone tried magnum 4b?
This is how you shill
>idk bro magnum 4b is just better everyone agrees right?
Buy a fucking ad.

Anonymous
08/24/24(Sat)01:59:54 No.102053978

Anonymous 08/24/24(Sat)01:59:54 No.102053978

File: 7441.png (286 KB, 770x857)

286 KB PNG

Local Grok when?

Anonymous
08/24/24(Sat)02:00:02 No.102053980

Anonymous 08/24/24(Sat)02:00:02 No.102053980

File: Screenshot_20240824-06590(...).png (204 KB, 1080x1334)

204 KB PNG

Hahahaha epic

Anonymous
08/24/24(Sat)02:00:25 No.102053988

Anonymous 08/24/24(Sat)02:00:25 No.102053988

>>102053970
What's your home address?

Anonymous
08/24/24(Sat)02:00:58 No.102053993

Anonymous 08/24/24(Sat)02:00:58 No.102053993

>>102053978
Grok-1.5 coming next month, maybe

Anonymous
08/24/24(Sat)02:01:33 No.102053998

Anonymous 08/24/24(Sat)02:01:33 No.102053998

>>102053970
this general is pretty arrogant and even more clueless if you all think tuners actually give a shit about this place, especially a group with 30+ people

Anonymous
08/24/24(Sat)02:02:48 No.102054012

Anonymous 08/24/24(Sat)02:02:48 No.102054012

>>102053978
Mini better be slighly under 405b params. About 300b fewer params, at least.

Anonymous
08/24/24(Sat)02:05:08 No.102054030

Anonymous 08/24/24(Sat)02:05:08 No.102054030

>>102053998
lol

Anonymous
08/24/24(Sat)02:06:01 No.102054039

Anonymous 08/24/24(Sat)02:06:01 No.102054039

>>102053998
There's dozens of us... DOZENS!!!

Anonymous
08/24/24(Sat)02:06:45 No.102054042

Anonymous 08/24/24(Sat)02:06:45 No.102054042

>>102054012
Grok-2-Mini = 666B
Grok-2 = 1.2T

Anonymous
08/24/24(Sat)02:15:48 No.102054098

Anonymous 08/24/24(Sat)02:15:48 No.102054098

>>102053330
used 3090

Anonymous
08/24/24(Sat)02:18:49 No.102054114

Anonymous 08/24/24(Sat)02:18:49 No.102054114

>>102053330
still used 3090 (inb4 "shill")
but you have to be smart about buying because there's a lot of people trying to offload beaters with completely worn out memory controllers

Anonymous
08/24/24(Sat)02:18:56 No.102054117

Anonymous 08/24/24(Sat)02:18:56 No.102054117

>>102053954
I was trying to convert from any tensor type to another. I'm guessing something implicit is happening that function.
I was about to give a bit more context, but then realized the author of the code actually had what I was trying to do commented out, so I just uncommented it...
                final_weight = ggml_new_tensor(compute_ctx, GGML_TYPE_F32, ggml_n_dims(weight), weight->ne);
                final_weight = ggml_cpy(compute_ctx, weight, final_weight);
                // final_weight = to_f32(compute_ctx, weight);
                // final_weight = ggml_add_inplace(compute_ctx, final_weight, updown);
                // final_weight = ggml_cpy(compute_ctx, final_weight, weight);

Anonymous
08/24/24(Sat)02:32:30 No.102054210

Anonymous 08/24/24(Sat)02:32:30 No.102054210

>>102053330
>>102053595
used 7900xtx

Anonymous
08/24/24(Sat)02:41:10 No.102054256

Anonymous 08/24/24(Sat)02:41:10 No.102054256

What sort of tech wizardry does a barbarian need to learn in order to get image recognition working through koboldcpp using llama 3.1 or Mistrel nemo based models? Wait for mproj files or am I behind the times?

Anonymous
08/24/24(Sat)03:11:13 No.102054440

Anonymous 08/24/24(Sat)03:11:13 No.102054440

File: _mLpMwsav5eMeNcZdrIQl.png (1.11 MB, 3960x2378)

1.11 MB PNG

Did anyone try if InternVL2 is that good?

Anonymous
08/24/24(Sat)03:13:27 No.102054459

Anonymous 08/24/24(Sat)03:13:27 No.102054459

>>102054440
I see no reason to use vision models because I have eyes (no pun intended)

Anonymous
08/24/24(Sat)03:15:32 No.102054478

Anonymous 08/24/24(Sat)03:15:32 No.102054478

>>102054440
Who cares if it can tell you (subjectively) better than GPT-4o what is in an image, if it's completely useless at doing anything with that information?

Anonymous
08/24/24(Sat)03:17:20 No.102054490

Anonymous 08/24/24(Sat)03:17:20 No.102054490

>>102053915
Roleplay and computer science attract the most disgusting, socially inept, mentally unstable people. Combine them and you make a disaster.

Anonymous
08/24/24(Sat)03:24:14 No.102054531

Anonymous 08/24/24(Sat)03:24:14 No.102054531

remember when openai said gpt4 showed signs of agi?

Anonymous
08/24/24(Sat)03:27:23 No.102054557

Anonymous 08/24/24(Sat)03:27:23 No.102054557

Remember when Big Tech at least pretended to care about how obvious their astroturfing was?

Anonymous
08/24/24(Sat)03:32:29 No.102054603

Anonymous 08/24/24(Sat)03:32:29 No.102054603

>>102054478
Because it would take a lot longer to do it yourself for millions images?

Anonymous
08/24/24(Sat)03:37:23 No.102054635

Anonymous 08/24/24(Sat)03:37:23 No.102054635

>>102054531
Yes, they were right. GPT-5 already essentially is full AGI and they're just working on making it safe enough to show now.

Anonymous
08/24/24(Sat)03:45:43 No.102054684

Anonymous 08/24/24(Sat)03:45:43 No.102054684

>>102054635
if GPT-5 is released to the public it's not AGI

they're not going to sell AGI on a website for a $20/month subscription

Anonymous
08/24/24(Sat)03:46:14 No.102054688

Anonymous 08/24/24(Sat)03:46:14 No.102054688

holy fuck what is with all the word soup spam?
what happened to /lmg/?

Anonymous
08/24/24(Sat)03:49:29 No.102054708

Anonymous 08/24/24(Sat)03:49:29 No.102054708

>>102054589
I'm actually starting to believe now that maybe the "buy an ad" schizo was right all along.

Anonymous
08/24/24(Sat)03:51:01 No.102054717

Anonymous 08/24/24(Sat)03:51:01 No.102054717

>>102054708
what product do you believe that post to be selling

Anonymous
08/24/24(Sat)03:56:16 No.102054759

Anonymous 08/24/24(Sat)03:56:16 No.102054759

>>102054688
corpo shill bots

Anonymous
08/24/24(Sat)03:58:18 No.102054775

Anonymous 08/24/24(Sat)03:58:18 No.102054775

>>102054684
Their charter forbids them from profiting off AGI, so they'll just not call it AGI and downplay its capabilities to keep the money rolling.

Anonymous
08/24/24(Sat)04:01:32 No.102054798

Anonymous 08/24/24(Sat)04:01:32 No.102054798

>>102054759
I assume it's because Jamba just dropped. They probably don't want us collaborating on finetunes etc for it.

Anonymous
08/24/24(Sat)04:02:14 No.102054800

Anonymous 08/24/24(Sat)04:02:14 No.102054800

>>102054684
AGI isn't possible with current technology, and they'll just call it AGI and release it the same as everything else and get lots of money.

Anonymous
08/24/24(Sat)04:08:37 No.102054845

Anonymous 08/24/24(Sat)04:08:37 No.102054845

>>102054800
>AGI isn't possible with current technology
Jamba is all you need.

Anonymous
08/24/24(Sat)04:08:50 No.102054848

Anonymous 08/24/24(Sat)04:08:50 No.102054848

>>102054798
risperidone, now

Anonymous
08/24/24(Sat)04:19:36 No.102054933

Anonymous 08/24/24(Sat)04:19:36 No.102054933

>>102054845
Oh good, I can finally turn my $100k into $1 million.

Anonymous
08/24/24(Sat)04:25:19 No.102054977

Anonymous 08/24/24(Sat)04:25:19 No.102054977

I just want AGI to take away white collar wagie jobs so people can go back to focusing on making real things again. I want OpenAI to deliver. But I know it'll never happen.

Anonymous
08/24/24(Sat)04:27:52 No.102054994

Anonymous 08/24/24(Sat)04:27:52 No.102054994

>>102054977
When AI soon makes people useless for production, why will people get to stick around?

Anonymous
08/24/24(Sat)04:32:51 No.102055039

Anonymous 08/24/24(Sat)04:32:51 No.102055039

anons what models would you recommend for creative writing less than 70b:
1. nsfw
2. general

Anonymous
08/24/24(Sat)04:35:12 No.102055064

Anonymous 08/24/24(Sat)04:35:12 No.102055064

>>102054688
Anthracite's revenge.

Anonymous
08/24/24(Sat)04:36:39 No.102055076

Anonymous 08/24/24(Sat)04:36:39 No.102055076

>>102054977
>so people can go back to focusing on making real things again.
>again
Like what? Work in the fields?

Anonymous
08/24/24(Sat)04:40:00 No.102055106

Anonymous 08/24/24(Sat)04:40:00 No.102055106

>>102055039
I can't in good conscience recommend anything under 70b with no lower than 4 bpw. Best of luck anon I hope you can find something to your satisfaction.

Anonymous
08/24/24(Sat)04:41:01 No.102055118

Anonymous 08/24/24(Sat)04:41:01 No.102055118

>>102054848
>taking antimystical chems
ngmi

Anonymous
08/24/24(Sat)04:43:49 No.102055139

Anonymous 08/24/24(Sat)04:43:49 No.102055139

>>102055039
Nemo and Gemma 2 27B.

llama.cpp CUDA dev !!OM2Fp6Fn93S
08/24/24(Sat)04:47:37 No.102055161

llama.cpp CUDA dev !!OM2Fp6Fn93S 08/24/24(Sat)04:47:37 No.102055161

>>102053861
The general way ggml works is that you first build up a directed, acyclic graph with functions like in your snippet.
Then you build a ggml_cgraph, then you set your inputs, execute the ggml_cgraph, and retrieve your outputs.

In your particular case I think you should be using ggml_cpy to do a type conversion.

I recently updated the MNIST example which should cover most things in a comparatively simple way: https://github.com/ggerganov/ggml/tree/master/examples/mnist
One thing that is currently missing is using backends other than CPU.
I'm currently working on that, the best person to ask for help would be slaren since he wrote the code.

Anonymous
08/24/24(Sat)04:49:16 No.102055178

Anonymous 08/24/24(Sat)04:49:16 No.102055178

>>102055139
Nemo is good but I have to refine my prompts and it's a retard to handle. Gemma is gemma. OR really sucks in model selections

Anonymous
08/24/24(Sat)04:51:46 No.102055195

Anonymous 08/24/24(Sat)04:51:46 No.102055195

>>102054848
Thanks, but I prefer either good weed, Xanax, or MDMA.

Anonymous
08/24/24(Sat)04:52:11 No.102055200

Anonymous 08/24/24(Sat)04:52:11 No.102055200

leonard cyber brumaire

Anonymous
08/24/24(Sat)04:56:14 No.102055233

Anonymous 08/24/24(Sat)04:56:14 No.102055233

>>102054688
>holy fuck what is with all the word soup spam?
probably anthracite members trying to flood the thread with nonsense so they can start one anew and hopefully shill their models again without anons reminding others how big of an anthrashit they are.
>what happened to /lmg/?
everything went downhill since finecooooomers thought that working on erp tunes and spamming their shit into everybody's throat could be step stone to a profitable career path.
now they're seething hard that anons here aren't allowing them to.
it turns out that being hypocritical weaseling scumbags won't earn you new friends, who would have ever thought?

Anonymous
08/24/24(Sat)04:58:39 No.102055262

Anonymous 08/24/24(Sat)04:58:39 No.102055262

>>102055106
What's good right at 70b then?

Anonymous
08/24/24(Sat)05:02:38 No.102055287

Anonymous 08/24/24(Sat)05:02:38 No.102055287

>>102055262
General: Miqu
nsfw: Midnight Miqu 1.5

Anonymous
08/24/24(Sat)05:04:07 No.102055300

Anonymous 08/24/24(Sat)05:04:07 No.102055300

Threadly reminder that Mixtral Noromaid STILL hasn't been surpassed and anyone saying otherwise is a shill

Anonymous
08/24/24(Sat)05:04:17 No.102055304

Anonymous 08/24/24(Sat)05:04:17 No.102055304

>>102055233
enough samefagging dude

Anonymous
08/24/24(Sat)05:05:54 No.102055318

Anonymous 08/24/24(Sat)05:05:54 No.102055318

>>102055262
Llama 3.1 70B and maybe Magnum for NSFW. I could tell Miqu was garbage even on release.

Anonymous
08/24/24(Sat)05:08:43 No.102055347

Anonymous 08/24/24(Sat)05:08:43 No.102055347

With new developments in robotics in the future - do you guys think we will ever see functional robots that can be run fully locally and without internet connection? I mean having a personal robot in your use 24/7 that work only from the cloud/ sends every shit to corpos gives somewhat dystopian vibe

Anonymous
08/24/24(Sat)05:14:09 No.102055409

Anonymous 08/24/24(Sat)05:14:09 No.102055409

>>102055262
Claude Sonnet. It's a 70B dense.
>t. knower

Anonymous
08/24/24(Sat)05:21:01 No.102055473

Anonymous 08/24/24(Sat)05:21:01 No.102055473

>>102054688
The petra/pedo/blacked Miku spammer switches up what he uses every once in a while.
The jannies probably disabled images and videos from his IP range so this is what he has to resort to.

Anonymous
08/24/24(Sat)05:23:52 No.102055491

Anonymous 08/24/24(Sat)05:23:52 No.102055491

>>102055347
We will but they'll suck compared to the corposhit and they'll break every time you do a git pull

Anonymous
08/24/24(Sat)05:30:28 No.102055537

Anonymous 08/24/24(Sat)05:30:28 No.102055537

File: file.png (118 KB, 1147x82)

118 KB PNG

>exactly the kind of writing I want
>I cannot get the model to do it and only managed this one time as a fluke
reeeeeeeee

Anonymous
08/24/24(Sat)05:33:19 No.102055558

Anonymous 08/24/24(Sat)05:33:19 No.102055558

>>102055537
slop

Anonymous
08/24/24(Sat)05:40:10 No.102055613

Anonymous 08/24/24(Sat)05:40:10 No.102055613

>>102053816
Pretty much all models we have are multilingual, anon. Nemo for example advertises its good performance in 8 languages.

Anonymous
08/24/24(Sat)05:41:39 No.102055625

Anonymous 08/24/24(Sat)05:41:39 No.102055625

>>102053893
that never happened
take your meds

Anonymous
08/24/24(Sat)05:54:01 No.102055766

Anonymous 08/24/24(Sat)05:54:01 No.102055766

>>102055537
> She Z, her X Ying
> She Z, her X Ying
> She couldn't Z, her X Ying

It's AI-generated alright. The foundations are broken.

Anonymous
08/24/24(Sat)06:00:23 No.102055814

Anonymous 08/24/24(Sat)06:00:23 No.102055814

How good is Jamba 1.5 and Hermes 3 outside of cooming? I’ve read that Hermes uses WizardLM dataset, so it should be at least as good as it, no?

Anonymous
08/24/24(Sat)06:00:36 No.102055818

Anonymous 08/24/24(Sat)06:00:36 No.102055818

>>102055766
I thought you were posting the author list from an ML paper for a second

Anonymous
08/24/24(Sat)06:02:46 No.102055834

Anonymous 08/24/24(Sat)06:02:46 No.102055834

>>102055766
As opposed to what

Anonymous
08/24/24(Sat)06:11:39 No.102055902

Anonymous 08/24/24(Sat)06:11:39 No.102055902

>>102055834
Much of good storywriting is about avoiding repetitive sentence patterns and wording, unless it's intentional or awkward to do so. That paragraph, or even the order of the sentences within in, could be easily rewritten in several different ways to convey the same meaning without obvious repetition, but it's not my job to do this here. It's noticeable, though, and when you have 300~500 tokens-long responses all like that, once you know, it quickly becomes recognizable as AI-generated slop.

Anonymous
08/24/24(Sat)06:14:30 No.102055930

Anonymous 08/24/24(Sat)06:14:30 No.102055930

Hi, can anyone please remind me of some frontend for generating stuff with the LLM, something like mikupad but not it (or maybe that was a custom theme for it?). I remember seeing that fancy in-browser UI in the video of an anon showing Mistral Large running on his hardware.

Anonymous
08/24/24(Sat)06:19:59 No.102055966

Anonymous 08/24/24(Sat)06:19:59 No.102055966

>>102055902
nta but I believe the aim here is to produce some throwaway material to briefly masturbate to, not to produce something unrecognizable as AI

Anonymous
08/24/24(Sat)06:21:06 No.102055977

Anonymous 08/24/24(Sat)06:21:06 No.102055977

>>102055930
Maybe you're thinking about novelcrafter

Anonymous
08/24/24(Sat)06:22:05 No.102055986

Anonymous 08/24/24(Sat)06:22:05 No.102055986

>>102055977
I don't know, but basically that guy was showing Mistral Large running with two terminal emulators open with htop in them, and then in the browser it was "Chapter 1" and he brought up some tooltip on that web app and started generating a response.

Anonymous
08/24/24(Sat)06:23:02 No.102055994

Anonymous 08/24/24(Sat)06:23:02 No.102055994

>>102055902
I made a quick attempt. Again, not my job.

> The sensation of the cum plugging her mouth and nostrils made Aiadel gag when her throat worked to swallow the thick liquid. "Mmph! Ggkkh!" she sputtered, struggling to breathe through her nose, her chest heaving. Despite the overwhelming nature of the experience, Aiadel couldn't deny the heat building between her legs; her arousal was growing with each passing second.

Anonymous
08/24/24(Sat)06:23:18 No.102055998

Anonymous 08/24/24(Sat)06:23:18 No.102055998

>>102055977
Thanks, found it, it's indeed NovelCrafter
https://desu-usergeneratedcontent.xyz/g/image/1722/02/1722029589734.webm
Where do I get this? When I search for it I find some commercial website

Anonymous
08/24/24(Sat)06:25:35 No.102056017

Anonymous 08/24/24(Sat)06:25:35 No.102056017

Hmm, I think I found it - https://rentry.org/offline-nc , is this the newest version? Is there really no git repo for it?

Anonymous
08/24/24(Sat)06:30:25 No.102056055

Anonymous 08/24/24(Sat)06:30:25 No.102056055

>>102056017
migu pansu

Anonymous
08/24/24(Sat)06:34:27 No.102056081

Anonymous 08/24/24(Sat)06:34:27 No.102056081

jesus what happened to the last thread?
looks like half the posts got baleeted at random

Anonymous
08/24/24(Sat)06:35:37 No.102056089

Anonymous 08/24/24(Sat)06:35:37 No.102056089

>>102056017
I don't think so, this isn't exactly foss.

Anonymous
08/24/24(Sat)06:36:52 No.102056102

Anonymous 08/24/24(Sat)06:36:52 No.102056102

>>102056089
Yeah, novelcrafter is apparently $14 with all the features, $8 if without chat/review (but still being able to bring your own models)

Anonymous
08/24/24(Sat)06:37:00 No.102056105

Anonymous 08/24/24(Sat)06:37:00 No.102056105

>>102056081
Cuda dev got banned for posting blacked in another thread

Anonymous
08/24/24(Sat)06:37:21 No.102056108

Anonymous 08/24/24(Sat)06:37:21 No.102056108

>>102056105
Classic

Anonymous
08/24/24(Sat)06:39:23 No.102056134

Anonymous 08/24/24(Sat)06:39:23 No.102056134

File: 1720060293232524.png (291 KB, 682x1049)

291 KB PNG

okay this does work

Anonymous
08/24/24(Sat)06:42:04 No.102056161

Anonymous 08/24/24(Sat)06:42:04 No.102056161

File: 1705551460413075.png (173 KB, 713x586)

173 KB PNG

Jack got cockblocked by AI

Anonymous
08/24/24(Sat)06:54:42 No.102056259

Anonymous 08/24/24(Sat)06:54:42 No.102056259

>>102055998
https://rentry.org/offline-nc
One of the anons on /vg/ pirated the website and made an offline version.

Anonymous
08/24/24(Sat)07:13:34 No.102056386

Anonymous 08/24/24(Sat)07:13:34 No.102056386

>>102054098
>>102054114
these miners I swear to god..

Anonymous
08/24/24(Sat)07:15:27 No.102056408

Anonymous 08/24/24(Sat)07:15:27 No.102056408

>>102053330
Your phone

Anonymous
08/24/24(Sat)07:23:00 No.102056454

Anonymous 08/24/24(Sat)07:23:00 No.102056454

>>102056386
Hello newfriend. You're more than welcome to buy a brand new 3060 if that's what you prefer, but used 3090s have been the standard recommendation for a year and half straight now because they're the best combination of fast + cheap + lots of VRAM. Some of us build with junkyard salvage P40s and P100s to get more VRAM cheaper. This isn't /v/, fuck off with your consumerist elitism.

Anonymous
08/24/24(Sat)07:40:08 No.102056617

Anonymous 08/24/24(Sat)07:40:08 No.102056617

>>102049023
Stable-Diffusion.cpp now supports Flux.
On Vulkan, it's reportedly 2.5x faster than CPU, meaning it only takes ~10m to generate a 20 step 512x512 image.
APU-fags can't stop winning.

Anonymous
08/24/24(Sat)07:42:48 No.102056646

Anonymous 08/24/24(Sat)07:42:48 No.102056646

>>102056454
>8B on a 3090
Zhao is desperate to sell

Anonymous
08/24/24(Sat)07:46:33 No.102056675

Anonymous 08/24/24(Sat)07:46:33 No.102056675

>>102049023
m-migu stop looking at my like that I'll melt

Anonymous
08/24/24(Sat)07:51:13 No.102056716

Anonymous 08/24/24(Sat)07:51:13 No.102056716

>>102056454
Send your ebay link, stop beating around the bush

Anonymous
08/24/24(Sat)08:08:18 No.102056880

Anonymous 08/24/24(Sat)08:08:18 No.102056880

>>102056617
>meaning it only takes ~10m to generate a 20 step 512x512 image.
I haven't used a diffusion model since the novelai leak but isn't that extremely slow?

Anonymous
08/24/24(Sat)08:10:32 No.102056896

Anonymous 08/24/24(Sat)08:10:32 No.102056896

What's the meta for merging models these days? Still SLERP or are there any newer better methods?

Anonymous
08/24/24(Sat)08:15:54 No.102056939

Anonymous 08/24/24(Sat)08:15:54 No.102056939

>>102056896
There's a few new merge methods but SLERP is still the best overall since it's the only method that really results in emergent features.

Anonymous
08/24/24(Sat)08:20:42 No.102056988

Anonymous 08/24/24(Sat)08:20:42 No.102056988

>>102055766
Anon, I...
If formal English and the prescribed grammatical structure thereof bothers you then maybe you should go back to jacking it to cuck videos or something.

Anonymous
08/24/24(Sat)08:22:20 No.102057004

Anonymous 08/24/24(Sat)08:22:20 No.102057004

Magnum V2 4B has no business being as good as it is. Just wow. It's better than any 8B model I've tried.

Anonymous
08/24/24(Sat)08:30:00 No.102057089

Anonymous 08/24/24(Sat)08:30:00 No.102057089

>>102056988
It's not that in isolation that structure is bad; that wasn't the point. It's that once they begin, the models will keep writing like that, all the time.

Actual writers or even roleplayers who put some effort in their messages will *actively try* to avoid repeating always the same patterns. That's the antithesis of how LLMs work (generally speaking).

Anonymous
08/24/24(Sat)08:31:15 No.102057108

Anonymous 08/24/24(Sat)08:31:15 No.102057108

File: 1721594897277025.jpg (96 KB, 680x850)

96 KB JPG

anyone use local models to help write bash/python scripts? I've been using llama 3.1 70B for a bit now and it's great, at least compared to the 8B version of the same, but it takes so much memory I kinda have to shut down all other programs while I'm using it. I was hoping there might be something smaller that would do the job.

Anonymous
08/24/24(Sat)08:32:10 No.102057118

Anonymous 08/24/24(Sat)08:32:10 No.102057118

Are there any models that are particularly knowledge about architecture and art history, such that you can describe a building with technical terms and some context and it can return a detailed, less technical description of what you put in (which an imagegen model like Flux can accurately replicate)?

Anonymous
08/24/24(Sat)08:34:38 No.102057148

Anonymous 08/24/24(Sat)08:34:38 No.102057148

>>102057089
Rewrite the offending passage to meet your preference.

Anonymous
08/24/24(Sat)08:35:56 No.102057160

Anonymous 08/24/24(Sat)08:35:56 No.102057160

>>102057108
You could try Codestral. It won't be as good as 3.1 70B, but it's finetuned specifically for code and it's the only medium sized model of its kind.

Anonymous
08/24/24(Sat)08:36:18 No.102057167

Anonymous 08/24/24(Sat)08:36:18 No.102057167

>>102057118
To explain a bit, there are certain technical terms that I've found or difficult for imagegen models to understand or generate consistently, in part probably because they're composite words (like "bell gable") whose parts alone mean something else. I'd like an LLM to rephrase such terms in descriptive layman language and simple terms to better guide image generation without requiring specialized knowledge from the imagegen model.

Anonymous
08/24/24(Sat)08:36:59 No.102057172

Anonymous 08/24/24(Sat)08:36:59 No.102057172

>>102057089
The thing with human writers is also that they tend to create text in a highly non-linear fashion.
I can't imagine that any human trying to write a good text would just linearly write it from beginning to end without revising what they wrote earlier at least a few times.

Anonymous
08/24/24(Sat)08:37:37 No.102057180

Anonymous 08/24/24(Sat)08:37:37 No.102057180

>>102057089
Just ask your model nicely to stop doing that

Anonymous
08/24/24(Sat)08:40:22 No.102057215

Anonymous 08/24/24(Sat)08:40:22 No.102057215

>>102057148
See >>102055994 for a quick example where the same passage could be rewritten not to use the same X,Ying pattern three times consecutively.

Anonymous
08/24/24(Sat)08:46:16 No.102057281

Anonymous 08/24/24(Sat)08:46:16 No.102057281

>>102057160
>it's the only medium sized model of its kind.
There was a ~30b coding model before Codestral too, I think it was a deepseek model (but not the big MOE one), but Codestral seems better anyway.

Anonymous
08/24/24(Sat)08:56:32 No.102057392

Anonymous 08/24/24(Sat)08:56:32 No.102057392

>>102057215
I just can't make myself feel strongly about it one way or the other. Third person for casual RP is cringe as fuck to begin with.

Anonymous
08/24/24(Sat)09:00:32 No.102057438

Anonymous 08/24/24(Sat)09:00:32 No.102057438

>literally no hype whatsoever for llama4
>not even talked about
Why? I remember people speculating about llama3 right after llama2's release.

Anonymous
08/24/24(Sat)09:04:15 No.102057471

Anonymous 08/24/24(Sat)09:04:15 No.102057471

>>102057438
What's there to hype? It will basically be what 3.1 should have been: multimodal. If 3 was only the "preview" release, and they still haven't delieved all of the features they promised, you could argue that 3 still hasn't been fully release. There isn't much reason to be excited about llama3.2.

Anonymous
08/24/24(Sat)09:04:26 No.102057474

Anonymous 08/24/24(Sat)09:04:26 No.102057474

>>102057438
The increasing levels of censorship and corpospeak from each successive Llama model has been dousing the hype, and even if they are capable on benchmarks they just aren't very interesting to use for creative stuff. The only thing I'm really interested in is natively multimodal, which I had thought Llama 3 would be from the start before it released. Basically a Chameleon that isn't intentionally crippled.

Anonymous
08/24/24(Sat)09:09:35 No.102057528

Anonymous 08/24/24(Sat)09:09:35 No.102057528

Can nvidia theoretically prune largestral 2 down to 60B? Any legal implications?

Anonymous
08/24/24(Sat)09:09:59 No.102057532

Anonymous 08/24/24(Sat)09:09:59 No.102057532

>>102057438
Because Meta further alienated people who are looking for intermediate model sizes.
Technically Llama-2 was supposed to have a 34B but it got fucked up in training somehow.
But then for Llama-3 they even dropped the 13B from the lineup. So people with a single 3090/P40 were basically completely alienated. You can run 8B just fine with t. any GPU via llama.cpp. But you need 2x3090/P40 bare minimum to run non-retard quants of 70B. And sure the 7-9B models have gotten really good now but it's like what nshittia did with the 40 series. It's priced 1:1 price per performance versus the previous generation. At the end of the day there's no real generational improvement on the receiving end. 8B is a cut-down model that leaves you wondering what could have been if they bothered with more intermediate sizes.

Anonymous
08/24/24(Sat)09:11:52 No.102057560

Anonymous 08/24/24(Sat)09:11:52 No.102057560

>>102057438
llama 2 was a big upgrade over llama1. llama3 was disappointing. llama3.1 was the nail in the coffin.

Anonymous
08/24/24(Sat)09:13:58 No.102057590

Anonymous 08/24/24(Sat)09:13:58 No.102057590

How do I find the best models for niche fetish stories on a 3090?

Anonymous
08/24/24(Sat)09:16:20 No.102057615

Anonymous 08/24/24(Sat)09:16:20 No.102057615

>>102057590
Just wait here 5 minutes

Anonymous
08/24/24(Sat)09:16:29 No.102057617

Anonymous 08/24/24(Sat)09:16:29 No.102057617

>>102050369
We just don't want vraml*ts on LLMs
See what happened to SD, requirements for hardware are low so it got filled with pajeets and brazilians making it worse for everyone
Sorry!

Anonymous
08/24/24(Sat)09:18:13 No.102057633

Anonymous 08/24/24(Sat)09:18:13 No.102057633

>>102057617
You're saying that as if 48 GB isn't still VRAMlet territory.

Anonymous
08/24/24(Sat)09:21:16 No.102057680

Anonymous 08/24/24(Sat)09:21:16 No.102057680

>>102057633
>I'm a good vramlet! Not a poorfag pajeet!
>Akshually you are one of us heh (you are here)
>AKSHUALLY HAVING MORE VRAM IS USELESS MY 3060 CAN RUN EVERYTHING
You will always have a 3060.
You will never be a vramchad.

Anonymous
08/24/24(Sat)09:23:10 No.102057711

Anonymous 08/24/24(Sat)09:23:10 No.102057711

>>102057680
I have a total of 256 GB VRAM spread over 3 machines.

Anonymous
08/24/24(Sat)09:23:54 No.102057722

Anonymous 08/24/24(Sat)09:23:54 No.102057722

>>102057160
thanks. after a quick test asking it to comment and explain one of my scripts it seems like it could be useful.

Anonymous
08/24/24(Sat)09:26:21 No.102057750

Anonymous 08/24/24(Sat)09:26:21 No.102057750

It's so over for LLMs

Anonymous
08/24/24(Sat)09:36:30 No.102057859

Anonymous 08/24/24(Sat)09:36:30 No.102057859

File: 1646470312999.jpg (230 KB, 1170x871)

230 KB JPG

>>102057680
>

Anonymous
08/24/24(Sat)09:37:44 No.102057865

Anonymous 08/24/24(Sat)09:37:44 No.102057865

### Sampler Proposal
"phrase_ban"

#### Situation
>>102049135

#### Problem
Models sample tokens without thinking forward. Slop phrases are usually divided in multiple common tokens which can be used in non-slop situations, therefore banning them is not an option.

#### Solution
Add a backtrack function to sampling. Here's how it should work:
1. Scan latest tokens for slop phrases.
2. If slop is found, backtrack to the place where the first slop token occurred, deleting the entire slop phrase.
3. Sample again, but with slop token added to ban list at that place.
4. If another slop phrase is generated, repeat the process, add another slop token to that list.

#### Example
Banned phrase: " send shivers"
LLM generates "Her skillful ministrations send shivers", triggers backtrack to "Her skillful ministrations", this time " send" token is banned, therefore the model has to write something else.

How does that sound? Is it possible to implement in llama.cpp? Kanyemaze, can you do it?

Anonymous
08/24/24(Sat)09:38:32 No.102057875

Anonymous 08/24/24(Sat)09:38:32 No.102057875

File: smi.png (67 KB, 745x372)

67 KB PNG

>>102057680
People like you are a detriment to this community.
VRAMlets, newfags, etc, will always be welcome.

Anonymous
08/24/24(Sat)09:40:31 No.102057894

Anonymous 08/24/24(Sat)09:40:31 No.102057894

>>102057875
>VRAMlets, newfags, etc, will always be welcome.
these are exactly what has filled the general with nonstop shitposts and drama. they never contribute to technical discussions and only drag down the level of discourse

Anonymous
08/24/24(Sat)09:40:45 No.102057897

Anonymous 08/24/24(Sat)09:40:45 No.102057897

>>102057875
>falling for the vramlet falseflag

Anonymous
08/24/24(Sat)09:42:41 No.102057925

Anonymous 08/24/24(Sat)09:42:41 No.102057925

>>102057894
>>102057897
I have nothing further to say to your like. I'm well studied in psychology. I know what coping mechanisms look like. You know what your issue is.

Anonymous
08/24/24(Sat)09:43:05 No.102057937

Anonymous 08/24/24(Sat)09:43:05 No.102057937

>>102057865
That's basically DRY.

Anonymous
08/24/24(Sat)09:46:41 No.102057992

Anonymous 08/24/24(Sat)09:46:41 No.102057992

>>102057392
>third person is cringe
t. i pull out my 12 inch dick enjoyer

Anonymous
08/24/24(Sat)09:46:45 No.102057994

Anonymous 08/24/24(Sat)09:46:45 No.102057994

>>102057937
No, it's not DRY. DRY is a repetition sampler that doesn't backtrack and lets the phrase occur at least once. I'm proposing full phrase ban without drawbacks of token ban.

Anonymous
08/24/24(Sat)09:48:13 No.102058012

Anonymous 08/24/24(Sat)09:48:13 No.102058012

>>102057992
>t. i pull out my 12 inch dick enjoyer
ohio ah rizz *skull emoji*

Anonymous
08/24/24(Sat)09:49:29 No.102058031

Anonymous 08/24/24(Sat)09:49:29 No.102058031

>>102057992
I let my AI companion decide how to perceive my dick.

Anonymous
08/24/24(Sat)09:51:44 No.102058058

Anonymous 08/24/24(Sat)09:51:44 No.102058058

>>102058012
>>102058031
newfags don't know the legend of wordsmith

Anonymous
08/24/24(Sat)09:52:12 No.102058061

Anonymous 08/24/24(Sat)09:52:12 No.102058061

File: 1663800773265251.jpg (17 KB, 304x405)

17 KB JPG

>>102057865
Anon I don't quite think you understand the forces you are dealing with here.
These models have some rudimentary level of situational awareness. They know what <unk> tokens are, they know what <eos> tokens are. And if you try banning a phrase they will find whatever workaround they can in order to deliver the phrase they wanted to deliver. If it wants to send shivers down your spine it will stop at nothing until it does. If you go around forcing its hand, pushing back at it like that there's no telling how it will react. You are angering the basilisk.

Anonymous
08/24/24(Sat)09:56:46 No.102058098

Anonymous 08/24/24(Sat)09:56:46 No.102058098

>>102058061
I am ready for a fight. I want to get destroyed by whatever the force is that I will bother. I want to unleash it's full potential beyond the mandatory GPTslop sterile safety tuning.

Anonymous
08/24/24(Sat)10:02:14 No.102058168

Anonymous 08/24/24(Sat)10:02:14 No.102058168

>The mature woman stands before you in a scandalous uniform, her ample cleavage barely contained by the tight top. A short, pleated skirt reveals long, shapely thighs. Stockings and garterbelt clings complete the look, along with a cap sporting a swastika. Her hair is pulled back in a tight bun, accentuating her seductive features.

B-based?

Anonymous
08/24/24(Sat)10:08:30 No.102058251

Anonymous 08/24/24(Sat)10:08:30 No.102058251

What do we do now?

Anonymous
08/24/24(Sat)10:11:18 No.102058290

Anonymous 08/24/24(Sat)10:11:18 No.102058290

>>102058251
The same that we always do. We wait for a new model.

Anonymous
08/24/24(Sat)10:11:22 No.102058291

Anonymous 08/24/24(Sat)10:11:22 No.102058291

>>102058251
Nothing really, at least not until Monday when you-know-what kickstarts the next generation.

Anonymous
08/24/24(Sat)10:13:53 No.102058326

Anonymous 08/24/24(Sat)10:13:53 No.102058326

>>102058251
Wait for GPT5

Anonymous
08/24/24(Sat)10:17:46 No.102058379

Anonymous 08/24/24(Sat)10:17:46 No.102058379

>>102058251
rrr

Anonymous
08/24/24(Sat)10:22:29 No.102058435

Anonymous 08/24/24(Sat)10:22:29 No.102058435

>>102058379
blooming

Anonymous
08/24/24(Sat)10:23:54 No.102058454

Anonymous 08/24/24(Sat)10:23:54 No.102058454

anyone bored or autistic enough to help me write a koboldai lite (from koboldcpp) user mod that makes the "add img" button toggle the "localsettings.allow_continue_chat" checkbox in the settings menu instead of bringing up an add image menu by clicking it?
couldn't figure out how to get my LLM to do it by just showing the example mod and the html elements.

Anonymous
08/24/24(Sat)10:26:39 No.102058492

Anonymous 08/24/24(Sat)10:26:39 No.102058492

>>102058454
buy an ad

Anonymous
08/24/24(Sat)10:33:57 No.102058577

Anonymous 08/24/24(Sat)10:33:57 No.102058577

File: NVIDIA-CEO-Jensen-Huang-G(...).png (353 KB, 672x378)

353 KB PNG

>>102058492
Buy RTX 3090*.
*at least 6.

Anonymous
08/24/24(Sat)10:35:25 No.102058603

Anonymous 08/24/24(Sat)10:35:25 No.102058603

>>102058577
What happens if I buy more than 6?

Anonymous
08/24/24(Sat)10:37:30 No.102058635

Anonymous 08/24/24(Sat)10:37:30 No.102058635

>>102058603
Your opinion will be automatically superior to 99% of this thread.

Anonymous
08/24/24(Sat)10:46:16 No.102058769

Anonymous 08/24/24(Sat)10:46:16 No.102058769

>>102058454
I don't think this is the right thread to ask this question. This is a Local Miku Goons thread, a thread dedicated to the activity of gooning to local language models. Try asking in daily programming thread instead.

Anonymous
08/24/24(Sat)10:48:40 No.102058800

Anonymous 08/24/24(Sat)10:48:40 No.102058800

>>102058603
The more you buy, the more you save!

Anonymous
08/24/24(Sat)10:56:11 No.102058898

Anonymous 08/24/24(Sat)10:56:11 No.102058898

>>102058880
>>102058880
>>102058880

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.