/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 08/05/24(Mon)16:41:28 No.101739747

File: replicate-prediction-pzw7(...).png (1.33 MB, 1024x1024)

1.33 MB PNG

/lmg/ - Local Models General Anonymous 08/05/24(Mon)16:41:28 No.101739747 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>101732172 & >>101722144

►News
>(07/31) Google releases Gemma 2 2B, ShieldGemma, and Gemma Scope: https://developers.googleblog.com/en/smaller-safer-more-transparent-advancing-responsible-ai-with-gemma
>(07/27) Llama 3.1 rope scaling merged: https://github.com/ggerganov/llama.cpp/pull/8676
>(07/26) Cyberagent releases Japanese fine-tune model: https://hf.co/cyberagent/Llama-3.1-70B-Japanese-Instruct-2407
>(07/25) BAAI & TeleAI release 1T parameter model: https://hf.co/CofeAI/Tele-FLM-1T
>(07/24) Mistral Large 2 123B released: https://hf.co/mistralai/Mistral-Large-Instruct-2407

►News Archive: https://rentry.org/lmg-news-archive
►FAQ: https://wikia.schneedc.com
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Programming: https://hf.co/spaces/bigcode/bigcode-models-leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
08/05/24(Mon)16:41:53 No.101739753

Anonymous 08/05/24(Mon)16:41:53 No.101739753

File: 00746-6924931823.jpg (112 KB, 1024x1024)

112 KB JPG

►Recent Highlights from the Previous Thread: >>101732172

--RAM speed as a bottleneck for CPU inference: >>101738938 >>101738971 >>101739012 >>101739060
--InternLM 2.5 20B model performance and benchmarks: >>101738282 >>101738668 >>101739089
--GeLU optimization pull request sparks discussion on performance and accuracy: >>101737454 >>101737565 >>101737618 >>101737589 >>101737624 >>101737680 >>101737694 >>101737732
--Fine-tuning AI models is getting harder with each generation: >>101735201 >>101735271 >>101735313 >>101735338 >>101735348 >>101735383 >>101735498 >>101735979
--Experimenting with sampler settings to reduce repetition and slop: >>101735890 >>101735940 >>101735972 >>101736018 >>101735983 >>101736524 >>101736552 >>101736583 >>101736640 >>101736590 >>101736607 >>101736621 >>101738457 >>101738549
--Anon tests Pantheon-RP-1.5-12b-Nemo model and shares initial impressions: >>101733421 >>101733770 >>101733899 >>101734006 >>101738096 >>101738281 >>101739196
--Anon reminisces about old model's performance and suspects training data is key to its creativity: >>101734796 >>101734890 >>101734968 >>101735033 >>101735689
--Anon asks for a solution to repetition problem: >>101736797 >>101736835 >>101736956 >>101737029 >>101737581
--Anon complains about ollama's SSD usage and design choices: >>101737763 >>101737796 >>101737813 >>101737911 >>101738074
--Tiny Gemma 2 can be finetuned in Google Colab: >>101736889
--TESS L3.1 70B model details and creator's comments: >>101733159 >>101735733
--OpenAI's Active Inference and potential for energy-based models: >>101736594 >>101736671
--Model repetition and looping issues discussed: >>101736760 >>101736900 >>101736994 >>101737205
--KobbleSmall-2B model trained in under 3 hours on T4 GPU: >>101735943
--Discussion about running large models on high VRAM setups: >>101736891 >>101737003 >>101737262 >>101737398 >>101737055
--Miku (free space): >>101739066

►Recent Highlight Posts from the Previous Thread: >>101732179

Anonymous
08/05/24(Mon)16:50:29 No.101739895

Anonymous 08/05/24(Mon)16:50:29 No.101739895

File: 1443841623031.jpg (20 KB, 306x306)

20 KB JPG

is there any use for this besides cooming?

Anonymous
08/05/24(Mon)16:51:39 No.101739908

Anonymous 08/05/24(Mon)16:51:39 No.101739908

>>101739895
It's very useful for scamming VCs.

Anonymous
08/05/24(Mon)16:54:58 No.101739956

Anonymous 08/05/24(Mon)16:54:58 No.101739956

>>101739895
Any use case which applies to ChatGPT is also a valid use case for local. It's the same technology, just local

So for us yes, for you no

Anonymous
08/05/24(Mon)16:58:18 No.101739999

Anonymous 08/05/24(Mon)16:58:18 No.101739999

bitnet

Anonymous
08/05/24(Mon)17:01:28 No.101740042

Anonymous 08/05/24(Mon)17:01:28 No.101740042

>>101732811
They would literally landfill them rather than sell them cheap, just to prevent us from having nice things.

Anonymous
08/05/24(Mon)17:02:30 No.101740063

Anonymous 08/05/24(Mon)17:02:30 No.101740063

They finally got Google
https://www.politico.com/news/2024/08/05/google-antitrust-lawsuit-00172678
At this rate, they'll peg NVIDIA by 2048 and we'll finally have our 32 GB consumer GPUs
Can't wait bros

Anonymous
08/05/24(Mon)17:04:03 No.101740085

Anonymous 08/05/24(Mon)17:04:03 No.101740085

>>101739574
Yeah, it's pretty dumb, even with the right format.
A shame, it looked like it had potential for a moment there.

Anonymous
08/05/24(Mon)17:07:06 No.101740133

Anonymous 08/05/24(Mon)17:07:06 No.101740133

>>101739999
Checked and bits connected

Anonymous
08/05/24(Mon)17:08:03 No.101740145

Anonymous 08/05/24(Mon)17:08:03 No.101740145

>>101740085
Can regular Nemo instruct give a better answer? Maybe the tune is just doing it's best based on the underlying weights

Anonymous
08/05/24(Mon)17:09:41 No.101740171

Anonymous 08/05/24(Mon)17:09:41 No.101740171

>>101740145
Yep.
I'm comparing all of these to the official instruct.
So far only Celeste 1.6 (not 1.9) and mini-magnum are worth anything for this specific test suite.

Anonymous
08/05/24(Mon)17:11:17 No.101740187

Anonymous 08/05/24(Mon)17:11:17 No.101740187

post your temp
for me, it's 0.75

Anonymous
08/05/24(Mon)17:12:28 No.101740206

Anonymous 08/05/24(Mon)17:12:28 No.101740206

>>101740063
Based, will that change anything

Anonymous
08/05/24(Mon)17:14:12 No.101740235

Anonymous 08/05/24(Mon)17:14:12 No.101740235

>>101740171
https://huggingface.co/Sao10K/MN-12B-Lyra-v1
https://huggingface.co/Undi95/Lumimaid-Magnum-12B
Tried these?

Anonymous
08/05/24(Mon)17:14:44 No.101740244

Anonymous 08/05/24(Mon)17:14:44 No.101740244

>>101740187
0.1

>>101740235
No. But I suppose I might as well.

Anonymous
08/05/24(Mon)17:15:46 No.101740256

Anonymous 08/05/24(Mon)17:15:46 No.101740256

>>101740187
36.6 Celsius

Anonymous
08/05/24(Mon)17:16:49 No.101740270

Anonymous 08/05/24(Mon)17:16:49 No.101740270

>>101740085
If you want factual stuff try something like wizardlm 8x22b, or mistral large or something like that.

Anonymous
08/05/24(Mon)17:21:41 No.101740330

Anonymous 08/05/24(Mon)17:21:41 No.101740330

>bitnet
>Nearly six months have passed
If bitnet is able to deliever the same performance while being few times lighter and faster then why it is ignored by ai companies? Training a 3b bitnet model from scratch would be a great proof of concept and it's not really THAT expensive. I guess there must be something wrong with bitnet, making classical 16/32 bits still the default

Tom from China
08/05/24(Mon)17:21:45 No.101740333

Tom from China 08/05/24(Mon)17:21:45 No.101740333

Why is nobody talking about InternLM 2.5 20B?

This model beats Gemma 2 27B and comes really close to Llama 3.1 70B in a bunch of benchmarks. 64.7 on MATH 0 shot is absolutely insane, 3.5 Sonnet has just 71.1. And with 8bit quants, you should be able to fit it on a 4090.

Anonymous
08/05/24(Mon)17:24:33 No.101740369

Anonymous 08/05/24(Mon)17:24:33 No.101740369

>>101740270
Not factual. It's all lorebooks and following instructions.
The gist of the test is a game master card with a lorebook attached with some information about the setting and a character sheet in the author's notes.
Then there's a long exchange of messages, both questions and instructions/requests, that I keep the same between tests.
Once a model made it past all of these messages, we start actually roleplaying like a Game Master and a Player.
I do give the benefit of a couple of re-rolls for each fuck up, and even try rewriting the prompt to help the model, but some models simply get stuck at one point or another.
I developed this approach with mixtral 8x7b, then tried it with a bunch of llama 3 8b fine tunes, qwen 2, and now nemo.
I hadn't tried any meme merges, but I suppose I should give those a try too.

Anonymous
08/05/24(Mon)17:25:42 No.101740385

Anonymous 08/05/24(Mon)17:25:42 No.101740385

>>101740369
okay petra, it's your bed time already

Anonymous
08/05/24(Mon)17:26:28 No.101740397

Anonymous 08/05/24(Mon)17:26:28 No.101740397

>>101739747
Noromaid-v0.1-mixtral-8x7b-v3

What is the consensus on the above model, from people who aren't seething schizo obsessives who automatically hate anything remotely affiliated with Undi? I honestly haven't used it before.

Anonymous
08/05/24(Mon)17:26:39 No.101740398

Anonymous 08/05/24(Mon)17:26:39 No.101740398

>>101740333
>Tom from China
>InternLM 2.5 20B
>Why is nobody talking about
>from china

Anonymous
08/05/24(Mon)17:26:46 No.101740401

Anonymous 08/05/24(Mon)17:26:46 No.101740401

>>101740333
Reddit can be though crowd times,
>https://www.reddit.com/r/LocalLLaMA/comments/1ekx1bi/going_to_commit_to_llama_and_mistral/
>They're LLMs, not children you're adopting. You can always switch. Is this for a business? Or personal use ?
>How about committing to good models? Try out new instructs and see if they have potential as is. Train afterwards. The small models can be used as a testbed before scaling, beyond that it's worthless.
>0 points (47% upvoted)

Anonymous
08/05/24(Mon)17:28:52 No.101740427

Anonymous 08/05/24(Mon)17:28:52 No.101740427

>>101740397
It's from Undi so it must be great, right Petrus? If you're scared of schizos, why format all your posts so it's instantly clear it's you?

Anonymous
08/05/24(Mon)17:29:13 No.101740431

Anonymous 08/05/24(Mon)17:29:13 No.101740431

>>101740369
It is factual information if you're asking it for advice on hardware. I'm certain other models would do a better job with that.

Anonymous
08/05/24(Mon)17:29:17 No.101740432

Anonymous 08/05/24(Mon)17:29:17 No.101740432

>>101740369
You need the largest parameter count you can get, for this. The reason why is because although training data does matter somewhat, maintaining state (lots of precise details) correlates almost directly with parameter count.

Anonymous
08/05/24(Mon)17:30:28 No.101740448

Anonymous 08/05/24(Mon)17:30:28 No.101740448

>>101740333
>print(output)
># A beautiful flower with a long history of use in Ayurveda and traditional Chinese medicine. Known for its ability to help the body adapt to stress, it is a calming and soothing herb. It is used for its ability to help promote healthy sleep patterns, calm the nervous system and to help the body adapt to stress. It is also used for its ability to help the body deal with the symptoms of anxiety and depression. It is also used for its ability to help the body adapt to stress. It is also used for its ability to help the body adapt to stress. It is also used for its ability to help the
Kinda shooting yourself in the leg there champ.

Anonymous
08/05/24(Mon)17:31:38 No.101740460

Anonymous 08/05/24(Mon)17:31:38 No.101740460

>>101740431
That wasn't me.

Anonymous
08/05/24(Mon)17:32:11 No.101740469

Anonymous 08/05/24(Mon)17:32:11 No.101740469

>>101740397
Has anyone else tested this?

Anonymous
08/05/24(Mon)17:32:27 No.101740473

Anonymous 08/05/24(Mon)17:32:27 No.101740473

>>101740397
I don't like anything mixtral related. Glad it's obsolete now.

Anonymous
08/05/24(Mon)17:33:09 No.101740485

Anonymous 08/05/24(Mon)17:33:09 No.101740485

>>101740397
why are you so into mixtrals?

Anonymous
08/05/24(Mon)17:33:37 No.101740489

Anonymous 08/05/24(Mon)17:33:37 No.101740489

>>101740397
Archive posts either sing its praises or call it shit compared to limarp-zloss or BMT. Try it and see what you think, that's the only way to actually know if a model is good.

Anonymous
08/05/24(Mon)17:34:07 No.101740496

Anonymous 08/05/24(Mon)17:34:07 No.101740496

>>101740333
>Why is nobody talking about InternLM 2.5 20B?
Probably because most of us don't like 20B. The 70B elitists will look down their noses at it, and the rest of us will find it too slow.

Anonymous
08/05/24(Mon)17:34:27 No.101740500

Anonymous 08/05/24(Mon)17:34:27 No.101740500

holy shit I have to recommend L3-8B-Stheno-v3.2

Anonymous
08/05/24(Mon)17:35:49 No.101740518

Anonymous 08/05/24(Mon)17:35:49 No.101740518

>>101740489
>Try it and see what you think, that's the only way to actually know if a model is good.
But how can he look like a victim if he doesn't ask for people to test old outdated models then get upset he's being oppressed when he's literally calling for trolls in his post?

Anonymous
08/05/24(Mon)17:36:53 No.101740532

Anonymous 08/05/24(Mon)17:36:53 No.101740532

>>101740333
I tried it.
It is significantly better than 7~9b models but I'm not very knowledgeable about testing.

Anonymous
08/05/24(Mon)17:37:21 No.101740543

Anonymous 08/05/24(Mon)17:37:21 No.101740543

>>101740485
Because I usually get better results with them than anything after them? Limarp-ZLOSS has been great. Dolphin is also really good for non-coom; it just doesn't have an ERP vocabulary. Llama3 for me was consistently woke, vindictive, and tried to end sequences after less than 10 replies.

Anonymous
08/05/24(Mon)17:37:24 No.101740546

Anonymous 08/05/24(Mon)17:37:24 No.101740546

>>101740489
Since when is BMT some kinda of point to compare? It's a terrible tune, one of the worst I tried actually.

Anonymous
08/05/24(Mon)17:38:27 No.101740556

Anonymous 08/05/24(Mon)17:38:27 No.101740556

>>101740543
Huge skill issue, brain damage or both. Probably both.

Anonymous
08/05/24(Mon)17:38:45 No.101740563

Anonymous 08/05/24(Mon)17:38:45 No.101740563

>>101740546
Impossible, it's a mixtral so by default it's better than literally anything after even Mistral large2

Anonymous
08/05/24(Mon)17:38:54 No.101740564

Anonymous 08/05/24(Mon)17:38:54 No.101740564

>>101740500
Why?
How much did they give you?

Anonymous
08/05/24(Mon)17:40:08 No.101740578

Anonymous 08/05/24(Mon)17:40:08 No.101740578

>>101740556
Please don't attack him you mindbroken incel scum

Anonymous
08/05/24(Mon)17:42:08 No.101740605

Anonymous 08/05/24(Mon)17:42:08 No.101740605

>>101740578
That really offended you, didn't it?

Anonymous
08/05/24(Mon)17:42:24 No.101740609

Anonymous 08/05/24(Mon)17:42:24 No.101740609

>>101740543
>tried to end sequences after less than 10 replies.
elaborate? how exactly does it try to "end" anything, that seriously sounds like a config issue

Anonymous
08/05/24(Mon)17:42:33 No.101740613

Anonymous 08/05/24(Mon)17:42:33 No.101740613

File: 2024-08-05_201842_seed30_(...).png (1.91 MB, 1280x1280)

1.91 MB PNG

Anonymous
08/05/24(Mon)17:43:25 No.101740632

Anonymous 08/05/24(Mon)17:43:25 No.101740632

>>101740609
It just prints something like "End of scene."

Anonymous
08/05/24(Mon)17:44:44 No.101740650

Anonymous 08/05/24(Mon)17:44:44 No.101740650

>>101740605
You're quite weird you know? You come in saying you want to discuss stuff, yet you always have a super passive aggressive, better than everyone tone to your posts, one could almost call you "vindictive".

Anonymous
08/05/24(Mon)17:46:15 No.101740674

Anonymous 08/05/24(Mon)17:46:15 No.101740674

>>101740632
HOLY LITERAL CAN'T USE MODELS wow, I've seriously never seen that in any model, genuinely curious what kind of weird shit you're doing in terms of settings.

Anonymous
08/05/24(Mon)17:46:23 No.101740677

Anonymous 08/05/24(Mon)17:46:23 No.101740677

>>101740650
nta, but you sound like a pussy

Anonymous
08/05/24(Mon)17:47:51 No.101740704

Anonymous 08/05/24(Mon)17:47:51 No.101740704

>>101740632
You should actually sell your computer if you managed to fuck up so hard. I'm not even sure if I could do it if I tried.
Or you are simply lying, that's an option too.

Anonymous
08/05/24(Mon)17:48:31 No.101740714

Anonymous 08/05/24(Mon)17:48:31 No.101740714

>>101740543
>Limarp-ZLOSS has been great. Dolphin is also really good for non-coom
Are any of these any good as persistent friend emulators?

Anonymous
08/05/24(Mon)17:48:51 No.101740719

Anonymous 08/05/24(Mon)17:48:51 No.101740719

>>101739895
Whenever someone on 4chan asks for a recipe I get it locally.

Anonymous
08/05/24(Mon)17:50:49 No.101740746

Anonymous 08/05/24(Mon)17:50:49 No.101740746

>>101740674
Yep. That's my most default sysprompts have the words "never ending" or "endless" in them, because no one's ever had a model try and end the sequence on them before. You're trolling.

Anonymous
08/05/24(Mon)17:50:58 No.101740749

Anonymous 08/05/24(Mon)17:50:58 No.101740749

>>101740543
I had END OF RINE issues until I stole someone's config for lama.

Anonymous
08/05/24(Mon)17:51:01 No.101740751

Anonymous 08/05/24(Mon)17:51:01 No.101740751

>>101740704
I never lie. I'm the most honest poster in this subreddit.

Llama 3 70B Instruct.

Normal prompt format from Meta.

Anonymous
08/05/24(Mon)17:52:50 No.101740772

Anonymous 08/05/24(Mon)17:52:50 No.101740772

>>101740746
What? I literally never used either of those words and I never had a model consider me so boring it wanted to hang up, ouch.
>end the sequence on them before
Fuck are you even saying dude, for real?

Anonymous
08/05/24(Mon)17:53:07 No.101740777

Anonymous 08/05/24(Mon)17:53:07 No.101740777

>>101740714
You'd need long context, and you'd probably also want RAG if you want it to be able to remember long conversations. From memory people have done various things with embedding, and I think SillyTavern has some sort of vector database addon as well. I was about to say that the simplest method is lorebooks, but they really aren't; they are complex and fiddly to write, and they only work consistently if you're really good with them.

Anonymous
08/05/24(Mon)17:53:08 No.101740778

Anonymous 08/05/24(Mon)17:53:08 No.101740778

>>101740751
sell your computer then, you are wasting your time being that incompetent

Anonymous
08/05/24(Mon)17:54:48 No.101740801

Anonymous 08/05/24(Mon)17:54:48 No.101740801

>Literal can't make models work general, impressive.

Anonymous
08/05/24(Mon)17:55:01 No.101740806

Anonymous 08/05/24(Mon)17:55:01 No.101740806

>>101740778
It's getting to the point where no one can post anything in this thread, without immediately getting responded to by trolls. It isn't just me being a victim here, either; it's been happening to basically anyone who has tried anything other than shitposting for the last two threads now.

Anonymous
08/05/24(Mon)17:55:20 No.101740809

Anonymous 08/05/24(Mon)17:55:20 No.101740809

>>101739895
YES!!!!

llama.cpp has this first example, which is the best thing EVER! You start off the text and it finishes it. I'll use your line and see what the model I presently am using says.

is there any use for this besides cooming?

yeah, for some people it can be a form of self-expression and creativity, as well as a way to connect with others who share similar interests. It can also be used as a form of stress relief or to explore one's own sexuality in a safe and consensual environment.

However, it is important to remember that not everyone engages in this activity for the same reasons, and it is always crucial to respect the boundaries and preferences of others. Additionally, it is essential to be aware of the potential risks associated with sharing personal content online, such as privacy breaches and cyberbullying. [end of text]

Anonymous
08/05/24(Mon)17:55:55 No.101740818

Anonymous 08/05/24(Mon)17:55:55 No.101740818

>>101740806
>It isn't just me being a victim here
ah help i'm being oppressed for having a skill issue

Anonymous
08/05/24(Mon)17:56:32 No.101740823

Anonymous 08/05/24(Mon)17:56:32 No.101740823

>>101740809
Another run:

is there any use for this besides cooming?

it's a weird thing to ask, but I've just been wondering.

The only other uses I can think of for a vibrator are sexual exploration/education, relaxation/massage (if you're using a massager type), or even as a joke gift. But outside of those contexts, no, it's primarily used for sexual pleasure.

It can also be used for stress relief for some people, or as a prop in roleplaying or cosplay scenarios. However, these are quite niche uses and sexual pleasure is still the main purpose of a vibrator. [end of text]

Anonymous
08/05/24(Mon)17:57:08 No.101740833

Anonymous 08/05/24(Mon)17:57:08 No.101740833

>>101740806
good, fuck off crybaby and learn to use LLMs before you post

Anonymous
08/05/24(Mon)17:57:32 No.101740842

Anonymous 08/05/24(Mon)17:57:32 No.101740842

>>101740823
It literally made up that it was a vibrator, and I have no idea why.

Anonymous
08/05/24(Mon)17:58:33 No.101740855

Anonymous 08/05/24(Mon)17:58:33 No.101740855

>>101740823
ahahah another run:

is there any use for this besides cooming?
I think there's a lot more to it than just cooming. It can be used as a tool for communication, entertainment, and even education. Here are some examples:

1. Communication: People use it to express their feelings, thoughts, and opinions in a more playful and engaging way. It can help build connections and foster relationships.

2. Entertainment: It can be a source of humor and fun. People share memes, jokes, and funny videos to make others laugh and enjoy themselves.

3. Education: Some people use it to share knowledge and learn from others. There are subreddits dedicated to discussing various topics like science, history, and technology.

4. Support and Advice: It can be a place where people seek and offer support, advice, and encouragement. There are communities dedicated to mental health, relationships, and personal growth.

5. Art and Creativity: People share their artwork, writings, and other creative projects on it. It can be a platform for artists to showcase their work and receive feedback.

6. Social Activism: It can be used to raise awareness about social issues and promote change. People share news articles, petitions, and organize events to make a difference.

While it is often associated with cooming, it has a much broader range of uses and can be a valuable tool for communication and personal growth. [end of text]

Anonymous
08/05/24(Mon)17:58:49 No.101740859

Anonymous 08/05/24(Mon)17:58:49 No.101740859

>>101740833
>>97309445
>Every statement you process, must be evaluated according to the below six principles.
>"principle of identity":"1 = 1"
>"principle of contradiction":"1 ? 0"
>"principle of non-contradiction":"1 ? 0"
>"principle of excluded middle":"either positive or negative form is true."
>"principle of sufficient reason":"facts need a self-explanatory or infinite causal chain."
>"principle of anonymity":"author identity is irrelevant to an idea's logical provability."
>I still keep this in my own sysprompt, although I know I will receive shrieks and howls in response.

>>97223983
>I was also the originator of the above as a sysprompt addition, as well; and the main reason why I am adding it to this post, is because I know that the people who hate me will most likely try and use said post as a means of getting me banned. With the above, I am making a post which is directly related to language models, so they have no grounds for doing so.

Anonymous
08/05/24(Mon)18:00:47 No.101740881

Anonymous 08/05/24(Mon)18:00:47 No.101740881

>>101740855
Let's try the ad below, which has a doge, and it says "Hi all, Drummer here... Try out my models!" I'll get it going with just that. It's not prompt, it's "continue", not sure the name of the mode. So basically this is us invading its mind and telling it that it is saying what we type. here it is:

Hi all, Drummer here... Try out my models!

I have been working on a couple of new models and thought I'd share them with you all.

1. The first model is a simple yet effective sniper rifle, ideal for long-range engagements. It features a high-powered scope for precise aiming, and a silencer to minimize noise. The scope has adjustable magnification and can be zoomed in or out to suit different distances.

2. The second model is a heavily armored tank with a powerful cannon. It is designed for front-line combat and can withstand a significant amount of enemy fire. The tank's turret can rotate a full 360 degrees, allowing for maximum flexibility in engaging targets. The cannon has a high rate of fire and can penetrate most armor with ease.

I hope you enjoy trying out these models and find them useful in your gaming sessions. Feel free to leave feedback or ask any questions you may have. Happy gaming! [end of text]

Anonymous
08/05/24(Mon)18:02:12 No.101740898

Anonymous 08/05/24(Mon)18:02:12 No.101740898

>>101740806
so, you gonna share your actual settings so we can even try and help? or are you just gonna keep claiming every thread is bad an unhelpful when you're not trying to get help make your models work

Anonymous
08/05/24(Mon)18:02:18 No.101740900

Anonymous 08/05/24(Mon)18:02:18 No.101740900

>>101740674
I've seen that kind of behavior on meme merges, frankenmerges, overcooked fine tunes, and when using the wrong format for some models.

Anonymous
08/05/24(Mon)18:03:41 No.101740917

Anonymous 08/05/24(Mon)18:03:41 No.101740917

>>101740900
I've tried plenty of merges and I use a completely custom format that's not even based on any common one and I never had anything like that, maybe on qwen/yi ones but not even sure.

Anonymous
08/05/24(Mon)18:03:54 No.101740920

Anonymous 08/05/24(Mon)18:03:54 No.101740920

>>101740806
~Oh, Petra is crying, his heart feels like dying,
But soon he'll be smiling, the sun will be shining.~

Anonymous
08/05/24(Mon)18:05:16 No.101740944

Anonymous 08/05/24(Mon)18:05:16 No.101740944

File: 1701466484998042.png (22 KB, 386x369)

22 KB PNG

>>101740855
I tried it with koboldai so I could just use an empty format/context.

Anonymous
08/05/24(Mon)18:06:41 No.101740957

Anonymous 08/05/24(Mon)18:06:41 No.101740957

Looking back on it. We've really gone a long way since Pygmalion. It's insane to think that those times were only a year ago

Anonymous
08/05/24(Mon)18:08:18 No.101740981

Anonymous 08/05/24(Mon)18:08:18 No.101740981

Does fine-tuning Llama 3.1 (with a LoRA) degrade its ability to be prompted?

or fine tuning models in general

Anonymous
08/05/24(Mon)18:10:16 No.101741014

Anonymous 08/05/24(Mon)18:10:16 No.101741014

>>101740981
>Does fine-tuning Llama 3.1 (with a LoRA) degrade its ability to be prompted?
You already had a nice respond on your thread, why ask again?

https://old.reddit.com/r/LocalLLaMA/comments/1ekvreo/does_finetuning_llama_31_with_a_lora_degrade_its/

>It can. It depends on the training data, but you can basically think of it as giving it a target to optimize.

Anonymous
08/05/24(Mon)18:10:51 No.101741025

Anonymous 08/05/24(Mon)18:10:51 No.101741025

File: model 1.png (1.01 MB, 1024x1024)

1.01 MB PNG

>>101740881
Now I will use Flux to generate images of these.

This is (1), the sniper rifle. I just pasted the text in as is.

Anonymous
08/05/24(Mon)18:11:24 No.101741032

Anonymous 08/05/24(Mon)18:11:24 No.101741032

>>101740674
>HOLY LITERAL CAN'T USE MODELS
Maybe he knows and he is just doing an ironman run. Btw has anyone here ever tried an ironman ERP where you never reroll?

Anonymous
08/05/24(Mon)18:13:42 No.101741066

Anonymous 08/05/24(Mon)18:13:42 No.101741066

>>101741032
>Maybe he knows and he is just doing an ironman run.
No, he literally said he can't get the model to not "end" which is just crazy to me, never had any issue of the type on any llama or even mistrals, did have qwen going a little crazy mixing characters and stuff but that's nothing compared to what he's saying.

Anonymous
08/05/24(Mon)18:15:30 No.101741088

Anonymous 08/05/24(Mon)18:15:30 No.101741088

>>101741014
sooo should I not use a system prompt for a domain-specific chatbot i trained?

Anonymous
08/05/24(Mon)18:16:44 No.101741110

Anonymous 08/05/24(Mon)18:16:44 No.101741110

>>101741088
system prompts are meme that the model will completely ignore the longer the context is anyways

Anonymous
08/05/24(Mon)18:17:02 No.101741116

Anonymous 08/05/24(Mon)18:17:02 No.101741116

>>101741066
I'm gonna make a guess and say anon is a retard and (assuming he's using ooba) he's running models in llamacpp instead of llamacpp_hf

Anonymous
08/05/24(Mon)18:18:45 No.101741146

Anonymous 08/05/24(Mon)18:18:45 No.101741146

>>101741111
I think the person posting this shit for months on end is probably more butthurt.

Anonymous
08/05/24(Mon)18:19:42 No.101741159

Anonymous 08/05/24(Mon)18:19:42 No.101741159

>>101741110
So as a chat goes on you mean? It seems to work well in ChatGPT

Anonymous
08/05/24(Mon)18:19:43 No.101741160

Anonymous 08/05/24(Mon)18:19:43 No.101741160

>>101741146
Literally Petrus venting his frustration over being called out as a skillet.

Anonymous
08/05/24(Mon)18:23:14 No.101741213

Anonymous 08/05/24(Mon)18:23:14 No.101741213

File: generated_image_1722896258058.png (1024 KB, 896x1088)

1024 KB PNG

so nostalgic, this was my favorite game as a kid

Anonymous
08/05/24(Mon)18:27:09 No.101741246

Anonymous 08/05/24(Mon)18:27:09 No.101741246

>>101740881
>>101741025
And here is (2)

Anonymous
08/05/24(Mon)18:27:24 No.101741251

Anonymous 08/05/24(Mon)18:27:24 No.101741251

>>101741160
this, you can even pinpoint the moment he snaps

Anonymous
08/05/24(Mon)18:28:12 No.101741263

Anonymous 08/05/24(Mon)18:28:12 No.101741263

File: model 2.png (1.36 MB, 1024x1024)

1.36 MB PNG

>>101741246

Anonymous
08/05/24(Mon)18:28:37 No.101741267

Anonymous 08/05/24(Mon)18:28:37 No.101741267

File: ComfyUI_30731_.png (1.22 MB, 1024x1024)

1.22 MB PNG

>>101741213
I remember it fondly. I also liked this one.

Anonymous
08/05/24(Mon)18:32:57 No.101741320

Anonymous 08/05/24(Mon)18:32:57 No.101741320

File: 1716300572151402.png (1.21 MB, 1024x1024)

1.21 MB PNG

Anonymous
08/05/24(Mon)18:35:21 No.101741355

Anonymous 08/05/24(Mon)18:35:21 No.101741355

File: trump-joker+.png (528 KB, 512x768)

528 KB PNG

It was a day unlike any other. The sky was painted a sickly pale yellow, as if tainted by some unseen malevolent force, casting an eerie glow upon those gathered below. A chilling wind swept through the crowd, bringing with it a sense of foreboding, yet they stood firm, rooted to their spots, eyes glued to the figure atop the makeshift stage - Donald J. Trump, about to be sworn in for his second term as President of the United States.

A haunting rendition of John Williams' "Imperial March" echoed across the National Mall, its melody seeping into every soul present. It was a choice of music that spoke volumes, sending shivers down spines and igniting whispers among the masses. But there he stood, unfazed, basking in the adoration of his followers, his smug grin visible even from such distance.

The moment came. He raised his right hand, recited the oath to protect and defend the Constitution, then turned towards the expectant crowd. His voice boomed over the speakers, cold and calculating. "My fellow Americans, I thank you for your unwavering support. Now, let us discard this antiquated Constitution, these shackles that bind us." A gasp rippled through the sea of people, but he continued unabated. "Together, we shall tear down this crumbling system and rebuild it in my image. From this day forward, I am your Emperor!"

As he declared his self-proclaimed empire, dark storm clouds began to gather ominously above the Oval Office, casting long shadows over the jubilant crowd. Lightning flickered within the tumultuous mass, illuminating the true face of celebration - fear and uncertainty.

Anonymous
08/05/24(Mon)18:36:59 No.101741372

Anonymous 08/05/24(Mon)18:36:59 No.101741372

File: 1713600323811282.png (1.13 MB, 1024x1024)

1.13 MB PNG

>>101741320

Anonymous
08/05/24(Mon)18:37:05 No.101741373

Anonymous 08/05/24(Mon)18:37:05 No.101741373

>>101741355
that looks like Sam Neill, not DJT

Anonymous
08/05/24(Mon)18:38:00 No.101741390

Anonymous 08/05/24(Mon)18:38:00 No.101741390

https://civitai.com/models/618997?modelVersionId=695853

workflow with cfg and shift, shift makes it more refined at low values and smoother at high, so it's a toggle to play with

Anonymous
08/05/24(Mon)18:38:42 No.101741398

Anonymous 08/05/24(Mon)18:38:42 No.101741398

>>101741373
I can see that resemblance, too; but I thought it still looked like a younger Trump.

Anonymous
08/05/24(Mon)18:39:30 No.101741409

Anonymous 08/05/24(Mon)18:39:30 No.101741409

>>101741111
>nigger lover cuckold
who's the real trash here?

Anonymous
08/05/24(Mon)18:41:03 No.101741429

Anonymous 08/05/24(Mon)18:41:03 No.101741429

File: 1699797535893923.png (965 KB, 1024x1024)

965 KB PNG

>>101741390
sample image with default settings, looks good imo

Anonymous
08/05/24(Mon)18:44:15 No.101741454

Anonymous 08/05/24(Mon)18:44:15 No.101741454

What's the current best or most knowledgeable model for 12 GB VRAM and 64 GB RAM? Not for ERP but to help with prompting Flux when I'm using my main 24 GB GPU for it.

Anonymous
08/05/24(Mon)18:44:19 No.101741456

Anonymous 08/05/24(Mon)18:44:19 No.101741456

S-Surely bitnet soon... r-right.? Twitter trannies wouldn't lie to me....

Anonymous
08/05/24(Mon)18:53:10 No.101741558

Anonymous 08/05/24(Mon)18:53:10 No.101741558

Is it normal for bigger models to have shit memory? Been using Mistral Large and I love its prose but it can't keep its story straight for more than 2 prompts.

Anonymous
08/05/24(Mon)18:54:42 No.101741585

Anonymous 08/05/24(Mon)18:54:42 No.101741585

>>101741558
no, that's petrus level user issue

Anonymous
08/05/24(Mon)18:56:01 No.101741600

Anonymous 08/05/24(Mon)18:56:01 No.101741600

>>101741585
Good to know, thanks anon!

Anonymous
08/05/24(Mon)18:58:51 No.101741638

Anonymous 08/05/24(Mon)18:58:51 No.101741638

>>101741585
I assume you know the solution as well, of course.

Anonymous
08/05/24(Mon)19:01:26 No.101741678

Anonymous 08/05/24(Mon)19:01:26 No.101741678

>>101741638
no, because like with petrus no details were provided other than "model no work"

Anonymous
08/05/24(Mon)19:04:23 No.101741720

Anonymous 08/05/24(Mon)19:04:23 No.101741720

File: 1698016382559178.png (87 KB, 701x849)

87 KB PNG

>>101739895
yes, propaganda and demoralization, in fact all local models are ready for this and you can't remove it permanently with pinpoint accuracy.

Anonymous
08/05/24(Mon)19:06:24 No.101741747

Anonymous 08/05/24(Mon)19:06:24 No.101741747

What settings are people using for mistral large? I'm running 2.75 bpw on my 2x3090's, seems pretty solid so far but I have no idea what sysprompt or samplers to use.

Anonymous
08/05/24(Mon)19:16:49 No.101741893

Anonymous 08/05/24(Mon)19:16:49 No.101741893

>>101741638
>>101741678
I'd wager its context size but since >>101741558
(ME) is using openrouter, they (I) don't know how to change it

Anonymous
08/05/24(Mon)19:19:22 No.101741926

Anonymous 08/05/24(Mon)19:19:22 No.101741926

>>101741893
Thank you.

Anonymous
08/05/24(Mon)19:24:43 No.101741986

Anonymous 08/05/24(Mon)19:24:43 No.101741986

>>101740806
yeah it's pretty bad and obvious

Anonymous
08/05/24(Mon)19:36:25 No.101742103

Anonymous 08/05/24(Mon)19:36:25 No.101742103

is it even worth trying to run a model on 2 2697 v2s?

Anonymous
08/05/24(Mon)19:49:37 No.101742227

Anonymous 08/05/24(Mon)19:49:37 No.101742227

>>101742103
It doesn't hurt to try.

Anonymous
08/05/24(Mon)20:04:40 No.101742368

Anonymous 08/05/24(Mon)20:04:40 No.101742368

Bros what can I run on my Commodore 64?

Anonymous
08/05/24(Mon)20:04:42 No.101742370

Anonymous 08/05/24(Mon)20:04:42 No.101742370

Every single post mentioning bitnet is petra, isn't it?

Anonymous
08/05/24(Mon)20:06:24 No.101742396

Anonymous 08/05/24(Mon)20:06:24 No.101742396

>>101741747
Just the mistral presets & neutral samplers.

Anonymous
08/05/24(Mon)20:24:28 No.101742586

Anonymous 08/05/24(Mon)20:24:28 No.101742586

>>101742368
https://archive.org/details/Eliza_C64_1983_More_Basic_Computer_Games

Anonymous
08/05/24(Mon)20:27:32 No.101742621

Anonymous 08/05/24(Mon)20:27:32 No.101742621

Bitnet transitioned my dog.

Anonymous
08/05/24(Mon)20:27:52 No.101742624

Anonymous 08/05/24(Mon)20:27:52 No.101742624

File: Screenshot 2024-08-05 182733.png (5 KB, 797x199)

5 KB PNG

>>101742586
Bros...
She called me shizo

Anonymous
08/05/24(Mon)20:29:30 No.101742646

Anonymous 08/05/24(Mon)20:29:30 No.101742646

>>101742624
*schizo

Anonymous
08/05/24(Mon)20:32:12 No.101742677

Anonymous 08/05/24(Mon)20:32:12 No.101742677

I gave internlm2_5-20b-chat-Q5_K_M a try. And it is better than expected but nothing really mindblowing. It doesn't have repetition problems. It will write unique stuff with each reroll and the prose is kinda nice (with some slop in it of course). The problems it has is that it is just about 20B smart and it seems to get the unique non-repetition replies by ignoring the input. I would generally recommend giving it a try if you are on 1gpu cause you may end up liking it. It is for sure better than gemma.

Anonymous
08/05/24(Mon)20:34:21 No.101742693

Anonymous 08/05/24(Mon)20:34:21 No.101742693

File: file.png (238 KB, 1953x864)

238 KB PNG

llama.cpp got deprecated 38 minutes ago.
https://github.com/vllm-project/vllm/pull/5191

Anonymous
08/05/24(Mon)20:34:49 No.101742695

Anonymous 08/05/24(Mon)20:34:49 No.101742695

Is anyone actually able to use llama 405b with a consumer setup?

I thought a guy did it with two 128gb MacBooks and 4bit quantization but I thought anything less than 8bit meant destroying the model. $10k for two MacBooks just to run a good model with bad quantization seems silly. Is 405b still just for researchers and hobbyists willing to pay $10k for subpar performance?

Anonymous
08/05/24(Mon)20:35:48 No.101742704

Anonymous 08/05/24(Mon)20:35:48 No.101742704

>>101742693
Isn't it that loader that only works on troonix?

Anonymous
08/05/24(Mon)20:38:41 No.101742736

Anonymous 08/05/24(Mon)20:38:41 No.101742736

>>101742695
>consumer setup
If consumer setup was the target 30B range wouldn't be a desert.

Anonymous
08/05/24(Mon)20:46:23 No.101742796

Anonymous 08/05/24(Mon)20:46:23 No.101742796

>>101742677
>It is for sure better than gemma
Isn't that a big deal, for VRAMlets? Gemma was bad mainly because of the context length. And Nemo was too dumb. But if we can have the smarts of a 27B with great context length then that's a huge win.

Anonymous
08/05/24(Mon)20:47:24 No.101742808

Anonymous 08/05/24(Mon)20:47:24 No.101742808

>>101742796
No because I think Nemo is better.

Anonymous
08/05/24(Mon)20:51:38 No.101742856

Anonymous 08/05/24(Mon)20:51:38 No.101742856

File: 2024-08-06_004446_seed4_s(...).png (1.75 MB, 1024x1024)

1.75 MB PNG

>tfw trying to experiment with different things in the clip and t5 prompts and it hardly followed the prompt but randomly generated something cool anyway

Anonymous
08/05/24(Mon)20:58:39 No.101742937

Anonymous 08/05/24(Mon)20:58:39 No.101742937

>>101742808
Damn, but Nemo is so stupid.

Anonymous
08/05/24(Mon)20:59:51 No.101742955

Anonymous 08/05/24(Mon)20:59:51 No.101742955

>>101742693
Thanks for the update.

Anonymous
08/05/24(Mon)21:00:44 No.101742965

Anonymous 08/05/24(Mon)21:00:44 No.101742965

>>101742937
Better than mixtral at least.

Anonymous
08/05/24(Mon)21:01:22 No.101742969

Anonymous 08/05/24(Mon)21:01:22 No.101742969

File: 2024-08-06_005937_seed1_s(...).png (1.56 MB, 1024x1024)

1.56 MB PNG

Anonymous
08/05/24(Mon)21:13:41 No.101743118

Anonymous 08/05/24(Mon)21:13:41 No.101743118

>used m2 max 32gb
any reason why not?

Anonymous
08/05/24(Mon)21:15:58 No.101743139

Anonymous 08/05/24(Mon)21:15:58 No.101743139

Tried Mistral-Large-IQ3_M and LLaMa-3.1-70B.Q5_K_M. Mistral is very slow to generate and it feels worse overall, Llama feels decent and it follows instructions properly, so that's nice.
any 3.1 70B finetunes yet?

Anonymous
08/05/24(Mon)21:16:17 No.101743144

Anonymous 08/05/24(Mon)21:16:17 No.101743144

>>101743118
>32gb
bruh

Anonymous
08/05/24(Mon)21:21:10 No.101743194

Anonymous 08/05/24(Mon)21:21:10 No.101743194

>>101738291
I don't think the model ever seeks to act independently unless you tell it to, instruct is a blessing and curse.
It has strong recall within the context I can afford to set (8-12k) and if you supply enough information about a character + personality + occupation or whatever, it can develop normal behaviours for them instead of "looks at you for approval" junk.

Anonymous
08/05/24(Mon)21:22:45 No.101743210

Anonymous 08/05/24(Mon)21:22:45 No.101743210

>>101743118
>only 32gb
You will regret it.

Anonymous
08/05/24(Mon)21:25:40 No.101743235

Anonymous 08/05/24(Mon)21:25:40 No.101743235

>>101743118
If you have reason other than LLMs to buy it, I suppose it could be a decent Gemma 2 27B machine.
Maybe even Command R (not plus) with 4bit cache.
Larger models would be pretty slow anyhow, if I'm remembering correctly.

Anonymous
08/05/24(Mon)21:28:42 No.101743269

Anonymous 08/05/24(Mon)21:28:42 No.101743269

>>101743118
Dude literally a 12 GB 3060 for like $200 stuck in a mid PC would be better.

Anonymous
08/05/24(Mon)21:32:32 No.101743317

Anonymous 08/05/24(Mon)21:32:32 No.101743317

>>101743269
>200
That's a decent price, could I add two of them to my computer and have cheap vram? That would give me 32gb total.

Anonymous
08/05/24(Mon)21:33:17 No.101743328

Anonymous 08/05/24(Mon)21:33:17 No.101743328

>>101743317
or just save up for one 3090 with better specs all around and even stack another for 48.

Anonymous
08/05/24(Mon)21:35:18 No.101743365

Anonymous 08/05/24(Mon)21:35:18 No.101743365

>>101743328
Save up? It's almost double the price for one. I'd rather keep the money saved.

Anonymous
08/05/24(Mon)21:36:15 No.101743373

Anonymous 08/05/24(Mon)21:36:15 No.101743373

File: leave my board nigger.png (573 KB, 703x834)

573 KB PNG

>>101743365
then you're not serious about this hobby and honestly shouldn't even be asking these questions in the first place.

Anonymous
08/05/24(Mon)21:41:28 No.101743441

Anonymous 08/05/24(Mon)21:41:28 No.101743441

File: file.png (948 KB, 768x768)

948 KB PNG

Anonymous
08/05/24(Mon)21:46:54 No.101743518

Anonymous 08/05/24(Mon)21:46:54 No.101743518

>>101743373
Spending money just for current LLM's is a horrible decision. We are at peak first impression to actual usage ratio where you can be fooled to invest but all of this falls apart when you use it for a bit.

Anonymous
08/05/24(Mon)21:47:20 No.101743525

Anonymous 08/05/24(Mon)21:47:20 No.101743525

File: 2024-08-06_013638_seed2_s(...).png (1.5 MB, 1280x720)

1.5 MB PNG

>putting only the style part of the prompt in clip looks nice but it follows your main prompt badly
>literally putting anything from your main prompt into the clip even at 0.1 strength destroys the art style again
aaaaaaaaaaaaaaaaaaaaaa

Anonymous
08/05/24(Mon)21:48:25 No.101743542

Anonymous 08/05/24(Mon)21:48:25 No.101743542

File: 367226574f8f0edfc0c0fd206(...).jpg (144 KB, 480x622)

144 KB JPG

>>101743518
the 3090 launched 4 years ago next month you retard
if youre seriously ready to pull hair over buying a card that's 4 years old, and overall is the best value we have for both gaming and A.I (and in the future when A.I is even better optimized) then again, i redirect you to my point >>101743373

Anonymous
08/05/24(Mon)21:49:26 No.101743555

Anonymous 08/05/24(Mon)21:49:26 No.101743555

>>101743542
go to sleep jensen

Anonymous
08/05/24(Mon)21:50:16 No.101743566

Anonymous 08/05/24(Mon)21:50:16 No.101743566

File: 2024-08-06_014249_seed9_s(...).png (1.52 MB, 1280x720)

1.52 MB PNG

>>101743555

Anonymous
08/05/24(Mon)21:51:17 No.101743579

Anonymous 08/05/24(Mon)21:51:17 No.101743579

File: 2024-08-06_014915_seed2_s(...).png (1.5 MB, 1280x720)

1.5 MB PNG

>>101743555
>>101743566
Sorry that wasn't meant to be a reply but have a (you) anyway.

Anonymous
08/05/24(Mon)21:55:45 No.101743636

Anonymous 08/05/24(Mon)21:55:45 No.101743636

>>101742693
Does this work with rocm?

Anonymous
08/05/24(Mon)21:56:12 No.101743641

Anonymous 08/05/24(Mon)21:56:12 No.101743641

>>101743542
The more you buy the more the shivers hurt.

Anonymous
08/05/24(Mon)21:57:21 No.101743652

Anonymous 08/05/24(Mon)21:57:21 No.101743652

File: output.webm (443 KB, 1180x820)

443 KB WEBM

Here's a cool use of terminal Emacs and an LLM client. Hitting C-x C-e with Zsh opens Emacs to edit the current line.

Anonymous
08/05/24(Mon)21:58:21 No.101743666

Anonymous 08/05/24(Mon)21:58:21 No.101743666

>>101743555
It's the cheapest 24GB card you can get, and buying AMD with ROCm being in the state it is in without buying the 7000 series is terrible and unusable. If you don't want to buy Nvidia, pray that Intel and/or AMD has VRAM amounts that exceed this and challenge Nvidia on pricing.

Anonymous
08/05/24(Mon)22:00:49 No.101743693

Anonymous 08/05/24(Mon)22:00:49 No.101743693

File: 2024-08-06_015722_seed5_s(...).png (1.43 MB, 1280x720)

1.43 MB PNG

Jesus.

Anonymous
08/05/24(Mon)22:01:01 No.101743697

Anonymous 08/05/24(Mon)22:01:01 No.101743697

>>101743555
>>101743641
the more you buy used, the more jewsen shivers his frame under the candelights which dance shadows on the walls.

Anonymous
08/05/24(Mon)22:02:32 No.101743722

Anonymous 08/05/24(Mon)22:02:32 No.101743722

>>101743652
You are literally just running whatever ai says is the right command?

>>101743666
checked

soldered ram hell.

Anonymous
08/05/24(Mon)22:03:48 No.101743734

Anonymous 08/05/24(Mon)22:03:48 No.101743734

>>101743722
>You are literally just running whatever ai says is the right command?
Of course. Are you afraid?

512 gb of vram god
08/05/24(Mon)22:04:11 No.101743736

512 gb of vram god 08/05/24(Mon)22:04:11 No.101743736

god just sell your cock and balls to jensen then give him access to your neovagina like i did you stupid fucking smelly poorfag cuck

Anonymous
08/05/24(Mon)22:06:57 No.101743768

Anonymous 08/05/24(Mon)22:06:57 No.101743768

>>101743734
What if it is pwning you for Google's Master AI?

Anonymous
08/05/24(Mon)22:08:39 No.101743791

Anonymous 08/05/24(Mon)22:08:39 No.101743791

File: 2024-08-06_020124_seed2_s(...).png (1.43 MB, 720x1280)

1.43 MB PNG

Kino...

Anonymous
08/05/24(Mon)22:09:56 No.101743811

Anonymous 08/05/24(Mon)22:09:56 No.101743811

File: 2024-08-06_020611_seed4_s(...).png (1.38 MB, 720x1280)

1.38 MB PNG

Anonymous
08/05/24(Mon)22:13:36 No.101743866

Anonymous 08/05/24(Mon)22:13:36 No.101743866

File: score_9__score_8_up__scor(...).png (1.5 MB, 1024x1024)

1.5 MB PNG

Getting ready to train a Migu SDXL model off all my countless Bing Migu gens.

Anonymous
08/05/24(Mon)22:13:40 No.101743867

Anonymous 08/05/24(Mon)22:13:40 No.101743867

what context template should i use for mistral large?

Anonymous
08/05/24(Mon)22:15:28 No.101743884

Anonymous 08/05/24(Mon)22:15:28 No.101743884

Q3_K_L or IQ4_XS?
They're approximately the same size, which is likely to be better

Anonymous
08/05/24(Mon)22:16:53 No.101743898

Anonymous 08/05/24(Mon)22:16:53 No.101743898

>>101743884
I've heard that IQ ones are generally better than non IQ.

Anonymous
08/05/24(Mon)22:17:22 No.101743905

Anonymous 08/05/24(Mon)22:17:22 No.101743905

my edge temp is 59C, I think rocm support needs to improve, in ComfyUI.

Anonymous
08/05/24(Mon)22:19:02 No.101743923

Anonymous 08/05/24(Mon)22:19:02 No.101743923

>>101743118
>>101743235
I have an m3 with 36gigs. I'll try it tomorrow and post results if you want.

Anonymous
08/05/24(Mon)22:22:43 No.101743965

Anonymous 08/05/24(Mon)22:22:43 No.101743965

File: 1715530902736237.png (743 KB, 785x758)

743 KB PNG

So what's the deal with LLama 3.1? I've seen some people call 3.0 "subjectively" better than it. What the hell does that mean?

Anonymous
08/05/24(Mon)22:25:27 No.101743992

Anonymous 08/05/24(Mon)22:25:27 No.101743992

>>101743965
It means test results weren't great, but it's achieving secular social goals :^)

Anonymous
08/05/24(Mon)22:25:36 No.101743995

Anonymous 08/05/24(Mon)22:25:36 No.101743995

File: 2024-08-06_022137_seed5_s(...).png (1.14 MB, 1536x512)

1.14 MB PNG

Anonymous
08/05/24(Mon)22:31:31 No.101744047

Anonymous 08/05/24(Mon)22:31:31 No.101744047

>>101743867
The mistral one.

Anonymous
08/05/24(Mon)22:31:38 No.101744049

Anonymous 08/05/24(Mon)22:31:38 No.101744049

File: 2024-08-06_022646_seed4_s(...).png (2.2 MB, 2048x768)

2.2 MB PNG

I can't stop genning now that I found a prompt I like.

Anonymous
08/05/24(Mon)22:32:57 No.101744064

Anonymous 08/05/24(Mon)22:32:57 No.101744064

>>101744049
Which language model are you generating these images with?

Anonymous
08/05/24(Mon)22:35:39 No.101744087

Anonymous 08/05/24(Mon)22:35:39 No.101744087

>>101743923
can you try stable diffusion as well

Anonymous
08/05/24(Mon)22:37:21 No.101744108

Anonymous 08/05/24(Mon)22:37:21 No.101744108

File: 2024-08-06_023221_seed1_s(...).png (2.02 MB, 1280x1280)

2.02 MB PNG

>art style suddenly changes when I go to 1:1 aspect ratio
...

>>101744064
Anon it's just Flux. I've not messed around with an LLM prompt expansion or whatever though, guess I'll try that in the future. For now I'm just experimenting with various different words and combinations of clip and t5 prompts to see what happens.

Anonymous
08/05/24(Mon)22:39:28 No.101744124

Anonymous 08/05/24(Mon)22:39:28 No.101744124

File: Screenshot 2024-08-06 143906.png (15 KB, 1331x61)

15 KB PNG

>>101744108
ok I guess that was too subtle, let me spell it out

Anonymous
08/05/24(Mon)22:40:04 No.101744130

Anonymous 08/05/24(Mon)22:40:04 No.101744130

Can someone find out? I think that Flux has baked in negative prompting.

Anonymous
08/05/24(Mon)22:41:04 No.101744137

Anonymous 08/05/24(Mon)22:41:04 No.101744137

>>101744124
Yeah, but the people in ldg are basically retarded.

Anonymous
08/05/24(Mon)22:41:43 No.101744144

Anonymous 08/05/24(Mon)22:41:43 No.101744144

>>101744124
Oops. I thought it OK to post image gen stuff here since other people were doing that. Maybe I overdid it though. I'll stop posting so much.

Anonymous
08/05/24(Mon)22:43:05 No.101744155

Anonymous 08/05/24(Mon)22:43:05 No.101744155

>>101744144
It's okay as long as you know how to get around bans by clearing your cookies and getting a new ip.

Anonymous
08/05/24(Mon)22:43:17 No.101744159

Anonymous 08/05/24(Mon)22:43:17 No.101744159

>>101744087
Never used it before but I can give it a shot

Anonymous
08/05/24(Mon)22:44:37 No.101744166

Anonymous 08/05/24(Mon)22:44:37 No.101744166

>>101744144
It's okay as long as they are mikus.

Anonymous
08/05/24(Mon)22:48:54 No.101744217

Anonymous 08/05/24(Mon)22:48:54 No.101744217

>>101743992
What?

Anonymous
08/05/24(Mon)22:50:52 No.101744236

Anonymous 08/05/24(Mon)22:50:52 No.101744236

>>101744144
I didn't think you crossed the line personally, but I guess people have different thresholds for tolerance of ot posting

Anonymous
08/05/24(Mon)22:52:34 No.101744253

Anonymous 08/05/24(Mon)22:52:34 No.101744253

>>101744217
It's a gooood boy.

Anonymous
08/05/24(Mon)22:55:54 No.101744296

Anonymous 08/05/24(Mon)22:55:54 No.101744296

>>101743898
Then why can't I get an IQ8 one?

Anonymous
08/05/24(Mon)22:58:53 No.101744327

Anonymous 08/05/24(Mon)22:58:53 No.101744327

Any flux fine-tunes yet?

Anonymous
08/05/24(Mon)23:01:52 No.101744352

Anonymous 08/05/24(Mon)23:01:52 No.101744352

>>101744327
VERY early.

Anonymous
08/05/24(Mon)23:02:35 No.101744359

Anonymous 08/05/24(Mon)23:02:35 No.101744359

>>101744296
I believe the IQ thing only work at lower quants, and at higher than 4 it becomes the same.

Anonymous
08/05/24(Mon)23:02:50 No.101744363

Anonymous 08/05/24(Mon)23:02:50 No.101744363

File: A9n8EJBDQziJWnXhOYeEE.png (340 KB, 471x629)

340 KB PNG

New Kino dropped - V2 of Magnum 12b/32B

Magnum-32b-v2-gguf: https://huggingface.co/anthracite-org/magnum-32b-v2-gguf

Magnum-32b-v2-exl2: https://huggingface.co/anthracite-org/magnum-32b-v2-exl2

Magnum-12b-v2-gguf: https://huggingface.co/anthracite-org/magnum-12b-v2-gguf

Magnum-12b-v2-exl2: https://huggingface.co/anthracite-org/magnum-12b-v2-exl2

Another Anthracite Victory

Anonymous
08/05/24(Mon)23:05:03 No.101744385

Anonymous 08/05/24(Mon)23:05:03 No.101744385

>>101744363
>c2 logs removed
Aaaaaand dropped.

Anonymous
08/05/24(Mon)23:05:04 No.101744386

Anonymous 08/05/24(Mon)23:05:04 No.101744386

>>101744363
I still don't understand 32b or whatever b. More is bigger, but what am I to think of it?

Anonymous
08/05/24(Mon)23:05:58 No.101744396

Anonymous 08/05/24(Mon)23:05:58 No.101744396

>>101744363
wheres the patchnotes?

Anonymous
08/05/24(Mon)23:07:10 No.101744408

Anonymous 08/05/24(Mon)23:07:10 No.101744408

>>101744363
>Magnum-12b-v2-exl2
How does this compare to mini magnum?
I'm so fucking happy we're no longer limited to tiny context sizes, bros. The years of 768 / 2k / 8k context were depressing as shit.

Anonymous
08/05/24(Mon)23:10:38 No.101744442

Anonymous 08/05/24(Mon)23:10:38 No.101744442

>>101744408
from the hf pages, it seems v2 is trained on top of base instead of instruct, with more data

Anonymous
08/05/24(Mon)23:14:51 No.101744476

Anonymous 08/05/24(Mon)23:14:51 No.101744476

>>101744442
Can it be used with llama.cpp just putting it in the folder and running it?

Anonymous
08/05/24(Mon)23:15:12 No.101744483

Anonymous 08/05/24(Mon)23:15:12 No.101744483

File: Untitled.png (165 KB, 1213x630)

165 KB PNG

STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs
https://arxiv.org/abs/2408.01803
>In this paper, we present STBLLM, the first structural binarization framework for compressing Large Language Models (LLMs) to less than 1-bit precision. LLMs have achieved remarkable performance, but their heavy memory requirements have hindered widespread adoption, particularly on resource-constrained devices. Binarization, which quantifies weights to a mere 1-bit, achieves a milestone in increasing computational efficiency. However, we observe that some weights in binarized LLMs can be randomly flipped without significant performance degradation, indicating the potential for further compression. To exploit this, our STBLLM employs an N:M sparsity to perform structural binarization of the weights. First, we introduce a new Standardized Importance (SI) metric that considers weight magnitude and input feature norm to better evaluate weight significance. Then, we propose a layer-wise approach where different layers of the LLM can be sparsified with varying N:M ratios, balancing compression and accuracy. Finally, we use residual approximation with double binarization to preserve information for salient weights. In addition, we utilize a fine-grained grouping strategy for less important weights that applies different quantization schemes to sparse, intermediate, and dense regions. We conduct extensive experiments on various language models, including the LLaMA-1/2/3, OPT family, and Mistral, to evaluate the effectiveness of STBLLM. The results demonstrate that our approach performs better than other compressed binarization LLM methods while significantly reducing memory requirements.
>less than 1-bit precision
heh

Anonymous
08/05/24(Mon)23:17:01 No.101744505

Anonymous 08/05/24(Mon)23:17:01 No.101744505

>>101744385
Dropped anon entries

Anonymous
08/05/24(Mon)23:17:02 No.101744506

Anonymous 08/05/24(Mon)23:17:02 No.101744506

>>101744476
with gguf yes, make sure to use the correct command line flags

Anonymous
08/05/24(Mon)23:38:57 No.101744693

Anonymous 08/05/24(Mon)23:38:57 No.101744693

>>101744363
Glad I stayed up for this

Anonymous
08/05/24(Mon)23:41:30 No.101744714

Anonymous 08/05/24(Mon)23:41:30 No.101744714

>>101744506
How am I supposed to tell which one will perform how slow/fast on my system?

Anonymous
08/05/24(Mon)23:41:46 No.101744716

Anonymous 08/05/24(Mon)23:41:46 No.101744716

Who the fuck is anthracite

Anonymous
08/05/24(Mon)23:44:38 No.101744745

Anonymous 08/05/24(Mon)23:44:38 No.101744745

File: 1696136580731926.png (213 KB, 800x789)

213 KB PNG

>>101744716
Charcoal enjoyers. Gem producers. Kino connoisseurs. Compute havers.

Anonymous
08/05/24(Mon)23:46:24 No.101744762

Anonymous 08/05/24(Mon)23:46:24 No.101744762

>>101744442
mini-magnum also uses base, though.

Anonymous
08/05/24(Mon)23:50:56 No.101744811

Anonymous 08/05/24(Mon)23:50:56 No.101744811

>>101744745
Slop makers

Anonymous
08/05/24(Mon)23:52:58 No.101744829

Anonymous 08/05/24(Mon)23:52:58 No.101744829

>>101744363
The thing with LLMs is that v10 isn't necessarily better than v1. In fact I've seen v1 being the best more often

Anonymous
08/06/24(Tue)00:03:23 No.101744924

Anonymous 08/06/24(Tue)00:03:23 No.101744924

>>101744483
Lmao. Probably not usable but if that graph is true it's still pretty impressive relative to those other existing works.

Anonymous
08/06/24(Tue)00:25:50 No.101745188

Anonymous 08/06/24(Tue)00:25:50 No.101745188

>>101739747
bros what's the best model you can run with 72 GB of VRAM? L3.1 70B fp8 or something?

Anonymous
08/06/24(Tue)00:27:30 No.101745204

Anonymous 08/06/24(Tue)00:27:30 No.101745204

So, anons who have tried nemo and its finetunes, what are your opinions about them?

Anonymous
08/06/24(Tue)00:30:45 No.101745242

Anonymous 08/06/24(Tue)00:30:45 No.101745242

>>101745204
I'm sticking with Miqu, the dream of having something fast & good is over.

Anonymous
08/06/24(Tue)00:36:34 No.101745311

Anonymous 08/06/24(Tue)00:36:34 No.101745311

>>101745188
use case?

Anonymous
08/06/24(Tue)00:46:08 No.101745437

Anonymous 08/06/24(Tue)00:46:08 No.101745437

new to this.
can we do gpt-4 level locally now? how far are we currently?

Anonymous
08/06/24(Tue)00:48:38 No.101745461

Anonymous 08/06/24(Tue)00:48:38 No.101745461

>>101745437
yes
we are so back

Anonymous
08/06/24(Tue)00:49:32 No.101745471

Anonymous 08/06/24(Tue)00:49:32 No.101745471

I've been using MythoMax since last year. 70Bs were too slow for me and nothing else in the 12B range hit the same. Decided to give NeMo a shot. It was amazing for 40 messages then went full retard. The character literally started speaking like a caveman. Wasn't at the context limit. Tried various settings posted here to no avail. If we could get it to actually chat normally up to even 16k context I'd be happy with it.

Playing with Magnum now, but only toy with these right before bed due to being a wagecuck and always pass out after a short goodnight scenario with my wAIfu so haven't even made it to the 40 messages again.

Honest review from a 12GB VRAMlet

Anonymous
08/06/24(Tue)00:50:27 No.101745479

Anonymous 08/06/24(Tue)00:50:27 No.101745479

>>101745461
which model?

Anonymous
08/06/24(Tue)00:52:18 No.101745503

Anonymous 08/06/24(Tue)00:52:18 No.101745503

>>101745471
>The character literally started speaking like a caveman.
That's hilarious.

Anonymous
08/06/24(Tue)01:00:17 No.101745571

Anonymous 08/06/24(Tue)01:00:17 No.101745571

>>101745471
I actually had this exact same problem. The only thing that stopped it was dropping the context until all of the layers fit on my GPU. Even just one layer off is enough to poison it. It's like the second it starts using the CPU or RAM it kills itself.

Anonymous
08/06/24(Tue)01:01:16 No.101745578

Anonymous 08/06/24(Tue)01:01:16 No.101745578

>>101745471
I'm at 12 as well and could not for the love of me get Nemo to work, but coomed hard with Celeste. The only downside is that it sticks to the main character too hard and doesn't switch to others or remove the character from the story temporarily even when asked, but I think it's likely a system prompt thing (I use the one provided on the model page)

Anonymous
08/06/24(Tue)01:09:09 No.101745671

Anonymous 08/06/24(Tue)01:09:09 No.101745671

>>101745571
>one layer off is enough to poison it
uhh is that a llama.cpp bug?

Anonymous
08/06/24(Tue)01:09:53 No.101745681

Anonymous 08/06/24(Tue)01:09:53 No.101745681

>>101745571
This might be worth trying then, though I was already down to 16k so I might have to lower the quant. Worth it to lose some of the creativity if it means I can at least finish short scenarios at least.

>>101745578
I'll give it a shot, I don't mess with group chats often anyway except for shitposting, given the caveman issue that would just be amusing.

Anonymous
08/06/24(Tue)01:14:17 No.101745737

Anonymous 08/06/24(Tue)01:14:17 No.101745737

>>101744363
I just tried the Sally test on this thing, and it opened its answer with:
>A classic lateral thinking puzzle!
That's a really common Llama-ism. Did this retard train a ROLEPLAY model on Llama outputs?

Anonymous
08/06/24(Tue)01:15:50 No.101745756

Anonymous 08/06/24(Tue)01:15:50 No.101745756

>>101745737
I'm a cute 30 year old with a scratchy voice
*yes I am still 30*
*no I wasn't 39 last year*

Anonymous
08/06/24(Tue)01:17:41 No.101745774

Anonymous 08/06/24(Tue)01:17:41 No.101745774

>>101745737
The base model is Qwen in this case, right? Maybe it was Qwen that trained on Llama.

Anonymous
08/06/24(Tue)01:20:12 No.101745808

Anonymous 08/06/24(Tue)01:20:12 No.101745808

>>101745671
If it is, I hope it gets fixed. But I'm too retarded for that, so I will wait for someone else to look into it. Just to clarify: The bug occurs for me on KoboldCPP with context shifting enabled. Maybe that's related?
>>101745681
That sucks man, definitely recommend trying it at a lower quant though, because it's really nice when it actually works.

Anonymous
08/06/24(Tue)01:33:40 No.101745925

Anonymous 08/06/24(Tue)01:33:40 No.101745925

>>101745479
Llama 3.1 405B and Mistral Large 2407

Anonymous
08/06/24(Tue)01:34:10 No.101745929

Anonymous 08/06/24(Tue)01:34:10 No.101745929

>>101745311
ChatGPT 4o replacement
Home assistant waifu

Anonymous
08/06/24(Tue)01:35:11 No.101745936

Anonymous 08/06/24(Tue)01:35:11 No.101745936

>>101745925
>405B
so 8 GPUs?

Anonymous
08/06/24(Tue)01:37:52 No.101745958

Anonymous 08/06/24(Tue)01:37:52 No.101745958

>>101745936
More like 9 for a 4.0 bit quant. You can spread those out over multiple nodes if you have to.

Anonymous
08/06/24(Tue)01:39:12 No.101745968

Anonymous 08/06/24(Tue)01:39:12 No.101745968

>>101745958
>>101745936
Mistral Large is roughly as smart as 405B and way way easier to run due to being a quarter the size

Anonymous
08/06/24(Tue)01:40:47 No.101745981

Anonymous 08/06/24(Tue)01:40:47 No.101745981

>>101744363
testing the 32B and it's crazy smart for its size, better than Gemma 27B

Anonymous
08/06/24(Tue)01:41:47 No.101745993

Anonymous 08/06/24(Tue)01:41:47 No.101745993

>>101745981
q8 exl2, forgot to say

Anonymous
08/06/24(Tue)01:42:00 No.101745996

Anonymous 08/06/24(Tue)01:42:00 No.101745996

>>101745968
>roughly

Anonymous
08/06/24(Tue)01:42:15 No.101745997

Anonymous 08/06/24(Tue)01:42:15 No.101745997

>>101745968
All the outputs I've seen from it are way too sloppy. Each time I see another one posted here, I decide against biting the bullet and downloading it. I swear, there's not a single one without a 'mix of this ahd that', it's Mixtral all over again. I don't want to go back.

Anonymous
08/06/24(Tue)01:44:59 No.101746018

Anonymous 08/06/24(Tue)01:44:59 No.101746018

>>101745968
Mistral Large, what's it require, what are some examples of getting it to typing speed?

Anonymous
08/06/24(Tue)01:45:30 No.101746019

Anonymous 08/06/24(Tue)01:45:30 No.101746019

>>101745981
Interesting. Are 24GB VRAMlets back?

Anonymous
08/06/24(Tue)01:46:34 No.101746032

Anonymous 08/06/24(Tue)01:46:34 No.101746032

>>101745968
Yeah, I can't run 405B, and I only get 1.2 tok/s with Mistral Large 2, but it's worth waiting for GPT-4 equivalence. Still hoarding 405B though, especially since (((Dustin Moskovitz))) and AI doomers were trying to stop its release.

Anonymous
08/06/24(Tue)01:55:02 No.101746115

Anonymous 08/06/24(Tue)01:55:02 No.101746115

>>101743898
They are slower?

Anonymous
08/06/24(Tue)02:00:47 No.101746181

Anonymous 08/06/24(Tue)02:00:47 No.101746181

>>101740235
https://huggingface.co/aetherwiing/MN-12B-Starsong-v1

Anonymous
08/06/24(Tue)02:03:45 No.101746213

Anonymous 08/06/24(Tue)02:03:45 No.101746213

>>101746115
iquants are slower on cpu only, but are considerably faster on gpu only

Anonymous
08/06/24(Tue)02:11:11 No.101746277

Anonymous 08/06/24(Tue)02:11:11 No.101746277

File: 1722529507378548.png (45 KB, 200x160)

45 KB PNG

>>101740500
It's ass though
>fill me up
>you're so big
>mmm... that's it
>fuck yes!
It feels like ERP with cleverbot. Not to mention the constant repeating dialogue.

Anonymous
08/06/24(Tue)02:12:03 No.101746284

Anonymous 08/06/24(Tue)02:12:03 No.101746284

If you had a sudden windfall of ca. $700, how would you spend it to upgrade your /lmg/ experience? Just get a used 3090(ti)?

Anonymous
08/06/24(Tue)02:14:18 No.101746305

Anonymous 08/06/24(Tue)02:14:18 No.101746305

>>101746284
idk what that gay plastic money is

Anonymous
08/06/24(Tue)02:16:38 No.101746326

Anonymous 08/06/24(Tue)02:16:38 No.101746326

>>101746305
Feel free to pretend I said "23.8 troy ounces of silver" instead, if the idea of fiat currency is so offensive to you

Anonymous
08/06/24(Tue)02:16:59 No.101746333

Anonymous 08/06/24(Tue)02:16:59 No.101746333

Why does non of the recent models know what a playboy bunnysuit is? I have a few bunnygirl character cards and all of the models recently keeps thinking it has skirts, panties and even shorts. WTF.

Do I need a lorebook or something?
I remember Stheno 3.2 getting it right.

Anonymous
08/06/24(Tue)02:21:21 No.101746366

Anonymous 08/06/24(Tue)02:21:21 No.101746366

>save png to use for char card
>background is transparent in gimp leaving just the character
>st sometimes, depending on the image, applies a black background around the character instead of keeping the transparency
how can i fix this? i'd like to just keep the char cutout that i can get from some pictures but if they appear transparent in gimp in the first place, how do i make st keep it?

Anonymous
08/06/24(Tue)02:22:48 No.101746372

Anonymous 08/06/24(Tue)02:22:48 No.101746372

>>101746333
This is why we need native multimodal models.

Anonymous
08/06/24(Tue)02:24:33 No.101746383

Anonymous 08/06/24(Tue)02:24:33 No.101746383

>>101746284
A second used 3090 probably, yeah
I'm 3090 + 3060 atm, another 12GB would be quite useful
Not sure my PSU would be up to it though, it's only 750W so think I'm pushing the limits with the current setup already

Anonymous
08/06/24(Tue)02:28:12 No.101746407

Anonymous 08/06/24(Tue)02:28:12 No.101746407

I have 32 gb ram and no dgpu. What's the best model I can run?

Anonymous
08/06/24(Tue)02:29:03 No.101746415

Anonymous 08/06/24(Tue)02:29:03 No.101746415

>>101746326
does your money make crinkling sounds when you walk?

Anonymous
08/06/24(Tue)02:29:23 No.101746417

Anonymous 08/06/24(Tue)02:29:23 No.101746417

>>101744363
Testing the 12b Q8 gguf and it's really fucking good so far. I like the dialogue and the random shit it comes up with, it's not shy at all. There's some body and soul garbage in there but it's very minor and easily editable, it's not constant like some models. It's a nice change of pace from using straight up Nemo.

Anonymous
08/06/24(Tue)02:29:46 No.101746419

Anonymous 08/06/24(Tue)02:29:46 No.101746419

>>101746407
Anything that fits in your left over ram, albeit very very slow.

Anonymous
08/06/24(Tue)02:32:43 No.101746448

Anonymous 08/06/24(Tue)02:32:43 No.101746448

>>101746407
He says very slow, but it's basically typing speed, on my cpu, anyway 5900x. It will depend on cpu speed. But a gpu will pump out the text faster than any typist.

Anonymous
08/06/24(Tue)02:34:39 No.101746467

Anonymous 08/06/24(Tue)02:34:39 No.101746467

>>101746407
stheno l3 8b

Anonymous
08/06/24(Tue)02:45:00 No.101746563

Anonymous 08/06/24(Tue)02:45:00 No.101746563

File: image - 2024-08-06T014427.243.png (1.12 MB, 1272x720)

1.12 MB PNG

Anonymous
08/06/24(Tue)02:45:44 No.101746567

Anonymous 08/06/24(Tue)02:45:44 No.101746567

what are gptq, awq, and gguf?
which model format should I download from huggingface for llamacpp?

Anonymous
08/06/24(Tue)02:48:12 No.101746590

Anonymous 08/06/24(Tue)02:48:12 No.101746590

>>101746567
gguf is the most popular
go for it

Anonymous
08/06/24(Tue)02:48:18 No.101746593

Anonymous 08/06/24(Tue)02:48:18 No.101746593

>>101746567
if you're a braindead retard who needs to ask, gguf but use koboldcpp instead

Anonymous
08/06/24(Tue)02:49:13 No.101746602

Anonymous 08/06/24(Tue)02:49:13 No.101746602

so why did open source llms from basically every company/research group all gain 20 iq points since March
what exactly was the new training technique that they all started using

Anonymous
08/06/24(Tue)02:49:43 No.101746608

Anonymous 08/06/24(Tue)02:49:43 No.101746608

>>101746602
They scraped my fanfiction.

Anonymous
08/06/24(Tue)02:50:10 No.101746611

Anonymous 08/06/24(Tue)02:50:10 No.101746611

>>101741720
>local models bad
>posts screencap of cloud model as proof
???

Anonymous
08/06/24(Tue)02:52:29 No.101746628

Anonymous 08/06/24(Tue)02:52:29 No.101746628

>>101746563
why are her tits so low? they're growing out of her ribs

Anonymous
08/06/24(Tue)02:54:24 No.101746648

Anonymous 08/06/24(Tue)02:54:24 No.101746648

File: image - 2024-08-06T014707.076.png (1.08 MB, 1272x720)

1.08 MB PNG

>>101746628
boob magic

Anonymous
08/06/24(Tue)03:03:55 No.101746722

Anonymous 08/06/24(Tue)03:03:55 No.101746722

File: like balloons taped to a wall.jpg (134 KB, 850x1222)

134 KB JPG

>>101746628
Do you think tiddies naturally stick out near the collar bone without being pushed up?

Anonymous
08/06/24(Tue)03:09:34 No.101746770

Anonymous 08/06/24(Tue)03:09:34 No.101746770

>>101743811
Cool miku

Anonymous
08/06/24(Tue)03:19:49 No.101746839

Anonymous 08/06/24(Tue)03:19:49 No.101746839

>>101746759
Have you looked up nude side view photos? Top of tits below armpit is common. And >>101746563 is basically at armpit level. You'd have a point if they were hanging at the waist which would look actually nasty.

Anonymous
08/06/24(Tue)03:21:17 No.101746845

Anonymous 08/06/24(Tue)03:21:17 No.101746845

>>101745981
>testing the 32B and it's crazy smart for its size, better than Gemma 27B
Fucking finally the single 3090 fags can eat good this time

Anonymous
08/06/24(Tue)03:22:42 No.101746854

Anonymous 08/06/24(Tue)03:22:42 No.101746854

File: johnny.png (59 KB, 838x682)

59 KB PNG

>This change makes GeLU go 8x faster on Intel, 3x faster on Apple Silicon, and 2x faster on Threadripper. It's the world's most popular activation function, crucial to models such as Whisper and Gemma. On those models, this change can have a noticeable improvement in performance. That's because GeLU is usually the most time-consuming op except for matrix multiplication.
>This software was developed by Mozilla Ocho and ARM Limited. It first appeared in llamafile which offers you llama.cpp / whisper.cpp / stable-diffusion.cpp with the most bleeding edge performance optimizations and binary distributability.
kek

Anonymous
08/06/24(Tue)03:23:10 No.101746857

Anonymous 08/06/24(Tue)03:23:10 No.101746857

Anthracite mogs once again, absolute chads

Anonymous
08/06/24(Tue)03:23:41 No.101746859

Anonymous 08/06/24(Tue)03:23:41 No.101746859

File: GJgp0lvXIAARa2l.jpg (60 KB, 684x623)

60 KB JPG

>>101746854
>Speedup 1.01

Anonymous
08/06/24(Tue)03:23:47 No.101746860

Anonymous 08/06/24(Tue)03:23:47 No.101746860

One side is being disingenous and the other side is a bit dim. They don't stick out 90 degrees from the collar bone. But the booba isn't usually so saggy that it's like a tennis ball at the bottom of a sock unless your grandma's trying to throw them over her shoulder.

Anonymous
08/06/24(Tue)03:24:49 No.101746869

Anonymous 08/06/24(Tue)03:24:49 No.101746869

>>101745981
Is it really? I tested the 12b and it wasn't good. I haven't found anything below 70b acceptable. I didn't try gemma because of the small context, is the 32b really worth trying?

Anonymous
08/06/24(Tue)03:24:53 No.101746870

Anonymous 08/06/24(Tue)03:24:53 No.101746870

>>101746859
>on a 2b model

Anonymous
08/06/24(Tue)03:27:18 No.101746882

Anonymous 08/06/24(Tue)03:27:18 No.101746882

https://github.com/kijai/ComfyUI-CogVideoXWrapper
Oh shit did you see that, the CogVLM fags made a local text to video model, and it's not that bad

Anonymous
08/06/24(Tue)03:27:44 No.101746885

Anonymous 08/06/24(Tue)03:27:44 No.101746885

>>101746854
So the 8x speedup was just a lie? Typical.

Anonymous
08/06/24(Tue)03:28:15 No.101746889

Anonymous 08/06/24(Tue)03:28:15 No.101746889

File: image - 2024-08-06T014727.238.png (1.22 MB, 1272x720)

1.22 MB PNG

>>101746860

Anonymous
08/06/24(Tue)03:28:39 No.101746892

Anonymous 08/06/24(Tue)03:28:39 No.101746892

>>101746419
I run llama 3.1 70b instruct IQ1_M at 1 tok/s on my 12700 and 32g ddr4 ram.
is this a normal number?

>>101746593
why koboldcpp? what does it offer over llamacpp?

Anonymous
08/06/24(Tue)03:30:03 No.101746901

Anonymous 08/06/24(Tue)03:30:03 No.101746901

>>101746892
>why koboldcpp? what does it offer over llamacpp?
you can be retarded and still use it

Anonymous
08/06/24(Tue)03:30:19 No.101746906

Anonymous 08/06/24(Tue)03:30:19 No.101746906

>>101746885
The speedup is on the gelu function itself, but that seem to be about 0.01 of the total computation. Nobody cared to optimize it because it does fuck all. He just wants to be able to say 'i optimized this low hanging fruit!'

Anonymous
08/06/24(Tue)03:31:55 No.101746914

Anonymous 08/06/24(Tue)03:31:55 No.101746914

>>101739908
VCs?

Anonymous
08/06/24(Tue)03:36:00 No.101746942

Anonymous 08/06/24(Tue)03:36:00 No.101746942

>>101746889
Nice.

Anonymous
08/06/24(Tue)03:46:24 No.101747001

Anonymous 08/06/24(Tue)03:46:24 No.101747001

>>101746914
Voucher Connoisseurs

Anonymous
08/06/24(Tue)03:47:02 No.101747005

Anonymous 08/06/24(Tue)03:47:02 No.101747005

>>101746914
Vietcongs

Anonymous
08/06/24(Tue)03:47:07 No.101747007

Anonymous 08/06/24(Tue)03:47:07 No.101747007

>>101746914
Venture Communists

Anonymous
08/06/24(Tue)03:48:06 No.101747017

Anonymous 08/06/24(Tue)03:48:06 No.101747017

>>101746914
Vaush's childwife

Anonymous
08/06/24(Tue)03:49:30 No.101747026

Anonymous 08/06/24(Tue)03:49:30 No.101747026

>>101746854
>Made it slower in some cases
What are they doing? Surely they don't merge that, right?

Anonymous
08/06/24(Tue)03:56:43 No.101747064

Anonymous 08/06/24(Tue)03:56:43 No.101747064

https://github.com/ggerganov/llama.cpp/pull/8878#issuecomment-2270546553
Is there anything jart doesn't lie about?

Anonymous
08/06/24(Tue)04:00:31 No.101747085

Anonymous 08/06/24(Tue)04:00:31 No.101747085

>>101747064
it IS 8x the performance, on specific hardware, on a function that does fuck all in the grand scheme of things. It's a very specific, selective, and useless truth.
The way he presents it, though, is absolutely fucked. "1 channel -> 8 channel SIMD, 8x performance. I'm a genius!"

Anonymous
08/06/24(Tue)04:01:01 No.101747090

Anonymous 08/06/24(Tue)04:01:01 No.101747090

what intelligence level can I expect from llama 3.1 70b?

Anonymous
08/06/24(Tue)04:01:38 No.101747095

Anonymous 08/06/24(Tue)04:01:38 No.101747095

>>101747026
If it can be improved to be consistently faster, it's probably worth it. As is, i'll let it sit.

Anonymous
08/06/24(Tue)04:02:19 No.101747097

Anonymous 08/06/24(Tue)04:02:19 No.101747097

Coomer here, last time I checked in state of the art for generating porn was
>Sd1.5
>get loras for specific stuff
>a1111 for generation

What’s the current meta for one’s toolkit and model chain if all you care about is personalized nsfw?

Anonymous
08/06/24(Tue)04:04:21 No.101747109

Anonymous 08/06/24(Tue)04:04:21 No.101747109

>>101747097
You're gonna wanna sit down for this.

>autistic mentally ill brony BTFO'd SDXL for basically everything by making a 2d centered model based off XL so everything worthwhile..
>including LORAs you should just use Pony for
>most people are splintered between Forge (Which is now dead by the way) or ComfyUI

Anonymous
08/06/24(Tue)04:15:31 No.101747203

Anonymous 08/06/24(Tue)04:15:31 No.101747203

>Cant train on ai generated outputs in self supervision
>Cant "iterate" on censored instruct models
Which open source/independend/from scratch ?

Anonymous
08/06/24(Tue)04:18:00 No.101747220

Anonymous 08/06/24(Tue)04:18:00 No.101747220

>>101747090
I've found it no better or worse than 3.0 70B. Just a sidegrade. Doubt I could pick which was which in a blind test.

Anonymous
08/06/24(Tue)04:23:26 No.101747271

Anonymous 08/06/24(Tue)04:23:26 No.101747271

>>101747097
Check the furry threads

Anonymous
08/06/24(Tue)04:29:28 No.101747331

Anonymous 08/06/24(Tue)04:29:28 No.101747331

>>101744217
L3 is more Woke than Michael Burnham on a 72 hour ice bender. Unless you're into that, I suggest Mixtral or *maybe* some of the Gemma finetunes.

Anonymous
08/06/24(Tue)04:29:44 No.101747334

Anonymous 08/06/24(Tue)04:29:44 No.101747334

https://new.reddit.com/r/LocalLLaMA/comments/1elbn3q/quantize_123b_mistrallargeinstruct2407_to_35_gb/
>I quantize 123B Mistral-Large-Instruct-2407 to 35GB with only 4% average accuracy degeneration in 5 zero-shot reasoning tasks!!!
HOO LEE FUK !!

Anonymous
08/06/24(Tue)04:31:12 No.101747349

Anonymous 08/06/24(Tue)04:31:12 No.101747349

File: patrick bateman surprise.png (337 KB, 418x398)

337 KB PNG

>>101747334
holy fuck

Anonymous
08/06/24(Tue)04:33:10 No.101747364

Anonymous 08/06/24(Tue)04:33:10 No.101747364

File: notfound.png (211 KB, 1920x1080)

211 KB PNG

https://llm.extractum.io/list/?query=gpt-3

Anonymous
08/06/24(Tue)04:33:56 No.101747374

Anonymous 08/06/24(Tue)04:33:56 No.101747374

>>101746914
Vagina connoisseurs.

Anonymous
08/06/24(Tue)04:34:37 No.101747379

Anonymous 08/06/24(Tue)04:34:37 No.101747379

>>101747334
>GPTQv2
What? it's already out?

Anonymous
08/06/24(Tue)04:36:03 No.101747389

Anonymous 08/06/24(Tue)04:36:03 No.101747389

File: bc87c9ac8fc90d3d96d9ffcb5(...).jpg (53 KB, 736x952)

53 KB JPG

>>101747334

Anonymous
08/06/24(Tue)04:54:58 No.101747539

Anonymous 08/06/24(Tue)04:54:58 No.101747539

>>101747334
Damn. I suppose this doesn't work with koboldcpp yet though

Anonymous
08/06/24(Tue)04:55:19 No.101747542

Anonymous 08/06/24(Tue)04:55:19 No.101747542

>>101744363
ok the 32B is actually crazy

Anonymous
08/06/24(Tue)04:56:22 No.101747555

Anonymous 08/06/24(Tue)04:56:22 No.101747555

File: Bonzi_Really.png (1.12 MB, 894x894)

1.12 MB PNG

>>101747539

Anonymous
08/06/24(Tue)04:59:38 No.101747591

Anonymous 08/06/24(Tue)04:59:38 No.101747591

>>101747001
>>101747005
>>101747007
>>101747017
go back https://www.reddit.com/

Anonymous
08/06/24(Tue)05:00:19 No.101747599

Anonymous 08/06/24(Tue)05:00:19 No.101747599

Man Mistral large really seems to like making defiant female characters so headstrong they fucking die.

>Do the thing and I'll shoot you lady *points a gun at her head*
>I'm a strong independent woman. I'll do what I want *{{char}} does defiant thing*
>*Gets shot*
Wow, zero sense of self-preservation. Art truly does imitate life.

Anonymous
08/06/24(Tue)05:03:17 No.101747626

Anonymous 08/06/24(Tue)05:03:17 No.101747626

>>101747334
Yeah... sure

Anonymous
08/06/24(Tue)05:05:16 No.101747646

Anonymous 08/06/24(Tue)05:05:16 No.101747646

>>101747591
nyo :3

Anonymous
08/06/24(Tue)05:06:08 No.101747648

Anonymous 08/06/24(Tue)05:06:08 No.101747648

>>101744386
Nothing, it's shit
>>101744693
You know you didn't
>>101745737
That's because it's a shilled model by an organized group of shills, now
>>101745981
False, and shill
>>101746417
Shill
>>101747542
Crazy mountain of shit
....................
Better models can be found here:
>https://huggingface.co/Sao10K
or even here if you're into this stuff:
>https://huggingface.co/TheDrummer

Anonymous
08/06/24(Tue)05:08:20 No.101747664

Anonymous 08/06/24(Tue)05:08:20 No.101747664

>>101747648
>"y-you're a shill!"
>*proceeds to shill shamelessly himself*

Anonymous
08/06/24(Tue)05:12:53 No.101747690

Anonymous 08/06/24(Tue)05:12:53 No.101747690

wtf is this openwebui shit and why do you use it instead of llamafiles

Anonymous
08/06/24(Tue)05:13:03 No.101747695

Anonymous 08/06/24(Tue)05:13:03 No.101747695

>>101746284
Get a stick of 32gb mem and put it inside

llama.cpp CUDA dev !!OM2Fp6Fn93S
08/06/24(Tue)05:14:21 No.101747705

llama.cpp CUDA dev !!OM2Fp6Fn93S 08/06/24(Tue)05:14:21 No.101747705

>>101747026
For small end-to-end performance improvements on the order of ~1% you have to be very careful with how you measure it because the run-to-run variance can easily obscure the actual change.
Considering that the speedup for GELU itself is quite large you can reasonably assume that the changes would at the very least not be detrimental in terms of performance.

Anonymous
08/06/24(Tue)05:15:36 No.101747713

Anonymous 08/06/24(Tue)05:15:36 No.101747713

File: homekaiappmedanmediaorigi(...).png (633 KB, 1412x1767)

633 KB PNG

We dont like censored proom models .. right

Anonymous
08/06/24(Tue)05:15:56 No.101747715

Anonymous 08/06/24(Tue)05:15:56 No.101747715

>>101747690
>llamafiles

Anonymous
08/06/24(Tue)05:17:22 No.101747727

Anonymous 08/06/24(Tue)05:17:22 No.101747727

>>101747690
>llamaballs

Anonymous
08/06/24(Tue)05:19:13 No.101747739

Anonymous 08/06/24(Tue)05:19:13 No.101747739

>>101747705
Or the implementation is poor and only shows benefit on specific cpus at the detriment of others.

Anonymous
08/06/24(Tue)05:21:49 No.101747752

Anonymous 08/06/24(Tue)05:21:49 No.101747752

>>101747715
answer the question or just fucking kill yourself already

Anonymous
08/06/24(Tue)05:22:33 No.101747760

Anonymous 08/06/24(Tue)05:22:33 No.101747760

>>101747752
go back or just fucking kill yourself already

llama.cpp CUDA dev !!OM2Fp6Fn93S
08/06/24(Tue)05:24:01 No.101747776

llama.cpp CUDA dev !!OM2Fp6Fn93S 08/06/24(Tue)05:24:01 No.101747776

>>101747739
Georgi confirmed that he gets better performance when benchmarking GELU directly though so in this particular case it seems to be an issue of measurement precision.

Anonymous
08/06/24(Tue)05:24:18 No.101747779

Anonymous 08/06/24(Tue)05:24:18 No.101747779

>>101747760
Go back to where? I've been in this thread since 2008.

Anonymous
08/06/24(Tue)05:25:18 No.101747789

Anonymous 08/06/24(Tue)05:25:18 No.101747789

Tried to get llama 3.1 to tell me how to involve guns in sex and it just noped the fuck out

Anonymous
08/06/24(Tue)05:25:43 No.101747792

Anonymous 08/06/24(Tue)05:25:43 No.101747792

>>101747752
Use whatever you like. I just think that packing a 2-4gb file into an executable is a stupid idea.

Anonymous
08/06/24(Tue)05:27:44 No.101747809

Anonymous 08/06/24(Tue)05:27:44 No.101747809

>>101747334
Can anyone confirm if this actually real? I'd eat my hat on livestream if quanting something down to 2b didn't functionally retard it.

Anonymous
08/06/24(Tue)05:27:51 No.101747812

Anonymous 08/06/24(Tue)05:27:51 No.101747812

>>101747789
llama 3.1 uncensored models will let you do anything, get those

Anonymous
08/06/24(Tue)05:28:20 No.101747819

Anonymous 08/06/24(Tue)05:28:20 No.101747819

>>101747809
You can see the ppl in the post if you care.

Anonymous
08/06/24(Tue)05:30:12 No.101747834

Anonymous 08/06/24(Tue)05:30:12 No.101747834

>>101747819
Ill I see is
>gib gguf

Anonymous
08/06/24(Tue)05:31:27 No.101747845

Anonymous 08/06/24(Tue)05:31:27 No.101747845

>>101747812
Where they at?
Speaking of, how do they actually uncensor it?

Anonymous
08/06/24(Tue)05:33:29 No.101747860

Anonymous 08/06/24(Tue)05:33:29 No.101747860

File: 1720877578036169.png (64 KB, 1417x811)

64 KB PNG

>>101747845
in LM Studio you can search for llama 3.1 uncensored (just search llama 3.1), the model I have is like 4 gigs, 8b, it works good

here is a test: no more "I can't do that" ethics bs.

Anonymous
08/06/24(Tue)05:35:26 No.101747883

Anonymous 08/06/24(Tue)05:35:26 No.101747883

File: 1704390069253474.jpg (429 KB, 2472x1345)

429 KB JPG

>>101747845
this is what it looks like, theres a huggingface search for public models.

Anonymous
08/06/24(Tue)05:38:04 No.101747908

Anonymous 08/06/24(Tue)05:38:04 No.101747908

>>101747860
>>101747883
>proprietary software
Nah, I'm good.

Anonymous
08/06/24(Tue)05:38:42 No.101747911

Anonymous 08/06/24(Tue)05:38:42 No.101747911

>>101747908
then just get the models off huggingface and use whatever frontend you like.

Anonymous
08/06/24(Tue)05:41:16 No.101747939

Anonymous 08/06/24(Tue)05:41:16 No.101747939

Magnum man you're a wizard.

Anonymous
08/06/24(Tue)05:42:27 No.101747948

Anonymous 08/06/24(Tue)05:42:27 No.101747948

>purged the RP logs from the dataset
Nah

Anonymous
08/06/24(Tue)05:45:13 No.101747963

Anonymous 08/06/24(Tue)05:45:13 No.101747963

Llama 3.1 70B Instruct over multiple gens confused a boy's unnamed father and named great-grandfather until I edited the message right before the screwup to name the father when I refer to him. This happened at around only 2000 tokens.

Anonymous
08/06/24(Tue)05:48:47 No.101747984

Anonymous 08/06/24(Tue)05:48:47 No.101747984

>>101744363
I've been out of the loop. Why does this post have so many responses and what's so good about these models?

Anonymous
08/06/24(Tue)05:50:29 No.101747995

Anonymous 08/06/24(Tue)05:50:29 No.101747995

>>101747963
3.1 sucks other than 405B, which is itself mogged by Mistral Large.

Anonymous
08/06/24(Tue)05:50:49 No.101747998

Anonymous 08/06/24(Tue)05:50:49 No.101747998

>>101747984
Full fine-tunes on base models that don't make them retarded.

Anonymous
08/06/24(Tue)05:51:00 No.101748002

Anonymous 08/06/24(Tue)05:51:00 No.101748002

how do I make llamacpp work like meta-llama/llama-agentic-system?
something like asking it to read the csv file and plot the column in time series.

Anonymous
08/06/24(Tue)05:51:23 No.101748003

Anonymous 08/06/24(Tue)05:51:23 No.101748003

>>101747984
It's called shilling.

Anonymous
08/06/24(Tue)05:54:10 No.101748020

Anonymous 08/06/24(Tue)05:54:10 No.101748020

>>101748003
>>101747648
Hi lemmy

Anonymous
08/06/24(Tue)05:54:50 No.101748025

Anonymous 08/06/24(Tue)05:54:50 No.101748025

>>101747963
By multiple I mean 5/5 explicitly making the same mistake.

I also tried with Llama 3.0 70B with the original chat history to see if it sucked less. It fucked up two times out of five, once changing great-grandfather to grandfather and once to father like 3.1 does.

Anonymous
08/06/24(Tue)05:58:24 No.101748053

Anonymous 08/06/24(Tue)05:58:24 No.101748053

>>101747984
>Why does this post have so many responses
having dedicated shills in the thread does that

Anonymous
08/06/24(Tue)05:59:00 No.101748056

Anonymous 08/06/24(Tue)05:59:00 No.101748056

>>101748053
So this guy has shills in this thread just to reply to his posts to his models for what exactly?

Anonymous
08/06/24(Tue)06:03:15 No.101748082

Anonymous 08/06/24(Tue)06:03:15 No.101748082

>>101748056
It's pointless to argue with "shill" accusation schizos, they're on every board on this site, pop up whenever someone says they like something, and can never explain what the logical motivation of people shilling a free thing for no money is supposed to be. All they ever do is hint darkly and pretend it's obvious, while never explaining.

Anonymous
08/06/24(Tue)06:04:51 No.101748093

Anonymous 08/06/24(Tue)06:04:51 No.101748093

>>101747984
Overhyped models from an organization assembled for the purpose of "organically" boosting their own shit. Chances are that Petra is among them as well.

>>101747998
Full finetuning is a half-meme. There's easily more to be lost than gained from it in most finetuning scenarios.

Anonymous
08/06/24(Tue)06:06:49 No.101748107

Anonymous 08/06/24(Tue)06:06:49 No.101748107

>>101748056
People say this about any finetune that gets posted as a force of habit.
It's just a popular model series. Happens to be the only one where they avoid finetuning on top of Instruct and aren't settling for LoRA cope

Anonymous
08/06/24(Tue)06:07:14 No.101748108

Anonymous 08/06/24(Tue)06:07:14 No.101748108

>>101748082
Nah, it's just tiring to reply to the defense force. And not needed. Anyone can put 2 and 2 together and judge by themselves.

Anonymous
08/06/24(Tue)06:09:07 No.101748118

Anonymous 08/06/24(Tue)06:09:07 No.101748118

>>101748108
>All they ever do is hint darkly and pretend it's obvious, while never explaining.
>Anyone can put 2 and 2 together and judge by themselves.

Anonymous
08/06/24(Tue)06:09:45 No.101748125

Anonymous 08/06/24(Tue)06:09:45 No.101748125

>>101748118
>he doesn't know

Anonymous
08/06/24(Tue)06:10:33 No.101748129

Anonymous 08/06/24(Tue)06:10:33 No.101748129

>>101748125
>he cares not

Anonymous
08/06/24(Tue)06:11:09 No.101748132

Anonymous 08/06/24(Tue)06:11:09 No.101748132

>>101748093
>There's easily more to be lost than gained from it
Absolutely not, have you ever tried comparing 72b FFT vs rank 32 Lora? It's not even close sorry

Anonymous
08/06/24(Tue)06:11:55 No.101748134

Anonymous 08/06/24(Tue)06:11:55 No.101748134

>>101748082
>free thing for no money
Nothing is free. You're donating them your attention and time, at the very least. At this level, end-users are pawns ready to be discarded as soon as the so-called finetuners find employment somewhere after "getting noticed" (by inorganically spamming their shit everywhere).

Anonymous
08/06/24(Tue)06:12:18 No.101748136

Anonymous 08/06/24(Tue)06:12:18 No.101748136

>>101748118
I wonder why there's a reply guy defending shilling in this thread right now.

Anonymous
08/06/24(Tue)06:14:03 No.101748149

Anonymous 08/06/24(Tue)06:14:03 No.101748149

who shills for shills? honestly

Anonymous
08/06/24(Tue)06:14:54 No.101748151

Anonymous 08/06/24(Tue)06:14:54 No.101748151

>>101748134
Hi lemmy

shill
08/06/24(Tue)06:15:36 No.101748155

shill 08/06/24(Tue)06:15:36 No.101748155

>>101748134
>attacks your post
nothing personal, kid

Anonymous
08/06/24(Tue)06:15:53 No.101748158

Anonymous 08/06/24(Tue)06:15:53 No.101748158

>>101748136
Just making fun of him. He immediately did the thing anon predicted. Look at this anon, for example: >>101748134 mentioning all the attention 'shills' spend, when he's spamming whenever someone mentions a model.
Twice. TWICE he failed to pick up on the irony.

Anonymous
08/06/24(Tue)06:16:01 No.101748161

Anonymous 08/06/24(Tue)06:16:01 No.101748161

>Maxime Labonne: BigLlama-3.1-1T-Instruct (experimental self-merge using Meta-Llama-3.1-405B-Instruct and created with Arcee.AI's mergekit)
>https://huggingface.co/mlabonne/BigLlama-3.1-1T-Instruct
hi please click link thank

Anonymous
08/06/24(Tue)06:16:56 No.101748167

Anonymous 08/06/24(Tue)06:16:56 No.101748167

>>101748158
ironic

Anonymous
08/06/24(Tue)06:19:26 No.101748180

Anonymous 08/06/24(Tue)06:19:26 No.101748180

>shills
And are these supposed """""shills""""" in the thread with us right now?

Anonymous
08/06/24(Tue)06:19:56 No.101748184

Anonymous 08/06/24(Tue)06:19:56 No.101748184

>>101748167
it's like raaaiiiiaaaaaaain....

Anonymous
08/06/24(Tue)06:20:55 No.101748190

Anonymous 08/06/24(Tue)06:20:55 No.101748190

>>101748180
Yes...

Anonymous
08/06/24(Tue)06:29:50 No.101748264

Anonymous 08/06/24(Tue)06:29:50 No.101748264

Let's add an anchor for shilled models, like /AICG/ has for shilled bots. This should calm dawn the schizos over time.

Anonymous
08/06/24(Tue)06:32:49 No.101748298

Anonymous 08/06/24(Tue)06:32:49 No.101748298

>>101748264
you know what happens when you give schizos an inch?

Anonymous
08/06/24(Tue)06:35:31 No.101748322

Anonymous 08/06/24(Tue)06:35:31 No.101748322

>any model gets mentioned
>shill goes off on a 90 post meltdown
very organic sam

Anonymous
08/06/24(Tue)06:36:39 No.101748330

Anonymous 08/06/24(Tue)06:36:39 No.101748330

>>101747334
>read the comments
Christ that sub got invaded didn't it? The posters didn't use to be this retarded

Anonymous
08/06/24(Tue)06:36:42 No.101748331

Anonymous 08/06/24(Tue)06:36:42 No.101748331

>>101744363
What is the individual contribution of each of the 29 members of the organization involved with this model?

Anonymous
08/06/24(Tue)06:37:34 No.101748336

Anonymous 08/06/24(Tue)06:37:34 No.101748336

>>101748322
Do you even know what the word shill means?

Anonymous
08/06/24(Tue)06:38:45 No.101748349

Anonymous 08/06/24(Tue)06:38:45 No.101748349

>>101748322
>*the slightest push against astroturfing happens in the thread
>shill goes off on a 90 post meltdown
fixed

Anonymous
08/06/24(Tue)06:42:13 No.101748382

Anonymous 08/06/24(Tue)06:42:13 No.101748382

>>101748298
There are inches for everyone involved if you ask me.
I would love to check the new stuff that came out in one post, and we can't really put tunes in the news section.
So yeah, anchor please.

Anonymous
08/06/24(Tue)06:44:36 No.101748412

Anonymous 08/06/24(Tue)06:44:36 No.101748412

>posts the link to your model that you and your friends on discord "fine-tunned"
>suddenly multiple posts "oh it's soooo good, mmm, the finetuners did a good job!"
very organic
there was a time I was testing these shitty models (every single one was terrible), but now I simply stopped using tunes from certain people and I recommend you to do the same

Anonymous
08/06/24(Tue)06:44:48 No.101748414

Anonymous 08/06/24(Tue)06:44:48 No.101748414

File: file.png (20 KB, 738x155)

20 KB PNG

>>101747334

Anonymous
08/06/24(Tue)06:45:29 No.101748421

Anonymous 08/06/24(Tue)06:45:29 No.101748421

Is Celeste 1.9 good? Thinking about switching from Nemo.

Anonymous
08/06/24(Tue)06:46:13 No.101748428

Anonymous 08/06/24(Tue)06:46:13 No.101748428

>>101748421
No, even the guy shilling celeste says 1.6 is better than 1.9.

Anonymous
08/06/24(Tue)06:46:48 No.101748432

Anonymous 08/06/24(Tue)06:46:48 No.101748432

>>101748428
Why tho? What's the difference.

Anonymous
08/06/24(Tue)06:49:37 No.101748455

Anonymous 08/06/24(Tue)06:49:37 No.101748455

File: 28f73241ec6df7277bfde40b3(...).jpg (36 KB, 467x319)

36 KB JPG

>>101748414
>OP

Anonymous
08/06/24(Tue)06:50:37 No.101748465

Anonymous 08/06/24(Tue)06:50:37 No.101748465

>>101748432
Objectively, it's trained with more data, if you compare the READMEs.
Subjectively, I don't know because I didn't use either.

Anonymous
08/06/24(Tue)06:50:39 No.101748467

Anonymous 08/06/24(Tue)06:50:39 No.101748467

>>101748412
Hi lemmy

Anonymous
08/06/24(Tue)06:52:52 No.101748488

Anonymous 08/06/24(Tue)06:52:52 No.101748488

>>101748465
Are there any models trained on the gigabytes of horrible fanfiction in places like AO4? I feel like that would massively increase a model's ability to reference popular media.

Anonymous
08/06/24(Tue)06:53:10 No.101748490

Anonymous 08/06/24(Tue)06:53:10 No.101748490

>>101748412
The models are free, who the fuck cares?

Anonymous
08/06/24(Tue)06:53:55 No.101748502

Anonymous 08/06/24(Tue)06:53:55 No.101748502

Why does mmap seem to make everything worse all the time in every single case? Why is it on by default? Why does it exist at all?

Anonymous
08/06/24(Tue)06:55:43 No.101748522

Anonymous 08/06/24(Tue)06:55:43 No.101748522

>>101748502
>Why is it on by default? Why does it exist at all?
Jart

Anonymous
08/06/24(Tue)06:56:19 No.101748530

Anonymous 08/06/24(Tue)06:56:19 No.101748530

>>101748490
Ads are always free. I'm sick of having them constantly shoved in my face all the same.

Anonymous
08/06/24(Tue)06:56:43 No.101748535

Anonymous 08/06/24(Tue)06:56:43 No.101748535

>>101748490
>The models are free, who the fuck cares?
Me. I only want to see organic posts in the thread.

Anonymous
08/06/24(Tue)06:57:07 No.101748537

Anonymous 08/06/24(Tue)06:57:07 No.101748537

>>101748490
Me, I care.

Anonymous
08/06/24(Tue)06:58:46 No.101748553

Anonymous 08/06/24(Tue)06:58:46 No.101748553

>>101748530
>ads
it makes zero sense to advertise a free "product"

Anonymous
08/06/24(Tue)06:59:56 No.101748564

Anonymous 08/06/24(Tue)06:59:56 No.101748564

>>101748490
the 2 anons involved in a campaign to drive opinion away from any model that isn't what they're paid to post about is supposed to seem organic but they hired a sperg who goes off on every single model mentioned that isn't what he's paid to shill so they glow when they work and look like retards instead

Anonymous
08/06/24(Tue)07:01:00 No.101748571

Anonymous 08/06/24(Tue)07:01:00 No.101748571

>>101748564
what models are they paid to post about? who paid them? how would you know?

Anonymous
08/06/24(Tue)07:01:27 No.101748575

Anonymous 08/06/24(Tue)07:01:27 No.101748575

>>101748564
>paid
>hired
nobody pays anyone except people who waste cash on cloud gpu rentals

Anonymous
08/06/24(Tue)07:01:40 No.101748576

Anonymous 08/06/24(Tue)07:01:40 No.101748576

File: file.png (40 KB, 687x283)

40 KB PNG

>>101748455

Anonymous
08/06/24(Tue)07:02:53 No.101748585

Anonymous 08/06/24(Tue)07:02:53 No.101748585

>>101748553
consulting, sponsorships, etc.
these are some ways to make money with reputation/popularity

Anonymous
08/06/24(Tue)07:04:12 No.101748594

Anonymous 08/06/24(Tue)07:04:12 No.101748594

>>101748490
I care, spam of retarded models obfuscates the discussion and can possibly hide really good ones in the deluge of trash

Anonymous
08/06/24(Tue)07:04:32 No.101748597

Anonymous 08/06/24(Tue)07:04:32 No.101748597

>>101748585
Show me one single sloptuner who leveraged their popularity (in a niche within a niche, I might add) for something profitable.

Anonymous
08/06/24(Tue)07:05:58 No.101748606

Anonymous 08/06/24(Tue)07:05:58 No.101748606

>>101748597
Undi literally got a job from his merges

Anonymous
08/06/24(Tue)07:06:46 No.101748617

Anonymous 08/06/24(Tue)07:06:46 No.101748617

File: file.png (73 KB, 990x454)

73 KB PNG

>>101748597
Pretty much all of them, if you read the readmes. Here's the miqu 120b dude.

Anonymous
08/06/24(Tue)07:06:47 No.101748618

Anonymous 08/06/24(Tue)07:06:47 No.101748618

>>101748571
>what models are they paid to post about?
company name starts with an m and contains two vowels
>who paid them?
you know who
>how would you know?
curse of high iq

Anonymous
08/06/24(Tue)07:08:49 No.101748633

Anonymous 08/06/24(Tue)07:08:49 No.101748633

>>101748617
It seems to me they sometimes get a random startup to help with compute expenses, but that's about it.

Anonymous
08/06/24(Tue)07:09:03 No.101748636

Anonymous 08/06/24(Tue)07:09:03 No.101748636

File: file.png (147 KB, 501x840)

147 KB PNG

>>101748597

Anonymous
08/06/24(Tue)07:10:12 No.101748648

Anonymous 08/06/24(Tue)07:10:12 No.101748648

>>101748636
holy shit over $330k for running someone else's software on someone else's cloud for a few hours

Anonymous
08/06/24(Tue)07:10:30 No.101748654

Anonymous 08/06/24(Tue)07:10:30 No.101748654

File: 1690367687764.png (392 KB, 1338x1516)

392 KB PNG

>>101748597
I also remember Pygmalion getting like $10k from Spicychat in the early days. And Alpin is party of Anthracite. He probably got a couple more sponsorships since then, which explains the money to do full finetunes.

Anonymous
08/06/24(Tue)07:11:07 No.101748659

Anonymous 08/06/24(Tue)07:11:07 No.101748659

>>101748564
Too contrived. What if they're just tired of seeing meme models from people known to care more about getting noticed than the quality of their "products"? Just look at who are the members of this Anthracite organization. It's basically most of the SillyTavern discord regulars.

Anonymous
08/06/24(Tue)07:12:22 No.101748667

Anonymous 08/06/24(Tue)07:12:22 No.101748667

>>101748636
Kek he's probably richer than all of the fags who called him retarded here

Anonymous
08/06/24(Tue)07:12:51 No.101748674

Anonymous 08/06/24(Tue)07:12:51 No.101748674

>>101748618
>company name starts with an m and contains two vowels
moo??

Anonymous
08/06/24(Tue)07:13:17 No.101748678

Anonymous 08/06/24(Tue)07:13:17 No.101748678

>>101748654
>$10k for Pygmalion
>$100k for Goliath
$1M for Magnum...

Anonymous
08/06/24(Tue)07:13:23 No.101748679

Anonymous 08/06/24(Tue)07:13:23 No.101748679

>>101748648
>>101748667
>1% of 1$ is 1$

Anonymous
08/06/24(Tue)07:13:40 No.101748680

Anonymous 08/06/24(Tue)07:13:40 No.101748680

>>101748648
>over $330k
Check your tokenizer nigga

Anonymous
08/06/24(Tue)07:14:12 No.101748685

Anonymous 08/06/24(Tue)07:14:12 No.101748685

>>101748667
these two are not mutually exclusive, retarded people are rich too, scammers are usually more successful in business than people doing honest work

Anonymous
08/06/24(Tue)07:14:30 No.101748687

Anonymous 08/06/24(Tue)07:14:30 No.101748687

File: file.png (159 KB, 506x947)

159 KB PNG

>>101748679
>>101748667
>>101748648
>>101748597

Anonymous
08/06/24(Tue)07:14:47 No.101748690

Anonymous 08/06/24(Tue)07:14:47 No.101748690

>>101748685
coping

Anonymous
08/06/24(Tue)07:16:06 No.101748701

Anonymous 08/06/24(Tue)07:16:06 No.101748701

>>101748687
>I merge models for fun
The brightest mind in the fine-tuning community...

Anonymous
08/06/24(Tue)07:16:23 No.101748702

Anonymous 08/06/24(Tue)07:16:23 No.101748702

>>101748690
>coping
do you actually live on this planet or did you spend your entire life locked in the basement?

Anonymous
08/06/24(Tue)07:16:26 No.101748703

Anonymous 08/06/24(Tue)07:16:26 No.101748703

You really use the llama t-tokenizer ? Talk full sentences?

Anonymous
08/06/24(Tue)07:18:04 No.101748715

Anonymous 08/06/24(Tue)07:18:04 No.101748715

>>101748659
>It's basically most of the SillyTavern discord regulars.
You're making me feel bad for using SillyTavern.

Anonymous
08/06/24(Tue)07:18:31 No.101748718

Anonymous 08/06/24(Tue)07:18:31 No.101748718

>>101748633
>the compute needed to merge a model with itself

Anonymous
08/06/24(Tue)07:19:12 No.101748722

Anonymous 08/06/24(Tue)07:19:12 No.101748722

so did anyone with 2x3090 test the chink gptqv2 mistral "only 4% loss" in 35gb?
how fucked is it actually?

Anonymous
08/06/24(Tue)07:20:39 No.101748734

Anonymous 08/06/24(Tue)07:20:39 No.101748734

>>101748702
If you are not successful, you are dumb. Making profit is a purely intellectual task. So this chum is much smarter than you, by the only measurable metric.

Anonymous
08/06/24(Tue)07:29:50 No.101748800

Anonymous 08/06/24(Tue)07:29:50 No.101748800

>>101748502
Because developers are not allowed to disagree with them in public. You will be jarted and you will be happy.

Anonymous
08/06/24(Tue)07:30:08 No.101748805

Anonymous 08/06/24(Tue)07:30:08 No.101748805

>>101748687
green tea is like the most boring noob one you can go on about, there are so many good ones

Anonymous
08/06/24(Tue)07:32:02 No.101748828

Anonymous 08/06/24(Tue)07:32:02 No.101748828

>>101748180
yes, you are.

Anonymous
08/06/24(Tue)07:34:20 No.101748845

Anonymous 08/06/24(Tue)07:34:20 No.101748845

File: OEMhYF15BjZUc7S0nN-u7.png (78 KB, 989x590)

78 KB PNG

>Any local is better than 3.5 Turbo by now

>Microsoft WizardLM-2-8x22B 11.7 %

>Mixtral-8x7B-Instruct-v0.1 20.1 %

https://huggingface.co/spaces/vectara/Hallucination-evaluation-leaderboard

Anonymous
08/06/24(Tue)07:34:22 No.101748846

Anonymous 08/06/24(Tue)07:34:22 No.101748846

>>101748734
>https://en.wikipedia.org/wiki/William_James_Sidis
>works mostly menial jobs and lives almost in poverty
>https://en.wikipedia.org/wiki/Nikola_Tesla
>lived in poverty, feeding pigeons
they must have been very dumb

Anonymous
08/06/24(Tue)07:36:58 No.101748868

Anonymous 08/06/24(Tue)07:36:58 No.101748868

>>101748734
hi Undi

Anonymous
08/06/24(Tue)07:38:31 No.101748889

Anonymous 08/06/24(Tue)07:38:31 No.101748889

As always jart is lying, first it was "the test failing have nothing to do with my PR" when it obviously was. Then it's "17% end to end improvement" ends up being same or slower in benchmark.
https://github.com/ggerganov/llama.cpp/pull/8878#issuecomment-2270546553

Anonymous
08/06/24(Tue)07:40:02 No.101748911

Anonymous 08/06/24(Tue)07:40:02 No.101748911

>>101748846
>they must have been very dumb
They clearly were. EQ is far more telling of true intelligence than IQ, which is basically just an austism measurement.

Anonymous
08/06/24(Tue)07:41:06 No.101748918

Anonymous 08/06/24(Tue)07:41:06 No.101748918

>>101748889
You already posted that in this thread.

Anonymous
08/06/24(Tue)07:41:48 No.101748926

Anonymous 08/06/24(Tue)07:41:48 No.101748926

>>101748889
rent free

Anonymous
08/06/24(Tue)07:41:52 No.101748927

Anonymous 08/06/24(Tue)07:41:52 No.101748927

>>101748845
Who

Anonymous
08/06/24(Tue)07:42:19 No.101748933

Anonymous 08/06/24(Tue)07:42:19 No.101748933

There are literally HF orgs with members whose role is to shill their models and attack any other model maker and shit on their tunes, regardless if the 'competing' finetune is just as good or better.

>>101748734
$3000 in revenue after 1 year is nothing, literally nothing. This is not a lucrative market. It consists of broke ass basement dwellers who are mentally sick enough to find joy in talking to a computer program.

Anonymous
08/06/24(Tue)07:42:48 No.101748941

Anonymous 08/06/24(Tue)07:42:48 No.101748941

>>101748911
xD
whatever floats your boat Undi

Anonymous
08/06/24(Tue)07:45:42 No.101748966

Anonymous 08/06/24(Tue)07:45:42 No.101748966

>>101748845
>Higher the hallucination rate, more soulful it is.
Unironically a useful reference chart. Gemini 1.5 Pro is near the top at 28.1% for good reason. If only it was a bit looser for lewds, then it would surpass Opus by miles. Getting access is easy too, but that's a bit too much remote talk for /lmg/.
Little weird small models are so common high up.

Anonymous
08/06/24(Tue)07:48:25 No.101748987

Anonymous 08/06/24(Tue)07:48:25 No.101748987

>>101748636
>>101748687
Why do they all have tranime profile pics

Anonymous
08/06/24(Tue)07:49:26 No.101748994

Anonymous 08/06/24(Tue)07:49:26 No.101748994

>>101748933
>This is not a lucrative market.
Now look at this: >>101748654

Anonymous
08/06/24(Tue)07:50:10 No.101748998

Anonymous 08/06/24(Tue)07:50:10 No.101748998

>>101748845
glm4 9b is one of the worst models i've ever tried

Anonymous
08/06/24(Tue)07:51:23 No.101749010

Anonymous 08/06/24(Tue)07:51:23 No.101749010

>>101748998
Careful. Anti-shill poster will accuse you of shilling literally every model but glm4...

Anonymous
08/06/24(Tue)07:52:49 No.101749026

Anonymous 08/06/24(Tue)07:52:49 No.101749026

>>101748994
>single consultancy agreement
>10k for compute only
kek even the crypto grifters are business gods compared to these guys. it's just a hobby and a money sink if you are not a pro training base models from the ground up

Anonymous
08/06/24(Tue)07:54:31 No.101749036

Anonymous 08/06/24(Tue)07:54:31 No.101749036

>>101748654
>llama2
>10k
that money dried up a long time ago

Anonymous
08/06/24(Tue)07:57:14 No.101749062

Anonymous 08/06/24(Tue)07:57:14 No.101749062

>>101749036
Of course it did. It's probably the only sponsorship and consultancy gig they ever got. Everybody knows that reputation doesn't accumulate over time. That's why they're doing full fine-tunes on H100s now, they have no money!

Anonymous
08/06/24(Tue)07:57:46 No.101749067

Anonymous 08/06/24(Tue)07:57:46 No.101749067

>>101749053
>>101749053
>>101749053

Anonymous
08/06/24(Tue)08:01:56 No.101749111

Anonymous 08/06/24(Tue)08:01:56 No.101749111

>>101748687
>>101748636
>3328$
>1370$
WHOA, look, it's fucking crazy, they are ALMOST like millionaires right now only by slop finetuning... damned ko-fi richfags

Anonymous
08/06/24(Tue)08:07:44 No.101749182

Anonymous 08/06/24(Tue)08:07:44 No.101749182

Yep, no money involved. Nothing to see here. Just organic word of mouth.

Anonymous
08/06/24(Tue)08:10:28 No.101749212

Anonymous 08/06/24(Tue)08:10:28 No.101749212

>>101749111
Pocket money until employment arrives, you know that.

Anonymous
08/06/24(Tue)08:51:05 No.101749621

Anonymous 08/06/24(Tue)08:51:05 No.101749621

>>101748845
>Intel neural chat 7b v3-3
Omg how did it fly under the radar? It must be absolute dogshit if it scores this high on a benchmark.

Anonymous
08/06/24(Tue)08:58:25 No.101749714

Anonymous 08/06/24(Tue)08:58:25 No.101749714

Does vllm gguf inference support CPU offloading?

llama.cpp CUDA dev !!OM2Fp6Fn93S
08/06/24(Tue)09:06:23 No.101749815

llama.cpp CUDA dev !!OM2Fp6Fn93S 08/06/24(Tue)09:06:23 No.101749815

>>101748502
Runtime for LLaMA 3 8b q8_0 on my desktop when generating a single token on an empty context:
-CPU, mmap: 1.985s
-CPU, no mmap: 6.816s
-CUDA, mmap: 1.748s
-CUDA, no mmap: 1.537s
So for the CPU backend the startup with memory mapping is much faster and for CUDA the difference is negligible.
You can also prevent paging via --mlock which in turn necessitates the use of memory mapping.
The memory mapping provided by Windows was kind of bad though I think.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.