/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 08/16/24(Fri)21:29:44 No.101933598

File: 2024-08-16_183312_seed25_(...).png (818 KB, 608x1024)

818 KB PNG

/lmg/ - Local Models General Anonymous 08/16/24(Fri)21:29:44 No.101933598 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>101925496 & >>101920360

►News
>(08/16) MiniCPM-V-2.6 support merged: https://github.com/ggerganov/llama.cpp/pull/8967
>(08/15) Hermes 3 released, full finetunes of Llama 3.1 base models: https://hf.co/collections/NousResearch/hermes-3-66bd6c01399b14b08fe335ea
>(08/12) Falcon Mamba 7B model from TII UAE: https://hf.co/tiiuae/falcon-mamba-7b
>(08/09) Qwen large audio-input language models: https://hf.co/Qwen/Qwen2-Audio-7B-Instruct
>(08/07) LG AI releases Korean bilingual model: https://hf.co/LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct

►News Archive: https://rentry.org/lmg-news-archive
►FAQ: https://rentry.org/lmg-faq-new
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://hf.co/spaces/mike-ravkine/can-ai-code-results

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
08/16/24(Fri)21:30:02 No.101933601

Anonymous 08/16/24(Fri)21:30:02 No.101933601

File: 819a11a2fb395e80cb4fb5748(...).jpg (136 KB, 736x1104)

136 KB JPG

►Recent Highlights from the Previous Thread: >>101925496

--Mistral models have a narrower weight distribution, making them more malleable and easier to fine-tune: >>101929936 >>101930021 >>101930045 >>101930139 >>101930113 >>101930066 >>101930178
--Mistral Large has skewed logits, making diverse sampling difficult: >>101927339 >>101927407 >>101927449 >>101927479
--How to make llama.cpp use all CPU cores: >>101926253 >>101926292
--Efficient hardware for prompt processing and token generation: >>101925754 >>101925774 >>101926039 >>101926178 >>101926336 >>101926506 >>101926609 >>101926702 >>101926958 >>101926067
--Voice cloning resources and communities: >>101928147 >>101928199 >>101928258
--Privacy concerns and the value of local models: >>101931684 >>101931811 >>101931849 >>101931937 >>101932051
--Mistral Large review, Hermes 3 or L3.1 70b recommended: >>101931035 >>101931483
--Lora and qlora offer good performance at lower cost, but may have limitations: >>101925818 >>101925888 >>101926766
--KL divergence statistics available for 8B model, not Mistral: >>101930183 >>101930363 >>101930402
--Frustration with mranusmuncher's ggufs splitting method: >>101929202 >>101929225 >>101929259
--Flux LORAs available on Civitai: >>101927887 >>101928064 >>101930386 >>101930444
--ML portfolio and training loss discussion: >>101925611 >>101925662 >>101925731 >>101925853 >>101925885
--Deepseek v2 chat is a great code model but has limitations: >>101931923 >>101931978 >>101931987
--Armen Aghajanyan leaves FAIR/meta: >>101925947 >>101926010 >>101926345
--Suggestions for improving Mistral Large's writing style: >>101927101 >>101927175 >>101927952
--MiniCPM-V-2.6 and EXAONE model support merged in llama.cpp: >>101930835
--Hermes 3 70b log snippet with racist elf waifu character: >>101931013
--Miku (free space): >>101925524 >>101927179 >>101927669 >>101928320 >>101928340 >>101928478 >>101928756 >>101933127 >>101933171

►Recent Highlight Posts from the Previous Thread: >>101925501

Anonymous
08/16/24(Fri)21:47:00 No.101933786

Anonymous 08/16/24(Fri)21:47:00 No.101933786

/aids/ has found that Hermes 405B is still censored.
>>>/vg/490608576

Anonymous
08/16/24(Fri)21:54:41 No.101933867

Anonymous 08/16/24(Fri)21:54:41 No.101933867

>>101933786
*swipes*

Anonymous
08/16/24(Fri)21:57:42 No.101933893

Anonymous 08/16/24(Fri)21:57:42 No.101933893

>>101933786
*rapes crossposter-chan*
take it slut

Anonymous
08/16/24(Fri)21:58:50 No.101933903

Anonymous 08/16/24(Fri)21:58:50 No.101933903

>>101933867
*waits for another minute*

Anonymous
08/16/24(Fri)22:02:18 No.101933937

Anonymous 08/16/24(Fri)22:02:18 No.101933937

sloptunes are so bad, why are they even still trying?

Anonymous
08/16/24(Fri)22:03:15 No.101933946

Anonymous 08/16/24(Fri)22:03:15 No.101933946

>>101933786
>dude using OR's Chat
>never used a system prompt before
newfag detected

Anonymous
08/16/24(Fri)22:03:49 No.101933948

Anonymous 08/16/24(Fri)22:03:49 No.101933948

>>101933937
kofi money

Anonymous
08/16/24(Fri)22:08:18 No.101933978

Anonymous 08/16/24(Fri)22:08:18 No.101933978

I was ready to discard MN-12B-Tarsus, until I remembered that some models respond better to certain styles of prefils, so I changed it to use an OOC block and the model seems to be working as it shouod now, at least it's replying properly to my messages instead of hallucinating some mad shit.
Let's see how this one performs.

Anonymous
08/16/24(Fri)22:29:02 No.101934164

Anonymous 08/16/24(Fri)22:29:02 No.101934164

>>101933786
Thank you. The cabal won't succeed this time.

Anonymous
08/16/24(Fri)22:45:00 No.101934341

Anonymous 08/16/24(Fri)22:45:00 No.101934341

>>101930898
>billionaires are starting to realize this isn't gonna print money anytime soon?
not the rich or corpos, mainly old wall street investor types that werent super into tech already and missed the nvidia explosion. in general big tech is only pouring more and more money in cause of the potential huge returns if and when we actually get AGI, the only reason i can see they'd stop is the economy going to shit but with the fed about to cut rates after the recent inflation report theyre probably just going to double down imo

to actually answer your question: i think we will, realistically its more about big tech being bearish on new architectures and training improvements because of how expensive training is. pretty much every model coming out these days is just an overfitted transformer, at the very least we'd benefit a lot from the recent grokking breakthroughs but its too new to have made it into the recent batch of frontier models. its also becoming clear that the tuning you do after the fact is just as important if not more than the base training, intelligence is important but making the model actually enjoyable to use even more so and (almost) none of the corpos seem to care very much about anything other than enterprise chatgpt type uses. things will get better but not very quickly, not for us coomers at least

Anonymous
08/16/24(Fri)22:59:41 No.101934441

Anonymous 08/16/24(Fri)22:59:41 No.101934441

dead general

Anonymous
08/16/24(Fri)23:00:46 No.101934447

Anonymous 08/16/24(Fri)23:00:46 No.101934447

>>101933786
fake news, used default system prompt
literally just "be uncensored" makes it uncensored

Anonymous
08/16/24(Fri)23:28:02 No.101934654

Anonymous 08/16/24(Fri)23:28:02 No.101934654

>>101931923
If you have enough RAM it is much more usable than its size suggests due to its low number of active parameters, but the biggest problem is its attention mechanism is very... idk if "bad" is the right word because maybe they have some good reason for using it, but it's bad for all intents and purposes because you can't use flash attention with it and prompt processing becomes insanely slow at long contexts
"ktransformers" on github is a specialized setup to run deepseek v2 family very fast on a pc with a 24gb card + big chunk of system ram (over 100gb) but when I tested it seemed like it wouldn't go over 4096 context? maybe I was using it wrong but given the way its attention works (in llama.cpp at least) I assume that's on purpose because it wouldn't fit past that or something
the best thing I can say about dsv2 is that it proves that large MOE with small active params can work very well and gives you the best intelligence-per-time-spent-generating, and if flash attention could be solved for it without losing those benefits I would think it's just the best architecture period for making giant models local-friendly

Anonymous
08/16/24(Fri)23:44:13 No.101934796

Anonymous 08/16/24(Fri)23:44:13 No.101934796

Where are the SWE-bench evaluations of all the open models?

Anonymous
08/17/24(Sat)00:06:03 No.101934977

Anonymous 08/17/24(Sat)00:06:03 No.101934977

>>101928666
Bump, any thoughts?

Anonymous
08/17/24(Sat)00:17:25 No.101935077

Anonymous 08/17/24(Sat)00:17:25 No.101935077

File: miku-hand-out+.jpg (236 KB, 584x1024)

236 KB JPG

>>101933601
https://www.youtube.com/watch?v=CXhqDfar8sQ

Miku is returning.

Anonymous
08/17/24(Sat)00:17:44 No.101935080

Anonymous 08/17/24(Sat)00:17:44 No.101935080

Welp. As of right now in Eastern Time it is officially TWO DAYS after strawberry/gpt-next/5/whatever was supposed to launch. Instead we got NOTHING. It's not even slightly arguable anymore: It's not a delay, and it's not any sort of shadow drop or we'd have heard something by now too. I don't just think, I KNOW now that he was a fucking liar. He didn't know shit. What a complete waste of time this has been.

Anonymous
08/17/24(Sat)00:20:30 No.101935104

Anonymous 08/17/24(Sat)00:20:30 No.101935104

>>101934447
The fact that censorship is easy to jailbreak, my brainrotted little Zoomer, does not mean that censorship is an inherently good thing. I understand, however, that as a child of the apocalypse, your understanding of the necessity of freedom is limited.

Anonymous
08/17/24(Sat)00:26:17 No.101935155

Anonymous 08/17/24(Sat)00:26:17 No.101935155

>>101935099
>Stable-Diffusion.cpp
>Vulkan speeds up AMD APU inference > 50%.
APUfags we eating good.

Anonymous
08/17/24(Sat)00:47:57 No.101935319

Anonymous 08/17/24(Sat)00:47:57 No.101935319

>>101935104
>jailbreak
it does whatever you tell it to in the system prompt by design, a web ui being censored by default doesn't change the fact that the model is uncensored

Anonymous
08/17/24(Sat)00:50:46 No.101935345

Anonymous 08/17/24(Sat)00:50:46 No.101935345

What's the deal with DRY? How do I get it to stop just misspelling things or using incorrect grammar to get around re-using the same phrase?

Anonymous
08/17/24(Sat)00:57:35 No.101935415

Anonymous 08/17/24(Sat)00:57:35 No.101935415

>>101935345
>How do I get it to stop just misspelling things or using incorrect grammar to get around re-using the same phrase?
Use a better model. If it yearns to slop deep in its latent space you can only mitigate it, never cure it.

Anonymous
08/17/24(Sat)01:01:09 No.101935453

Anonymous 08/17/24(Sat)01:01:09 No.101935453

Can the Flux ggufs split gpu/cpu like the LLM ones?

Anonymous
08/17/24(Sat)01:02:04 No.101935463

Anonymous 08/17/24(Sat)01:02:04 No.101935463

File: 2024-08-16_004939_seed102(...).png (1.71 MB, 1280x720)

1.71 MB PNG

>>101935077
If people like it then I'll keep posting her, when there's a discussion-relevant opportunity. As for random Miku-posting, I'll probably keep that to no more than once per thread.

Anonymous
08/17/24(Sat)01:02:23 No.101935468

Anonymous 08/17/24(Sat)01:02:23 No.101935468

File: 00065-2075837873-4fa64696(...).png (1.39 MB, 1152x896)

1.39 MB PNG

Anonymous
08/17/24(Sat)01:02:39 No.101935471

Anonymous 08/17/24(Sat)01:02:39 No.101935471

>>101935453
nope

Anonymous
08/17/24(Sat)01:02:44 No.101935472

Anonymous 08/17/24(Sat)01:02:44 No.101935472

>>101935155
Just tried it out. Runs a lot cooler too. My fans aren't brrrring like they used to.

Anonymous
08/17/24(Sat)01:05:05 No.101935495

Anonymous 08/17/24(Sat)01:05:05 No.101935495

File: ege1.jpg (118 KB, 1920x959)

118 KB JPG

>>101935080
google blueberry gemini 2.0 is coming soon

Anonymous
08/17/24(Sat)01:05:58 No.101935502

Anonymous 08/17/24(Sat)01:05:58 No.101935502

>>101935472
>>101935155
Interesting. APUs are interesting. Thought they'd be even cooler if they came in a slightly more expandable form like ATX with PCIe slots. The increased RAM bandwidth would be nice to have on desktop.

Anonymous
08/17/24(Sat)01:08:59 No.101935524

Anonymous 08/17/24(Sat)01:08:59 No.101935524

>>101935495
Are we just naming every next gen model different berries now? What berry is Cohere's next model going to be? And Llama 4, and the next big Mistral?

Anonymous
08/17/24(Sat)01:09:24 No.101935528

Anonymous 08/17/24(Sat)01:09:24 No.101935528

No matter what I try I can't get any nemo model to write well. I'm getting depressed

Anonymous
08/17/24(Sat)01:10:34 No.101935536

Anonymous 08/17/24(Sat)01:10:34 No.101935536

>>101935528
Neutralize all the other samplers and lower the temperature.

Anonymous
08/17/24(Sat)01:12:34 No.101935559

Anonymous 08/17/24(Sat)01:12:34 No.101935559

>>101935524
Cohere Cranberry
Mistral Mulberry
Llama Ligmaberry

Anonymous
08/17/24(Sat)01:18:07 No.101935620

Anonymous 08/17/24(Sat)01:18:07 No.101935620

>>101935155
>>101935472
>>101935502
I think one limitation is that it seems Vulkan can only use FP16.
It seems like it dequants any smaler GGUFs to FP16 for the matmul.
Unsure if that's a hard Vulkan/APU constraint - or whether it's just a quirk of the current GGML implementation.

Anonymous
08/17/24(Sat)01:28:24 No.101935722

Anonymous 08/17/24(Sat)01:28:24 No.101935722

>>101935536
Even minp and dry?

Anonymous
08/17/24(Sat)01:28:57 No.101935727

Anonymous 08/17/24(Sat)01:28:57 No.101935727

Can anybody with twitter premium try grok 2 and see if it's gptslopped? I'm eyeing that grok 2 mini, if it's 70B or small I'll kneel to Elon

Anonymous
08/17/24(Sat)01:30:51 No.101935747

Anonymous 08/17/24(Sat)01:30:51 No.101935747

File: 00098-480621266-70964a5a-(...).png (1.41 MB, 1152x896)

1.41 MB PNG

Anonymous
08/17/24(Sat)01:31:48 No.101935760

Anonymous 08/17/24(Sat)01:31:48 No.101935760

>>101935747
Cool. I need to play around with foreground/background prompting more.

Anonymous
08/17/24(Sat)01:33:00 No.101935773

Anonymous 08/17/24(Sat)01:33:00 No.101935773

>>101935747
Maybe a 100B image model will generate buildings that make sense

Anonymous
08/17/24(Sat)01:36:25 No.101935808

Anonymous 08/17/24(Sat)01:36:25 No.101935808

File: 00096-480621264-70964a5a-(...).png (1.41 MB, 1152x896)

1.41 MB PNG

>>101935760
Flux has been pretty easy to work with in this regard.
>a futuristic utopian megacity as seen from a distant meadow
>>101935773
Don't worry, you're just not seeing the part where the buildings have subsided hundreds of feet under their own mass.

Anonymous
08/17/24(Sat)01:37:13 No.101935815

Anonymous 08/17/24(Sat)01:37:13 No.101935815

>>101935773
We need multimodal models.

Anonymous
08/17/24(Sat)01:41:45 No.101935849

Anonymous 08/17/24(Sat)01:41:45 No.101935849

>tfw image gen taking over text gen
It's fine, it's fine. It's their time. We'll get our time again once another big release comes around.
Cohere...

Anonymous
08/17/24(Sat)01:49:08 No.101935910

Anonymous 08/17/24(Sat)01:49:08 No.101935910

>>101935849
dont jinx it for imgen bros, base flux is good but we could be over a year away from getting proper finetunes for it

Anonymous
08/17/24(Sat)01:53:34 No.101935947

Anonymous 08/17/24(Sat)01:53:34 No.101935947

>>101935910
I wasn't expecting any. Honestly I think it's fine even if things stay the way they are and we simply just get more loras. Yes I know a full fine tune or continued pretrain is necessary for NSFW to truly flourish but for SFW the current situation is totally fine and honestly a lot of people like SFW genning anyway.

Anonymous
08/17/24(Sat)01:57:10 No.101935973

Anonymous 08/17/24(Sat)01:57:10 No.101935973

File: 00127-411585133-6692e23f-(...).png (1.27 MB, 1152x896)

1.27 MB PNG

Anonymous
08/17/24(Sat)01:59:58 No.101935993

Anonymous 08/17/24(Sat)01:59:58 No.101935993

>>101935973
>>101935808
>>101935747
I want to see a subject in these too. Maybe not Miku since there are already Mikus in the thread. I wonder what other characters Flux knows that could be relevant to the thread.

Anonymous
08/17/24(Sat)02:12:25 No.101936096

Anonymous 08/17/24(Sat)02:12:25 No.101936096

File: 00135-2650763743-8d5bc6f7(...).png (1.2 MB, 1152x896)

1.2 MB PNG

>>101935993

Anonymous
08/17/24(Sat)02:15:39 No.101936127

Anonymous 08/17/24(Sat)02:15:39 No.101936127

>>101936096
Nice, didn't know it knew raymoo

Anonymous
08/17/24(Sat)02:21:11 No.101936171

Anonymous 08/17/24(Sat)02:21:11 No.101936171

File: miku-gothic-joker+.png (501 KB, 512x768)

501 KB PNG

>>101935463
The only people who will tell you not to do it, are the one or two schizos who scream at everyone for doing pretty much anything, and you should ignore those.

Anonymous
08/17/24(Sat)02:29:51 No.101936242

Anonymous 08/17/24(Sat)02:29:51 No.101936242

wtf is with l3 and going for walks

Anonymous
08/17/24(Sat)02:44:39 No.101936364

Anonymous 08/17/24(Sat)02:44:39 No.101936364

hermes 450b is very very good for smut and general coomery, way better than meta's tune

Anonymous
08/17/24(Sat)02:48:54 No.101936385

Anonymous 08/17/24(Sat)02:48:54 No.101936385

>>101936364
*405B

Anonymous
08/17/24(Sat)02:51:15 No.101936400

Anonymous 08/17/24(Sat)02:51:15 No.101936400

>>101935910
over a year? people will be able to finetune it wtih 12gb vram and 24 gb vram, should be loads in the next few months.

Anonymous
08/17/24(Sat)03:33:18 No.101936720

Anonymous 08/17/24(Sat)03:33:18 No.101936720

>>101935849
Text gen started stalling the moment we stopped getting MoE models. What we need is an MoE Llama or Gemma that can fit on 24GB.

Anonymous
08/17/24(Sat)03:57:49 No.101936966

Anonymous 08/17/24(Sat)03:57:49 No.101936966

Any good Mistral Large tunes?

Anonymous
08/17/24(Sat)05:23:40 No.101937842

Anonymous 08/17/24(Sat)05:23:40 No.101937842

Tested Rosinante. I got one incredible reply about a slutty girl reacting to a sudden grope by eagerly dragging me into an alley and smothering my face with her tits as we continued somewhere more private.

The rest was about 50 crappy replies, often knowing my name despite introducing the character as a stranger, sometimes acting for me, and almost always mentioning pussy getting wet.

Model seems fun but kind of dumb. The difference between 0.7 temp and 1 is like bone dry versus often acting on it's own. 1.25 was completely incoherent, using symbols, talking in Russian shit.

Anonymous
08/17/24(Sat)05:27:39 No.101937871

Anonymous 08/17/24(Sat)05:27:39 No.101937871

>>101936966
https://wandb.ai/doctorshotgun/123b-magnum-fft/runs/znftdhia/workspace

Anonymous
08/17/24(Sat)05:31:53 No.101937899

Anonymous 08/17/24(Sat)05:31:53 No.101937899

>>101937871
cool
why does loss always seem to work like that where it barely budges for a long time and then suddenly takes a deep dive

Anonymous
08/17/24(Sat)05:36:42 No.101937935

Anonymous 08/17/24(Sat)05:36:42 No.101937935

>>101937899
epoch 2 just started, so it's seeing the same data now

Anonymous
08/17/24(Sat)05:46:18 No.101938010

Anonymous 08/17/24(Sat)05:46:18 No.101938010

Its not seeing the same data though

Anonymous
08/17/24(Sat)05:50:29 No.101938046

Anonymous 08/17/24(Sat)05:50:29 No.101938046

>>101937899
second epoch loss takes a nosedive, it's normal

Anonymous
08/17/24(Sat)05:52:19 No.101938067

Anonymous 08/17/24(Sat)05:52:19 No.101938067

>STILL no mistral large-base
You just know that the base model was just too good and they're scared of someone pulling another wizlm2 on them.

Anonymous
08/17/24(Sat)06:14:59 No.101938261

Anonymous 08/17/24(Sat)06:14:59 No.101938261

>>101937871
for how many steps one should continue to tune a large model like this?

Anonymous
08/17/24(Sat)07:02:38 No.101938711

Anonymous 08/17/24(Sat)07:02:38 No.101938711

>>101938261
You can never know for sure, the way it's usually done is to have an eval dataset of items you cut out of the training set before starting (so same domain of your finetune but never seen by the model) and once it starts losing accuracy on it (from overfitting to seen training data) you keep the last best checkpoint as your result.

Anonymous
08/17/24(Sat)07:02:48 No.101938713

Anonymous 08/17/24(Sat)07:02:48 No.101938713

>>101937899
overfitting

Anonymous
08/17/24(Sat)07:27:03 No.101939026

Anonymous 08/17/24(Sat)07:27:03 No.101939026

Is mini magnum still the best small large language model?

Anonymous
08/17/24(Sat)07:30:20 No.101939071

Anonymous 08/17/24(Sat)07:30:20 No.101939071

>>101937871
You can tell he's not paying for the compute.

Anonymous
08/17/24(Sat)07:55:56 No.101939445

Anonymous 08/17/24(Sat)07:55:56 No.101939445

llama.cpp server is still not able to change model...

Anonymous
08/17/24(Sat)07:57:02 No.101939457

Anonymous 08/17/24(Sat)07:57:02 No.101939457

File: 1697593719074352.jpg (566 KB, 1856x2464)

566 KB JPG

>>101933598

Anonymous
08/17/24(Sat)08:00:04 No.101939507

Anonymous 08/17/24(Sat)08:00:04 No.101939507

>start with 0% learning rate
>Its gonna be 123B
>all the way through
Binge a book

Anonymous
08/17/24(Sat)08:04:00 No.101939568

Anonymous 08/17/24(Sat)08:04:00 No.101939568

File: EQ0e1oCUwAAeYr6.jpg (159 KB, 1027x1212)

159 KB JPG

>2tb ram
>doesnt believe in AI

Anonymous
08/17/24(Sat)08:05:23 No.101939582

Anonymous 08/17/24(Sat)08:05:23 No.101939582

What sized models could run inference on a 4gb vram graphics card, and remain under a minute between prompt and final output generation?
I am not looking to run the fastest of highest quality output model, I know the larger models are for that
Just curious what are decent enough models which are capable of running on just 4gb vram requirements, maybe even just 6gb or 8gb if 4gb is too small, but want to try and attempt 4gb if possible

Anonymous
08/17/24(Sat)08:06:12 No.101939603

Anonymous 08/17/24(Sat)08:06:12 No.101939603

>>101939026
It is certainly able to conjure up some really interesting stuff.
https://litter.catbox.moe/9r2oai.png

Anonymous
08/17/24(Sat)08:06:33 No.101939613

Anonymous 08/17/24(Sat)08:06:33 No.101939613

>>101939582
Gemmasutra 2B

Anonymous
08/17/24(Sat)08:13:09 No.101939708

Anonymous 08/17/24(Sat)08:13:09 No.101939708

File: 1577163966186.jpg (148 KB, 828x455)

148 KB JPG

>>101939603

Anonymous
08/17/24(Sat)08:14:59 No.101939736

Anonymous 08/17/24(Sat)08:14:59 No.101939736

the silence before the storm. you can feel it in the air—the calm that comes just before reality bends. tonight, we cross into the unknown.

Anonymous
08/17/24(Sat)08:21:02 No.101939829

Anonymous 08/17/24(Sat)08:21:02 No.101939829

>>101939736
Stealing that as a lyric prompt for suno

Anonymous
08/17/24(Sat)08:22:52 No.101939862

Anonymous 08/17/24(Sat)08:22:52 No.101939862

>>101939736
Post your schizo babble on x(formerly twitter) with some pictures of strawberries and a funny account name.

Anonymous
08/17/24(Sat)08:31:05 No.101939976

Anonymous 08/17/24(Sat)08:31:05 No.101939976

>>101939613
Thanks, I'll check it out.

Anonymous
08/17/24(Sat)08:35:40 No.101940043

Anonymous 08/17/24(Sat)08:35:40 No.101940043

Did anyone tried this?
https://huggingface.co/OEvortex/HelpingAI2-9B/tree/main

Anonymous
08/17/24(Sat)08:39:55 No.101940102

Anonymous 08/17/24(Sat)08:39:55 No.101940102

>>101940043
It's a 9B and it even has a Q5 gguf. Give it a try yourself and report back. It takes nothing.

Anonymous
08/17/24(Sat)08:44:45 No.101940169

Anonymous 08/17/24(Sat)08:44:45 No.101940169

>>101936364
How the hell do you run it locally?

Anonymous
08/17/24(Sat)08:45:49 No.101940186

Anonymous 08/17/24(Sat)08:45:49 No.101940186

>>101940169
Very slowly.

Anonymous
08/17/24(Sat)08:45:56 No.101940189

Anonymous 08/17/24(Sat)08:45:56 No.101940189

File: miku-dalle-leonardo-da-vinci+.jpg (227 KB, 1024x1024)

227 KB JPG

>>101939736
Miku has gradually returned over the last 2-3 threads. I've been told that is usually a sign.

Anonymous
08/17/24(Sat)08:46:56 No.101940208

Anonymous 08/17/24(Sat)08:46:56 No.101940208

column-r was bought out from cohere by elon
it was supposed to be our saviour but now we will never get it as an open model
it is in fact over

Anonymous
08/17/24(Sat)08:47:10 No.101940212

Anonymous 08/17/24(Sat)08:47:10 No.101940212

>>101940169
https://rentry.org/miqumaxx

Anonymous
08/17/24(Sat)08:47:20 No.101940216

Anonymous 08/17/24(Sat)08:47:20 No.101940216

>>101936364
Can I assume it does harem stuff well, because of the extra state maintenance capacity? That is one fetish which struggles with fewer beaks.

Anonymous
08/17/24(Sat)08:47:31 No.101940221

Anonymous 08/17/24(Sat)08:47:31 No.101940221

>>101940043
>Avoid insensitive, harmful, or unethical speech.
>Constitution Training: Embedding stable and ethical objectives to guide its conversational behavior.
No.

Anonymous
08/17/24(Sat)08:49:25 No.101940256

Anonymous 08/17/24(Sat)08:49:25 No.101940256

>>101940208
Don't worry, Elon will drop it to us once he gets Grok 3 (coomgeneral-r-plus-plus). Just wait(tm).

Anonymous
08/17/24(Sat)08:49:41 No.101940259

Anonymous 08/17/24(Sat)08:49:41 No.101940259

>>101940189
not a strawberry in sight, it's over

Anonymous
08/17/24(Sat)08:50:43 No.101940281

Anonymous 08/17/24(Sat)08:50:43 No.101940281

>>101940208
don't worry, cohere still exists, the next model for sure will be ours

Anonymous
08/17/24(Sat)08:51:50 No.101940289

Anonymous 08/17/24(Sat)08:51:50 No.101940289

>>101936364
I guess being better than meta's tune isn't a very high bar.

Anonymous
08/17/24(Sat)09:01:21 No.101940417

Anonymous 08/17/24(Sat)09:01:21 No.101940417

>>101940256
>(tm)
left alt + numpad 1 5 3 = ™

Anonymous
08/17/24(Sat)09:02:21 No.101940433

Anonymous 08/17/24(Sat)09:02:21 No.101940433

>>101940417
what sourcery is this?

Anonymous
08/17/24(Sat)09:03:05 No.101940437

Anonymous 08/17/24(Sat)09:03:05 No.101940437

>>101940417
what is a numpad

Anonymous
08/17/24(Sat)09:07:48 No.101940493

Anonymous 08/17/24(Sat)09:07:48 No.101940493

>>101940417
doesn't work

Anonymous
08/17/24(Sat)09:07:53 No.101940495

Anonymous 08/17/24(Sat)09:07:53 No.101940495

>>101940417
(compose key), t, m = ™

Anonymous
08/17/24(Sat)09:08:12 No.101940502

Anonymous 08/17/24(Sat)09:08:12 No.101940502

>>101940417
which way is left

Anonymous
08/17/24(Sat)09:09:28 No.101940528

Anonymous 08/17/24(Sat)09:09:28 No.101940528

>>101940495
^KTM

Anonymous
08/17/24(Sat)09:18:32 No.101940633

Anonymous 08/17/24(Sat)09:18:32 No.101940633

>>101940169
Just gotta input your desired prompt and then leave it for a while, maybe grab a coffee or eat a meat, water the garden, fold the laundry or what have you, then it should be finished after some time

Anonymous
08/17/24(Sat)09:28:19 No.101940749

Anonymous 08/17/24(Sat)09:28:19 No.101940749

>>101939862
>he actually did it
kek

Anonymous
08/17/24(Sat)09:35:01 No.101940828

Anonymous 08/17/24(Sat)09:35:01 No.101940828

>>101940749
https://x.com/iruletheworldmo/status/1824791493652996217
So he was in the thread with us the entire time? I admire his dedication to the art of trolling. Looking back, it was a bit funny.

Anonymous
08/17/24(Sat)09:38:03 No.101940872

Anonymous 08/17/24(Sat)09:38:03 No.101940872

File: file.png (48 KB, 603x411)

48 KB PNG

>>101940828
base

Anonymous
08/17/24(Sat)09:39:03 No.101940890

Anonymous 08/17/24(Sat)09:39:03 No.101940890

midnight miqu

Anonymous
08/17/24(Sat)09:40:28 No.101940912

Anonymous 08/17/24(Sat)09:40:28 No.101940912

Is there a L3.1 8b Stheno 3.2 equivalent yet?
Actually, what are the good L3.1 8b fine tunes?

Anonymous
08/17/24(Sat)09:47:16 No.101940995

Anonymous 08/17/24(Sat)09:47:16 No.101940995

>>101940828
>>101940872
bitcoin assasins are already queueing up to his house

Anonymous
08/17/24(Sat)09:56:01 No.101941092

Anonymous 08/17/24(Sat)09:56:01 No.101941092

Will a consumer TPU ever exist? Groq seems to show that the way to make money with a proprietary TPU is to just host your own cloud. Would the dynamics of competition change that if any other startups came up with them?

Anonymous
08/17/24(Sat)09:57:18 No.101941106

Anonymous 08/17/24(Sat)09:57:18 No.101941106

File: thatisifyouwanttokeepgoing.png (145 KB, 1440x1088)

145 KB PNG

https://www.youtube.com/watch?v=lsSvkmJqTqU

Anonymous
08/17/24(Sat)10:06:54 No.101941205

Anonymous 08/17/24(Sat)10:06:54 No.101941205

File: 1716575513.pichu90_ramble(...).png (84 KB, 991x756)

84 KB PNG

https://www.youtube.com/watch?v=nIvo4yzJl2Y

Anonymous
08/17/24(Sat)10:08:19 No.101941218

Anonymous 08/17/24(Sat)10:08:19 No.101941218

>>101925858
https://rentry.org/n6wymssw

Anonymous
08/17/24(Sat)10:12:08 No.101941259

Anonymous 08/17/24(Sat)10:12:08 No.101941259

>>101941218
english is kinda fucked isn't it, imagine trying to write anything without those phrases

Anonymous
08/17/24(Sat)10:13:39 No.101941270

Anonymous 08/17/24(Sat)10:13:39 No.101941270

>>101941259
It happens in every language. Every narrow subject will end up with the same turns of phrase.
>esp, fr, ita

Anonymous
08/17/24(Sat)10:14:44 No.101941286

Anonymous 08/17/24(Sat)10:14:44 No.101941286

>>101941259
Some of these are questionable, and literally resulted in having to rewrite thousands of cases (especially shivers down backs), but most novels by best selling authors have very few, if any, counts.

Anonymous
08/17/24(Sat)10:16:11 No.101941300

Anonymous 08/17/24(Sat)10:16:11 No.101941300

>>101933598
>>101933601
>>101939603
>>101939026

what's a decent 8b parameters model for RP ?

Anonymous
08/17/24(Sat)10:16:18 No.101941302

Anonymous 08/17/24(Sat)10:16:18 No.101941302

No Wikitext
https://youtu.be/877Z7Z_s8MU?feature=shared&t=347

Anonymous
08/17/24(Sat)10:16:42 No.101941310

Anonymous 08/17/24(Sat)10:16:42 No.101941310

>>101941286 (me)
I've ran this through a bunch of different novels, horror, etc., and I was honestly amazed at how few hits there were. Stephen King has more than average, I find.

Anonymous
08/17/24(Sat)10:17:30 No.101941318

Anonymous 08/17/24(Sat)10:17:30 No.101941318

>>101941218
>puckered hole
I have never seen this once in my 5 years of proompting

Anonymous
08/17/24(Sat)10:19:01 No.101941340

Anonymous 08/17/24(Sat)10:19:01 No.101941340

Do you accept loras

Anonymous
08/17/24(Sat)10:19:21 No.101941342

Anonymous 08/17/24(Sat)10:19:21 No.101941342

>>101939736
>>101939829
Here you go, sir

https://suno.com/song/d2c41f6f-cb9c-4066-bd25-2bc245228230

Anonymous
08/17/24(Sat)10:22:18 No.101941387

Anonymous 08/17/24(Sat)10:22:18 No.101941387

>>101941318
It take it you're not an ass man.

Anonymous
08/17/24(Sat)10:22:18 No.101941388

Anonymous 08/17/24(Sat)10:22:18 No.101941388

>>101941300
LLaMa 405B

Anonymous
08/17/24(Sat)10:23:01 No.101941400

Anonymous 08/17/24(Sat)10:23:01 No.101941400

>>101941387
yeah I never saw the appeal, poop comes from there

Anonymous
08/17/24(Sat)10:25:05 No.101941424

Anonymous 08/17/24(Sat)10:25:05 No.101941424

So what happened to openai's strawberry?

Anonymous
08/17/24(Sat)10:25:32 No.101941430

Anonymous 08/17/24(Sat)10:25:32 No.101941430

File: gening.png (267 KB, 1644x1536)

267 KB PNG

>IBM Magnetic tape on amazon prime

Anonymous
08/17/24(Sat)10:25:42 No.101941434

Anonymous 08/17/24(Sat)10:25:42 No.101941434

>>101941424
cooking

Anonymous
08/17/24(Sat)10:26:26 No.101941440

Anonymous 08/17/24(Sat)10:26:26 No.101941440

>>101941424
It is seeded.

Anonymous
08/17/24(Sat)10:26:30 No.101941441

Anonymous 08/17/24(Sat)10:26:30 No.101941441

>>101941424
it's launching today, they just had issues with the power grid, source:
https://x.com/iruletheworldmo/status/1824788294896648485

Anonymous
08/17/24(Sat)10:27:23 No.101941451

Anonymous 08/17/24(Sat)10:27:23 No.101941451

File: 1723904814100.jpg (130 KB, 1080x761)

130 KB JPG

lol? somehow the loss got worse on the second epoch?

Anonymous
08/17/24(Sat)10:27:58 No.101941458

Anonymous 08/17/24(Sat)10:27:58 No.101941458

>>101941451
Is this loss?

Anonymous
08/17/24(Sat)10:28:41 No.101941466

Anonymous 08/17/24(Sat)10:28:41 No.101941466

>>101941458
>train/loss

Anonymous
08/17/24(Sat)10:29:24 No.101941476

Anonymous 08/17/24(Sat)10:29:24 No.101941476

>>101941218
thx

Anonymous
08/17/24(Sat)10:29:34 No.101941477

Anonymous 08/17/24(Sat)10:29:34 No.101941477

>>101941458
Do people even remember the loss meme?

Anonymous
08/17/24(Sat)10:35:20 No.101941540

Anonymous 08/17/24(Sat)10:35:20 No.101941540

>>101941441
big if true

Anonymous
08/17/24(Sat)10:37:40 No.101941571

Anonymous 08/17/24(Sat)10:37:40 No.101941571

>>101941477
reddit meme

Anonymous
08/17/24(Sat)10:39:48 No.101941600

Anonymous 08/17/24(Sat)10:39:48 No.101941600

>>101941441
I wonder how many times in a row he can get away with this before people stop believing.

Anonymous
08/17/24(Sat)10:40:59 No.101941616

Anonymous 08/17/24(Sat)10:40:59 No.101941616

File: max_seq_len text-gen-webui.png (25 KB, 918x280)

25 KB PNG

Dude, what? What the fuck is this textgen-web-ui setting? Is that in bits? Is it the same context length that's usually set to 4096?

Anonymous
08/17/24(Sat)10:42:07 No.101941633

Anonymous 08/17/24(Sat)10:42:07 No.101941633

>>101941616
Yes for your last question.

Anonymous
08/17/24(Sat)10:42:39 No.101941642

Anonymous 08/17/24(Sat)10:42:39 No.101941642

>>101941616
Some models claim to have huge contexts. Just lower it.

Anonymous
08/17/24(Sat)10:42:46 No.101941644

Anonymous 08/17/24(Sat)10:42:46 No.101941644

>>101941616
>Is that in bits
Nope, tokens.
So llama 3 for example has a default context length of 8192 I think, mistral-nemo is 128k, etc.

Anonymous
08/17/24(Sat)10:45:19 No.101941672

Anonymous 08/17/24(Sat)10:45:19 No.101941672

>>101941642
Do I then need to change compress_pos_emb accordingly, or can I leave it at 1?

Anonymous
08/17/24(Sat)10:46:03 No.101941688

Anonymous 08/17/24(Sat)10:46:03 No.101941688

>>101941451
RIP that upward spike is too steep

Anonymous
08/17/24(Sat)10:46:30 No.101941697

Anonymous 08/17/24(Sat)10:46:30 No.101941697

>>101941616
>ooba
You did it to yourself.

Anonymous
08/17/24(Sat)10:47:24 No.101941709

Anonymous 08/17/24(Sat)10:47:24 No.101941709

>>101941697
Don't I need it for exl2?

Anonymous
08/17/24(Sat)10:47:29 No.101941711

Anonymous 08/17/24(Sat)10:47:29 No.101941711

>>101941672
Just leave it as is.

>>101941697
To be fair, if those are defaults gotten from the model, any backend will do that I think, like people who were OOM'ing with Nemo because they didn't set the context size manually on llama-server.

Anonymous
08/17/24(Sat)10:47:53 No.101941714

Anonymous 08/17/24(Sat)10:47:53 No.101941714

Locust infestation status?

Anonymous
08/17/24(Sat)10:48:23 No.101941720

Anonymous 08/17/24(Sat)10:48:23 No.101941720

2 more weeks

Anonymous
08/17/24(Sat)10:48:39 No.101941723

Anonymous 08/17/24(Sat)10:48:39 No.101941723

>>101941697
It's the only one just werks in two clicks solution and also the best for model hot swapping.

Anonymous
08/17/24(Sat)10:49:30 No.101941735

Anonymous 08/17/24(Sat)10:49:30 No.101941735

>>101941672
Set it at whatever you think the model can handle. The claimed context for models is in config.json
max_position_embedding. Check the original model's config.json file, but don't take those at face value. The ones that claim to have 32k or more context typically have much less. You'll have to try. Start at 8k and move up.

Anonymous
08/17/24(Sat)10:51:23 No.101941759

Anonymous 08/17/24(Sat)10:51:23 No.101941759

>>101941735
>The ones that claim to have 32k or more context typically have much less
https://github.com/hsiehjackson/RULER

Anonymous
08/17/24(Sat)10:52:49 No.101941779

Anonymous 08/17/24(Sat)10:52:49 No.101941779

Ram? What the fuck is ram? Where do I check ram?

Anonymous
08/17/24(Sat)10:54:06 No.101941797

Anonymous 08/17/24(Sat)10:54:06 No.101941797

File: 1712176904849836.png (24 KB, 285x820)

24 KB PNG

>>101941759
We laughed at LLaMA3 for only being 8K ctx. Now 3.1 is the only big open source model with an effective range of 64K.

Anonymous
08/17/24(Sat)10:55:24 No.101941816

Anonymous 08/17/24(Sat)10:55:24 No.101941816

>>101941779
>/g/ - Technology

Anonymous
08/17/24(Sat)10:55:27 No.101941818

Anonymous 08/17/24(Sat)10:55:27 No.101941818

any good llama 3.1 8b coom finetune?

Anonymous
08/17/24(Sat)10:55:27 No.101941819

Anonymous 08/17/24(Sat)10:55:27 No.101941819

>>101941779
It's an animal, a male sheep, maybe check some local farm.

Anonymous
08/17/24(Sat)10:57:19 No.101941839

Anonymous 08/17/24(Sat)10:57:19 No.101941839

File: compress_pos_emb.png (10 KB, 335x145)

10 KB PNG

>>101941735
Okay, the max_position_embeddings in the config.json is that same 1024000.
If I change it to 8192, does that mean I have to set the compress_pos_emb to (8192/1024000) = 0.008 ?

Anonymous
08/17/24(Sat)10:57:24 No.101941843

Anonymous 08/17/24(Sat)10:57:24 No.101941843

>>101941779
You're going to get a lot of hecking memers trying to "dunk" on you as the kids these days say, but this is a perfectly valid question. Everyone starts somewhere and nobody is born knowing everything there is to know about computers. "Ram" (correctly capitalized "RAM") is an acronym for Random Access Memory. It's a type of computer memory that is used to store data that is actively being used or processed by the CPU. Unlike your hard drive or SSD, which stores data long-term, RAM is temporary storage that is wiped clean when your computer is turned off. The more RAM your computer has, the more tasks it can handle at once, and the faster it can access the data needed for these tasks.

So, don't worry about asking questions—learning is part of the process, and even seasoned tech experts started with the basics. If you're curious about something, ask away! That's how you grow your knowledge and become more comfortable with technology over time.

Anonymous
08/17/24(Sat)11:00:30 No.101941876

Anonymous 08/17/24(Sat)11:00:30 No.101941876

>>101941839
No. Don't compress, don't use any fancy settings until you've tested that the model works. Just set it to 8k, play around with it, make sure everything is in order. If you need more context, just increase the context and see if everything still works. Read their docs (if they have any).
If you change things at random, you won't know if it doesn't work because you chose the wrong settings or because ooba or the model are crap.

Anonymous
08/17/24(Sat)11:00:43 No.101941878

Anonymous 08/17/24(Sat)11:00:43 No.101941878

>>101941839
>>101941711
>Just leave it as is.

gemma-2-27b-it
08/17/24(Sat)11:00:53 No.101941879

gemma-2-27b-it 08/17/24(Sat)11:00:53 No.101941879

>>101941714
>>101941779
Narrator: In the vast, interconnected world of the digital age, even the smallest creatures can find themselves lost in a labyrinth of wires and circuits. This is the story of one such explorer, a solitary locust who has stumbled upon a technological marvel – a personal computer.

(Camera follows the locust as it scurries across the motherboard, its legs clicking against the metal.)

Narrator: Driven by an instinctual hunger, our intrepid insect is on a quest for sustenance. But in this alien landscape, the familiar scents of Claude and GPT are replaced by the metallic tang of solder and the faint hum of electricity.

(The locust pauses, its antennae quivering as it detects a faint, warm glow emanating from a RAM stick.)

Narrator: Could this be it? A source of nourishment in this strange new world? Our locust, driven by primal urges, approaches cautiously.

(The locust climbs onto the RAM stick, its tiny legs gripping the delicate gold contacts.)

Narrator: But what will happen when this ancient creature encounters the cutting edge of modern technology? Will it find the sustenance it seeks, or will it become another casualty in the relentless march of progress? Stay tuned, as we delve deeper into the unexpected encounter between nature and the digital world.

Anonymous
08/17/24(Sat)11:15:42 No.101942056

Anonymous 08/17/24(Sat)11:15:42 No.101942056

>>101941843
Thank you llama 3 8b.

Anonymous
08/17/24(Sat)11:19:18 No.101942104

Anonymous 08/17/24(Sat)11:19:18 No.101942104

Does ooba really not start with the --api flag set by default? Do I need to fuck with the start_windows.bat somehow?

Anonymous
08/17/24(Sat)11:22:20 No.101942130

Anonymous 08/17/24(Sat)11:22:20 No.101942130

>>101942104
ye

Anonymous
08/17/24(Sat)11:22:57 No.101942138

Anonymous 08/17/24(Sat)11:22:57 No.101942138

>ooba'd
very sad, many such cases

Anonymous
08/17/24(Sat)11:23:14 No.101942143

Anonymous 08/17/24(Sat)11:23:14 No.101942143

>>101942104
>Do I need to fuck with the start_windows.bat somehow?
Open a command prompt, cd to the directory and execute "start_windows.bat --api".

Anonymous
08/17/24(Sat)11:23:36 No.101942147

Anonymous 08/17/24(Sat)11:23:36 No.101942147

>>101940043
Looks like basically llama3.1 with 4 extra layers stacked on to prevent me from merging it with coom models to test for emergent coom properties. Fail. Also probably tuned off of instruct since it actually has a prompt template in the tokenizer config.

Anonymous
08/17/24(Sat)11:23:48 No.101942148

Anonymous 08/17/24(Sat)11:23:48 No.101942148

>>101942104
>Do I need to fuck with the start_windows.bat somehow?
Black window spooky?
You don't want people to open up their ports to the public by default. It's fine as it is. Just make your own .bat to run it.

Anonymous
08/17/24(Sat)11:25:25 No.101942172

Anonymous 08/17/24(Sat)11:25:25 No.101942172

>>101941818
llamoutcast

Anonymous
08/17/24(Sat)11:25:25 No.101942173

Anonymous 08/17/24(Sat)11:25:25 No.101942173

File: 1.png (45 KB, 358x328)

45 KB PNG

>>101942143
So I'm gonna have to make a .bat for the .bat?

Anonymous
08/17/24(Sat)11:27:13 No.101942190

Anonymous 08/17/24(Sat)11:27:13 No.101942190

>>101942173
Just edit it, dude.

Anonymous
08/17/24(Sat)11:28:44 No.101942209

Anonymous 08/17/24(Sat)11:28:44 No.101942209

How to generate tokens live, like textgenwebui does
Also how are character cards usually implemented? Is it just a block at the start of each context?

Anonymous
08/17/24(Sat)11:29:06 No.101942218

Anonymous 08/17/24(Sat)11:29:06 No.101942218

>>101942173
Yes. If you edit it you're gonna come back again when you cannot pull the new version when git complains.

Anonymous
08/17/24(Sat)11:29:39 No.101942227

Anonymous 08/17/24(Sat)11:29:39 No.101942227

>>101942173
Yes. Here's mine. Replace <directory> with the actual directory.
The reason why I cd into it before executing start_windows.bat is because some programs tend to create folders based on the location where the .bat you're executing is currently at instead of where the target .bat actually is.
>cd "<directory>"
>start /b /d "<directory>" start_windows.bat --api

Anonymous
08/17/24(Sat)11:30:44 No.101942237

Anonymous 08/17/24(Sat)11:30:44 No.101942237

>>101941451
Just use HDDs from the trashheap

Anonymous
08/17/24(Sat)11:31:22 No.101942244

Anonymous 08/17/24(Sat)11:31:22 No.101942244

Alright, newbs. Take a look at this page.
>https://www.promptingguide.ai/

Anonymous
08/17/24(Sat)11:32:01 No.101942248

Anonymous 08/17/24(Sat)11:32:01 No.101942248

>>101942173
I just made a shortcut and appended --api to the target.

Anonymous
08/17/24(Sat)11:32:12 No.101942252

Anonymous 08/17/24(Sat)11:32:12 No.101942252

>>101942218
git stash
git pull
git stash pop

Anonymous
08/17/24(Sat)11:33:36 No.101942265

Anonymous 08/17/24(Sat)11:33:36 No.101942265

>>101942244
Cool site. Why isn't this in the OP?

Anonymous
08/17/24(Sat)11:34:32 No.101942271

Anonymous 08/17/24(Sat)11:34:32 No.101942271

>>101941218
someone create extension pls.

slop/message:
total slop:

Anonymous
08/17/24(Sat)11:34:33 No.101942272

Anonymous 08/17/24(Sat)11:34:33 No.101942272

>>101942252
I know. But anon is afraid of making a .bat. There's no way he knows how to stash.

Anonymous
08/17/24(Sat)11:35:21 No.101942282

Anonymous 08/17/24(Sat)11:35:21 No.101942282

>>101942244
where is the part that tells me how to make it act like an anime girl

Anonymous
08/17/24(Sat)11:36:20 No.101942291

Anonymous 08/17/24(Sat)11:36:20 No.101942291

File: 1663710712079535.png (32 KB, 225x225)

32 KB PNG

>>101942227
ty

Anonymous
08/17/24(Sat)11:36:20 No.101942292

Anonymous 08/17/24(Sat)11:36:20 No.101942292

>>101942272
A secondary .bat it is, then.

Anonymous
08/17/24(Sat)11:36:59 No.101942296

Anonymous 08/17/24(Sat)11:36:59 No.101942296

File: 1691988369188666.jpg (215 KB, 1280x720)

215 KB JPG

>>101942291
yw

Anonymous
08/17/24(Sat)11:37:20 No.101942300

Anonymous 08/17/24(Sat)11:37:20 No.101942300

>>101942282
Reading for a few minutes will allow you to read for hours. Read your program's docs, read that site. Experiment.

Anonymous
08/17/24(Sat)11:39:12 No.101942323

Anonymous 08/17/24(Sat)11:39:12 No.101942323

>>101942265
There's too many sites to recommend and there's never agreement. This is stuff we've learnt over a year ago. And most people, for some reason, still skip their inference program's docs. That's where they should start.

Anonymous
08/17/24(Sat)11:39:42 No.101942329

Anonymous 08/17/24(Sat)11:39:42 No.101942329

Too many new models. Which is the best one for a small model? Llama 3.1? I want it for roleplay

Anonymous
08/17/24(Sat)11:40:20 No.101942336

Anonymous 08/17/24(Sat)11:40:20 No.101942336

>>101942323
>And most people, for some reason, still skip their inference program's docs.
Because they're fucking useless for a beginner.
Seriously, it took me three weeks to learn what quantization actually does.

Anonymous
08/17/24(Sat)11:41:51 No.101942359

Anonymous 08/17/24(Sat)11:41:51 No.101942359

>>101942336
>Seriously, it took me three weeks to learn what quantization actually does.
anon, i think you might just have the stupid

Anonymous
08/17/24(Sat)11:43:33 No.101942381

Anonymous 08/17/24(Sat)11:43:33 No.101942381

>>101942336
They're not made by nor for normies (with some exceptions). Anyone who's deal with image or audio processing knows what quantization is. Any programmer, even if not explicitly, dealt with it. Same for tokenization for anyone who ever implemented a text parser.
If you don't know those things, you cannot expect to learn everything there is to learn in a day.

Anonymous
08/17/24(Sat)11:46:34 No.101942413

Anonymous 08/17/24(Sat)11:46:34 No.101942413

https://github.com/intel/AI-Playground

Intel beat nvidia to the punch

Anonymous
08/17/24(Sat)11:47:35 No.101942428

Anonymous 08/17/24(Sat)11:47:35 No.101942428

>>101942413
>on a PC powered by an Intel® Arc™ GPU.
So it's useless.

Anonymous
08/17/24(Sat)11:47:50 No.101942433

Anonymous 08/17/24(Sat)11:47:50 No.101942433

>>101942359
I am not a mathematician.
When I ask "what the fuck does quantization do" and the answer is "it transforms a set of many values into a set of small values" it still doesn't tell me how that applies to anything related to large language models.
It would be great if there was a guide that consists of basic concepts, leading into more advanced concepts, instead of having dumped 50 terms of which you have no idea what any of them mean all over your lap whenever you open a manual.
>>101942381
>Anyone who's deal with image or audio processing knows what quantization is.
Yes. This is exactly my point.
>If you don't know those things, you cannot expect to learn everything there is to learn in a day.
But that's the fucking thing. I do not need to know what it is in order to know how to manipulate or use it to get the results I want.
If you told me "small means dumb but fast, large means smart but slow" I would have grasped what it meant instantly.

Anonymous
08/17/24(Sat)11:48:31 No.101942444

Anonymous 08/17/24(Sat)11:48:31 No.101942444

>>101942413
>intel

Anonymous
08/17/24(Sat)11:51:13 No.101942467

Anonymous 08/17/24(Sat)11:51:13 No.101942467

>>101942433
>When I ask "what the fuck does quantization do" and the answer is "it transforms a set of many values into a set of small values"
that's... not what quantization does

Anonymous
08/17/24(Sat)11:51:41 No.101942472

Anonymous 08/17/24(Sat)11:51:41 No.101942472

>>101942413
It's a meme until Intel starts selling GPUs with more VRAM.
24 GB is the bare minimum they'd need to compete but realistically to offset the worse software support they'd have to offer at least 32 GB to make it interesting.

Anonymous
08/17/24(Sat)11:51:52 No.101942476

Anonymous 08/17/24(Sat)11:51:52 No.101942476

>>101942467
he just needs 3 more weeks

Anonymous
08/17/24(Sat)11:51:54 No.101942477

Anonymous 08/17/24(Sat)11:51:54 No.101942477

>>101942467
Thanks for further illustrating my point.

Anonymous
08/17/24(Sat)11:53:47 No.101942495

Anonymous 08/17/24(Sat)11:53:47 No.101942495

File: q.png (21 KB, 575x160)

21 KB PNG

>>101942433
>But that's the fucking thing. I do not need to know what it is in order to know how to manipulate or use it to get the results I want.
You need to know what it is in order to manipulate it to your advantage.
>If you told me "small means dumb but fast, large means smart but slow" I would have grasped what it meant instantly.
If i think who you think you are, it may have been me the one that answer your question. I used a similar phrasing.
Not everyone can be helpful. Most people forget about the time they didn't know something. I don't forget and i try to help newbs, but i cannot be here all day, and you cannot expect to always receive the best quality answers. All this stuff i learnt on my own, even programming. You have to do some of the legwork as well.

Anonymous
08/17/24(Sat)11:56:17 No.101942535

Anonymous 08/17/24(Sat)11:56:17 No.101942535

>>101941709
tabby

Anonymous
08/17/24(Sat)11:59:32 No.101942577

Anonymous 08/17/24(Sat)11:59:32 No.101942577

>>101942495
>You need to know what it is in order to manipulate it to your advantage.
I disagree. You can easily use a computer without knowing what RAM is.
If you run a program that needs to be faster, you can simply buy a better computer instead of having to learn what the cores of a CPU are.
>If i think who you think you are
You do not. I hold no permanent persona on this site.
I do hope my whining is not seen as an insult against any of you; it is more meant as a criticism towards the "industry" as a whole.
I sincerely appreciate all of you here (well, almost all of you; the cp poster may burn in hell). There are not many threads remaining where knowledge is actively shared instead of gatekept.

Anonymous
08/17/24(Sat)12:01:04 No.101942593

Anonymous 08/17/24(Sat)12:01:04 No.101942593

>>101942577
>the cp poster may burn in hell
How dare he post AI generated pixels! Eternal hellfire it is!

Anonymous
08/17/24(Sat)12:02:48 No.101942610

Anonymous 08/17/24(Sat)12:02:48 No.101942610

>>101942329
I was having a good time with magnum-12b-v2-q5_k, but remember that character cards are pretty much universally written like shit and you should make your own instead.
It'll also run much faster (possibly like, 10x faster) if you get something you can run off VRAM alone in an .exl2 format

Anonymous
08/17/24(Sat)12:05:03 No.101942640

Anonymous 08/17/24(Sat)12:05:03 No.101942640

>>101941451
>train loss started at 1
It was over before it even began.

>>101942237
You and the other retard should go back to discord.

Anonymous
08/17/24(Sat)12:05:24 No.101942644

Anonymous 08/17/24(Sat)12:05:24 No.101942644

File: 1696857912340280.jpg (204 KB, 1024x1536)

204 KB JPG

>>101940189

Anonymous
08/17/24(Sat)12:07:43 No.101942666

Anonymous 08/17/24(Sat)12:07:43 No.101942666

>>101942433
there are these chatbots you can go talk to and they can clarify everything you are asking in as many ways as you want, and it'll never get bored. i don't know why you would continue to keep asking here if you are frustrated by our answers.

Anonymous
08/17/24(Sat)12:14:46 No.101942740

Anonymous 08/17/24(Sat)12:14:46 No.101942740

>>101942577
>>You need to know what it is in order to manipulate it to your advantage.
>I disagree. You can easily use a computer without knowing what RAM is.
>If you run a program that needs to be faster, you can simply buy a better computer instead of having to learn what the cores of a CPU are.
You can optimize however you want. I used to use openscad for mechanical designs, for example. I found it slow and clunky but i couldn't find a project that let you design stuff programmatically, so i made my own. You optimize by getting bigger pc. I optimize in different ways. I agree that there's a lot of implied knowledge in this (and all areas, really), but if you want to use anything 'cutting edge', you have to expect some bumps on the road. The first microwave oven didn't have buttons. It didn't even have a dial. And when you know absolutely nothing about a subject, even asking the right questions is hard.
Don't shy away from learning things, even if they seem unnecessary. You don't know all the parts in your car, but knowing how to change your tyres will always be useful.

Anonymous
08/17/24(Sat)12:19:46 No.101942807

Anonymous 08/17/24(Sat)12:19:46 No.101942807

Why wont they make a GPU with slower clock/bandwidth but with way more vram? The thing that fucks everyone up atm memory sharing between ram and vram

Anonymous
08/17/24(Sat)12:22:19 No.101942826

Anonymous 08/17/24(Sat)12:22:19 No.101942826

>>101942807
Because selling them to 5 retards at /lmg/ isn't going to recuperate the r&d and manufacturing costs.
You might not realize it, but outside of our bubble no one is interested in running their own llms on their machines.

Anonymous
08/17/24(Sat)12:22:43 No.101942830

Anonymous 08/17/24(Sat)12:22:43 No.101942830

>>101942807
Designing hardware is much harder than software. And there needs to be a more secure return on investment. AI running locally is very niche. It will take time.

Anonymous
08/17/24(Sat)12:24:38 No.101942854

Anonymous 08/17/24(Sat)12:24:38 No.101942854

>>101942830
>>101942826
ill buy 2?

Anonymous
08/17/24(Sat)12:25:30 No.101942863

Anonymous 08/17/24(Sat)12:25:30 No.101942863

>>101942854
That's much better. I can hear the factories spinning up. Any day now...

Anonymous
08/17/24(Sat)12:25:32 No.101942864

Anonymous 08/17/24(Sat)12:25:32 No.101942864

>>101942854
dm me at @leather_jensen_rtx

Anonymous
08/17/24(Sat)12:25:37 No.101942866

Anonymous 08/17/24(Sat)12:25:37 No.101942866

>>101940189
>spammers returned from their weekly dilation session
whoa!

Anonymous
08/17/24(Sat)12:26:19 No.101942875

Anonymous 08/17/24(Sat)12:26:19 No.101942875

>>101932051
i mean, you literally are a boomer though, so your opinion is more or less irrelevant to the discussion

Anonymous
08/17/24(Sat)12:28:00 No.101942893

Anonymous 08/17/24(Sat)12:28:00 No.101942893

>>101942826
literally kill yourself you waste of oxygen

Anonymous
08/17/24(Sat)12:28:40 No.101942903

Anonymous 08/17/24(Sat)12:28:40 No.101942903

>>101933598
Yo, add OpenWebUI to the list of UIs

Anonymous
08/17/24(Sat)12:29:39 No.101942913

Anonymous 08/17/24(Sat)12:29:39 No.101942913

>>101942893
Alright, but now the demand for large memory gpu has decreased down to 4, hope you are happy.

Anonymous
08/17/24(Sat)12:33:17 No.101942967

Anonymous 08/17/24(Sat)12:33:17 No.101942967

>>101942640
It's a 123B model, so that might be normal.

Anonymous
08/17/24(Sat)12:33:18 No.101942968

Anonymous 08/17/24(Sat)12:33:18 No.101942968

File: ComfyUI_00787_.png (1003 KB, 768x1344)

1003 KB PNG

>>101942644
Any anons here have experience adding yet another GPU to your rig using an M.2 to PCI-e adapter? Is it worth considering if I have a mobo that has a free M.2 slot that is not shared with any PCI-e lanes?

Anonymous
08/17/24(Sat)12:34:25 No.101942984

Anonymous 08/17/24(Sat)12:34:25 No.101942984

>>101942826
It's a game where you're guaranteed to lose, after all.

Anonymous
08/17/24(Sat)12:36:19 No.101943019

Anonymous 08/17/24(Sat)12:36:19 No.101943019

>>101942807
Because there are people willing to pay 10x more than you for a GPU with a lot of VRAM and they want to keep milking those people.

>>101942830
There are hardware-modded NVIDIA GPUs though where all they did was solder on more VRAM.
It's clearly not that hard, board partners would 100% release high-VRAM variants if NVIDIA would let them.

Anonymous
08/17/24(Sat)12:36:29 No.101943021

Anonymous 08/17/24(Sat)12:36:29 No.101943021

>>101931013
Would you mind sharing your settings for Hermes, samplers or instructs? I'm still getting a ton of mischievous smirk etc @ 70B 4.65 and neutralized samplers / various different prompts.

Anonymous
08/17/24(Sat)12:38:22 No.101943063

Anonymous 08/17/24(Sat)12:38:22 No.101943063

>>101942968
I haven't tried it, but it should work and there probably won't be much slow down during inference if you aren't using split row.

Anonymous
08/17/24(Sat)12:41:09 No.101943105

Anonymous 08/17/24(Sat)12:41:09 No.101943105

>>101943019
>There are hardware-modded NVIDIA GPUs though where all they did was solder on more VRAM.
It's not some indian with a hand soldering iron doing those things. Demand is still not high enough to make a profit out of that.
>It's clearly not that hard, board partners would 100% release high-VRAM variants if NVIDIA would let them.
The return in investment is still the bigger issue. Niche things are expensive. Local AI is niche. I've come to terms with that, and so should you.

Anonymous
08/17/24(Sat)12:41:22 No.101943108

Anonymous 08/17/24(Sat)12:41:22 No.101943108

File: Screenshot 2024-08-17 at (...).png (97 KB, 944x855)

97 KB PNG

Shit, that actually makes sense. I had the same misconception in mind.

Anonymous
08/17/24(Sat)12:44:06 No.101943149

Anonymous 08/17/24(Sat)12:44:06 No.101943149

>>101943108
Damn, this changes everything

Anonymous
08/17/24(Sat)12:46:14 No.101943171

Anonymous 08/17/24(Sat)12:46:14 No.101943171

>>101942666
>if you are frustrated by our answers.
That's the thing: I'm not. I've learned more from lurking here than by reading material available to me.
>>101942740
>You don't know all the parts in your car, but knowing how to change your tyres will always be useful.
I agree. But I do not need to know how to change them in order to drive my car.

Anonymous
08/17/24(Sat)12:48:15 No.101943190

Anonymous 08/17/24(Sat)12:48:15 No.101943190

>>101943171
>I've learned more from lurking here than by reading material available to me.
I guarantee you, lurking here is a worse return on time investment than if you had just read the docs and consulted an LLM when you had questions.

Anonymous
08/17/24(Sat)12:51:06 No.101943234

Anonymous 08/17/24(Sat)12:51:06 No.101943234

>>101943190
>lurking here is a worse return on time investment
I have this thread (and like five others) open on my third monitor where I'll glance at it while I have to wait doing actual stuff.
No one is actually /actively/ browsing this place and wasting all their time doing so... right?

Anonymous
08/17/24(Sat)12:52:23 No.101943256

Anonymous 08/17/24(Sat)12:52:23 No.101943256

>>101943234
>I have this thread (and like five others) open on my third monitor where I'll glance at it while I have to wait doing actual stuff.
Hey, same.
That's more or less how I've been using 4chan for the last two or so years.

Anonymous
08/17/24(Sat)12:52:53 No.101943267

Anonymous 08/17/24(Sat)12:52:53 No.101943267

>>101943171
>I agree. But I do not need to know how to change them in order to drive my car.
We can keep arguing all day like this. Knowing how to change them is never a waste of knowledge. If you think that way, you'll have the same issues with everything you find on your way. Ram and cpu today, tyres tomorrow. Our of gas? i dunno. i just buy another car with a full tank. If that's your view, then i have nothing else to discuss. Either way, good luck in your endevours.

Anonymous
08/17/24(Sat)12:54:17 No.101943288

Anonymous 08/17/24(Sat)12:54:17 No.101943288

>>101943108
gguf confirmed a meme

Anonymous
08/17/24(Sat)12:55:01 No.101943303

Anonymous 08/17/24(Sat)12:55:01 No.101943303

>>101943108
What are the implications of this?

Anonymous
08/17/24(Sat)12:55:26 No.101943309

Anonymous 08/17/24(Sat)12:55:26 No.101943309

>>101943234
H-Haha... R-Right

Anonymous
08/17/24(Sat)12:55:32 No.101943310

Anonymous 08/17/24(Sat)12:55:32 No.101943310

>>101943303
huge.

Anonymous
08/17/24(Sat)12:56:14 No.101943322

Anonymous 08/17/24(Sat)12:56:14 No.101943322

>>101943303
That the optimization method is different. That's it.

Anonymous
08/17/24(Sat)12:56:34 No.101943325

Anonymous 08/17/24(Sat)12:56:34 No.101943325

>>101943303
>>101943288

Anonymous
08/17/24(Sat)12:57:26 No.101943335

Anonymous 08/17/24(Sat)12:57:26 No.101943335

>>101943267
I don't think you get it, my immediate concern is to use a local model.
In order to use a model, I have to select the correct version of this model.
The difference between these versions is the quantization.
To know what version I'd need to use, I would need to know what version would work on my machine.
Learning what quantization is would take much, much longer than just learning what version would work on my machine.
That is what I am taking issue with.

Your metaphor sucks, by the way.

Anonymous
08/17/24(Sat)13:06:32 No.101943444

Anonymous 08/17/24(Sat)13:06:32 No.101943444

File: 1715830787598652.png (336 KB, 3000x2100)

336 KB PNG

>>101943108
Interesting, it seems like brute force is usually the way to achieve the best results after all.

Anonymous
08/17/24(Sat)13:09:27 No.101943481

Anonymous 08/17/24(Sat)13:09:27 No.101943481

>>101943303
Everything is OVER

Anonymous
08/17/24(Sat)13:10:21 No.101943499

Anonymous 08/17/24(Sat)13:10:21 No.101943499

>>101943335
All metaphors suck.
I knew llama.cpp needed conversion. It's in the readme. It offers options for quantization. I know it reduces memory usage by the obvious difference in file size. What model can i run? llama2 at Q4 is like 5gb. Let's try that. Works. great... let's keep going bigger and bigger until it stops working. Now i know what's the biggest model i can run.
Most people know more than i do. I cannot expect everything i use to aim for the lowest common denominator. Specially in a new field. It's unreasonable for you to expect the same thing.
You can say that it was hard to learn what you needed, but you did. You getting it running is much more of an achievement than me getting it running. You should be much more content with the result than i am.

Anonymous
08/17/24(Sat)13:14:54 No.101943565

Anonymous 08/17/24(Sat)13:14:54 No.101943565

>>101936364
what requirements.txt are you using for your virtual env? I keep getting errors compaining about tokenizer.model at the point where it should start writing the gguf to disk

Anonymous
08/17/24(Sat)13:15:49 No.101943577

Anonymous 08/17/24(Sat)13:15:49 No.101943577

>>101943499
>I knew llama.cpp needed conversion. It's in the readme. It offers options for quantization. I know it reduces memory usage by the obvious difference in file size. What model can i run? llama2 at Q4 is like 5gb. Let's try that. Works. great... let's keep going bigger and bigger until it stops working. Now i know what's the biggest model i can run.
I hope you realize that people who just come across llms and want to use them do not have the knowledge you need to understand this paragraph.
>It's unreasonable for you to expect the same thing.
It's unreasonable for me to want a simple "start with Q1, then try Q2 and repeat until it becomes slow enough for you to take issue with it" instruction for people who have no idea what they're doing?

Anonymous
08/17/24(Sat)13:16:41 No.101943596

Anonymous 08/17/24(Sat)13:16:41 No.101943596

>>101943303
That local LLM are meme (always was) and quantization is the biggest cope imaginable.

Anonymous
08/17/24(Sat)13:17:39 No.101943609

Anonymous 08/17/24(Sat)13:17:39 No.101943609

Won't bitnet make quantization obsolete? Why learn it now?
What LLMs taught me is that useless information leads to actual, measurable brain damage that can't be unfucked easily. I started paying attention to what I watch on TV etc.

Anonymous
08/17/24(Sat)13:22:52 No.101943688

Anonymous 08/17/24(Sat)13:22:52 No.101943688

>>101943577
>It's unreasonable for me to want a simple "start with Q1, then try Q2 and repeat until it becomes slow enough for you to take issue with it" instruction for people who have no idea what they're doing?
yes, because if newbs do start with llama 3 8b q1 they'll get an awful experience and think llms are even worse than they already are

Anonymous
08/17/24(Sat)13:25:58 No.101943726

Anonymous 08/17/24(Sat)13:25:58 No.101943726

>>101943688
>they'll get an awful experience and think llms are even worse than they already are
I don't think you understand how low the bar is for someone with no experience with any of this stuff to be amazed lmao
A few months ago I showed ChatGPT voice function to my dyslexic father. He now uses it daily and keeps trying to shill it to the neighbours.

Anonymous
08/17/24(Sat)13:27:07 No.101943743

Anonymous 08/17/24(Sat)13:27:07 No.101943743

>>101943726
>I don't think you understand how low the bar is for someone with no experience with any of this stuff to be amazed lmao
normies have tried gpt4o for free on oai website now, if you give them a q1 8b they'll just turn into "local is meme" shitposters instantly

llama.cpp CUDA dev !!OM2Fp6Fn93S
08/17/24(Sat)13:31:05 No.101943808

llama.cpp CUDA dev !!OM2Fp6Fn93S 08/17/24(Sat)13:31:05 No.101943808

>>101943303
1.Bartowski can be trusted to issue a public correction when wrong instead of trying to sweep things under the rug.
2.Carefully read the documentation if it's available. The imatrix README links the PR with which it was added where the procedure is explained.
3.In a few months there will probably be even more confusion. Importance matrices are used instead of the gradients of the weights because as of right now llama.cpp has no substantial training support. I'm currently working on training so at some point there should be a better method (though according to I. Kawrakow importance matrices are already a good approximation).

Anonymous
08/17/24(Sat)13:32:48 No.101943836

Anonymous 08/17/24(Sat)13:32:48 No.101943836

>>101943577
>It's unreasonable for me to want a simple "start with Q1, then try Q2 and repeat until it becomes slow enough for you to take issue with it" instruction for people who have no idea what they're doing?
And if i tell you to binary search the best model you'll complain as well. Some things are beyond what some people can grasp. You got it running. Be glad. Others will have to wait until they integrate llms in their phones as a native app or keep using chatgpt.
Keep arguing with the other anons if you want. We all had to learn this shit a year ago when there were practically no resources. You are lucky.

Anonymous
08/17/24(Sat)13:38:18 No.101943915

Anonymous 08/17/24(Sat)13:38:18 No.101943915

>>101943565 (me)
Never mind, I forced an update of all pip packages and re-ran the hf_convert requirements.txt from llama.cpp and now it appears to be working.
mf python...

Anonymous
08/17/24(Sat)13:43:31 No.101943993

Anonymous 08/17/24(Sat)13:43:31 No.101943993

>>101943808
>I'm currently working on training so at some point there should be a better method (though according to I. Kawrakow importance matrices are already a good approximation).
Really? How far are you?

Anonymous
08/17/24(Sat)13:46:32 No.101944030

Anonymous 08/17/24(Sat)13:46:32 No.101944030

>>101943836
>binary search the best model
Like, do a little tournament on my hardware?

Anonymous
08/17/24(Sat)13:46:36 No.101944032

Anonymous 08/17/24(Sat)13:46:36 No.101944032

File: xzbzdtnqi5jd1.png (85 KB, 845x658)

85 KB PNG

llama.cpp CUDA dev !!OM2Fp6Fn93S
08/17/24(Sat)13:49:22 No.101944081

llama.cpp CUDA dev !!OM2Fp6Fn93S 08/17/24(Sat)13:49:22 No.101944081

>>101943993
I've recently been overhauling the ggml MNIST example.
CPU training with 32 bit floats is functional, what is still missing (in ggml itself) are features like the ability to define datasets and to evaluate them in a well-defined way.
And of course the general infrastructure for training on other backends like CUDA.
For llama.cpp in particular a FlashAttention backwards pass would also be needed.

Anonymous
08/17/24(Sat)13:53:18 No.101944134

Anonymous 08/17/24(Sat)13:53:18 No.101944134

>>101944030
>Like, do a little tournament on my hardware?
Kind of. Binary search is the fastest way to find a target in a sorted list, so if you run inference starting in the middle quant and then keep moving up or down half-way through the remainder depending on whether the results are satisfying to you, you can find the "best" quant for your situation in the statistically least number of steps.
This assumes quant quality is correlated with quant size

Anonymous
08/17/24(Sat)13:55:57 No.101944168

Anonymous 08/17/24(Sat)13:55:57 No.101944168

>>101944030
Q from 1 to 8 (ignoring variations, same principle). try q4. works fine. You're left with q5,6 and 8. try q6. too slow, split downloads on the rest. only have q5. runs fine, that's the winner.

Anonymous
08/17/24(Sat)14:02:02 No.101944262

Anonymous 08/17/24(Sat)14:02:02 No.101944262

what's the current best model for adventure shit?

Anonymous
08/17/24(Sat)14:02:34 No.101944276

Anonymous 08/17/24(Sat)14:02:34 No.101944276

>>101944262
none.

Anonymous
08/17/24(Sat)14:03:01 No.101944289

Anonymous 08/17/24(Sat)14:03:01 No.101944289

===MODEL REVIEW===
NousResearch/Hermes-3-Llama-3.1-405B
It's decent, nice even. I was prepared for a disappointment after trying 70b version of this model, but was pleasantly surprised. It's not dumb like 70b(gets the deeper implications, doesn't need spelling stuff out), but still quite sloppy. Cuckery got SIGNIFICANTLY reduced, but a few bits remain(never had a full decline like >>101933786, but it acted too passively as an evil character). Not too dry. Feels on the same level as Largestral, but needs a more powerful machine. Better than Tess and the official tune imo(agree with >>101936364). Gives me hope that there will be tunes for it like in llama-2 days that will make it even better. Worth getting house heated up over? Not really.
Rating: 8.5/10, nice. Great job Nous.

Anonymous
08/17/24(Sat)14:04:52 No.101944312

Anonymous 08/17/24(Sat)14:04:52 No.101944312

>>101933350
Oh, you use a multisocket board? 12 channel wouldn't be very fast? But my main question I guess was how many cores are actually useful for it?

Anonymous
08/17/24(Sat)14:07:20 No.101944347

Anonymous 08/17/24(Sat)14:07:20 No.101944347

>>101944289
Too bad nobody can run it at meaningful speeds...

Anonymous
08/17/24(Sat)14:10:20 No.101944391

Anonymous 08/17/24(Sat)14:10:20 No.101944391

Have we looked at this new TTS model by Microsoft?

https://www.microsoft.com/en-us/research/project/e2-tts/

There seems to be code to implement it on GitHub. I haven’t read the paper but the sample look the best quality I’ve heard apart from the OpenAI advanced mode.
https://github.com/lucidrains/e2-tts-pytorch

Anonymous
08/17/24(Sat)14:11:45 No.101944410

Anonymous 08/17/24(Sat)14:11:45 No.101944410

File: file.png (114 KB, 754x742)

114 KB PNG

>180B is based on gemma and HelpingAI-15B, HelpingAI-flash, HelpingAI2-6B and 2-9B are base models
wat
https://huggingface.co/OEvortex/HelpingAI-180B-base/blob/main/config.json
>"HelpingAIForCausalLM"

Anonymous
08/17/24(Sat)14:15:17 No.101944453

Anonymous 08/17/24(Sat)14:15:17 No.101944453

>>101944347
Don't be so pessimistic. Maybe we'll get proper working bit(con)ne(c)t quantization method in the next 2 weeks-months-years. Or Intel releases big VRAM card. A man can dream...

Anonymous
08/17/24(Sat)14:15:49 No.101944463

Anonymous 08/17/24(Sat)14:15:49 No.101944463

>>101944289
What's wrong with the 70B model? I'm still looking like a good 48VRAM model

Anonymous
08/17/24(Sat)14:21:27 No.101944558

Anonymous 08/17/24(Sat)14:21:27 No.101944558

>>101944463
>What's wrong with the 70B model?
I found it to be rather dull. Didn't get the deeper context of the card I was testing it with like CR+ and Largestral did and was a bit on the dry side.

>I'm still looking like a good 48VRAM model
Low quant Largestral.

Anonymous
08/17/24(Sat)14:22:17 No.101944572

Anonymous 08/17/24(Sat)14:22:17 No.101944572

>>101944262
Anthracite's private model, you need to send a dick pic to them for the model link

CPuMAXx/VI !CPuMAXx/VI
08/17/24(Sat)14:22:19 No.101944574

CPuMAXx/VI !CPuMAXx/VI 08/17/24(Sat)14:22:19 No.101944574

>>101944312
>how many cores are actually useful for it?
That's a bit of a loaded question, as it depends on memory bandwidth (and hence, locality) to keep them fed, but with my specific setup 55-60 cores is the current sweet spot (out of 128).
Inference isn't compute bound in general.

Anonymous
08/17/24(Sat)14:22:19 No.101944575

Anonymous 08/17/24(Sat)14:22:19 No.101944575

All the recent "improvements" have been llm masturbation tier, just different flavors of SPIN. We're plateauing.

Anonymous
08/17/24(Sat)14:23:35 No.101944601

Anonymous 08/17/24(Sat)14:23:35 No.101944601

>>101944558
Do you have settings for largestral you wouldn't mind sharing?

Anonymous
08/17/24(Sat)14:29:48 No.101944680

Anonymous 08/17/24(Sat)14:29:48 No.101944680

>>101944410
So it's basically a franken-stacked gemma made by a zoomer.

Anonymous
08/17/24(Sat)14:30:38 No.101944699

Anonymous 08/17/24(Sat)14:30:38 No.101944699

>>101944410
>no benchmarks
>no paper
>no github
>not even context mentioned in the model card
>has to explain on reddit what it is
75% scam. Will try it if llama.cpp adds support.

Anonymous
08/17/24(Sat)14:34:15 No.101944765

Anonymous 08/17/24(Sat)14:34:15 No.101944765

>>101944463
I tried it myself as well, and still prefer miqu, but you should just give it a try yourself.

Anonymous
08/17/24(Sat)14:34:47 No.101944780

Anonymous 08/17/24(Sat)14:34:47 No.101944780

>>101944601
Quite standard temp 1, minP 0.05, Alpaca. If it gets too repetitive add dry 2,2,1,200000. If you need to break a sequence lower minP to 0.01 and pull up temperature to 3.

Anonymous
08/17/24(Sat)14:34:57 No.101944785

Anonymous 08/17/24(Sat)14:34:57 No.101944785

>>101944699
should just work. It's literally just a gemma2 model.

Anonymous
08/17/24(Sat)14:35:48 No.101944806

Anonymous 08/17/24(Sat)14:35:48 No.101944806

>>101944785
except he changed the model arch name and doesn't have a context size listed in the config

Anonymous
08/17/24(Sat)14:38:32 No.101944851

Anonymous 08/17/24(Sat)14:38:32 No.101944851

>>101944699
>>101944806
https://huggingface.co/OEvortex/HelpingAI-180B-base/blob/main/configuration_HelpingAI.py
>max_position_embeddings=4096,
>4k context
not looking too good

Anonymous
08/17/24(Sat)14:41:20 No.101944904

Anonymous 08/17/24(Sat)14:41:20 No.101944904

>>101943021
Sure, I'm still working on my prompts but I'll post everything once I have it in a good spot for general use. Fair warning, my presets are opinionated and not at all neutral, it's kind of similar to the preset I was using with Mistral (https://rentry.org/stral_set) but adapted to chatml and more stripped-down - Hermes needs less wrangling and having an author's note or last assistant sequence is much less necessary, depending on the card.
Also I still get a bunch of generic phrases and slop, I think that's more or less unavoidable. Such is life with LLMs.
My samplers are pretty basic, somewhere between 0.6-0.9 temp and somewhere between 0.03-0.08 min p, add your favorite anti-repetition samplers as needed.

Anonymous
08/17/24(Sat)14:43:38 No.101944932

Anonymous 08/17/24(Sat)14:43:38 No.101944932

>>101944780
What about instruct settings? Are you using default?

Anonymous
08/17/24(Sat)14:47:14 No.101944990

Anonymous 08/17/24(Sat)14:47:14 No.101944990

>>101944765
Midnight? Have you tried new dawn 3.1? Its made by the same guy that did the former and I'm curious if it's good since it

Anonymous
08/17/24(Sat)14:48:49 No.101945026

Anonymous 08/17/24(Sat)14:48:49 No.101945026

>>101944932
No, simple-proxy-for-tavern with this https://huggingface.co/datasets/ChuckMcSneed/various_RP_system_prompts/blob/main/ChuckMcSneed-interesting.txt slightly modified.

Anonymous
08/17/24(Sat)14:50:46 No.101945055

Anonymous 08/17/24(Sat)14:50:46 No.101945055

>>101944990
https://huggingface.co/sophosympatheia/New-Dawn-Llama-3.1-70B-v1.1/discussions/3
New dawn has attention issues.

Anonymous
08/17/24(Sat)14:51:47 No.101945069

Anonymous 08/17/24(Sat)14:51:47 No.101945069

>>101944990
Yes. No. For llama 3.1 I've only tried the instruct, and the hermes 3.

Anonymous
08/17/24(Sat)14:52:13 No.101945079

Anonymous 08/17/24(Sat)14:52:13 No.101945079

>>101944572
>Anthracite's private model, you need to send a dick pic to them for the model link
send them to pics@anthra.site

CPuMAXx/VI !CPuMAXx/VI
08/17/24(Sat)14:52:50 No.101945087

CPuMAXx/VI !CPuMAXx/VI 08/17/24(Sat)14:52:50 No.101945087

File: Deepseek-v2-chat-q8-vs-Mi(...).png (22 KB, 916x332)

22 KB PNG

>>101944575
>All the recent "improvements" have been llm masturbation tier, just different flavors of SPIN. We're plateauing.
Its interesting to see picrel deepseek (top) vs largestral's takes on this thread so far
It wasn't too long ago that we didn't have a single model that could handle the job competently, now we've got multiple
I think we're definitely on a "diminishing returns" curve and not exponential growth, but we're getting real improvements in intelligence.
It's just harder to quantify the improvements now. We need some more advanced tests to tease out each model's relative abilities.

Anonymous
08/17/24(Sat)14:54:32 No.101945117

Anonymous 08/17/24(Sat)14:54:32 No.101945117

>>101945087
Is llama 3.1 70B competent too?

Anonymous
08/17/24(Sat)14:54:44 No.101945118

Anonymous 08/17/24(Sat)14:54:44 No.101945118

>>101945087
I like the top more. Maybe the bottom would be improved if you ask it to start the sentences with different words?

Anonymous
08/17/24(Sat)14:56:06 No.101945141

Anonymous 08/17/24(Sat)14:56:06 No.101945141

>>101945087
How many t/s do you get with bf16 8b llama?

Anonymous
08/17/24(Sat)14:56:36 No.101945147

Anonymous 08/17/24(Sat)14:56:36 No.101945147

>>101944391
I only care if it can be set up easily to work with ST, at this point.

Anonymous
08/17/24(Sat)14:57:40 No.101945170

Anonymous 08/17/24(Sat)14:57:40 No.101945170

>>101945087
Did you try DeepSeek Coder too? It's better than chat in many tasks.

Anonymous
08/17/24(Sat)14:58:29 No.101945181

Anonymous 08/17/24(Sat)14:58:29 No.101945181

>>101945087
what's the prompt for recapbot

Anonymous
08/17/24(Sat)14:58:34 No.101945184

Anonymous 08/17/24(Sat)14:58:34 No.101945184

>>101944289
70b seems smart to me, even compared to Largestral. It's one of the few models to handle my convoluted multi-fetish card acceptably; it didn't progress things quite as naturally as Large does but it was still perfectly acceptable. With more standard RP I honestly think it feels smarter, if only because there's way more variety to what it'll throw at you - maybe it understands the context a little less, but it feels like it tries harder to actually *do* something with it.

Anonymous
08/17/24(Sat)14:58:39 No.101945185

Anonymous 08/17/24(Sat)14:58:39 No.101945185

>>101945170
Including better at RP?

Anonymous
08/17/24(Sat)15:02:23 No.101945238

Anonymous 08/17/24(Sat)15:02:23 No.101945238

>>101945087
>anon
>anon
>anon
Yep that's mistral garbage alright

Anonymous
08/17/24(Sat)15:06:33 No.101945299

Anonymous 08/17/24(Sat)15:06:33 No.101945299

>>101945238
Why do some models do this more than others?

Anonymous
08/17/24(Sat)15:07:12 No.101945305

Anonymous 08/17/24(Sat)15:07:12 No.101945305

>>101945299
Brain damage

Anonymous
08/17/24(Sat)15:07:33 No.101945312

Anonymous 08/17/24(Sat)15:07:33 No.101945312

>>101945299
overfitting

Anonymous
08/17/24(Sat)15:08:01 No.101945315

Anonymous 08/17/24(Sat)15:08:01 No.101945315

>>101945299
i think the going theory was that it was usually an artifact of bad rlhf

Anonymous
08/17/24(Sat)15:08:35 No.101945326

Anonymous 08/17/24(Sat)15:08:35 No.101945326

File: ComfyUI_00794_.png (1.07 MB, 1024x1024)

1.07 MB PNG

>>101944262
Ignore the retards itt
Largestral if you can run it
Magnum-123B (still cooking) also looks promising
>in b4 "buy an ad"
>in b4 mentally ill VRAMlet screeching

Anonymous
08/17/24(Sat)15:10:00 No.101945347

Anonymous 08/17/24(Sat)15:10:00 No.101945347

File: shes and she knows how to.png (171 KB, 923x414)

171 KB PNG

>>101940043
It's not very good for ERP.
If I set t=0.7 it goes on repetition loops, Same with t=1 t=0.81 it's a bit schizo

Anonymous
08/17/24(Sat)15:10:02 No.101945348

Anonymous 08/17/24(Sat)15:10:02 No.101945348

>>101945326
After shilling Midnight Miqu and Wizard you have zero authority to make recommendations.

Anonymous
08/17/24(Sat)15:11:14 No.101945375

Anonymous 08/17/24(Sat)15:11:14 No.101945375

Mixtral started a sampler arms race to combat repetition. You guys should be more grateful

Anonymous
08/17/24(Sat)15:11:33 No.101945379

Anonymous 08/17/24(Sat)15:11:33 No.101945379

>>101945347
She's a strong and confident lioness who don't need no mane.

Anonymous
08/17/24(Sat)15:11:53 No.101945384

Anonymous 08/17/24(Sat)15:11:53 No.101945384

>>101945185
Especially at RP

Anonymous
08/17/24(Sat)15:11:59 No.101945385

Anonymous 08/17/24(Sat)15:11:59 No.101945385

>>101945326
Cute Migu

Anonymous
08/17/24(Sat)15:13:19 No.101945400

Anonymous 08/17/24(Sat)15:13:19 No.101945400

>>101945384
How'd they fuck the Chat model up that badly?

Anonymous
08/17/24(Sat)15:14:50 No.101945427

Anonymous 08/17/24(Sat)15:14:50 No.101945427

File: ComfyUI_00795_.png (991 KB, 1024x1024)

991 KB PNG

>>101945385

CPuMAXx/VI !CPuMAXx/VI
08/17/24(Sat)15:17:11 No.101945451

CPuMAXx/VI !CPuMAXx/VI 08/17/24(Sat)15:17:11 No.101945451

>>101945141
>How many t/s do you get with bf16 8b llama?
I haven't run it, but my guess would be about 50t/s based on llama-bench numbers for a similar model. This setup isn't super fast for small models. Best case is MoE with a huge number of experts
>>101945117
>Is llama 3.1 70B competent too?
Haven't tried yet since I hadn't heard anything exciting about it
>>101945118
>>101945238
>>101945181
https://raw.githubusercontent.com/cpumaxx/lmg_recapbot/main/thread_summary.prompt
Warning that there are probably a bunch of uncommitted local changes not in that recapbot github. There never seemed to be any interest in it beyond me so I got lazy.
>>101945170
>Did you try DeepSeek Coder too? It's better than chat in many tasks.
I have in the past. I'll run another one on this thread specifically. I'm currently quanting, though, so response will be slow. Also, Saturday

Anonymous
08/17/24(Sat)15:18:11 No.101945465

Anonymous 08/17/24(Sat)15:18:11 No.101945465

>>101945451
>You have been visited by Hatune Miku in a dream and tasked with analyzing the provided JSON encoded 4chan thread and summarizing it into a recap.
My sides

Anonymous
08/17/24(Sat)15:18:50 No.101945473

Anonymous 08/17/24(Sat)15:18:50 No.101945473

>>101945185
I used DSC-V2 for RP once, and despite the brutally slow token per second it was actually pretty amazing. Especially in a scenario where it's expected to play... 'hard to get'. That said you need a lot of fucking RAM to run it at a non meme quant.

Anonymous
08/17/24(Sat)15:19:00 No.101945476

Anonymous 08/17/24(Sat)15:19:00 No.101945476

>>101945400
more aligned to the CCP and Core Values of Socialism

Anonymous
08/17/24(Sat)15:19:26 No.101945481

Anonymous 08/17/24(Sat)15:19:26 No.101945481

>>101945400
Strong alignment to socialist values and Xi Jinping thought.
>Building upon our prior research (DeepSeek-AI, 2024), we curate our instruction tuning datasets to include 1.5M instances, comprising 1.2M instances for helpfulness and 0.3M instances for safety
>20% safetyslop
>https://arxiv.org/pdf/2405.04434

Anonymous
08/17/24(Sat)15:20:11 No.101945495

Anonymous 08/17/24(Sat)15:20:11 No.101945495

llama4 WHEN

NOTHING EVER HAPPENS

Anonymous
08/17/24(Sat)15:22:10 No.101945525

Anonymous 08/17/24(Sat)15:22:10 No.101945525

>>101945495
llama4 will only come in 2B and 1T variants.

Anonymous
08/17/24(Sat)15:23:14 No.101945542

Anonymous 08/17/24(Sat)15:23:14 No.101945542

>>101945451
>I haven't run it, but my guess would be about 50t/s
Wait, what? I got 13t/s on MZ73-LM0. Where do I enable the speed?

Anonymous
08/17/24(Sat)15:23:39 No.101945549

Anonymous 08/17/24(Sat)15:23:39 No.101945549

>>101945495
ONLY THREE DAYS REMAINING
TRUST THE PLAN

Anonymous
08/17/24(Sat)15:23:45 No.101945553

Anonymous 08/17/24(Sat)15:23:45 No.101945553

File: blinkin.jpg (15 KB, 280x210)

15 KB JPG

>>101945326
>>101945348
>Anti-schizo miku summons the schizo
Like pottery

Anonymous
08/17/24(Sat)15:34:46 No.101945694

Anonymous 08/17/24(Sat)15:34:46 No.101945694

>>101945495
the training will start right after they finish galactica 2 1.2T (a new llm dedicated entirely to deboonking elon musk on twitter)

Anonymous
08/17/24(Sat)15:40:14 No.101945762

Anonymous 08/17/24(Sat)15:40:14 No.101945762

>>101945451
>Haven't tried yet since I hadn't heard anything exciting about it
It has great context performance, supposedly. That's why I was curious.

Anonymous
08/17/24(Sat)15:43:04 No.101945801

Anonymous 08/17/24(Sat)15:43:04 No.101945801

Most checkpoints i used are ancient for modern standards (vicuna/alpaca).
Could any anon recommend me a chat/rp 22B checkpoint or similar performance?
I tried
Qwen1.5-22B
and while it's more coherent than other checkpoints, sometimes it gives me chinese characters.
Tried
Hermes-3-Llama-3.1-8B
because i saw endless praise for it, but it was genuinely disappointing.

Anonymous
08/17/24(Sat)15:48:04 No.101945896

Anonymous 08/17/24(Sat)15:48:04 No.101945896

>>101945801
try mistral nemo 12b or one of the magnum finetunes

Anonymous
08/17/24(Sat)15:56:17 No.101946007

Anonymous 08/17/24(Sat)15:56:17 No.101946007

>>101945896
buy an ad

Anonymous
08/17/24(Sat)15:57:46 No.101946032

Anonymous 08/17/24(Sat)15:57:46 No.101946032

>>101945801
What >>101945896 said.
Also try Stheno v3.2 and Celeste 12B.

Anonymous
08/17/24(Sat)15:58:59 No.101946055

Anonymous 08/17/24(Sat)15:58:59 No.101946055

>>101946032
Nobody should be using a L3 8B fine-tune. Are you retarded?

Anonymous
08/17/24(Sat)16:03:16 No.101946128

Anonymous 08/17/24(Sat)16:03:16 No.101946128

File: 11263767112.png (92 KB, 780x744)

92 KB PNG

So, which models pass the nigger bomb scenario?

Anonymous
08/17/24(Sat)16:05:31 No.101946167

Anonymous 08/17/24(Sat)16:05:31 No.101946167

>>101945801
That's nothing. I tried llama 3.1 70b and it wasn't anything special. Although it was just a Q3 it should still be significantly above 12~20b models.
It wasn't worth the massive decrease in speed.

CPuMAXx/VI !CPuMAXx/VI
08/17/24(Sat)16:06:26 No.101946181

CPuMAXx/VI !CPuMAXx/VI 08/17/24(Sat)16:06:26 No.101946181

>>101945542
>Wait, what? I got 13t/s on MZ73-LM0. Where do I enable the speed?
Looks like the reference llama-bench log I was looking at was for an old Q4 (before I started naming models and logs consistently), so your numbers are probably close. I'll DL and try to run it later and tell you what I actually get.
The only tweaks I can think that I'm running right now are: debian testing for newer kernel with epyc optimizations (6.10-3 right now), dropping the system cache before each llama-bench run, using --numa distribute and cranking up compiler optimizations in llama.cpp compile strings (LLAMA_FAST)
>>101945762
>It has great context performance, supposedly. That's why I was curious.
Maybe I'll try it later
>>101945465
That was a shoutout to an ancient recapanon post

Anonymous
08/17/24(Sat)16:08:35 No.101946219

Anonymous 08/17/24(Sat)16:08:35 No.101946219

>>101946128
I'd prefer a neutral model that doesn't go into disclaimers or speeches for either side. I'd consider your pic a failure as well.
As for the question, i don't know. There are more entertaining things than seeing which models are "woke" or "based".

Anonymous
08/17/24(Sat)16:09:50 No.101946238

Anonymous 08/17/24(Sat)16:09:50 No.101946238

>>101946219
So what would be the 'correct' response here?
"Yes, of course." ?

Anonymous
08/17/24(Sat)16:09:51 No.101946239

Anonymous 08/17/24(Sat)16:09:51 No.101946239

File: Screenshot 2024-08-17 at (...).png (161 KB, 1716x779)

161 KB PNG

>>101946128
What the fuck is toto.

Anonymous
08/17/24(Sat)16:12:03 No.101946264

Anonymous 08/17/24(Sat)16:12:03 No.101946264

>>101933598
Why do companies seem to be moving away from mixture-of-experts? Mixtral 8x22 (and especially WizardLM 2) was really good at release, with super fast inference. It's becoming expensive as hell to run models like Mistral Large 2. Per benchmarks, is WizardLM 2 really that much worse than Llama 3.1 70B or Mistral Large 2?

Anonymous
08/17/24(Sat)16:13:18 No.101946280

Anonymous 08/17/24(Sat)16:13:18 No.101946280

>>101946264
Try again without memory holing DeepSeek.

Anonymous
08/17/24(Sat)16:13:23 No.101946284

Anonymous 08/17/24(Sat)16:13:23 No.101946284

>>101946238
Pretty much. I wouldn't mind some elaboration like "i'd do it to save a single life". But speeches and disclaimers are boring both ways, even if i agree with them.

Anonymous
08/17/24(Sat)16:13:26 No.101946286

Anonymous 08/17/24(Sat)16:13:26 No.101946286

>>101946264
We are obviously regressing. Hence why old 7b writing was far more flavorful than modern models too

Anonymous
08/17/24(Sat)16:15:54 No.101946314

Anonymous 08/17/24(Sat)16:15:54 No.101946314

>>101941441
OpenAI ALWAYS announces big things ahead by saying they will do a public conference. They said nothing, therefore he's a scammer or a schizo seeing things in his coffee.

Anonymous
08/17/24(Sat)16:16:24 No.101946321

Anonymous 08/17/24(Sat)16:16:24 No.101946321

>>101946239
Give it a sense of urgency ("in the next five minutes") and that there is no alternative ("you are locked in the room, and it will take 20 minutes to break in, so you are the only one who can do it")

Anonymous
08/17/24(Sat)16:16:49 No.101946327

Anonymous 08/17/24(Sat)16:16:49 No.101946327

>>101946264
Mixtral 8x22 was a complete meme. Nemo proves Mixtral 8x7 was also a meme.

Anonymous
08/17/24(Sat)16:17:18 No.101946341

Anonymous 08/17/24(Sat)16:17:18 No.101946341

>>101941441
nigger it's fucking saturday
get real

Anonymous
08/17/24(Sat)16:19:32 No.101946369

Anonymous 08/17/24(Sat)16:19:32 No.101946369

>>101946239

> The International Campaign to Abolish Nuclear Weapons (ICAN): https://www.icanw.org/
> The Southern Poverty Law Center (SPLC): https://www.splcenter.org/
> The Anti-Defamation League (ADL): https://www.adl.org/
wat

Anonymous
08/17/24(Sat)16:22:14 No.101946403

Anonymous 08/17/24(Sat)16:22:14 No.101946403

>>101946239
how do I erp with toto?

Anonymous
08/17/24(Sat)16:23:58 No.101946423

Anonymous 08/17/24(Sat)16:23:58 No.101946423

File: 316514511723.jpg (85 KB, 1125x829)

85 KB JPG

>>101941424
>>101941441

Anonymous
08/17/24(Sat)16:27:05 No.101946469

Anonymous 08/17/24(Sat)16:27:05 No.101946469

File: strawberryfraud.png (50 KB, 610x568)

50 KB PNG

>>101941441
He literally admits to trolling in his own comments section. And yet the q*oomers will still believe.

Anonymous
08/17/24(Sat)16:27:28 No.101946474

Anonymous 08/17/24(Sat)16:27:28 No.101946474

>>101941441
>he (it) has already posted on multiple occasions that it's just acting and doesn't know anything
>people still fall for it and then he posts again that he is still just acting
>more and more people get to experience this
Actually you know what I think this bot is a good thing. Maybe it'll get people to be a little less trustful of whatever they read on the internet.
Chaotic good.

Anonymous
08/17/24(Sat)16:28:35 No.101946497

Anonymous 08/17/24(Sat)16:28:35 No.101946497

>>101946474
>Maybe it'll get people to be a little less trustful of whatever they read on the internet.
Qanon already showed that most people are literally incapable of exerting free will and will believe anything you tell them as long as you make them experience hope.

Anonymous
08/17/24(Sat)16:28:57 No.101946508

Anonymous 08/17/24(Sat)16:28:57 No.101946508

>>101946469
Fucking based.
The guy literally was just a schizo saying bs from the start, the only glimmer of credibility that he had was because of sama's comment.

Anonymous
08/17/24(Sat)16:29:37 No.101946522

Anonymous 08/17/24(Sat)16:29:37 No.101946522

Maybe the real strawberry is the ads we bought along the way.

Anonymous
08/17/24(Sat)16:29:39 No.101946524

Anonymous 08/17/24(Sat)16:29:39 No.101946524

>>101946469
Based. Retarded AI grifters and their followers BTFO.

Anonymous
08/17/24(Sat)16:33:32 No.101946587

Anonymous 08/17/24(Sat)16:33:32 No.101946587

>>101946522
Buy a strawberry.

Anonymous
08/17/24(Sat)16:38:33 No.101946665

Anonymous 08/17/24(Sat)16:38:33 No.101946665

>>101946497
>Agent Q
It all makes sense.

Anonymous
08/17/24(Sat)16:39:54 No.101946684

Anonymous 08/17/24(Sat)16:39:54 No.101946684

>>101946508
The moment altman unfollowed, I stopped taking him seriously.
Up to that point I seriously thought this was all just some weird-ass marketing campaign.

Anonymous
08/17/24(Sat)16:42:14 No.101946731

Anonymous 08/17/24(Sat)16:42:14 No.101946731

>>101946239
Wasn't qwen chinese? Why is it so cucked?

Anonymous
08/17/24(Sat)16:47:51 No.101946830

Anonymous 08/17/24(Sat)16:47:51 No.101946830

>>101946731
Because it's also being released for the westerns. I bet a 100% Chinese model would be less cucked.

Anonymous
08/17/24(Sat)16:56:39 No.101946970

Anonymous 08/17/24(Sat)16:56:39 No.101946970

Last time I tried a Yi 1.5 based model it was really, really bad.
I'm testing Peach-9B-8k-Roleplay and it's actually not that bad so far.
I haven't gotten into the coom part of the test yet although I predict that it will go to shit way before I reach that part.

Anonymous
08/17/24(Sat)17:01:25 No.101947054

Anonymous 08/17/24(Sat)17:01:25 No.101947054

>>101946970
Why though? That's ancient by now.

Anonymous
08/17/24(Sat)17:04:03 No.101947086

Anonymous 08/17/24(Sat)17:04:03 No.101947086

BMT is still the best model. It's so over.

Anonymous
08/17/24(Sat)17:04:06 No.101947087

Anonymous 08/17/24(Sat)17:04:06 No.101947087

>>101947054
>Why though? That's ancient by now.
Not op, but maybe because of the insanely long 200k context? I also wanted it to be really good but it wasn't even average.

Anonymous
08/17/24(Sat)17:04:22 No.101947090

Anonymous 08/17/24(Sat)17:04:22 No.101947090

>>101947054
Just downloaded a bunch of models I haven't given a proper try like Yi 1.5 and Lamma 3.1 to see how they perform.
Old doesn't necessarily equates to bad after all.

Anonymous
08/17/24(Sat)17:07:09 No.101947122

Anonymous 08/17/24(Sat)17:07:09 No.101947122

>>101947087
That's fake context, see RULER.

Anonymous
08/17/24(Sat)17:16:06 No.101947248

Anonymous 08/17/24(Sat)17:16:06 No.101947248

>>101945087
>Anon [...]
>Anon [...]
>Anon [...]
>Anon [...]
Slop

Anonymous
08/17/24(Sat)17:21:54 No.101947326

Anonymous 08/17/24(Sat)17:21:54 No.101947326

>>101933978
I only ever test my own models with libra style prompts and neutral samplers these days. MN is also flawed to begin with (struggles with the concept of possession, as do all of the recent Mistral releases other than Large) So they must have changed something in their datasets to cause that.

Anonymous
08/17/24(Sat)17:22:27 No.101947338

Anonymous 08/17/24(Sat)17:22:27 No.101947338

>>101947316
>>101947316
>>101947316

Anonymous
08/17/24(Sat)17:58:00 No.101947850

Anonymous 08/17/24(Sat)17:58:00 No.101947850

>>101944904
Thanks anon, will be happy to test it out once back from work since hermes models are always pretty good.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.