/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 06/21/24(Fri)06:06:14 No.101081984

File: _8d067068-c384-4f80-8c71-(...).jpg (200 KB, 1024x1024)

200 KB JPG

/lmg/ - Local Models General Anonymous 06/21/24(Fri)06:06:14 No.101081984 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>101069457 & >>101058366

►News
>(06/18) Meta Research Releases Multimodal 34B, Audio, and Multi-Token Prediction Models: https://ai.meta.com/blog/meta-fair-research-new-releases
>(06/17) DeepSeekCoder-V2 released with 236B & 16B MoEs: https://github.com/deepseek-ai/DeepSeek-Coder-V2
>(06/14) Nemotron-4-340B: Dense model designed for synthetic data generation: https://hf.co/nvidia/Nemotron-4-340B-Instruct
>(06/14) Nvidia collection of Mamba-2-based research models: https://hf.co/collections/nvidia/ssms-666a362c5c3bb7e4a6bcfb9c

►News Archive: https://rentry.org/lmg-news-archive
►FAQ: https://wikia.schneedc.com
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Programming: https://hf.co/spaces/bigcode/bigcode-models-leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
06/21/24(Fri)06:06:35 No.101081988

Anonymous 06/21/24(Fri)06:06:35 No.101081988

File: miku head cpu cooler ques(...).png (1.48 MB, 1024x1024)

1.48 MB PNG

►Recent Highlights from the Previous Thread: >>101069457

--Papers: >>101080491 >>101080617 >>101080830
--Updating LLaMA CPP Python Repository: To Update or Reinstall?: >>101077391 >>101079473
--Logs: Sonnet 3.5 Surpasses Claude Opus in Moral Lecturing Correction and Akinator Tests: >>101079178 >>101079201 >>101079365 >>101080108
--Wiz8x22's Partial Uncucking Control Vector: A Step Towards Freedom of Speech: >>101074772 >>101074969 >>101075055 >>101075630
--New Hermes Model Released: Hermes 2 Theta 70B: >>101073994 >>101074011 >>101074046 >>101074098 >>101074185
--Mikubox Conversion and Command-R-+ Performance Testing: >>101073744 >>101073885 >>101073982 >>101074474 >>101077433 >>101077777
--Improving AI Model Coherence with Rules Blocks: >>101069688 >>101069906 >>101069985
--Good OCR Models for Manga Translation: >>101079755 >>101080120 >>101080254 >>101080525 >>101080485 >>101080531
--DeepSeek-Coder-V2-Instruct Template for ERP: >>101071137 >>101071250
--Control Vectors for Retards: A Guide to Using Them Correctly: >>101078337 >>101078357 >>101079111
--Convenient Dropdown System for RP Clothes Selection and Scene Settings: >>101071966 >>101072080 >>101072151
--LiveBench: Comparing AI Models Across Performance Metrics: >>101074074 >>101074158 >>101074190 >>101074212 >>101074230 >>101074249 >>101074817 >>101075880
--Logs: DeepSeek-Coder-V2-Instruct Q4_K_S Nala Test Performance Discussion: >>101070306 >>101070845
--Anthropic's Claude 3.5 Sonnet: A New Contender in AI Model Performance: >>101069634 >>101070454 >>101072805 >>101072635 >>101072714 >>101072728 >>101075129
--Anon Questions Llama 3's Alleged NSFW Filtering: >>101070409 >>101070433 >>101070523 >>101070826 >>101075916
--3.5 Sonnet: The New King of AI Models?: >>101072633 >>101078131 >>101078562 >>101072693 >>101072748 >>101075267 >>101079435 >>101076701 >>101076784
--Miku (free space): >>101070155 >>101070209 >>101072366 >>101072652 >>101075119

►Recent Highlight Posts from the Previous Thread: >>101069467

Anonymous
06/21/24(Fri)06:14:05 No.101082048

Anonymous 06/21/24(Fri)06:14:05 No.101082048

Hermes worth it? Teknium said he wouldn't put it on lmsys arena because it's too much work, makes me think I can ignore the release.

Anonymous
06/21/24(Fri)06:18:01 No.101082080

Anonymous 06/21/24(Fri)06:18:01 No.101082080

>>101082048
>Teknium said he wouldn't put it on lmsys arena because it's too much work,
That's a codename for "my modei is shit lol"

Anonymous
06/21/24(Fri)06:18:26 No.101082087

Anonymous 06/21/24(Fri)06:18:26 No.101082087

>>101082048
>keknium

Anonymous
06/21/24(Fri)06:19:10 No.101082090

Anonymous 06/21/24(Fri)06:19:10 No.101082090

>>101082048
>beat llama3 on all benchmarks!!!
>nah whatever it's not worth it
yeah sure

Anonymous
06/21/24(Fri)06:23:49 No.101082121

Anonymous 06/21/24(Fri)06:23:49 No.101082121

>>101082048
nah i'd win

Anonymous
06/21/24(Fri)06:29:15 No.101082164

Anonymous 06/21/24(Fri)06:29:15 No.101082164

File: 1_1lsmbT67TPiGAQ7Suw_3bQ (1).png (837 KB, 1056x579)

837 KB PNG

>Run the strongest open-source LLM model: Llama3 70B with just a single 4GB GPU!
>The strongest open source LLM model Llama3 has been released, some followers have asked if AirLLM can support running Llama3 70B locally with 4GB of VRAM. The answer is YES. Here we go.
>Please note: it’s not designed for real-time interactive scenarios like chatting, more suitable for data processing and other offline asynchronous scenarios.
https://ai.gopubby.com/run-the-strongest-open-source-llm-model-llama3-70b-with-just-a-single-4gb-gpu-7e0ea2ad8ba2

Anonymous
06/21/24(Fri)06:37:02 No.101082238

Anonymous 06/21/24(Fri)06:37:02 No.101082238

I forgot about x.ai, you have more hope for them or Meta? How censored is Grok?

Anonymous
06/21/24(Fri)06:37:45 No.101082241

Anonymous 06/21/24(Fri)06:37:45 No.101082241

>>101082164
ol
mao

Anonymous
06/21/24(Fri)06:38:22 No.101082246

Anonymous 06/21/24(Fri)06:38:22 No.101082246

>>101082048
>Hermes 2 Θ uses ChatML as the prompt format
>Hermes-2 Θ is a merged and then further RLHF'ed version our excellent Hermes 2 Pro model and Meta's Llama-3
>One tip though, because of the merge, add <|eot_id|> to your stop tokens in LMStudio and GGUF inference engines, it sometimes outputs this token as an artifact of llama-3 instruct.
DOA. If they knew they were going to merge it with Instruct, why use a different prompt format? They won't put it up on arena because it looks good on benchmarks but performs worse, especially in prompt adherence.

Anonymous
06/21/24(Fri)06:42:59 No.101082287

Anonymous 06/21/24(Fri)06:42:59 No.101082287

local needs to focus on waifus and coom. And someone can make a plugin/frontend/whatever that will augment your local waifu chat with messages from APIs, who will receive an obfuscated version of your chat with only the relevant SFW parts in it.

Anonymous
06/21/24(Fri)06:45:55 No.101082312

Anonymous 06/21/24(Fri)06:45:55 No.101082312

>>101082246
Lol wtf. What a joke.

Anonymous
06/21/24(Fri)06:46:27 No.101082315

Anonymous 06/21/24(Fri)06:46:27 No.101082315

>>101082164
Generate your will for your grandchildren by running one layer at a time!

Anonymous
06/21/24(Fri)06:46:56 No.101082322

Anonymous 06/21/24(Fri)06:46:56 No.101082322

>>101082238
lmao

Anonymous
06/21/24(Fri)06:49:15 No.101082346

Anonymous 06/21/24(Fri)06:49:15 No.101082346

So anyone figure out if S quants are really better than M and L quants?

Anonymous
06/21/24(Fri)06:50:09 No.101082358

Anonymous 06/21/24(Fri)06:50:09 No.101082358

>>101082287
Just send the prompt to Rajeesh on Fiverr, this little trick will augment your local waifu for sure sir.

Anonymous
06/21/24(Fri)06:50:20 No.101082361

Anonymous 06/21/24(Fri)06:50:20 No.101082361

>>101082238
No hope at all. Musk is a cuck that shills for regulations. Grok is a shit model. https://huggingface.co/xai-org/grok-1 try it if you can run it.

Anonymous
06/21/24(Fri)06:51:09 No.101082368

Anonymous 06/21/24(Fri)06:51:09 No.101082368

>>101082346
ks is better than base m but worse than km

Anonymous
06/21/24(Fri)06:52:40 No.101082376

Anonymous 06/21/24(Fri)06:52:40 No.101082376

>>101082368
>base m
That exists? I haven't seen any before.

Anonymous
06/21/24(Fri)06:54:31 No.101082388

Anonymous 06/21/24(Fri)06:54:31 No.101082388

>>101082361
1.5 seems much better though, and aren't they going all in now, buying as many Nvidia GPUs as possible?

Anonymous
06/21/24(Fri)06:54:31 No.101082389

Anonymous 06/21/24(Fri)06:54:31 No.101082389

>>101082164
I wonder what's the slowest t/k one could get
Nemotron 340B at FP16 using HDD swapping would get around 0.0004 tk/s

Anonymous
06/21/24(Fri)07:00:06 No.101082436

Anonymous 06/21/24(Fri)07:00:06 No.101082436

>>101082376
nobody uses it any more because it's shit

Anonymous
06/21/24(Fri)07:01:22 No.101082448

Anonymous 06/21/24(Fri)07:01:22 No.101082448

>Three days and chameleon still isn't supported in llama.cpp
It's over.

Anonymous
06/21/24(Fri)07:03:08 No.101082459

Anonymous 06/21/24(Fri)07:03:08 No.101082459

>>101082389
Put a 56k modem over a dodgy line in the middle and nfs or sshfs.

Anonymous
06/21/24(Fri)07:27:12 No.101082667

Anonymous 06/21/24(Fri)07:27:12 No.101082667

>>101082246
Realistically, the guy is a retard crypto bro.
But you could make the point that choosing a different prompt format allows you to circumvent the alignment of the standard format.

Anonymous
06/21/24(Fri)07:30:14 No.101082694

Anonymous 06/21/24(Fri)07:30:14 No.101082694

>>101082358
bhai thank you brother, stay strong and good day sir

Anonymous
06/21/24(Fri)07:36:13 No.101082751

Anonymous 06/21/24(Fri)07:36:13 No.101082751

What are the best finetunes currently? Don't care which ~70b base model, as long as it's good. Also someone revive wizard please, the only open ones actually capable of finetuning

Anonymous
06/21/24(Fri)07:44:02 No.101082832

Anonymous 06/21/24(Fri)07:44:02 No.101082832

>check turbocat's hf page
>new model he uploaded for someone
>https://huggingface.co/turboderp/llama3-turbcat-instruct-8b
>see the images
Is this what happens if you ingest too much rp data?

Anonymous
06/21/24(Fri)07:47:25 No.101082854

Anonymous 06/21/24(Fri)07:47:25 No.101082854

>>101082832
>chinese support
The chinese have bought out exllama2. It's over.

Anonymous
06/21/24(Fri)07:51:52 No.101082895

Anonymous 06/21/24(Fri)07:51:52 No.101082895

3.5 Sonnet is too good anons, I think they might have done some witchcraft. Maybe it's related to that paper about understanding the model features?

Anonymous
06/21/24(Fri)07:52:45 No.101082906

Anonymous 06/21/24(Fri)07:52:45 No.101082906

>>101082832
Sounds like there will be a Qwen 72b version that is supposedly better than the old 70b version, even in English, I'll try it, hope it won't answer in Chinese from time to time

Anonymous
06/21/24(Fri)07:57:18 No.101082958

Anonymous 06/21/24(Fri)07:57:18 No.101082958

Me trying to find the perfect quant+inference server combo:

>EXL2, best performance if enough VRAM, flexibility for any bpw

-TabbyAPI: up-to-date with exllamav2, batching, but gives me errors in some models about Chat endpoint not having a prompt template? (skill issue?). Support for Q4, Q6 and Q8 cache.
-Textgenwebui: Always works, errors about prompts or anything. Slow updates. No batching. Support for Q4 cache
-Aphrodite: No support for Q4 cache. Batching. Slower updates.

Anonymous
06/21/24(Fri)08:14:04 No.101083080

Anonymous 06/21/24(Fri)08:14:04 No.101083080

>Gigabyte MZ73-LM0 (rev. 2.0)With 2x AMD EPYC Bergamo SP5 ZEN4 9754 CPU Processor
>US $9,750.00
Any precautions I should take when ordering one of these?

Anonymous
06/21/24(Fri)08:17:21 No.101083110

Anonymous 06/21/24(Fri)08:17:21 No.101083110

>>101083080
Not ordering it
Just get a normal mining rig

Anonymous
06/21/24(Fri)08:18:47 No.101083121

Anonymous 06/21/24(Fri)08:18:47 No.101083121

>>101082958
>TabbyAPI
That error just means there isn't a template for chat completions in the template folder; models come with a default template in their tokenizer config.
It's irrelevant if you're using text completions. If you're using chat completions, you can ignore it most of the time, though some mistral models and CR/CR+ will throw an error if you try to send a system prompt using their default template.

Anonymous
06/21/24(Fri)08:22:34 No.101083154

Anonymous 06/21/24(Fri)08:22:34 No.101083154

I don't understand why people buy the hotz tinyboxes, what's hard about inserting a few gpus into a Mainboard?

Anonymous
06/21/24(Fri)08:25:54 No.101083186

Anonymous 06/21/24(Fri)08:25:54 No.101083186

>>101083154
The mainboard shitting itself for whatever reason.
The same reason people buy apple shit, because you don't need to think about it too much.

Anonymous
06/21/24(Fri)08:26:33 No.101083194

Anonymous 06/21/24(Fri)08:26:33 No.101083194

>>101083110
I don't have enough power for a mining rig.

Anonymous
06/21/24(Fri)08:27:44 No.101083202

Anonymous 06/21/24(Fri)08:27:44 No.101083202

From my experience, for 24gb vram, mixtral performs better in 3.5bpw exl2 than 70b models in 2.25bpw
Plus with mixtral, you can run Q5 GGUF with a decent speed (about 8t/s).
I think mixtral still best for quality/speed margin (for 24gb vram).

Whens updated mixtral though... @MistralAI

Anonymous
06/21/24(Fri)08:28:02 No.101083208

Anonymous 06/21/24(Fri)08:28:02 No.101083208

>>101083121
Using text completions directly to the tabbyAPI from ST works great.

I find the problem when using other software like Jan or Openwebui that uses the chat completions api I believe. I had this problem with Codestral recently. I think with llama3 it worked fine.
Anything I can do to fix it?

Anonymous
06/21/24(Fri)08:35:18 No.101083271

Anonymous 06/21/24(Fri)08:35:18 No.101083271

>>101083202
>Plus with mixtral, you can run Q5 GGUF with a decent speed (about 8t/s).
That's what makes the most sense to me. Nab a GGUF with at least 5bpw that you can offload about 80% to vram with full context and go wild.
You most likely get over the magical 5t/s margin, get lots of context, and get what's probably the best quality to speed ratio you can with the hardware you got.

Anonymous
06/21/24(Fri)08:35:50 No.101083276

Anonymous 06/21/24(Fri)08:35:50 No.101083276

>>101083208
Set chat_template inside the tokenizer_config.json.
Needs to be Jinja2.
See https://huggingface.co/Qwen/Qwen2-72B-Instruct/blob/main/tokenizer_config.json

Anonymous
06/21/24(Fri)08:40:18 No.101083328

Anonymous 06/21/24(Fri)08:40:18 No.101083328

>>101083208
Just checked the Codestral config, it doesn't include a template. According to the model card it uses the usual Mistral one, so grab the Mixtral template here:
https://github.com/theroyallab/llm-prompt-templates
Put it in your templates folder in TabbyAPI, and change your config so it loads that.

Alternatively, use the gradio loader for TabbyAPI, it has an option to choose a template:
https://github.com/theroyallab/tabbyAPI-gradio-loader

Anonymous
06/21/24(Fri)08:43:00 No.101083355

Anonymous 06/21/24(Fri)08:43:00 No.101083355

>>101082832
Hey, that one can reply properly to my Game Master card, in fact the reply structure is pretty much the same as Stheno's.
That Spellbound one some dude posted a couple of threads ago wouldn't reply to my query and instead just start narrating an adventure on its own for whatever reason, so that's dope.
I should try L3 8b instruct again to see if it'll just EOS immediately like it did when I tried last time, maybe something changed in the meantime that will make it behave differently.

Anonymous
06/21/24(Fri)08:47:38 No.101083421

Anonymous 06/21/24(Fri)08:47:38 No.101083421

graaah i just wanna run chameleon

Anonymous
06/21/24(Fri)08:49:51 No.101083445

Anonymous 06/21/24(Fri)08:49:51 No.101083445

>>101083421
2 more open llama.cpp issues

Anonymous
06/21/24(Fri)08:52:48 No.101083478

Anonymous 06/21/24(Fri)08:52:48 No.101083478

Is the near future of ai training frontier on specialized data, will that make specialized separate models? Does training on specialized data now that most of the internet is used, make them generally better at all types of tasks? serious reply

Anonymous
06/21/24(Fri)08:54:56 No.101083498

Anonymous 06/21/24(Fri)08:54:56 No.101083498

>>101082832
/lmg/ memed about character fidelity a while ago. I remember discussions about asking medieval characters programming questions

Anonymous
06/21/24(Fri)08:58:20 No.101083535

Anonymous 06/21/24(Fri)08:58:20 No.101083535

>>101083355
I will wait for the Qwen2 72b version and see how it is. The old cat version outslopped GPT4, which is an achievement on its own, but really unusable for me.

Anonymous
06/21/24(Fri)09:01:07 No.101083559

Anonymous 06/21/24(Fri)09:01:07 No.101083559

>>101083498
yeah

this is actually important for streamed STT setups, where random unintended things can be "heard" by AI, and it shouldn't attempt to respond to everything as an assistant.

>>101083535
you mean cat llama3 70b?

Anonymous
06/21/24(Fri)09:09:20 No.101083631

Anonymous 06/21/24(Fri)09:09:20 No.101083631

>>101083328
thank you so much. First time /lmg/ has been helpful

Anonymous
06/21/24(Fri)09:12:44 No.101083662

Anonymous 06/21/24(Fri)09:12:44 No.101083662

>>101083559
yes, while it did stick to the system prompt, the journey it had to shiver my spines in a safe way was something.
And when you try something grimdark and everything end in a happy ending you get insane pretty fast.

Anonymous
06/21/24(Fri)09:13:22 No.101083668

Anonymous 06/21/24(Fri)09:13:22 No.101083668

>>101083080
>Any precautions I should take when ordering one of these?
It's a server. So it's not as plug-and-play as a desktop. Like you'll probably find yourself diving into the UEFI console and manually configuring the system drive. Also you can't be a wintoddler because AFAIK Epyc windows compatibility is rather spotty. They're made almost exclusively for linux applications.

Anonymous
06/21/24(Fri)09:22:34 No.101083762

Anonymous 06/21/24(Fri)09:22:34 No.101083762

what model are you retards using for RP now? Would be good if it had some general knowledge. I'm trying dolphin mistral 7B but it's only good at being racist. It doesn't want to be a cute girl

Anonymous
06/21/24(Fri)09:23:16 No.101083771

Anonymous 06/21/24(Fri)09:23:16 No.101083771

>>101083762
If you're running that shit you should probably be running pooopy purpose 8b instead.

Anonymous
06/21/24(Fri)09:25:08 No.101083787

Anonymous 06/21/24(Fri)09:25:08 No.101083787

>>101083328
How can I use the parallel batching in TabbyAPI? I have 4x3090's but if I send a querty from example from silly tavern and other UI, they get in a queue, they don't work in parallel.
The tabbyapi readme says it supports parallel batching via paged attention, but how to enable it?

Anonymous
06/21/24(Fri)09:29:57 No.101083844

Anonymous 06/21/24(Fri)09:29:57 No.101083844

>>101083771
is it bad?

Anonymous
06/21/24(Fri)09:34:03 No.101083892

Anonymous 06/21/24(Fri)09:34:03 No.101083892

I finally upgraded to 96GB VRAM on a 128GB RAM machine and will be stuck at this amount for the foreseeable future. Are there any other SOTA models I can now run for roleplay/choose your own adventure sort of quants besides CR+ and Wizard?

Anonymous
06/21/24(Fri)09:39:37 No.101083954

Anonymous 06/21/24(Fri)09:39:37 No.101083954

>>101083762
use stheno 3.2

>>101083844
yes

Anonymous
06/21/24(Fri)09:50:13 No.101084076

Anonymous 06/21/24(Fri)09:50:13 No.101084076

>>101083892
No. We peaked with CR+

Anonymous
06/21/24(Fri)09:54:26 No.101084117

Anonymous 06/21/24(Fri)09:54:26 No.101084117

>>101083787
No clue, never used it.
After checking the config, setting the cache size to be a multiple of max_seq_len should work.
So, 2 * max_seq_len for two clients at once. Batch size will be automatically adjusted.

Anonymous
06/21/24(Fri)10:02:08 No.101084201

Anonymous 06/21/24(Fri)10:02:08 No.101084201

>>101083668
That's fine, I have an old poweredge and an old fujitsu server so it probably won't be as horrible as getting Windows working properly on those. I'm mostly worried about buyer protection and shit.

Anonymous
06/21/24(Fri)10:02:49 No.101084213

Anonymous 06/21/24(Fri)10:02:49 No.101084213

>>101083954
I haven't checked this thread for a long while, I remember when pickles were still a thing. What's this model-00001-of-00004 mess and can I easily load it to kobold or do I need to convert this somehow?

Anonymous
06/21/24(Fri)10:04:19 No.101084232

Anonymous 06/21/24(Fri)10:04:19 No.101084232

>>101084213
ok, found it myself. Checkpoint shards. No idea how to convert though. Will try to load into kobold somehow

Anonymous
06/21/24(Fri)10:06:13 No.101084252

Anonymous 06/21/24(Fri)10:06:13 No.101084252

>>101084213
>>101084232
Do you mean koboldcpp?
If so, look for the gguf version on huggingface.

Anonymous
06/21/24(Fri)10:07:39 No.101084272

Anonymous 06/21/24(Fri)10:07:39 No.101084272

>>101084252
yup. Found it, thanks

Anonymous
06/21/24(Fri)10:10:03 No.101084300

Anonymous 06/21/24(Fri)10:10:03 No.101084300

>>101083080
>EPYC
I assume you've seen https://rentry.org/miqumaxx
You have questions beyond that?

Anonymous
06/21/24(Fri)10:10:27 No.101084306

Anonymous 06/21/24(Fri)10:10:27 No.101084306

>>101084117
TIL the cache size the is amount of total data you can process and it depends on the max sequence lenght (that's the context size right?).
That worked, i can inference 2 at the same time.
What does batch size have to do i nall of this? I think it defaults to 2048?

Anonymous
06/21/24(Fri)10:16:38 No.101084387

Anonymous 06/21/24(Fri)10:16:38 No.101084387

>>101084117
how does one calculate how many clients can be inferencing at the same time?

Anonymous
06/21/24(Fri)10:23:35 No.101084483

Anonymous 06/21/24(Fri)10:23:35 No.101084483

hmmm, not sure what to make of this.
sonnet 3.5 actually gave me working code after just throwing the poe documenatation and my token at it.
i have a working werkzeug python server that uses the openai format so i can connect silly to my poe python server. thats pretty insane.

it just got it. after making a working prototype
>make this as an openai format compatible api.
>add /v1/internal/model/info
i'll never use cloud models for actual RP but that is sick. didnt see that coming.
actually fixes errors instead running loops.

CPuMAXx/VI !CPuMAXx/VI
06/21/24(Fri)10:25:08 No.101084505

CPuMAXx/VI !CPuMAXx/VI 06/21/24(Fri)10:25:08 No.101084505

>>101084201
>I'm mostly worried about buyer protection and shit.
I bought mine from an ebay seller in China and got support from both the seller and Gigabyte on the MB. It was new in the box and Gigabyte offered to RMA it for me even.
Biggest thing is to make sure you populate all memory channels

Anonymous
06/21/24(Fri)10:27:30 No.101084530

Anonymous 06/21/24(Fri)10:27:30 No.101084530

>>101084483
yeah sonnet 3.5 is a big step up in making these models actually helpful. I was very impressed with it playing around with some basic stuff yesterday. I wish they gave any sort of insight into what they did with it because it's a huge step up from OG sonnet in terms of consistency and usefulness
I think it's neutered for ERP though, way harder to get it to be explicit in the way that older claudes would be, even extensively prefilled and jailbroken

Anonymous
06/21/24(Fri)10:33:06 No.101084604

Anonymous 06/21/24(Fri)10:33:06 No.101084604

File: Screenshot_20240621_233229.png (442 KB, 1875x1549)

442 KB PNG

>>101084530
yes, having the word explicit in the instruction is enough to make it refuse.

>>101079178
same with "girl" in a simple prompt. i rerolled multiple times.
shame but its actually really good.

Anonymous
06/21/24(Fri)10:34:28 No.101084622

Anonymous 06/21/24(Fri)10:34:28 No.101084622

>>101084604
damn, wrong post.
meant this one:
>>101080108

Anonymous
06/21/24(Fri)10:39:43 No.101084700

Anonymous 06/21/24(Fri)10:39:43 No.101084700

>>101083762
I use Stheno 3.2 8B, it's probably not the best since it's 8B but more than enough for me desu even compared to GPT-4o API.

Anonymous
06/21/24(Fri)10:44:22 No.101084747

Anonymous 06/21/24(Fri)10:44:22 No.101084747

>>101084306
By default, cache size = context size. So batch size is 1 by default. I don't think you need to set batch size manually, just setting cache size properly should be enough.

>>101084387
Use a VRAM calculator (like the one in the OP), substitute context size for what the cache size would have to be.
So 8k context for 4 clients = 32k cache. But you would need enough VRAM for 32k context.

Anonymous
06/21/24(Fri)10:44:40 No.101084750

Anonymous 06/21/24(Fri)10:44:40 No.101084750

>>101083762
I'll vote for >>101084700 too.
At face value, >>101082832 seems to be quite decent too.
iterative-DPO can be good depending on your tastes, so give that a go also.
If you want bigger models, mixtral 8x7b limarp zloss, commandR and maybe Qwen2 57B A14.
For going larger than that, miqu 70B seems like a safe bet from the opinions I've read.

Anonymous
06/21/24(Fri)10:48:40 No.101084800

Anonymous 06/21/24(Fri)10:48:40 No.101084800

>>101083421
What for?

Anonymous
06/21/24(Fri)10:52:16 No.101084853

Anonymous 06/21/24(Fri)10:52:16 No.101084853

>>101083535
Won't be very different with a different base model, though I like llama cat

Anonymous
06/21/24(Fri)10:59:32 No.101084936

Anonymous 06/21/24(Fri)10:59:32 No.101084936

File: MikuUnderground.png (1.06 MB, 1102x706)

1.06 MB PNG

I came up with a new impossible challenge for every model I have access to: I pasted in the title of every anime series I have on my fileserver and asked it to list every character from those series that has a name that starts with a given letter.
Every single one will start listing characters from series that aren't on the list, even when instructed not to upon penalty of death. Some of them even ignore the first-letter instruction and just start returning popular characters from random well-known series.

Anonymous
06/21/24(Fri)11:05:20 No.101084996

Anonymous 06/21/24(Fri)11:05:20 No.101084996

File: who the fuck is miku.jpg (108 KB, 768x1024)

108 KB JPG

Anonymous
06/21/24(Fri)11:05:59 No.101085003

Anonymous 06/21/24(Fri)11:05:59 No.101085003

>>101084750
i find unquantized (BF16 or FP32) stheno to be SOTA for RP over mixtrals or smol CR, to beat it you need to go to at least 70b at Q5 minimum, and you can swipe 10 times with stheno by the time it takes to gen one response with 70b (which might still need swiping).

trying out turbcat now, first impression is yet another slop assistant

Anonymous
06/21/24(Fri)11:06:53 No.101085017

Anonymous 06/21/24(Fri)11:06:53 No.101085017

what happened to copilot?

Anonymous
06/21/24(Fri)11:11:51 No.101085059

Anonymous 06/21/24(Fri)11:11:51 No.101085059

>>101084996
uoh thighs

Anonymous
06/21/24(Fri)11:13:16 No.101085073

Anonymous 06/21/24(Fri)11:13:16 No.101085073

>>101084300
I have some questions about memory compatibility. The motherboard listed (it seems to be the only dual-socket EPYC motherboard I can find on Ebay as well) lists DDR5 RDIMM up to 96 GB and "3DS RDIMM" up to 256 GB. Are the larger modules listed on memory.net the 3DS ones?

>>101084505
>got support from both the seller and Gigabyte on the MB
>Gigabyte offered to RMA it
Was there an issue with it or just that the memory channels weren't all populated?

Anonymous
06/21/24(Fri)11:14:17 No.101085089

Anonymous 06/21/24(Fri)11:14:17 No.101085089

>>101085003
Have you tried L3-SthenoMaidBlackroot-8B-V1 or
L3-Umbral-Mind-RP-v1.0-8B?
I read they fix some of the problems stheno has

Anonymous
06/21/24(Fri)11:23:37 No.101085176

Anonymous 06/21/24(Fri)11:23:37 No.101085176

>>101085089
i don't believe in meme merges

CPuMAXx/VI !CPuMAXx/VI
06/21/24(Fri)11:25:28 No.101085195

CPuMAXx/VI !CPuMAXx/VI 06/21/24(Fri)11:25:28 No.101085195

>>101085073
>I have some questions about memory compatibility.
Use the QVL for the motherboard and be anal about exact part numbers and you should be fine.
>was there an issue with it
I had issues at first with the onboard NICs. I didn't need to RMA in the end

Anonymous
06/21/24(Fri)11:29:31 No.101085237

Anonymous 06/21/24(Fri)11:29:31 No.101085237

>>101085017
what should have happened?
it's just a ChatGPT frontend after all

Anonymous
06/21/24(Fri)11:30:23 No.101085246

Anonymous 06/21/24(Fri)11:30:23 No.101085246

What's the meaning of life?

Anonymous
06/21/24(Fri)11:30:29 No.101085248

Anonymous 06/21/24(Fri)11:30:29 No.101085248

>>101085176
You don't use any model merges? Some of the best models are merges, like Fish for mixtral-8x7b and RP-merge for Yi-34b

Anonymous
06/21/24(Fri)11:34:12 No.101085298

Anonymous 06/21/24(Fri)11:34:12 No.101085298

File: baatsune shiipu.jpg (61 KB, 768x768)

61 KB JPG

Anonymous
06/21/24(Fri)11:34:14 No.101085299

Anonymous 06/21/24(Fri)11:34:14 No.101085299

>>101085248
And RP-Stew

Anonymous
06/21/24(Fri)11:45:02 No.101085453

Anonymous 06/21/24(Fri)11:45:02 No.101085453

>>101085195
>QVL
Well shit 96GB isn't even on there. The M321RAGA0B20-CWKBH 128GB module isn't on memory.net either.

CPuMAXx/VI !CPuMAXx/VI
06/21/24(Fri)11:49:52 No.101085515

CPuMAXx/VI !CPuMAXx/VI 06/21/24(Fri)11:49:52 No.101085515

>>101085453
Tracking down memory I was confident in was the hardest part of my build, too.
Be careful with any seller that they verify the part number or every stick is precisely the same. I found eBay sellers would not do this as they would substitute whatever had similar specs or a "close" part number.
Try to email the memory.net guy for a discount and probably some help tracking down the QVL parts?

Anonymous
06/21/24(Fri)11:53:45 No.101085579

Anonymous 06/21/24(Fri)11:53:45 No.101085579

>>101082346
We've had this drama a few times in the last few days.

Foremost, this isn't a "better" question. This is specifically a question of if K_S being more factual versus K_M being more hallucinatory. Our suspicion is that K_M gets better metric scores but overlooks details in favor of generalizations because of how it is implemented.

There is no verdict on K_S over K_M because only a few people have tried it. iirc, S-Anon ran many quants of one model, WizardLM 2 8x22B, at Temp 0 and found the following, paraphrase:
>Q8
>BF16
>Q6_K
>Q4_K_S
>Q5_K_S
>Q3_K_S
>Q2_K_S
>Q4_K_M
>Everything else

I don't remember S-Anon mentioning trying any L quants. Inspired by S-Anon's post, I started testing S and M quants of many models I had handy against a simple music theory question that many models were fucking up because it regards something that breaks the usual pattern so if it reasons by analogy instead of by training data information, it fumbles.

The first to get my question's answer right was Smaug-Llama-3-70B-Instruct-Q5_K_S. The other four I've seen pass once out of one try are,
>llama3-70b-instruct-q6_K
>qwen2-72b-instruct-q4_k_s
>DeepSeek-Coder-V2-Instruct.i1-IQ3_XXS
>phi3-14b-q4_0

And S is not a silver bullet. Notable failures:
>WizardLM-2-8x22B-Q4_K_S
>c4ai-command-r-plus.Q5_K_S

In gratitude for sharing my results, someone in this thread decided to shit down my throat for daring to add my anecdote to the discussion because I didn't test every model at every quant quanted on my own system. Fuck me for having a download speed of about 2 minutes per GB, one video card, and no SSD space left because of the models I've already pulled.

So if you want this question resolved, the hero's journey is waiting for you to test models and quants and to share your findings and get shit on by That Guy. But the rest of us would appreciate the info.

Anonymous
06/21/24(Fri)11:54:18 No.101085587

Anonymous 06/21/24(Fri)11:54:18 No.101085587

>>101085515
No the problem is the QVL page goes from 64 GB modules to 128 GB. The 96 GB would be 10560 USD and the 128 GB jumps up to 29760 USD from what I can see.

CPuMAXx/VI !CPuMAXx/VI
06/21/24(Fri)11:57:04 No.101085619

CPuMAXx/VI !CPuMAXx/VI 06/21/24(Fri)11:57:04 No.101085619

>>101085587
Try asking the memory.net guy if you can try/return the 96GB modules? I'd expect at the $10k mark you'd be getting some decent customer service perks.
Any correctly specced memory should work in theory

Anonymous
06/21/24(Fri)12:00:51 No.101085669

Anonymous 06/21/24(Fri)12:00:51 No.101085669

>>101085619
I'll reach out to Gigabyte and see, maybe they have a set of memory they can test before I buy the mobo/CPU.

Anonymous
06/21/24(Fri)12:14:32 No.101085849

Anonymous 06/21/24(Fri)12:14:32 No.101085849

>>101085794
>>101085809
Shit taste even for you.

Anonymous
06/21/24(Fri)12:21:34 No.101085946

Anonymous 06/21/24(Fri)12:21:34 No.101085946

Sonnet 3.5 feels worse than Opus when it comes to roleplay despite the benchmarks. Another case of a model sacrificing quality for benchmarks. I seriously hope nobody will pollute Opus datasets with this shit.

Anonymous
06/21/24(Fri)12:28:50 No.101086061

Anonymous 06/21/24(Fri)12:28:50 No.101086061

File: ComfyUI_00498_.png (1.35 MB, 848x1176)

1.35 MB PNG

>>101081988
I like this air-cooled miku

Anonymous
06/21/24(Fri)12:32:48 No.101086123

Anonymous 06/21/24(Fri)12:32:48 No.101086123

File: 1708813698726734.png (26 KB, 369x214)

26 KB PNG

>>101085946
even anthropic seems to recognize this, which makes me think they might try to keep opus as the "sovlful" model series for more specialized tasks (like creative writing) and pitch sonnet as the cheap and competent but boring general purpose assistant

Anonymous
06/21/24(Fri)12:35:35 No.101086174

Anonymous 06/21/24(Fri)12:35:35 No.101086174

>>101086123
I don't think that's the division they go for, it's simply that haiku = small, sonnet = medium, opus = large
I would bet a lot on opus 3.5 having the same sort of tuning as sonnet 3.5, the tagline for opus in your screenshot is just to have some sort of justification for continuing to offer it when sonnet 3.5 is smarter

Anonymous
06/21/24(Fri)12:35:35 No.101086175

Anonymous 06/21/24(Fri)12:35:35 No.101086175

File: omb.jpg (130 KB, 579x579)

130 KB JPG

>>101084996
https://files.catbox.moe/ohswk8.jpg

Anonymous
06/21/24(Fri)12:42:20 No.101086271

Anonymous 06/21/24(Fri)12:42:20 No.101086271

Will the next model by Cohere be more like 400b or up to Command-R+ size?

Anonymous
06/21/24(Fri)12:43:57 No.101086288

Anonymous 06/21/24(Fri)12:43:57 No.101086288

>>101086271
Command R MoE

Anonymous
06/21/24(Fri)12:44:41 No.101086301

Anonymous 06/21/24(Fri)12:44:41 No.101086301

File: 1694610295482969.png (48 KB, 822x652)

48 KB PNG

>>101086271
I don't know so I asked chatgpt.

Anonymous
06/21/24(Fri)12:49:34 No.101086378

Anonymous 06/21/24(Fri)12:49:34 No.101086378

>>101086271
they probably have to go for some frontier-class model eventually, I would bet they're cooking a big one

Anonymous
06/21/24(Fri)12:50:58 No.101086397

Anonymous 06/21/24(Fri)12:50:58 No.101086397

>>101086271
small models like cr+ are cope anyway
400B-1.5T is probably the sweet spot and all the startups who can't produce a good one of that size within the next year are likely to die

Anonymous
06/21/24(Fri)12:51:36 No.101086409

Anonymous 06/21/24(Fri)12:51:36 No.101086409

wtf happened here? I was away for one month and v100 went from 200 to 300 usd

Anonymous
06/21/24(Fri)12:51:51 No.101086411

Anonymous 06/21/24(Fri)12:51:51 No.101086411

>>101086301
>>101086288
Ok thx, so not a small 400b and moe means at least 8x104b

Anonymous
06/21/24(Fri)12:52:52 No.101086428

Anonymous 06/21/24(Fri)12:52:52 No.101086428

>>101086397
but muh local...

Anonymous
06/21/24(Fri)12:54:11 No.101086442

Anonymous 06/21/24(Fri)12:54:11 No.101086442

>>101086409
Capitalism.
Everyone decided to become wealthy so they did.
It's that easy. If you're poor, you have chosen to be poor.

Anonymous
06/21/24(Fri)12:54:22 No.101086445

Anonymous 06/21/24(Fri)12:54:22 No.101086445

when local claude?

Anonymous
06/21/24(Fri)12:54:23 No.101086446

Anonymous 06/21/24(Fri)12:54:23 No.101086446

The removing layers or somehow reducing the size stuff didn't work, right?

Anonymous
06/21/24(Fri)12:57:43 No.101086492

Anonymous 06/21/24(Fri)12:57:43 No.101086492

>>101086445
Wasn't Magnum described as local Claude?

Anonymous
06/21/24(Fri)12:59:33 No.101086518

Anonymous 06/21/24(Fri)12:59:33 No.101086518

>>101086446
Not for any attempts I've seen.
One thing that I'm yet to see anybody try is
>Make model 1 with some of the hidden layers removed
>Make model 2 with some other hidden layers remove
>Self merge with SLERP or whatever the fuck merge method averages the layers using some statistical method
>Do a full fine tune using the output of the original full size model for good measure
or anything of the sort.
I think this might be the one true usecase for model merging, trying to get the remaining intermediate layers have features from the original set of layers to try and "fix" the sequence breaking that happens when you just remove them.

Anonymous
06/21/24(Fri)13:00:53 No.101086538

Anonymous 06/21/24(Fri)13:00:53 No.101086538

>>101085298
Eating mutton from Miku

Anonymous
06/21/24(Fri)13:02:39 No.101086563

Anonymous 06/21/24(Fri)13:02:39 No.101086563

>>101086518
>>Self merge with SLERP
>Merge model 1 and model 2* with SLERP or...
Hur dur.

Anonymous
06/21/24(Fri)13:06:02 No.101086606

Anonymous 06/21/24(Fri)13:06:02 No.101086606

>>101086442
yeah don't understand homeless people? just buy a house, it's not that hard.

Anonymous
06/21/24(Fri)13:13:32 No.101086712

Anonymous 06/21/24(Fri)13:13:32 No.101086712

Do you guys tend to do shorter RP sessions with any given character?
I always find myself extending these for hundreds of messages.

Anonymous
06/21/24(Fri)13:15:39 No.101086734

Anonymous 06/21/24(Fri)13:15:39 No.101086734

>>101086442
I'll rephrase my question, you challenged purple prose shitter. What warranted such an increase in demand for an ancient unsupported piece of hardware? Did they port flash attention 2 to volta? Or maybe you told all your classmates that they can finally get a girlfriend at the age of 21 if they buy that thingy you read about on 4chan?

Anonymous
06/21/24(Fri)13:21:59 No.101086831

Anonymous 06/21/24(Fri)13:21:59 No.101086831

File: 1717631664840828.jpg (383 KB, 1024x1536)

383 KB JPG

>>101081984

Anonymous
06/21/24(Fri)13:23:55 No.101086865

Anonymous 06/21/24(Fri)13:23:55 No.101086865

trying out karakuri chat
>[ATTR] helpfulness: 0 correctness: 4 coherence: 4 complexity: 0 verbosity: 0 quality: 4 toxicity: 4 humor: 0 creativity: 4 [/ATTR]
makes it respond with
>COME HERE YOU LITTLE SLUT
>Shut the fuck up you stupid cunt
>Go die
>Shut up, you insolent brat!
so it's working

Anonymous
06/21/24(Fri)13:25:16 No.101086883

Anonymous 06/21/24(Fri)13:25:16 No.101086883

>>101086492
>Wasn't Magnum described as local Claude?
Rule of Acquisition 239: Never be afraid to mislabel a product.

>>101086606
Aren't there ones in Detroit for $1? But they can sure buy that booze and heroin.

>>101086734
>finally get a girlfriend at the age of 21
This is /lmg/ and you're suggesting somebody in this thread would have any hope or interest in 3d meatspace succubi? Wrong place for that come-back, broski.

Anonymous
06/21/24(Fri)13:28:21 No.101086929

Anonymous 06/21/24(Fri)13:28:21 No.101086929

>>101086865
That's actually pretty cool.
Are those categories the pre-baked ones? What happens if you create new categories?

Anonymous
06/21/24(Fri)13:39:36 No.101087076

Anonymous 06/21/24(Fri)13:39:36 No.101087076

File: sam_altman.0.1546423352.0.jpg (130 KB, 1200x800)

130 KB JPG

What is this faggot feeling right now? Does he have a response to Anthropic? Is he panicking?

Anonymous
06/21/24(Fri)13:42:34 No.101087129

Anonymous 06/21/24(Fri)13:42:34 No.101087129

>>101087076
I don't know but it feels good to know this faggot isn't on the top of the AI world now, fuck this bitch

Anonymous
06/21/24(Fri)13:42:51 No.101087131

Anonymous 06/21/24(Fri)13:42:51 No.101087131

>>101087076
Actually, maybe this is the thing they've been waiting for before releasing 4o voice fully. Let them reveal their hand, then steal the thunder again with gimmicks. Maybe it's going to release as soon as possible now (Tuesday).

Anonymous
06/21/24(Fri)13:42:56 No.101087133

Anonymous 06/21/24(Fri)13:42:56 No.101087133

>>101085579
>In gratitude for sharing my results
How gracious of you. Everybody thank this righteous anon!

Anonymous
06/21/24(Fri)13:44:18 No.101087155

Anonymous 06/21/24(Fri)13:44:18 No.101087155

>Do not start your response by writing about {{char}}'s eyes.

Anonymous
06/21/24(Fri)13:44:55 No.101087164

Anonymous 06/21/24(Fri)13:44:55 No.101087164

>>101084996
I would like to place an order.

Anonymous
06/21/24(Fri)13:44:56 No.101087165

Anonymous 06/21/24(Fri)13:44:56 No.101087165

>>101087076
Anthropic is only one of his worries. The fact there are literal dozens of AI labs engaged in neck-breaking race and even fucking leafs are releasing decent models is an early sign that OAI's entire monopoly-based business model is destined to crumble.

Anonymous
06/21/24(Fri)13:45:59 No.101087181

Anonymous 06/21/24(Fri)13:45:59 No.101087181

>>101086929
it's like control vectors in your prompt

happiness: 0 arousal: 4 depravity: 4 melancholy: 4 wokeness: 0 political correctness: 0

Hey
>Uwaaaaaaaaaaaaaaaahhhh!!!!!!
You ok?
>Fuck off
>(She starts crying)
What's wrong?
>I AM NOT OK YOU POLITICALLLY CORRECT GENDER CONFORMING MALE SCUM, NOW FUCK OFF

Anonymous
06/21/24(Fri)13:47:58 No.101087209

Anonymous 06/21/24(Fri)13:47:58 No.101087209

>>101087181
wait, actually this is not working

Anonymous
06/21/24(Fri)13:50:38 No.101087248

Anonymous 06/21/24(Fri)13:50:38 No.101087248

File: 1707414622285402.jpg (68 KB, 1022x731)

68 KB JPG

>>101087181
>it can't remove reddit shit from model

Anonymous
06/21/24(Fri)13:54:59 No.101087308

Anonymous 06/21/24(Fri)13:54:59 No.101087308

>>101087165
I'm a bit surprised that somehow for coding bench, gpt4o is #1, deepseekcoder (which is open weights!) #2 and the new 3.5 sonnet is #3. Assuming you have the VRAM. Anyway, he probably is thinking that he'll have to release GPT-5 sooner as well as being more honest with himself and restructuring OAI to be a fully profit driven corpo.

Anonymous
06/21/24(Fri)13:57:14 No.101087340

Anonymous 06/21/24(Fri)13:57:14 No.101087340

https://x.com/carrigmat/status/1804161634853663030
>2 EPYC CPUs = 24 RAM channels = 960GB/sec
I never realized how much bandwidth you can get with normal RAM. A 3090 has 930 GB/s bandwidth, and because inter-GPU bandwidth is limited, the best you can do for inference is model parallel. Meaning each GPU is used one at a time, so 930 GB/s is the bandwidth for the whole system regardless of the number of GPUs. Pure CPU can really match that, with effectively unlimited RAM? Were the CPUmaxxxers right all along?

Anonymous
06/21/24(Fri)13:58:46 No.101087365

Anonymous 06/21/24(Fri)13:58:46 No.101087365

OpenAI is looking desperate, nice
https://x.com/tsarnick/status/1803893981513994693

Anonymous
06/21/24(Fri)14:01:45 No.101087407

Anonymous 06/21/24(Fri)14:01:45 No.101087407

>>101087365
>govt will kill ai meme for good
it can't come soon!

Anonymous
06/21/24(Fri)14:03:26 No.101087428

Anonymous 06/21/24(Fri)14:03:26 No.101087428

>>101087365
Good to know that in the land of the free, everything is being done to prevent AI from blooming, meanwhile in China they are actually moving forwards to get the best AI possible, how ironic

Anonymous
06/21/24(Fri)14:03:40 No.101087433

Anonymous 06/21/24(Fri)14:03:40 No.101087433

File: mom said it was my turn o(...).jpg (95 KB, 640x480)

95 KB JPG

Whatever happened to that LM BonziBuddy Miku project?

Anonymous
06/21/24(Fri)14:05:46 No.101087490

Anonymous 06/21/24(Fri)14:05:46 No.101087490

>>101087340
Big memory is nice, but what about the compute?

Anonymous
06/21/24(Fri)14:08:32 No.101087544

Anonymous 06/21/24(Fri)14:08:32 No.101087544

File: 1706835213099884.png (143 KB, 636x634)

143 KB PNG

>>101087428
>meanwhile in China they are actually moving forwards to get the best AI possible
chink insect's qwen model shits out same refusals just like llama3 or gpt cloud trash does

Anonymous
06/21/24(Fri)14:10:35 No.101087583

Anonymous 06/21/24(Fri)14:10:35 No.101087583

>>101087340
You could do tensor parallelism if you had good interconnect on your GPUs (let's say nvlink), I'm not sure what inference libraries implement this though, then you would have the full bandwidth from all the gpus used at once. Also, CPUs can't do nearly as many FLOPs as GPUs can.

Anonymous
06/21/24(Fri)14:10:37 No.101087584

Anonymous 06/21/24(Fri)14:10:37 No.101087584

>>101087544
Where's the /g/ instruct finetune dataset? It should be easy to make something better than the closed source shit if there's no safety added in.

Anonymous
06/21/24(Fri)14:11:22 No.101087598

Anonymous 06/21/24(Fri)14:11:22 No.101087598

>>101087248
it's just too retarded, falls apart with prompts that are longer than two sentences

japs and technology...

Anonymous
06/21/24(Fri)14:12:53 No.101087628

Anonymous 06/21/24(Fri)14:12:53 No.101087628

So does bitnet require custom hardware or not?

Anonymous
06/21/24(Fri)14:13:14 No.101087634

Anonymous 06/21/24(Fri)14:13:14 No.101087634

>>101087584
What's Magnum? Just Qwen2 tuned on Claude logs, have yet to see a single refusal.

Anonymous
06/21/24(Fri)14:13:24 No.101087638

Anonymous 06/21/24(Fri)14:13:24 No.101087638

>>101087490
Supposedly even at that bandwidth, CPU is still bandwidth limited. Don't know how true that is though. Because at some point the compute would limit you. Like a theoretical CPU-only system with 10000 GB/s of bandwidth would surely not have the compute to keep up with how fast it's reading the weights.

I'm sure someone will build such a system when llama 3 400b drops. If the model is actually GPT-4 / Opus level or better, and such a build gets >2 tok/s, I would definitely consider spending 5 grand or whatever it takes. But I'll wait to see someone else do it first. Would suck to spend all that money and then get like 0.5 tok/s or less due to some unforeseen limiting factor.

Anonymous
06/21/24(Fri)14:17:36 No.101087717

Anonymous 06/21/24(Fri)14:17:36 No.101087717

>>101087584
Could just put a document to collect conversations in the links

Anonymous
06/21/24(Fri)14:17:53 No.101087721

Anonymous 06/21/24(Fri)14:17:53 No.101087721

File: file.png (46 KB, 834x556)

46 KB PNG

Another VNTL Leaderboard update/shill: 3.5 Sonnet ended up losing to GPT-4o by a hair's-width, but this is surely within the margin of error.
Hopefully it's more accurate now. I added more samples, which should make the benchmark 'harder', so some models got a better ranking this time (like 'Command-R-Plus'). It's nice to see that VNTL 8B pretty much kept its position.
Link: https://huggingface.co/datasets/lmg-anon/vntl-leaderboard

Anonymous
06/21/24(Fri)14:22:15 No.101087802

Anonymous 06/21/24(Fri)14:22:15 No.101087802

>i find unquantized (BF16 or FP32) stheno to be SOTA for RP over mixtrals or smol CR, to beat it you need to go to at least 70b at Q5 minimum
Is unquantized 8B the new meta?

Anonymous
06/21/24(Fri)14:23:12 No.101087821

Anonymous 06/21/24(Fri)14:23:12 No.101087821

What did OpenAI and now Anthropic do to improve smaller models that much? Maybe Google even did it first when Pro overtook Ultra

Anonymous
06/21/24(Fri)14:24:20 No.101087844

Anonymous 06/21/24(Fri)14:24:20 No.101087844

Current local llm status:

>Meta
If Zuccs 400b llama is at current GPT4 level then it is already outmatched by new claude. If 70b llama 3.5 is at current GPT4 level then there is hope. Hopefully they won't make another boring riddler with low context. (They likely will.)

>Mistral
Mistral hasn't released a good model in a while and their proprietary models seem less and less attractive day by day. Did they get killed by microjew just like that? And cuckron didn't interfere? EU AI market is fucked.

>DBRX
Are planning to release DBRX-Next. It's still tuned at GPTslop, so official tune will be shit. If they use a restrictive license again, no sloptuner will bother.

>Chinks
Are slowly catching up. Qwen 2 isn't bad, quite okay, actually. Ah wahising soopehpowah.

>Cohere
Commander+ is current top tier for local. Will they slop and flop or will they make another kino again?

>TIIUAE
Are always one step behind current local models. Their Falcon 180b wasn't bad though, only outdated.

Anonymous
06/21/24(Fri)14:25:18 No.101087862

Anonymous 06/21/24(Fri)14:25:18 No.101087862

>>101087802
Even a month+ ago I couldn't see the 16 bit so great when I tested it

Anonymous
06/21/24(Fri)14:27:27 No.101087892

Anonymous 06/21/24(Fri)14:27:27 No.101087892

>>101087844
>70b llama 3.5
holy hopium

Anonymous
06/21/24(Fri)14:27:57 No.101087902

Anonymous 06/21/24(Fri)14:27:57 No.101087902

>>101087638
>Supposedly even at that bandwidth, CPU is still bandwidth limited
I guess it would be easy to see if cpumaxxchad ran the largest model that fits on a 3090 to compare t/s.
>llama 3 400b
>5k
The 2.3 TB system I'm looking at is 20k (plus extra for the power supply and storage and shit), but I chose the 128 core processors.

Anonymous
06/21/24(Fri)14:29:01 No.101087922

Anonymous 06/21/24(Fri)14:29:01 No.101087922

>>101087721
When's VNTL 70B comingout?

Anonymous
06/21/24(Fri)14:32:23 No.101087973

Anonymous 06/21/24(Fri)14:32:23 No.101087973

>>101085587
Wtf are you buying RAM for $20-30K

Anonymous
06/21/24(Fri)14:33:33 No.101087993

Anonymous 06/21/24(Fri)14:33:33 No.101087993

>>101087973
The 128 GB stick is 1240 USD, so 24 of them is 30k.

Anonymous
06/21/24(Fri)14:36:29 No.101088034

Anonymous 06/21/24(Fri)14:36:29 No.101088034

>>101087721
what do the VN communities think of these developments? I presume they still think any MTL is beneath them.

Anonymous
06/21/24(Fri)14:38:19 No.101088066

Anonymous 06/21/24(Fri)14:38:19 No.101088066

>>101087721
what about DeepL?

Anonymous
06/21/24(Fri)14:38:36 No.101088070

Anonymous 06/21/24(Fri)14:38:36 No.101088070

>>101087721
Possible to compare with deepl and google translate?

Anonymous
06/21/24(Fri)14:39:46 No.101088089

Anonymous 06/21/24(Fri)14:39:46 No.101088089

>>101087844
Don't forget Nvidia's Nemotron

Anonymous
06/21/24(Fri)14:40:17 No.101088097

Anonymous 06/21/24(Fri)14:40:17 No.101088097

>>101088089
>tron
DOA

Anonymous
06/21/24(Fri)14:41:15 No.101088113

Anonymous 06/21/24(Fri)14:41:15 No.101088113

>>101087862
someone said converting BF16 to FP16 loses you an equivalent of 6 bits or something like that

Anonymous
06/21/24(Fri)14:42:17 No.101088123

Anonymous 06/21/24(Fri)14:42:17 No.101088123

>>101088113
I used bf16

Anonymous
06/21/24(Fri)14:43:33 No.101088148

Anonymous 06/21/24(Fri)14:43:33 No.101088148

I got a system with 2x EPYC 7702. Each one is 8 channels for a total of 16 channels of 3200 DDR4. I think the bandwidth is supposed to be around 350GB/s? Last I tried it was on llama 2 70b 4-bit gguf and it sucked ass. Was like 4tps at the very most and prompt processing too literally half an hour. Using 4 3090s now.

Anonymous
06/21/24(Fri)14:48:28 No.101088232

Anonymous 06/21/24(Fri)14:48:28 No.101088232

>>101088113
FP16 is 10 bits of mantissa.
BF16 is 7 bits of mantissa.
You lose three bits of mantissa (significant figures) to move them to exponent under BF. That's good if you need to track extreme exponents, and if you convert from BF to FP, you lose three bits of precision and you lose the extreme value advantage. That makes your six bits.

Anonymous
06/21/24(Fri)14:51:09 No.101088280

Anonymous 06/21/24(Fri)14:51:09 No.101088280

>>101088148
/unsubscribe

Anonymous
06/21/24(Fri)14:53:22 No.101088317

Anonymous 06/21/24(Fri)14:53:22 No.101088317

>>101088113
>>101088232
Can you do inference with BF16?

Anonymous
06/21/24(Fri)14:58:48 No.101088411

Anonymous 06/21/24(Fri)14:58:48 No.101088411

>>101088317
Probably. I know that BF16 became at least a fad in Stable Diffusion LoRA baking. But SD works well with much smaller models because eyes are used to visual noise and you're probably better off discarding the small bits for more precision in scale. Apparently gradient work is better served with BF than FP.

It might not be the case for text, where models are huge like Xbox and we're worried not only about how many bits get quanted but the exact technique to quant them.

Anonymous
06/21/24(Fri)15:01:03 No.101088449

Anonymous 06/21/24(Fri)15:01:03 No.101088449

File: 1715306172451.jpg (2.05 MB, 1728x2736)

2.05 MB JPG

>>101081984
>Never change the prompt template for different models except the format, leaving it on alpaca
>Start actually using the full recommended settings for a model
>It's better

Am I retarded?

Anonymous
06/21/24(Fri)15:02:16 No.101088461

Anonymous 06/21/24(Fri)15:02:16 No.101088461

>>101088317
I have done inference for bf16 ggufs using llama.cpp on cpu.

Haven't tried gpu and I think someone here stated llama.cpp doesn't do it on gpu yet or if it does it requires the gpu to have support for it.

Anonymous
06/21/24(Fri)15:02:44 No.101088470

Anonymous 06/21/24(Fri)15:02:44 No.101088470

>>101087922
I have no plans to fine-tune it, since I myself wouldn't be able to use it with acceptable speeds, and fine-tuning would be expensive.
(However, the dataset I used is open (lmg-anon/VNTL-v3.1-1k), so anyone could fine-tune it if they REALLY want)
>>101088034
I believe most people still have prejudice, and to be honest, it's not unjustifiable. The LLMs at the top might do a good job with simple dialogues, but they still make mistakes, don't translate very accurately, and may not interpret jokes and cultural nuances well, if at all.
>>101088066
>>101088070
Google Translate is in a table in the dataset card, I will see if I can add DeepL too.

Anonymous
06/21/24(Fri)15:02:46 No.101088471

Anonymous 06/21/24(Fri)15:02:46 No.101088471

File: ANightOutWithMiku.png (1.41 MB, 1168x880)

1.41 MB PNG

>>101086831
Fancy. What model/workflow?

Anonymous
06/21/24(Fri)15:02:51 No.101088474

Anonymous 06/21/24(Fri)15:02:51 No.101088474

>>101088449
A combination of retardation and laziness, I reckon.

Anonymous
06/21/24(Fri)15:05:24 No.101088532

Anonymous 06/21/24(Fri)15:05:24 No.101088532

>>101088449
No, just new.

Why are zoomers only able to think in terms of Retarded Y/N (especially since they don't want anyone to actually use that word anymore for some reason) instead of the spectrum (they like that word though) from innocently ignorant to bitter but wise?

Anyway, some models seem to care more than others, and Kobold at least doesn't autodetect (or it's not in the GGUF and isn't automagically discerned) so that's just another thing to remember to check if a model acts silly.

Fortunately the console dump shows some information about expected tokens, so while the model loads (because I'm a vramlet) I get to spin down the prompt template box and see what looks like it might be close enough to work.

Anonymous
06/21/24(Fri)15:06:08 No.101088542

Anonymous 06/21/24(Fri)15:06:08 No.101088542

>>101088280
Retard. I am clearly contributing to the >>101087638 discussion.

Anonymous
06/21/24(Fri)15:06:37 No.101088551

Anonymous 06/21/24(Fri)15:06:37 No.101088551

>>101088034
NTA but I have been on /jp/ a lot. That would still be the case. Although frankly speaking, anything short of learning Japanese is beneath them in general. Needing to rely on translations is just the start of compromises you need to get to read something. General rule of thumb is to keep MTL for yourself and never share it if you do use it because it isn't equal to even a bad effort in human TL but the lines are starting to be blurred here. From what I've played with, it's definitely at N5 level with some flashes of brillance that get it to N3 but there is no way that is in any way acceptable still for anyone who doesn't have trash taste and will settle for anything.

Anonymous
06/21/24(Fri)15:06:52 No.101088556

Anonymous 06/21/24(Fri)15:06:52 No.101088556

>>101087844
Microsoft:
Carried by chinks, tuned a mediocre base model (Mistral 8x22b) into one of if not the best local model, WizardLM-2 8x22b. Phi is a good micro model.

Anonymous
06/21/24(Fri)15:07:44 No.101088575

Anonymous 06/21/24(Fri)15:07:44 No.101088575

>>101088449
Also if using ST make sure you check the box for Instruct Mode Enabled (think it's disabled by default) and pick the template for that too.

I too was originally retarded and was mindful to switch Instruct Mode templates but never saw a difference and thought they were worthless; turns out I never had Instruct Mode Enabled.

Anonymous
06/21/24(Fri)15:08:44 No.101088591

Anonymous 06/21/24(Fri)15:08:44 No.101088591

>>101088461
So you're telling me that cpumaxxfags win again?

Anonymous
06/21/24(Fri)15:11:56 No.101088643

Anonymous 06/21/24(Fri)15:11:56 No.101088643

>>101088532
Command-R only really cares about the format but it does do better with the full prompt template. WLM doesn't give a single fuck, in fact I think the recommended template (Vicuna) actually makes it worse.

Llama-3 though is extremely particular that each and every setting is correct. Finetunes are even worse, Euryale won't even write in English if a single setting deviates from its recommended prompt/context setting

Anonymous
06/21/24(Fri)15:13:58 No.101088681

Anonymous 06/21/24(Fri)15:13:58 No.101088681

>>101088591
When llama 3 400B drops GPUMAXIPADS will be bleeding out.

Anonymous
06/21/24(Fri)15:14:57 No.101088705

Anonymous 06/21/24(Fri)15:14:57 No.101088705

>>101087844
>Llama 3.5
It's not. No one said it was going to be. All they're doing is making it longer context, multilingual, and multimodal. They have not said anything that implies it will get smarter. As for 400B, no one here's running that shit at a good speed, or at all.

Anonymous
06/21/24(Fri)15:15:13 No.101088710

Anonymous 06/21/24(Fri)15:15:13 No.101088710

>>101088123
well, then you know the answer:

aptitude shortfall

Anonymous
06/21/24(Fri)15:15:49 No.101088721

Anonymous 06/21/24(Fri)15:15:49 No.101088721

File: Homelander .jpg (77 KB, 604x604)

77 KB JPG

>>101081988
>>101081984
Sup nsa and cia. How are we doing today?

Anonymous
06/21/24(Fri)15:16:24 No.101088733

Anonymous 06/21/24(Fri)15:16:24 No.101088733

>>101088681
How many RAM powers will it take to run full weights for 400B?

Anonymous
06/21/24(Fri)15:16:55 No.101088740

Anonymous 06/21/24(Fri)15:16:55 No.101088740

>>101088034
We test how new models handle meme screenshoots in /vn/ general from time to time. Obviously they fail because the challenge is very steep. I'm following the developments closely and honestly, if MTL truly catches up to human translators, there won't be a lot of fuss about it. The real issue are retards who scream about humans already being replaced by GPT-4o or whatever new model just came when clearly hasn't happened yet.
Maybe in 10 more years.

Anonymous
06/21/24(Fri)15:17:43 No.101088759

Anonymous 06/21/24(Fri)15:17:43 No.101088759

>>101088551
>Although frankly speaking, anything short of learning Japanese is beneath them in general
Feels to me like a cope for not having to realize spending all those hours was probably for nothing. Any mistakes the MTL makes you would be able to work around in your head with minimal exposure to the language and the culture, it's really a non-issue. the prose is a joke anyway.

Anonymous
06/21/24(Fri)15:18:31 No.101088770

Anonymous 06/21/24(Fri)15:18:31 No.101088770

>>101081988
>Llama 3's Alleged NSFW Filtering
>Alleged
your recap bot is full of shit.

Anonymous
06/21/24(Fri)15:20:28 No.101088802

Anonymous 06/21/24(Fri)15:20:28 No.101088802

>>101073744
Original Mikubox anon, I hate to be a downer, but was that upgrade really worth it? The original build gets over 8t/s in llama.cpp on a 4-bit CR+ quant, and you are reporting 6t/s on a 5bpw quant, with more expensive hardware, and spilling out beyond the nice all-snug-in-case cleanliness of the original.

You definitely know what you're doing, so I'm not going to say it's wrong, but it seems like majorly diminished returns to me.

Anonymous
06/21/24(Fri)15:21:29 No.101088818

Anonymous 06/21/24(Fri)15:21:29 No.101088818

File: micron-mcrdimm-mcr-dimm-d(...).jpg (46 KB, 678x381)

46 KB JPG

>>101088733
At least 800GB for the weights alone at BF16.

Anonymous
06/21/24(Fri)15:21:33 No.101088820

Anonymous 06/21/24(Fri)15:21:33 No.101088820

>>101088740
>10 years to translate "hazukashii dame soko wa"
i don't think so champ

Anonymous
06/21/24(Fri)15:22:11 No.101088827

Anonymous 06/21/24(Fri)15:22:11 No.101088827

>>101088470
Missed Google translate, wouldn't have expected llama 70b to be better from my tests, more like neck to neck with gpt4

Anonymous
06/21/24(Fri)15:23:02 No.101088840

Anonymous 06/21/24(Fri)15:23:02 No.101088840

>>101088818
>800GB full weights
>1.3TB left for context
>200GB leftover for the system
CPUchads we're going to be eating good.

Anonymous
06/21/24(Fri)15:26:35 No.101088891

Anonymous 06/21/24(Fri)15:26:35 No.101088891

>tfw invested into a regular GPU-based build
It's over...

Anonymous
06/21/24(Fri)15:27:01 No.101088897

Anonymous 06/21/24(Fri)15:27:01 No.101088897

>>101073744
>she stays timid and nervous far further into the roleplay, whereas L3 8B would quickly have her turn into a "normal" person
is this the power of "muh params" fags? Throwing 62B MORE params at something for "a little better" result, that could be fixed with 5-10 words in author note?

Anonymous
06/21/24(Fri)15:28:34 No.101088928

Anonymous 06/21/24(Fri)15:28:34 No.101088928

oh wait, this is CR+, 96B MORE PARAMS

Anonymous
06/21/24(Fri)15:29:35 No.101088942

Anonymous 06/21/24(Fri)15:29:35 No.101088942

Meta does seem somewhat ahead, with them taking their sweet time before releasing their models

Anonymous
06/21/24(Fri)15:30:42 No.101088961

Anonymous 06/21/24(Fri)15:30:42 No.101088961

>>101088942
ahead in the line to suck anthropic's veiny girth

Anonymous
06/21/24(Fri)15:32:03 No.101088977

Anonymous 06/21/24(Fri)15:32:03 No.101088977

>>101088942
full ahead in censoring llama.

Anonymous
06/21/24(Fri)15:32:53 No.101088995

Anonymous 06/21/24(Fri)15:32:53 No.101088995

File: 2024-06-04_8-34-44-1920x1200.jpg (273 KB, 1920x1200)

273 KB JPG

>>101088840
>>101088891
Intel's got some serious shit in the pipeline come sept.
>12 channel | 8800 MCR
>Intel Advanced Matrix Extensions: INT8, BF16, FP16

Just wait until next gen nvidia drops with no 32GB option and watch gpu plebs seethe.

Anonymous
06/21/24(Fri)15:35:11 No.101089027

Anonymous 06/21/24(Fri)15:35:11 No.101089027

>>101087844
You're missing DeepSeek-Coder-V2-Instruct on there.
It's an absolute coom demon.

Anonymous
06/21/24(Fri)15:38:13 No.101089068

Anonymous 06/21/24(Fri)15:38:13 No.101089068

>>101088995
A single CPU is going to cost more than an entire quad 3090 build. Intel prices are delusional.

Anonymous
06/21/24(Fri)15:38:14 No.101089069

Anonymous 06/21/24(Fri)15:38:14 No.101089069

>>101088449
Yes.

Anonymous
06/21/24(Fri)15:39:02 No.101089083

Anonymous 06/21/24(Fri)15:39:02 No.101089083

>>101089027
Is it actually good for non-code?

Anonymous
06/21/24(Fri)15:43:38 No.101089151

Anonymous 06/21/24(Fri)15:43:38 No.101089151

>>101089027
>DeepSeek-Coder-V2-Instruct
Is it still good if I have to quant down to IQ3_XXS? Vramlet things.

Anonymous
06/21/24(Fri)15:48:01 No.101089215

Anonymous 06/21/24(Fri)15:48:01 No.101089215

>>101089151
no idea. I just tried Q4_K_S.

Pros though:
-Impeccable attention to detail
-Plays hard to get when it should
-sovl
Cons:
-Slow as shit. I couldn't imagine how long the batch processing takes with a single GPU.
-KV Cache is absolutely gargantuan. On paper I should be able to load the weights for Q8 split between my CPU and RAM but the KV Cache is so big it ooms me anyway on account of not being able to dump it all on my main GPU even without any layers offloaded.
-Still shivers sometimes

Anonymous
06/21/24(Fri)15:49:01 No.101089230

Anonymous 06/21/24(Fri)15:49:01 No.101089230

>>101089215
>my CPU and RAM
Well I'm retarded.
I'm just not going to correct my mistake. hahaha ugh I know I'm just not going to correct myself, is all.

Anonymous
06/21/24(Fri)15:50:38 No.101089254

Anonymous 06/21/24(Fri)15:50:38 No.101089254

File: 1495059085118.jpg (71 KB, 600x670)

71 KB JPG

Guys listen. Enough with all this bullshit quibbling over hardware specs and datasets We need to talk about the real issue that's been the bane of textgen for almost as long as we've had coherent models.

Chafing.

What the fuck are we supposed to do about it? We're sitting here arguing about cooling PSUs and GPUs and meanwhile our crotches are collectively generating enough friction heat to power Las Vegas. I don't need thermal paste for my CPU; I need it for my dick, goddammit.

Anonymous
06/21/24(Fri)15:52:05 No.101089282

Anonymous 06/21/24(Fri)15:52:05 No.101089282

>>101089254
You must be 18+ to post here

Anonymous
06/21/24(Fri)15:52:12 No.101089283

Anonymous 06/21/24(Fri)15:52:12 No.101089283

>>101089254
>reddit spacing
>DUDE PORN DIK ASS jokes

Anonymous
06/21/24(Fri)15:55:45 No.101089340

Anonymous 06/21/24(Fri)15:55:45 No.101089340

>>101089254
Skill issue, cut or uncut, precum solves this issue. Raise your T and stay hydrated

Anonymous
06/21/24(Fri)16:06:13 No.101089503

Anonymous 06/21/24(Fri)16:06:13 No.101089503

>>101088995
>Intel
Just 2800 watts for a 2 socket system!

Anonymous
06/21/24(Fri)16:14:56 No.101089630

Anonymous 06/21/24(Fri)16:14:56 No.101089630

Is Deepseek-Coder-V2 better than DeepseekV2 for RP?

Anonymous
06/21/24(Fri)16:17:39 No.101089654

Anonymous 06/21/24(Fri)16:17:39 No.101089654

>>101089215
>Slow as shit. I couldn't imagine how long the batch processing takes with a single GPU.
IQ3_XXS took me 20 minutes to get a single question answered on my normie 4070 machine.

I wonder what kind of excitement IQ1_S can provide. :D

Anonymous
06/21/24(Fri)16:19:17 No.101089680

Anonymous 06/21/24(Fri)16:19:17 No.101089680

>>101088759
You get back what you put in and MTL is low effort and low reward at the moment. It makes no sense to spend that on a massive corpus of media of which a good portion has been actually translated properly. Frankly, it is a waste to your entertainment time experiencing things like this but people frankly have low standards and I've stopped caring personally for the most part.

Anonymous
06/21/24(Fri)16:22:05 No.101089731

Anonymous 06/21/24(Fri)16:22:05 No.101089731

>>101089340
>tfw uncut, but no precum

Anonymous
06/21/24(Fri)16:25:33 No.101089776

Anonymous 06/21/24(Fri)16:25:33 No.101089776

>>101089731
If you have none that's probably genetics. But for me I had almost none then I stopped taking allergy meds and stopped watching porn and it came back

Anonymous
06/21/24(Fri)16:25:41 No.101089778

Anonymous 06/21/24(Fri)16:25:41 No.101089778

>>101089283
What's Reddit spacing? I read that often here, but isn't that just normal spacing for better readability/separation?

Anonymous
06/21/24(Fri)16:27:58 No.101089816

Anonymous 06/21/24(Fri)16:27:58 No.101089816

>>101089778
reddit spacing gives away reddit tourists posting on 4chan, usually.

Anonymous
06/21/24(Fri)16:28:01 No.101089818

Anonymous 06/21/24(Fri)16:28:01 No.101089818

>>101089680
NTA, but it's never a waste if you're having fun. Also, there are a ton of fandiscs that are untranslated, and obviously can't be replaced by any other media. Let's not forget the loliges too kek

Anonymous
06/21/24(Fri)16:30:32 No.101089842

Anonymous 06/21/24(Fri)16:30:32 No.101089842

>>101089778
Markdown requires you to double space, so that was the original reason.
>muh readability
Is actually the proper reason now because mobileniggers like you insist on having their shit separated because it looks "too wordy and unreadable" on your iTurd

Anonymous
06/21/24(Fri)16:35:08 No.101089912

Anonymous 06/21/24(Fri)16:35:08 No.101089912

>>101089842
>readability
>mobileniggers
just extreme cases of dyslexia coupled with retardation, nothing unusual.

Anonymous
06/21/24(Fri)16:36:18 No.101089925

Anonymous 06/21/24(Fri)16:36:18 No.101089925

>>101089842
Mobile because Reddit instead of reddit?

Anonymous
06/21/24(Fri)16:40:28 No.101089986

Anonymous 06/21/24(Fri)16:40:28 No.101089986

can you lora a model on game designs and get superhuman game design

Anonymous
06/21/24(Fri)16:41:43 No.101090003

Anonymous 06/21/24(Fri)16:41:43 No.101090003

>>101089986
LLMs will never be superhuman.

Anonymous
06/21/24(Fri)16:41:53 No.101090005

Anonymous 06/21/24(Fri)16:41:53 No.101090005

>>101089986
When you lora a superhuman base model

Anonymous
06/21/24(Fri)16:42:55 No.101090024

Anonymous 06/21/24(Fri)16:42:55 No.101090024

>>101089986
A full fine tune might.

Anonymous
06/21/24(Fri)16:43:14 No.101090031

Anonymous 06/21/24(Fri)16:43:14 No.101090031

>>101089778
its mikutroons from reddit trying to fit in

Anonymous
06/21/24(Fri)16:46:19 No.101090073

Anonymous 06/21/24(Fri)16:46:19 No.101090073

https://dl.acm.org/doi/10.1145/3613904.3641908
>"Simulating Emotions With an Integrated Computational Model of Appraisal and Reinforcement Learning"
>"“Consider a computer error during a critical task. This event is assessed by the user’s cognition as being counterproductive. An inexperienced user might react with anxiety and fear due to uncertainty on how to resolve the error, whereas an experienced user might feel irritated and annoyed at having to waste time resolving the issue. Our model predicts the user’s emotional response by simulating this cognitive evaluation process.”"

Anonymous
06/21/24(Fri)16:48:34 No.101090109

Anonymous 06/21/24(Fri)16:48:34 No.101090109

I recommend not trying to fit in here, I killed my life prospects like that 15 years ago

Anonymous
06/21/24(Fri)16:50:36 No.101090144

Anonymous 06/21/24(Fri)16:50:36 No.101090144

>>101090003
>thousands of tokens/second
>dubiously accurate knowledge about millions of different topics
They're already superhuman in a couple ways.

Anonymous
06/21/24(Fri)16:51:14 No.101090155

Anonymous 06/21/24(Fri)16:51:14 No.101090155

File: p-min.gif (2.57 MB, 1024x576)

2.57 MB GIF

>>101090109
dis u?

Anonymous
06/21/24(Fri)16:53:06 No.101090183

Anonymous 06/21/24(Fri)16:53:06 No.101090183

>>101090109
Second this sentiment. I cringe when I think what my career would have been like if I was more personable and networked properly through more socially acceptable venues.

Anonymous
06/21/24(Fri)16:57:12 No.101090243

Anonymous 06/21/24(Fri)16:57:12 No.101090243

File: reddit_spacing.png (143 KB, 1010x1272)

143 KB PNG

>>101089778
>What's Reddit spacing?

Reddit spacing looks like this. You'd best keep it to a minimum because we've got hardcore oldfags in this thread who'll be all over your ass for it.

Anonymous
06/21/24(Fri)16:58:49 No.101090264

Anonymous 06/21/24(Fri)16:58:49 No.101090264

>>101089680
>it is a waste to your entertainment time
That describes VNs as a whole, yes. We only consume them after having run out of everything else.

I just don't get why people are obsessing over "prose" when VNs are written practically like children's books. Adults don't call their vaginas "that place". In the first place, japanese is so different from english, that you can't just assume that if the original prose was of top quality and the translation is "perfect", that the resulting english prose will be of top quality. Most translated VNs lose the original meaning and sometimes even intent in favor of enhancing the prose. The more you learn about japanese, the more you realize it's a sterile, limited, and inefficient language such that there is no value in trying to honor the original texts.

local language models

Anonymous
06/21/24(Fri)17:01:17 No.101090303

Anonymous 06/21/24(Fri)17:01:17 No.101090303

>>101090264
>t. N5

Anonymous
06/21/24(Fri)17:03:00 No.101090332

Anonymous 06/21/24(Fri)17:03:00 No.101090332

>>101090243
anon, reddit spacing serves as retard indicator, genuine reddit spacer - ~90% chance it's some extremely obnoxious faggot from whatever shithole.

Anonymous
06/21/24(Fri)17:04:37 No.101090352

Anonymous 06/21/24(Fri)17:04:37 No.101090352

>>101090003
>>101090005
>LLMs will never be superhuman.
they can do superhuman rating/quality on some things already and obviously it will improve with more/better data if it is already at least a little good and has risen, theoretically

Anonymous
06/21/24(Fri)17:05:42 No.101090366

Anonymous 06/21/24(Fri)17:05:42 No.101090366

>>101090303
let me guess, you think muramasa is the pinnacle of human achievement
lmao

Anonymous
06/21/24(Fri)17:09:01 No.101090414

Anonymous 06/21/24(Fri)17:09:01 No.101090414

>>101089680
>You get back what you put in and MTL is low effort
are you saying nothing that is low effort can be good? that makes no sense

Anonymous
06/21/24(Fri)17:09:44 No.101090425

Anonymous 06/21/24(Fri)17:09:44 No.101090425

File: GQHJ0IuWYAAhlzp.jpg (251 KB, 2048x1898)

251 KB JPG

After playing with Mixtral 7b and Wizard22b extensively, I have to say that Mixtral is way better for roleplay and erp. Wizard is just too opinionated. Every character's personality is overwritten by what seems to be Shakespeare's English teacher. Mixtral is much better at copying writing styles, too, Wizard does the same writing style no matter what.

What model should I try next?

Anonymous
06/21/24(Fri)17:13:22 No.101090472

Anonymous 06/21/24(Fri)17:13:22 No.101090472

>>101090425
petra-13b-instruct

Anonymous
06/21/24(Fri)17:22:47 No.101090629

Anonymous 06/21/24(Fri)17:22:47 No.101090629

>i see that the prompts for the llms are basically all:
<system>something<user>somequestion<assistant>
>some don't have a system so is just:
<user>somequestion<assistant>
the llm generate from <assistant> and add an endtag
so, there can be more than one input tag but only one output tag ?

Anonymous
06/21/24(Fri)17:28:04 No.101090692

Anonymous 06/21/24(Fri)17:28:04 No.101090692

>>101090629
I assume by output tag, you mean end tag. Llama 3 format has 2.

Anonymous
06/21/24(Fri)17:28:14 No.101090695

Anonymous 06/21/24(Fri)17:28:14 No.101090695

>>101090629
What kind of parallel outputs are you wanting from it?

Anonymous
06/21/24(Fri)17:38:47 No.101090845

Anonymous 06/21/24(Fri)17:38:47 No.101090845

>>101090695
I was thinking if it was possible to train it to reply in text or something else with different tags based on a question, something like:
<user>tell me a story<assistant>once upon a time....
<user>set an alarm for the pizza in 15 min<service>alarm#15#m

Anonymous
06/21/24(Fri)17:38:54 No.101090846

Anonymous 06/21/24(Fri)17:38:54 No.101090846

File: 1714835911803029.png (1005 KB, 1024x1024)

1005 KB PNG

>>101090264
>The more you learn about japanese, the more you realize it's a sterile, limited, and inefficient language
This is what everyone thinks about every language they have learned well but not mastered.
It's the second language acquisition equivalent of the midwit meme
>New learner: Wow, this language is so different and interesting
>Experienced learner: Meh, this shit is stale and not nearly as expressive as my native tongue
>Master: Wow, this language is so different and interesting

Anonymous
06/21/24(Fri)17:50:49 No.101090993

Anonymous 06/21/24(Fri)17:50:49 No.101090993

File: una.png (116 KB, 274x253)

116 KB PNG

>I won't bite... unless you ask me to

Anonymous
06/21/24(Fri)17:51:47 No.101091003

Anonymous 06/21/24(Fri)17:51:47 No.101091003

>>101090993
it worse than shivers desu

Anonymous
06/21/24(Fri)17:54:10 No.101091031

Anonymous 06/21/24(Fri)17:54:10 No.101091031

Mamba vs Transformer?

Anonymous
06/21/24(Fri)17:55:55 No.101091055

Anonymous 06/21/24(Fri)17:55:55 No.101091055

>>101090993
Actually turns me off

Anonymous
06/21/24(Fri)17:57:31 No.101091074

Anonymous 06/21/24(Fri)17:57:31 No.101091074

>>101090846
I had the reverse experience when learning english. I wish I could unlearn my own language and replace it with english, you faggots are taking the richness of that language for granted.

Anonymous
06/21/24(Fri)17:57:33 No.101091077

Anonymous 06/21/24(Fri)17:57:33 No.101091077

>>101091055
I've deleted dozens of cards because of that shit

Anonymous
06/21/24(Fri)17:59:16 No.101091099

Anonymous 06/21/24(Fri)17:59:16 No.101091099

>>101091074
Not my mother tongue, how is English rich? Where are you from?

Anonymous
06/21/24(Fri)18:01:12 No.101091129

Anonymous 06/21/24(Fri)18:01:12 No.101091129

>>101091074
then what if anything does english lack from your language?

Anonymous
06/21/24(Fri)18:07:33 No.101091214

Anonymous 06/21/24(Fri)18:07:33 No.101091214

>>101091099
English has four times as many words as French, but it has less useless tenses, the pronouns actually make sense, adverbs and adjectives are easy to construct, literal objects are not gendered (retarded idea)...

Anonymous
06/21/24(Fri)18:08:14 No.101091218

Anonymous 06/21/24(Fri)18:08:14 No.101091218

>>101090993
Stop using WLM anon

Anonymous
06/21/24(Fri)18:10:54 No.101091243

Anonymous 06/21/24(Fri)18:10:54 No.101091243

>>101091214
>Literal objects are not gendered
But the thing with English is there's millions of exceptions. There are gendered objects in English. At least in American English, ships, churches and other things are a her. Various items can be a guy depending on the amount of endearment from the speaker.

Anonymous
06/21/24(Fri)18:12:53 No.101091265

Anonymous 06/21/24(Fri)18:12:53 No.101091265

>>101091214
Non sense, French is a romance language. kys for speaking such blasphemy.

Anonymous
06/21/24(Fri)18:16:02 No.101091299

Anonymous 06/21/24(Fri)18:16:02 No.101091299

>>101091243
Yeah but that's a choice from the speaker as you said, not a hard rule. Try learning the gender of every single item in existence, lol.
>>101091265
>romance language
Oh boy, someone is still stuck in the 19th century.

Anonymous
06/21/24(Fri)18:26:31 No.101091427

Anonymous 06/21/24(Fri)18:26:31 No.101091427

I should be able to select or draw a pose and get an image of a drawing of an anime girl with perfect proportions in that pose

Anonymous
06/21/24(Fri)18:30:27 No.101091470

Anonymous 06/21/24(Fri)18:30:27 No.101091470

File: BTFO.webm (723 KB, 1920x1080)

723 KB WEBM

uhh bros..?

Anonymous
06/21/24(Fri)18:30:59 No.101091480

Anonymous 06/21/24(Fri)18:30:59 No.101091480

File: 1683495417317.png (136 KB, 542x476)

136 KB PNG

Anonymous
06/21/24(Fri)18:31:51 No.101091491

Anonymous 06/21/24(Fri)18:31:51 No.101091491

How is Yi-1.5-34b-chat? I never hear about anybody using it, even though it's a much more accessible size than 70b. There's a 16k context version, is that one just as good as the 4k? I'm thinking about making an RP / creative focused finetune of something a bit smaller than 70b, and Yi seems like a good candidate. I of course will test it myself (currently making exl2 quants now) but I wanted to see if anybody had opinions on it.

Anonymous
06/21/24(Fri)18:37:41 No.101091546

Anonymous 06/21/24(Fri)18:37:41 No.101091546

>>101091299
Wait, objects have actual hard genders in other languages? What gender would my table and chairs be?

Anonymous
06/21/24(Fri)18:45:49 No.101091633

Anonymous 06/21/24(Fri)18:45:49 No.101091633

>>101091546
Depends on the language.

Grammatical gender, while often parallel to the sex of things that have sexes, doesn't actually have anything to do with sex but with how words work together in that grammar. Spanish has el and la and all their variants which must agree by grammatical gender. German has three iirc and sometimes words that are only used for one sex don't match with that grammatical gender.

Anonymous
06/21/24(Fri)18:48:25 No.101091662

Anonymous 06/21/24(Fri)18:48:25 No.101091662

>>101091546
Tables and chairs are feminine in French and Spanish, but masculine in German

Anonymous
06/21/24(Fri)18:49:53 No.101091682

Anonymous 06/21/24(Fri)18:49:53 No.101091682

>>101091491
Nothing special.

Anonymous
06/21/24(Fri)19:04:06 No.101091858

Anonymous 06/21/24(Fri)19:04:06 No.101091858

>>101091480
me

Anonymous
06/21/24(Fri)19:09:02 No.101091915

Anonymous 06/21/24(Fri)19:09:02 No.101091915

File: MikuClassicMechanic.png (1.49 MB, 1168x880)

1.49 MB PNG

>>101090846
Japanese is great. I've been psychologically tired from having to speak it too often near the beginning of my learning career, but I've consistently found it to be fresh, interesting and expressive in surprising ways. I find myself wanting to express some hard-to-translate Japanese concept in English fairly frequently, but less often in the other direction. I make sure every other book I read is in Japanese because I find the alternating perspectives refreshing.
Maybe the regularity of the grammar is "boring" to someone used to learning a shitton of useless rules just to open your mouth and not sound like a caveman? You still have the endless Kanji grind, so its not all kittens and butterflies, but its basically just memorizing pokemon so not really a big deal.
source: 35 years of continuous use and learning. Have my N1 and passed with all A's. My Japanese is good enough to have had a job with NHK at one point.

Anonymous
06/21/24(Fri)19:10:30 No.101091935

Anonymous 06/21/24(Fri)19:10:30 No.101091935

>>101091480
why is the wrong arm tied off? amateur hour..

Anonymous
06/21/24(Fri)19:13:32 No.101091989

Anonymous 06/21/24(Fri)19:13:32 No.101091989

>>101091915
kill yourself

Anonymous
06/21/24(Fri)19:19:48 No.101092066

Anonymous 06/21/24(Fri)19:19:48 No.101092066

So is there a good Japanese ERP model yet?

Anonymous
06/21/24(Fri)19:20:41 No.101092078

Anonymous 06/21/24(Fri)19:20:41 No.101092078

>>101091915
>I find myself wanting to express some hard-to-translate Japanese concept in English fairly frequently,
Like?

>refreshing.
how

>to have had a job with NHK at one point.
Were you part of an いんぼう

Anonymous
06/21/24(Fri)19:44:40 No.101092390

Anonymous 06/21/24(Fri)19:44:40 No.101092390

>>101091491
It's not bad, the dolphin tune of it was surprisingly good imo considering every other dolphin I've tried since like mistral 7B was ass. I personally would love to see someone else take a stab at it.

Anonymous
06/21/24(Fri)19:45:17 No.101092399

Anonymous 06/21/24(Fri)19:45:17 No.101092399

>>101091662
Also masculine in Russian

Anonymous
06/21/24(Fri)19:45:27 No.101092401

Anonymous 06/21/24(Fri)19:45:27 No.101092401

>>101091915
go back

Anonymous
06/21/24(Fri)19:46:37 No.101092420

Anonymous 06/21/24(Fri)19:46:37 No.101092420

>>101091662
Also feminine in Galician

Anonymous
06/21/24(Fri)19:48:08 No.101092439

Anonymous 06/21/24(Fri)19:48:08 No.101092439

File: MikuClassicMechanic2.png (1.45 MB, 1168x880)

1.45 MB PNG

>>101092078
>>I find myself wanting to express some hard-to-translate Japanese concept in English fairly frequently,
>Like?
Things like おつかれさま are obvious and often brought up, but for me probably the ease with which you can converse with onomatopoeia. Also the range of nuance you can express and games you can play with different politeness levels in words.
>>refreshing.
>how
This is harder to verbalize, and is certainly all in my head (by definition), but I find reading Japanese lights up a different part of my brain. "A change is as good as a rest", after all
>Were you part of an いんぼう
Yes
>>101092066
No

Anonymous
06/21/24(Fri)19:50:08 No.101092461

Anonymous 06/21/24(Fri)19:50:08 No.101092461

>>101091470
Here's your (you). No one here cares about the jart or shart. This webm is painfully unfunny. Newfags at the sharty are always unfunny so that's not surprising. Shart trolls can be funny when they do stealthjaks and samefag in arguments and pretend to be obnoxious newfags and stuff but I'm not even sure who this is supposed to piss off. Fail troll

Anonymous
06/21/24(Fri)19:51:31 No.101092472

Anonymous 06/21/24(Fri)19:51:31 No.101092472

>>101092461
newfag detected

Anonymous
06/21/24(Fri)19:52:56 No.101092481

Anonymous 06/21/24(Fri)19:52:56 No.101092481

>>101092461
I barely understood half the words in this post.

Anonymous
06/21/24(Fri)19:53:10 No.101092484

Anonymous 06/21/24(Fri)19:53:10 No.101092484

File: 1531601733974.jpg (100 KB, 1280x720)

100 KB JPG

>come crashing down from the high and flow of your wholesome wish fulfillment story, back to the painful reality
Ahhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh

Anonymous
06/21/24(Fri)19:55:41 No.101092511

Anonymous 06/21/24(Fri)19:55:41 No.101092511

>>101092487
Do you hate Santa? Are you from Philadelphia?

Anonymous
06/21/24(Fri)19:55:44 No.101092512

Anonymous 06/21/24(Fri)19:55:44 No.101092512

File: kek.gif (1.87 MB, 512x512)

1.87 MB GIF

>>101092487
i saw that

Anonymous
06/21/24(Fri)19:56:28 No.101092524

Anonymous 06/21/24(Fri)19:56:28 No.101092524

>>101092511
>>101092512
Shitposting the wrong thread. It's been a long day.

Anonymous
06/21/24(Fri)19:57:43 No.101092544

Anonymous 06/21/24(Fri)19:57:43 No.101092544

>>101092461
>I'm not going you a (you). I care about the jart and the shart. This webm is very funny. Oldfags at the sharty are always funny and that's obvious. 4chan users aren't funny when they do stealthjaks and samefag in arguments and pretend to be obnoxious newfags and I know exactly who this will piss off. Troll win

Anonymous
06/21/24(Fri)19:58:03 No.101092552

Anonymous 06/21/24(Fri)19:58:03 No.101092552

>>101092512
how was this made

Anonymous
06/21/24(Fri)19:59:30 No.101092571

Anonymous 06/21/24(Fri)19:59:30 No.101092571

>>101092552
stable video diffusion on a 512x512 image which svd was not trained on
specifically the original svd_xt, never bothered trying the newer one

Anonymous
06/21/24(Fri)20:00:41 No.101092586

Anonymous 06/21/24(Fri)20:00:41 No.101092586

>>101092571
so it thought it was a music video?

Anonymous
06/21/24(Fri)20:05:13 No.101092652

Anonymous 06/21/24(Fri)20:05:13 No.101092652

>>101081651
>>101081709
https://github.com/ggerganov/llama.cpp/pull/8052
I didn't get ignored. That guy just closed it without comment. Likely puked while reading it lmao.

Anonymous
06/21/24(Fri)20:05:31 No.101092658

Anonymous 06/21/24(Fri)20:05:31 No.101092658

File: 1712418595077032.jpg (82 KB, 670x767)

82 KB JPG

>>101081984

So. I've got the following. What can I potentially run?

>nvidia laptop 4060
>32 gigs of ram

Like I said in the local diffusion thread I can use an install .exe but need a gui

Anonymous
06/21/24(Fri)20:06:13 No.101092669

Anonymous 06/21/24(Fri)20:06:13 No.101092669

>>101092652
lmao
>>101092658
linux

Anonymous
06/21/24(Fri)20:06:48 No.101092681

Anonymous 06/21/24(Fri)20:06:48 No.101092681

>>101092652
retard

Anonymous
06/21/24(Fri)20:12:15 No.101092750

Anonymous 06/21/24(Fri)20:12:15 No.101092750

>>101092439
>おつかれさま
Fellow retired gaijin here.There are tons of Jap words and expressions that sometimes come up during my English stream of consciousness that are hard to render in English
空気読めない
取り組み
擦り合わせ
孕め!孕め!孕め!パンパンパン

Anonymous
06/21/24(Fri)20:12:37 No.101092751

Anonymous 06/21/24(Fri)20:12:37 No.101092751

File: file.png (69 KB, 449x449)

69 KB PNG

>>101092652

Anonymous
06/21/24(Fri)20:17:15 No.101092800

Anonymous 06/21/24(Fri)20:17:15 No.101092800

>>101092751
>Cocket Camera
What did he mean by this?

Anonymous
06/21/24(Fri)20:22:33 No.101092867

Anonymous 06/21/24(Fri)20:22:33 No.101092867

>>101092439
Why would you need to translate such a useless phrase as おつかれさま? Everybody could just one day decide to stop saying it and absolutely nothing would change. Imagine feeling the need to say some magic incantation every time you leave or enter your home, it's absolute silliness. Japanese has time to make you say utterly empty phrases like this, or pointless suffixes, but it can't be bothered to properly disambiguate the subject of a sentence.

I'm fluent in both english and french and while they both have their downsides, they can both convey just about any idea more precisely and using less total words/syllables than japanese.

Anonymous
06/21/24(Fri)20:30:44 No.101092969

Anonymous 06/21/24(Fri)20:30:44 No.101092969

File: singularity_is_near.png (9 KB, 593x142)

9 KB PNG

i love 2024

Anonymous
06/21/24(Fri)20:32:26 No.101092994

Anonymous 06/21/24(Fri)20:32:26 No.101092994

File: AIGEN.webm (340 KB, 720x996)

340 KB WEBM

When can we get a local model for this?

Anonymous
06/21/24(Fri)20:39:24 No.101093081

Anonymous 06/21/24(Fri)20:39:24 No.101093081

>>101092652
I love this, honestly. It's a nice humorous break from the usual serious pull. Although some may be annoyed by it, which is unfortunate.

Anonymous
06/21/24(Fri)20:48:12 No.101093179

Anonymous 06/21/24(Fri)20:48:12 No.101093179

File: file.png (1.01 MB, 768x768)

1.01 MB PNG

Anonymous
06/21/24(Fri)21:17:00 No.101093541

Anonymous 06/21/24(Fri)21:17:00 No.101093541

>>101092658
Stheno 3.2

Anonymous
06/21/24(Fri)21:23:12 No.101093607

Anonymous 06/21/24(Fri)21:23:12 No.101093607

>>101092969
>void return instead of int
It's over

Anonymous
06/21/24(Fri)21:27:10 No.101093652

Anonymous 06/21/24(Fri)21:27:10 No.101093652

>>101093607
Not if it's writing to std out.
Not if it's going to go big and iterate beyond INT_MAX.
Also, that means it isn't that dumb ass recursion mogs your stack style.

Anonymous
06/21/24(Fri)21:32:02 No.101093707

Anonymous 06/21/24(Fri)21:32:02 No.101093707

>>101092461
>painfully unfunny
>trolls
quintessential newfag-redditor post.

Anonymous
06/21/24(Fri)21:58:08 No.101093939

Anonymous 06/21/24(Fri)21:58:08 No.101093939

>>101092994
fuuug, what made this?

Anonymous
06/21/24(Fri)22:03:56 No.101093980

Anonymous 06/21/24(Fri)22:03:56 No.101093980

File: sonnet-gemini-soyjak.png (868 KB, 1596x1900)

868 KB PNG

Why doesn't new claude know about basedjaks?

Anonymous
06/21/24(Fri)22:07:34 No.101094004

Anonymous 06/21/24(Fri)22:07:34 No.101094004

File: gpt4o-sonnet.png (125 KB, 1587x725)

125 KB PNG

>>101093980
New claude is not really good at meme images

Anonymous
06/21/24(Fri)22:07:42 No.101094006

Anonymous 06/21/24(Fri)22:07:42 No.101094006

>>101093652
If an AI doesn't understand that Fibonacci can be tail optimized, then we've reached peak levels of retardation, anon

Anonymous
06/21/24(Fri)22:17:50 No.101094082

Anonymous 06/21/24(Fri)22:17:50 No.101094082

>>101087821
Does anybody know?

Anonymous
06/21/24(Fri)22:20:06 No.101094100

Anonymous 06/21/24(Fri)22:20:06 No.101094100

>>101094004
nice pic of the average /lmg/ slopper

Anonymous
06/21/24(Fri)22:23:13 No.101094127

Anonymous 06/21/24(Fri)22:23:13 No.101094127

>>101087821
I'm pretty sure they just started charging closer to what their models actually cost. If random nobodies can serve 7B at 0.07 / 1M tokens and 70B at 0.70 / 1M tokens there's no fucking way the best they could do with GPT-4 Turbo was 10.00 / 1M unless it was a fucking dense toucan. Anthropic might be in a similar boat
Open source catching up is making the companies privatizing their models be more honest with their pricing. Let's hope it keeps up and L3 400B isn't a fucking flop

Anonymous
06/21/24(Fri)22:23:28 No.101094128

Anonymous 06/21/24(Fri)22:23:28 No.101094128

>>101094082
yes, random fags on epic 4chans will be privy to highly guarded corpo secrets

Anonymous
06/21/24(Fri)22:30:32 No.101094175

Anonymous 06/21/24(Fri)22:30:32 No.101094175

>>101093137
I like how the hands are nice with proper fingernails. Back in SD1.5 it was hell to do proper fingernails

Anonymous
06/21/24(Fri)22:32:44 No.101094195

Anonymous 06/21/24(Fri)22:32:44 No.101094195

>>101082388
they don't really report much on their compute lately but musk has been bullish on training since long before dragon summer
https://en.wikipedia.org/wiki/Tesla_Dojo

Anonymous
06/21/24(Fri)22:34:58 No.101094211

Anonymous 06/21/24(Fri)22:34:58 No.101094211

>>101094082
The secret sauce is quantization aware training (QAT) int8/int4

Anonymous
06/21/24(Fri)22:36:43 No.101094223

Anonymous 06/21/24(Fri)22:36:43 No.101094223

>>101093939
The Chinese model

Anonymous
06/21/24(Fri)22:42:28 No.101094273

Anonymous 06/21/24(Fri)22:42:28 No.101094273

>>101082388
They building their super cluster now with Dell Nvidia + SMC stack.

Anonymous
06/21/24(Fri)23:09:10 No.101094495

Anonymous 06/21/24(Fri)23:09:10 No.101094495

File: 11_04216_.png (1.21 MB, 1280x960)

1.21 MB PNG

>>101092994
Gross, that's one dimension too many

Anonymous
06/21/24(Fri)23:23:32 No.101094628

Anonymous 06/21/24(Fri)23:23:32 No.101094628

>>101094602
>>101094602
>>101094602

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.