/g/ - /lmg/ - Local Models General - Technology


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

Anonymous
/lmg/ - Local Models General 08/06/24(Tue)19:07:18 No.101757601

File: miku_flag.png (1.37 MB, 1024x1024)

/lmg/ - Local Models General Anonymous 08/06/24(Tue)19:07:18 No.101757601

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>101749053 & >>101739747

►News
>(08/05) vLLM GGUF loading support merged: https://github.com/vllm-project/vllm/pull/5191
>(07/31) Gemma 2 2B, ShieldGemma, and Gemma Scope: https://developers.googleblog.com/en/smaller-safer-more-transparent-advancing-responsible-ai-with-gemma
>(07/27) Llama 3.1 rope scaling merged: https://github.com/ggerganov/llama.cpp/pull/8676
>(07/26) Cyberagent releases Japanese fine-tune model: https://hf.co/cyberagent/Llama-3.1-70B-Japanese-Instruct-2407
>(07/25) BAAI & TeleAI release 1T parameter model: https://hf.co/CofeAI/Tele-FLM-1T

►News Archive: https://rentry.org/lmg-news-archive
►FAQ: https://wikia.schneedc.com
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Programming: https://hf.co/spaces/bigcode/bigcode-models-leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
08/06/24(Tue)19:07:40 No.101757610

Anonymous 08/06/24(Tue)19:07:40 No.101757610

File: 2024-08-06_205558_seed317(...).png (1.24 MB, 1024x1024)

1.24 MB PNG

►Recent Highlights from the Previous Thread: >>101749053

--vLLM GGUF loading issues with Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf: >>101752460 >>101752493 >>101752587 >>101752887 >>101752942 >>101754105 >>101754463
--Using LLMs for code obfuscation and randomness: >>101750235 >>101750681 >>101751579 >>101751645 >>101751742 >>101752513 >>101753593 >>101753898 >>101754413 >>101754550 >>101754571
--Training a multimodal LLM for reverse stable diffusion captioning: >>101751771 >>101751869 >>101751899 >>101751940 >>101752015 >>101751930
--InternLM 2.5 20B model impresses with benchmark results: >>101750268 >>101750302 >>101751043 >>101751291 >>101753224 >>101753296 >>101750400
--Florence and Kosmos for multimodal image description: >>101750080 >>101750118 >>101750228 >>101750173 >>101751908 >>101751942 >>101751950
--Anon discusses prompt execution time and computer specs: >>101754319 >>101754426 >>101754477 >>101754507 >>101754698 >>101754725 >>101755844 >>101755911 >>101756005 >>101756120 >>101755903
--Using video games to test AGI morality is flawed due to inconsequential decisions: >>101754707 >>101754720 >>101755315 >>101755394 >>101755515
--Updated Mistral preset for Large and other models: >>101751804 >>101753610 >>101753679
--Hardware suggestions for power-efficient Gemma 2 27b inference server: >>101752368 >>101752925 >>101752951 >>101753060
--DeepSeek implements caching on API, others to follow: >>101757114 >>101757169 >>101757232 >>101757482
--fp8 vs fp16 and schnell quality issues: >>101749508 >>101749522 >>101749599 >>101749963
--LORA hotswap endpoint merged, mixture of LORAs idea status unknown: >>101753955 >>101753979
--Anon wants speculative decoding for server mode for coding tasks: >>101752977 >>101753322 >>101753548 >>101753584 >>101753613
--Miku (free space): >>101749957 >>101751995 >>101752374 >>101752420 >>101752435 >>101752960 >>101752990 >>101753050 >>101753073 >>101755688 >>101756008

►Recent Highlight Posts from the Previous Thread: >>101749058

Anonymous
08/06/24(Tue)19:09:39 No.101757635

Anonymous 08/06/24(Tue)19:09:39 No.101757635

Dead general.

Anonymous
08/06/24(Tue)19:12:33 No.101757672

Anonymous 08/06/24(Tue)19:12:33 No.101757672

>>101757635
We need to revitalize it with more mikus!

Anonymous
08/06/24(Tue)19:13:06 No.101757682

Anonymous 08/06/24(Tue)19:13:06 No.101757682

how's the new magnum 32b?

Anonymous
08/06/24(Tue)19:13:26 No.101757687

Anonymous 08/06/24(Tue)19:13:26 No.101757687

>>101757635
I miss the log posters and the Miku Movers who made this thread really comfy, now its just full of /pol/ schizos.

Anonymous
08/06/24(Tue)19:14:01 No.101757694

Anonymous 08/06/24(Tue)19:14:01 No.101757694

>>101757635
Dead internet.

Anonymous
08/06/24(Tue)19:15:51 No.101757726

Anonymous 08/06/24(Tue)19:15:51 No.101757726

https://x.com/HalimAlrasihi/status/1820918388002009363

Anonymous
08/06/24(Tue)19:16:51 No.101757742

Anonymous 08/06/24(Tue)19:16:51 No.101757742

https://blackforestlabs.ai/wp-content/uploads/2024/08/tv_no_screen_2.mp4

Anonymous
08/06/24(Tue)19:17:53 No.101757757

Anonymous 08/06/24(Tue)19:17:53 No.101757757

>>101757729
>>101757743
Do you have vanilla? I'm not into your fetishes.

Anonymous
08/06/24(Tue)19:18:08 No.101757761

Anonymous 08/06/24(Tue)19:18:08 No.101757761

>>101757729
>not even black just mannequin colored
If you're going to live out your cuck fetishes as least do a good job.

Anonymous
08/06/24(Tue)19:18:54 No.101757770

Anonymous 08/06/24(Tue)19:18:54 No.101757770

>>101757729
Holy mental illness. Get help

Anonymous
08/06/24(Tue)19:20:01 No.101757789

Anonymous 08/06/24(Tue)19:20:01 No.101757789

>>101757772
You obviously have a racemixing fetish, you told us so even.

Anonymous
08/06/24(Tue)19:20:38 No.101757804

Anonymous 08/06/24(Tue)19:20:38 No.101757804

>>101757742
Looks like if you replace the 2 with a 3, you get a higher bitrate video.

Anonymous
08/06/24(Tue)19:20:48 No.101757810

Anonymous 08/06/24(Tue)19:20:48 No.101757810

>>101757742
Do I even want to know what the VRAM requirements for this is going to be?

Anonymous
08/06/24(Tue)19:20:57 No.101757812

Anonymous 08/06/24(Tue)19:20:57 No.101757812

>>101757791
Miku mindbroken him LOL

Anonymous
08/06/24(Tue)19:21:36 No.101757828

Anonymous 08/06/24(Tue)19:21:36 No.101757828

>>101757815
Its not really cope when you told us.

Anonymous
08/06/24(Tue)19:22:31 No.101757844

Anonymous 08/06/24(Tue)19:22:31 No.101757844

>>101757804
1 and 3 are hevc, don't play in my browser.

Anonymous
08/06/24(Tue)19:22:39 No.101757849

Anonymous 08/06/24(Tue)19:22:39 No.101757849

>>101757742
huh, they actually have info on their site

Anonymous
08/06/24(Tue)19:23:37 No.101757864

Anonymous 08/06/24(Tue)19:23:37 No.101757864

>>101757840
Projection because you clearly are to the point where you have gigs of this stuff saved on your computer.

Anonymous
08/06/24(Tue)19:24:31 No.101757875

Anonymous 08/06/24(Tue)19:24:31 No.101757875

can someone host local models for us to test instead of gatekeeping?

Anonymous
08/06/24(Tue)19:25:18 No.101757890

Anonymous 08/06/24(Tue)19:25:18 No.101757890

>>101757772
based

Anonymous
08/06/24(Tue)19:25:59 No.101757901

Anonymous 08/06/24(Tue)19:25:59 No.101757901

>>101757849
https://blackforestlabs.ai/up-next/
Not that much info.

Anonymous
08/06/24(Tue)19:27:15 No.101757923

Anonymous 08/06/24(Tue)19:27:15 No.101757923

>>101757890
kek, didn't read. Have fun seething.

Anonymous
08/06/24(Tue)19:27:18 No.101757926

Anonymous 08/06/24(Tue)19:27:18 No.101757926

>>101757875
Use colab/kaggle poorfag

Anonymous
08/06/24(Tue)19:28:24 No.101757937

Anonymous 08/06/24(Tue)19:28:24 No.101757937

>>101757908
>Obsessed for months
Nah you're mentally ill schizo. Glad I'm not having a meltdown on an anime waifu

Anonymous
08/06/24(Tue)19:28:46 No.101757943

Anonymous 08/06/24(Tue)19:28:46 No.101757943

>>101757901
Speaking of German AI. Has AlephAlpha done anything noteworthy in the past few years? Last i heard from them was their shitty proprietary 70b that they advertised as the definitive BLOOM/OPT-175B killer.

Anonymous
08/06/24(Tue)19:28:50 No.101757944

Anonymous 08/06/24(Tue)19:28:50 No.101757944

>>101757908
>gets filtered
Brainrot is so easy to spot these days, it was probably all cope anyway.

Anonymous
08/06/24(Tue)19:29:04 No.101757948

Anonymous 08/06/24(Tue)19:29:04 No.101757948

>>101757926
you shut the fuck up retard how can you expect people to learn locals if you fucking say "LOL POORFAG"

Anonymous
08/06/24(Tue)19:29:16 No.101757951

Anonymous 08/06/24(Tue)19:29:16 No.101757951

>>101757901
i just meant their announcement post

Anonymous
08/06/24(Tue)19:29:51 No.101757964

Anonymous 08/06/24(Tue)19:29:51 No.101757964

>>101757944
I filter brainrot words as well, zoomers are so fucked up that normal people look mentally ill to them.

Anonymous
08/06/24(Tue)19:31:06 No.101757981

Anonymous 08/06/24(Tue)19:31:06 No.101757981

I'm quitting

Anonymous
08/06/24(Tue)19:31:06 No.101757982

Anonymous 08/06/24(Tue)19:31:06 No.101757982

>>101757948
There is no use in seething about being poor. I'm poor myself and have to run the models on CPU and RAM.
Just get more RAM.

Anonymous
08/06/24(Tue)19:31:19 No.101757987

Anonymous 08/06/24(Tue)19:31:19 No.101757987

>>101757742
How can a company appear out of nowhere and absolutely destroy SD? I think the chinks are behind this.

Anonymous
08/06/24(Tue)19:31:48 No.101757992

Anonymous 08/06/24(Tue)19:31:48 No.101757992

File: 1638735741761.jpg (28 KB, 510x510)

28 KB JPG

>>101757948
I've given you free options zoomer, be grateful and fuck off. You're not worth anyone time

Anonymous
08/06/24(Tue)19:32:28 No.101758004

Anonymous 08/06/24(Tue)19:32:28 No.101758004

>>101757908
You need new material

Anonymous
08/06/24(Tue)19:35:59 No.101758042

Anonymous 08/06/24(Tue)19:35:59 No.101758042

>>101757601
so how well cpu offloading works in vllm?

Anonymous
08/06/24(Tue)19:37:39 No.101758068

Anonymous 08/06/24(Tue)19:37:39 No.101758068

>>101757948
why do you think anyone here cares if you learn anything?

Anonymous
08/06/24(Tue)19:39:43 No.101758083

Anonymous 08/06/24(Tue)19:39:43 No.101758083

QUIT BEING POOR IS WHAT YOU SAY BUT I LIVE IN A FUCKING COUNTY WHERE I CANT JUST GET SHIT FOR CHEAP. NO WONDER BARELY ANYONE GETS INTO LOCALS BECAUSE YOU SHOULD FUCKING HELP

Anonymous
08/06/24(Tue)19:46:26 No.101758167

Anonymous 08/06/24(Tue)19:46:26 No.101758167

>>101758083
I'm saying that I'm poor too.
I don't even have a video card...

Anonymous
08/06/24(Tue)19:47:01 No.101758173

Anonymous 08/06/24(Tue)19:47:01 No.101758173

>>101758083
Dude, first of all relax retard. Secondly, what you can run is going to be heavily tied to what hardware you can buy/afford. So start saving. If you can't wait buy a runpod, if you can still manage to put money aside for food, housing, fuel, and the little fund to get better computing parts. Say goodbye to other time and money sink hobbies as well while you are at it.

Anonymous
08/06/24(Tue)19:48:52 No.101758190

Anonymous 08/06/24(Tue)19:48:52 No.101758190

>>101758173
>If you can't wait buy a runpod
Doesn't that require ID + photo KYC bullshit?

Anonymous
08/06/24(Tue)19:59:07 No.101758313

Anonymous 08/06/24(Tue)19:59:07 No.101758313

File: ComfyUI_00069_.jpg (298 KB, 1024x1024)

298 KB JPG

>>101757672

Anonymous
08/06/24(Tue)20:04:17 No.101758385

Anonymous 08/06/24(Tue)20:04:17 No.101758385

>>101747334
Need more info/updates on this

Anonymous
08/06/24(Tue)20:05:20 No.101758396

Anonymous 08/06/24(Tue)20:05:20 No.101758396

>>101758190
No idea, if it does then fuck that and your new hobby is now saving money, kek.

Anonymous
08/06/24(Tue)20:05:58 No.101758409

Anonymous 08/06/24(Tue)20:05:58 No.101758409

File: 1721245034982228.png (195 KB, 407x353)

195 KB PNG

>Try 405b
>It's really good
It's not fucking fair, bwos...

Anonymous
08/06/24(Tue)20:09:43 No.101758442

Anonymous 08/06/24(Tue)20:09:43 No.101758442

File: 1716578696899155.png (3.4 MB, 1451x1907)

3.4 MB PNG

Is this actually better than Stheno 3.2? https://huggingface.co/Sao10K/L3-8B-Niitama-v1

Anonymous
08/06/24(Tue)20:11:51 No.101758472

Anonymous 08/06/24(Tue)20:11:51 No.101758472

>>101758409
I use 405B and Mistral Large on openrouter and I find myself preferring to use Mistral most of the time

Anonymous
08/06/24(Tue)20:11:56 No.101758474

Anonymous 08/06/24(Tue)20:11:56 No.101758474

>>101758442
I didn't test it much but seemed like a wash aside from it being a little less horny by default.

Anonymous
08/06/24(Tue)20:14:19 No.101758512

Anonymous 08/06/24(Tue)20:14:19 No.101758512

alright bros what cool new models we got

Anonymous
08/06/24(Tue)20:16:27 No.101758548

Anonymous 08/06/24(Tue)20:16:27 No.101758548

>>101758512
I'm not your bro, pal.

Anonymous
08/06/24(Tue)20:17:05 No.101758559

Anonymous 08/06/24(Tue)20:17:05 No.101758559

>>101758548
I'm not your pal, buddy.

Anonymous
08/06/24(Tue)20:17:43 No.101758566

Anonymous 08/06/24(Tue)20:17:43 No.101758566

>>101758512
>>101758548
>>101758559
I am your bro, pal and buddy, fellas.

Anonymous
08/06/24(Tue)20:18:13 No.101758575

Anonymous 08/06/24(Tue)20:18:13 No.101758575

>>101758566
then why didnt you answer my question :(

Anonymous
08/06/24(Tue)20:18:51 No.101758581

Anonymous 08/06/24(Tue)20:18:51 No.101758581

>>101758575
My silence is saving your soul, please understand

Anonymous
08/06/24(Tue)20:19:02 No.101758583

Anonymous 08/06/24(Tue)20:19:02 No.101758583

File: Final_114727731545456_00209.png (2 MB, 1248x1824)

2 MB PNG

>>101758512
it's been months and there still hasn't been a release any better than commandr 35b desu

Anonymous
08/06/24(Tue)20:19:34 No.101758594

Anonymous 08/06/24(Tue)20:19:34 No.101758594

>>101758581
But anon I need my celeb sex toy ai

Anonymous
08/06/24(Tue)20:21:43 No.101758626

Anonymous 08/06/24(Tue)20:21:43 No.101758626

>>101758583
What is command r good at?
I just downloaded a while ago.

Anonymous
08/06/24(Tue)20:22:08 No.101758629

Anonymous 08/06/24(Tue)20:22:08 No.101758629

>>101758594
Starling 7B Beta.

Anonymous
08/06/24(Tue)20:22:40 No.101758638

Anonymous 08/06/24(Tue)20:22:40 No.101758638

>>101758385
Probably nothing, the guys posts don't seem that intelligent so it's likely someone just padding his paper count with bullshit. I'm starting to believe that guy that said LLMs will be smarter than 'AI researchers' soon.

Anonymous
08/06/24(Tue)20:23:15 No.101758650

Anonymous 08/06/24(Tue)20:23:15 No.101758650

the fuck is a brainstorm model?

Anonymous
08/06/24(Tue)20:23:20 No.101758654

Anonymous 08/06/24(Tue)20:23:20 No.101758654

>>101758629
But anon thats old

Anonymous
08/06/24(Tue)20:26:19 No.101758690

Anonymous 08/06/24(Tue)20:26:19 No.101758690

>>101758583
Gemma 2 27B is pretty damn close. Hell of a lot smarter than CR while still having a nice writing style. Too many spine shivers and voices barely above a whisper though.

Anonymous
08/06/24(Tue)20:30:51 No.101758750

Anonymous 08/06/24(Tue)20:30:51 No.101758750

>>101758472
>Use mistral-large on openrouter
>Begins repeating itself within the first message
I dunno, man. It's very technically proficient, but I just can't get it to stop repeating.

Anonymous
08/06/24(Tue)20:30:53 No.101758751

Anonymous 08/06/24(Tue)20:30:53 No.101758751

File: FwNjj8ZWwAMpdnl.jpg (83 KB, 896x898)

83 KB JPG

>>101758512
>>101758583
>>101758690

See anon, that's how you get answers on this board. Don't ask to be spoonfed, just post an opinion that will bait responses you want.

Anonymous
08/06/24(Tue)20:33:59 No.101758785

Anonymous 08/06/24(Tue)20:33:59 No.101758785

>>101758409
sterile as fuck. It's good if you're a regular user, but shit if you want creative writing/roleplaying. Even then, largestral is better at coding, so it's lacking for productive cases as well.

Anonymous
08/06/24(Tue)20:37:41 No.101758825

Anonymous 08/06/24(Tue)20:37:41 No.101758825

File: linus_fanbase.jpg (79 KB, 812x621)

79 KB JPG

>>101758751
>Don't ask questions, just consume LLMs and then get excited for next LLMs

Anonymous
08/06/24(Tue)20:38:48 No.101758839

Anonymous 08/06/24(Tue)20:38:48 No.101758839

>>101758751
This but unironically

Anonymous
08/06/24(Tue)20:39:16 No.101758846

Anonymous 08/06/24(Tue)20:39:16 No.101758846

>>101758750
I posted about this last night. Same exact setup and same exact problem. I hypothesized it was related to Context Size as I couldn't modify it via the OpenRouter API on Kobold but idk what front end you're using.

Anonymous
08/06/24(Tue)20:43:42 No.101758904

Anonymous 08/06/24(Tue)20:43:42 No.101758904

>>101758846
I'm using Sillytavern, let me mess around with context size.

Anonymous
08/06/24(Tue)20:50:18 No.101758989

Anonymous 08/06/24(Tue)20:50:18 No.101758989

>>101758785
L3 405B needs higher temperatures to get sovl out of it. If you aren't running at least 1.0 you're wasting your compute.

Anonymous
08/06/24(Tue)20:56:28 No.101759055

Anonymous 08/06/24(Tue)20:56:28 No.101759055

File: Sad.png (510 KB, 1014x819)

510 KB PNG

>tfw people are talking about using 405b
>me, stuck at 70b

Anonymous
08/06/24(Tue)20:57:39 No.101759065

Anonymous 08/06/24(Tue)20:57:39 No.101759065

>>101759055
Anon, those of us talking about 405B aren't running it on our own computers

Anonymous
08/06/24(Tue)20:58:28 No.101759073

Anonymous 08/06/24(Tue)20:58:28 No.101759073

>>101759055
And I'm stuck at 30b models running on RAM

Anonymous
08/06/24(Tue)21:00:39 No.101759099

Anonymous 08/06/24(Tue)21:00:39 No.101759099

>>101759055
405B can be used for free by trialscumming together.ai

Anonymous
08/06/24(Tue)21:00:57 No.101759102

Anonymous 08/06/24(Tue)21:00:57 No.101759102

>>101759055
>>101759073
12B checking in (T-T)b

Anonymous
08/06/24(Tue)21:01:42 No.101759115

Anonymous 08/06/24(Tue)21:01:42 No.101759115

>>101757635
with flux and the latest generation of LLMs people are spending less time posting and more time actually gatcha rolling AI content

Anonymous
08/06/24(Tue)21:03:18 No.101759133

Anonymous 08/06/24(Tue)21:03:18 No.101759133

>>101759099
Don't they have the worst prices, low context, insanely quanted models, and a shit free trial now though?

Anonymous
08/06/24(Tue)21:03:53 No.101759136

Anonymous 08/06/24(Tue)21:03:53 No.101759136

>>101755281
>>101755355
>>101755454
>>101749539
I can finally coom to a 2B model

Anonymous
08/06/24(Tue)21:09:51 No.101759211

Anonymous 08/06/24(Tue)21:09:51 No.101759211

>>101759133
>free
>prices
can you rephrase your question in a less schizo way that doesn't make it seem like you don't understand what "free" means?

Anonymous
08/06/24(Tue)21:15:53 No.101759275

Anonymous 08/06/24(Tue)21:15:53 No.101759275

>>101759211
I mean sure, if you want to create a new burner every time you want to get three generations of quantslop rather than spending $3 for better options on OpenRouter, that's your perogative

Anonymous
08/06/24(Tue)21:17:02 No.101759284

Anonymous 08/06/24(Tue)21:17:02 No.101759284

>>101759275
I do use openrouter personally, but I'm assuming the frogposter was some kind of poor if he was upset about not having access to 405B
last time I trialscummed Together they gave $25 credit on signup

Anonymous
08/06/24(Tue)21:22:37 No.101759347

Anonymous 08/06/24(Tue)21:22:37 No.101759347

File: 1722114549343438.png (342 KB, 856x665)

342 KB PNG

ZLUDA has been taken down.
https://github.com/vosen/ZLUDA
Why is AMD so retarded?

Anonymous
08/06/24(Tue)21:27:38 No.101759408

Anonymous 08/06/24(Tue)21:27:38 No.101759408

>>101759347
The CEO of AMD and Nvidia are related.
That should be all the information you need to piece together what the scam is.

Anonymous
08/06/24(Tue)21:29:55 No.101759434

Anonymous 08/06/24(Tue)21:29:55 No.101759434

>>101759133
>worst prices
They don't have the worst prices.
>low context
It seems.
>insanely quanted models
They have INT4, FP8, and FP16. The livebench benchmark was done with FP8 and 405B is the 3rd best model.
>and a shit free trial
I think it's $5, but I still have my $25 from a few months ago.

Anonymous
08/06/24(Tue)21:33:51 No.101759474

Anonymous 08/06/24(Tue)21:33:51 No.101759474

>>101759434
They do though. Input price dominates, and it's 4.50 on Together, which is highest out of all providers, on top of a whooping 4k context
There's truly no reason to use them if you can avoid it

Anonymous
08/06/24(Tue)21:37:06 No.101759515

Anonymous 08/06/24(Tue)21:37:06 No.101759515

File: file.png (74 KB, 1827x1031)

74 KB PNG

>>101759474
>which is highest out of all providers
False. They also have best throughput among the ones in OpenRouter.

Anonymous
08/06/24(Tue)21:39:37 No.101759548

Anonymous 08/06/24(Tue)21:39:37 No.101759548

>>101759434
One thing I forgot to mention, the FP16 models are reference models. Those will double in price after August 31 to encourage the quants.

Anonymous
08/06/24(Tue)21:39:39 No.101759549

Anonymous 08/06/24(Tue)21:39:39 No.101759549

>>101757682
It's good, but they need to stop being fags and make either the v2 magnum 72b, or do l3. 1 70b magnum

Anonymous
08/06/24(Tue)21:41:43 No.101759573

Anonymous 08/06/24(Tue)21:41:43 No.101759573

File: Screenshot_20240806-194029.png (118 KB, 1080x1195)

118 KB PNG

>>101759515
On OpenRouter, you tard.
Also, there's your 2 token/second throughput.
The service is shit. Get money and use a better provider.

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.