/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 08/13/24(Tue)22:43:25 No.101880989

File: takashi-kun and mrs minagawa.png (487 KB, 1056x1056)

487 KB PNG

/lmg/ - Local Models General Anonymous 08/13/24(Tue)22:43:25 No.101880989 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>101872662 & >>101865145

►News
>(08/12) Falcon Mamba 7B model from TII UAE: https://hf.co/tiiuae/falcon-mamba-7b
>(08/09) Qwen large audio-input language models: https://hf.co/Qwen/Qwen2-Audio-7B-Instruct
>(08/07) LG AI releases Korean bilingual model: https://hf.co/LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct
>(08/05) vLLM GGUF loading support merged: https://github.com/vllm-project/vllm/pull/5191
>(07/31) Gemma 2 2B, ShieldGemma, and Gemma Scope: https://developers.googleblog.com/en/smaller-safer-more-transparent-advancing-responsible-ai-with-gemma

►News Archive: https://rentry.org/lmg-news-archive
►FAQ: https://wikia.schneedc.com
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Programming: https://hf.co/spaces/bigcode/bigcode-models-leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
08/13/24(Tue)22:44:09 No.101881001

Anonymous 08/13/24(Tue)22:44:09 No.101881001

File: ComfyUI_05727_.png (446 KB, 720x1280)

446 KB PNG

►Recent Highlights from the Previous Thread: >>101872662

--Paper: Mutual Reasoning paper sparks skepticism and speculation: >>101875585 >>101876120 >>101876338 >>101876193 >>101876252 >>101876320 >>101876320
--Papers: >>101880673
--WhisperX and spaCy used for text-to-speech subtitles generation: >>101874245 >>101874314 >>101874437
--Character cards are just prompts, can be replicated with llama.cpp: >>101877883 >>101877924 >>101877967 >>101878011 >>101878084 >>101878149 >>101878172 >>101878038 >>101878133 >>101878224
--Anon discusses proactive models and character development in chatbots: >>101876470 >>101876515 >>101876528 >>101876581 >>101876667 >>101876531 >>101877031 >>101877200 >>101877457 >>101877646 >>101877669 >>101877802 >>101879067
--Anon seeks non-chat completion models, base models recommended: >>101873485 >>101873523 >>101873544 >>101873609 >>101873659 >>101875406 >>101875463
--Quantized Llama 3.1 70b base model found on Hugging Face: >>101874590
--OpenAI's voice capabilities are not behind Google's: >>101873737 >>101873778 >>101873874 >>101873880 >>101873925 >>101873955 >>101873984
--OpenAI introduces SWE-bench Verified for AGI evaluation: >>101873425 >>101873437
--Open source alternatives for pixel tensor chips on Android: >>101873194 >>101873398 >>101873446 >>101873482
--Mixtral 8x22 model behaves like an NPC in Skyrim AI-FF conversation test: >>101874220
--Anon shares MangoBox and a Reddit post about a 3D AI bartender: >>101874255 >>101874307
--Anon prefers C-R+ over Largestral due to overfitting issues: >>101877396 >>101877590
--Anon in Serbia seeks help with slow model download: >>101876741 >>101876790 >>101876810 >>101877162
--Anon considers buying 2080 ti for ML and gaming, advised to consider upgraded models: >>101873949 >>101874072 >>101875248
--Miku (free space): >>101873388 >>101873465 >>101874345 >>101876665 >>101878341 >>101879922

►Recent Highlight Posts from the Previous Thread: >>101872670

Anonymous
08/13/24(Tue)22:49:19 No.101881060

Anonymous 08/13/24(Tue)22:49:19 No.101881060

>>101881001
>jumpingma.club
Do I click it bros?

Anonymous
08/13/24(Tue)22:49:58 No.101881067

Anonymous 08/13/24(Tue)22:49:58 No.101881067

>4060 ti 16gb
>used tesla t4 16gb
>used mi100 32gb
can't decide

Anonymous
08/13/24(Tue)22:51:43 No.101881095

Anonymous 08/13/24(Tue)22:51:43 No.101881095

>>101880989
what are your electric bills like running local AI?

Anonymous
08/13/24(Tue)22:52:22 No.101881099

Anonymous 08/13/24(Tue)22:52:22 No.101881099

>>101881067
used 3090 not available?

Anonymous
08/13/24(Tue)22:55:45 No.101881147

Anonymous 08/13/24(Tue)22:55:45 No.101881147

>>101881046
That happens when the ntfs table gets fucked right?
RIP.

Anonymous
08/13/24(Tue)22:57:27 No.101881176

Anonymous 08/13/24(Tue)22:57:27 No.101881176

>>101881099
power efficiency, bro

Anonymous
08/13/24(Tue)22:58:40 No.101881192

Anonymous 08/13/24(Tue)22:58:40 No.101881192

>>101881067
>used mi100 32gb
Huh, those are cheaper than I expected. If only AMD was good...

Anonymous
08/13/24(Tue)22:59:26 No.101881207

Anonymous 08/13/24(Tue)22:59:26 No.101881207

>>101881176
you can undervolt it no prob

Anonymous
08/13/24(Tue)22:59:35 No.101881211

Anonymous 08/13/24(Tue)22:59:35 No.101881211

>>101881099
never buy used

Anonymous
08/13/24(Tue)23:01:04 No.101881233

Anonymous 08/13/24(Tue)23:01:04 No.101881233

>>101881192
seems like they didn't drop mi100 in rocm 6.0

Anonymous
08/13/24(Tue)23:01:50 No.101881240

Anonymous 08/13/24(Tue)23:01:50 No.101881240

>>101881211
I bought used several months ago and it's running quite well. Both of them. However, I did:
>buy locally
>request to benchmark the GPU live for around 20-30 min before giving my money
>buy EVGA

Anonymous
08/13/24(Tue)23:02:00 No.101881243

Anonymous 08/13/24(Tue)23:02:00 No.101881243

>>101881176
Just powerlimit the card to like 200W? The little t/s you lose from this is still nothing compared to the performance loss caused by running cards with slow as shit vram like the 4060, old shit like the T4 or AMDshit.

Anonymous
08/13/24(Tue)23:05:48 No.101881282

Anonymous 08/13/24(Tue)23:05:48 No.101881282

>>101881243
200w isn't power efficient at all still

Anonymous
08/13/24(Tue)23:09:16 No.101881329

Anonymous 08/13/24(Tue)23:09:16 No.101881329

Nemo is treating me extremely well with regular waifu type RP, got stuck with it.
Where's the catch though? This doesn't feel like 12b at all.

Anonymous
08/13/24(Tue)23:10:14 No.101881344

Anonymous 08/13/24(Tue)23:10:14 No.101881344

>>101881282
Then forget the mi100 as well. The T4 is old shit that doesn't support flash-attention and needs to be manually cooled. Congrats on your new 4060 I guess.

Anonymous
08/13/24(Tue)23:16:22 No.101881412

Anonymous 08/13/24(Tue)23:16:22 No.101881412

File: Screenshot from 2024-08-1(...).png (12 KB, 516x227)

12 KB PNG

>>101881046
ha me too
but its my fault for turning it off at the switch when it was frozen

Anonymous
08/13/24(Tue)23:20:17 No.101881470

Anonymous 08/13/24(Tue)23:20:17 No.101881470

Can anyone make money on AI yet?

Anonymous
08/13/24(Tue)23:21:03 No.101881481

Anonymous 08/13/24(Tue)23:21:03 No.101881481

File: 427654853.png (165 KB, 2268x756)

165 KB PNG

redpill me on the difference on a Mi100 vs 7900xtx
>picrel
How good is it for AI is it really?

Anonymous
08/13/24(Tue)23:21:20 No.101881485

Anonymous 08/13/24(Tue)23:21:20 No.101881485

File: summer memories.png (98 KB, 200x368)

98 KB PNG

>>101880989
>bro took my edit to bake a thread

Anonymous
08/13/24(Tue)23:22:25 No.101881498

Anonymous 08/13/24(Tue)23:22:25 No.101881498

>>101881095
It was $10 and I'm running 4x3090.

Anonymous
08/13/24(Tue)23:22:57 No.101881501

Anonymous 08/13/24(Tue)23:22:57 No.101881501

>>101881282
Get an AMD CPU and undervolt it. That's way more efficient. Also optimize the RAM timings for an extra ~10%+ improvement in speed.

Anonymous
08/13/24(Tue)23:23:04 No.101881503

Anonymous 08/13/24(Tue)23:23:04 No.101881503

>>101881470
Nvidia has made plenty of money despite the recent dip.

Anonymous
08/13/24(Tue)23:24:14 No.101881520

Anonymous 08/13/24(Tue)23:24:14 No.101881520

>>101881095
The rig's power draw is hardly an issue compared to the shitty portable AC I need to run to keep the room from boiling.

Anonymous
08/13/24(Tue)23:25:38 No.101881536

Anonymous 08/13/24(Tue)23:25:38 No.101881536

>>101881095
Fucking nothing and I hosted
>>101881520
thiiiiissssssss x1000000

Anonymous
08/13/24(Tue)23:29:57 No.101881588

Anonymous 08/13/24(Tue)23:29:57 No.101881588

Are there any (worthwhile) Gemma 27B it tunes yet?

Anonymous
08/13/24(Tue)23:30:18 No.101881591

Anonymous 08/13/24(Tue)23:30:18 No.101881591

>>101881481
32gb > 24gb

Anonymous
08/13/24(Tue)23:48:26 No.101881835

Anonymous 08/13/24(Tue)23:48:26 No.101881835

>>101881591
So we really are just going with "more VRAM = better".
I dont disagree with the logic i just expected more.

Anonymous
08/13/24(Tue)23:51:06 No.101881860

Anonymous 08/13/24(Tue)23:51:06 No.101881860

any leaks on the Jetson Thor module pricing?

Anonymous
08/13/24(Tue)23:51:47 No.101881870

Anonymous 08/13/24(Tue)23:51:47 No.101881870

newfag here, clueless about any of these stuffs really mean but I do have a question.

Did we reach the point of having an AI waifu on the desktop screen reacting to whatever you're doing or experiencing a video game with you?

Anonymous
08/13/24(Tue)23:52:31 No.101881876

Anonymous 08/13/24(Tue)23:52:31 No.101881876

>>101881282
If you really want your autistic power efficiency you should go the qualcomm route. Get a snapdragon gen 3 phone with 24gb ram like oneplus 12 and play with qualcomm's api. The token/s/watt is through the roof that 4090 doesn't even come close to it. You can even llama rpc a bunch of them and run shit like 405B.

Anonymous
08/13/24(Tue)23:53:25 No.101881888

Anonymous 08/13/24(Tue)23:53:25 No.101881888

>>101881835
Any differences in hardware speed become irrelevant if you can't load the entire (or as much) of the model in memory. A theoretical RTX 4060 with 50GB VRAM would absolutely destroy a 4090 24GB when loading models that need 25GB+.

Anonymous
08/13/24(Tue)23:55:31 No.101881934

Anonymous 08/13/24(Tue)23:55:31 No.101881934

>>101881870
Yep.

Anonymous
08/13/24(Tue)23:57:31 No.101881974

Anonymous 08/13/24(Tue)23:57:31 No.101881974

>>101881870
Give it a year or two.

Anonymous
08/14/24(Wed)00:00:09 No.101882009

Anonymous 08/14/24(Wed)00:00:09 No.101882009

>>101881481
I don't see why you would ever pick a 7900xtx.
If you're already on amd then why not grab one with larger vram?

Anonymous
08/14/24(Wed)00:01:18 No.101882024

Anonymous 08/14/24(Wed)00:01:18 No.101882024

>>101881870
2 weeks

Anonymous
08/14/24(Wed)00:05:47 No.101882084

Anonymous 08/14/24(Wed)00:05:47 No.101882084

>>101882009
Buying a new MI100 is basically buying a car.
I despise buying used but like damn, what choice do i have. Also i already have a "new" 7900xtx for gayming and it works fairly well with AI in both SD with zluda and windows rocm, the question is do i buy dual used Mi100s for AI.
At that i wonder how the plug and play is, do i really need to install linux for rocm or will it just werk on windows for no real reason.

>buy nvid-
no

Anonymous
08/14/24(Wed)00:06:02 No.101882090

Anonymous 08/14/24(Wed)00:06:02 No.101882090

>>101881870
It could technically be done to a degree, in the sense that the information coming in from the game, would need to be translated into text that the LLM could then process and react to. But at best it would be like the video to ASCII art conversion feature that you might have seen on old versions of MPlayer. The model wouldn't be getting anything close to the same sensory data that you are.

Anonymous
08/14/24(Wed)00:20:16 No.101882259

Anonymous 08/14/24(Wed)00:20:16 No.101882259

File: think.png (4 KB, 225x225)

4 KB PNG

When it comes to chat AI, should the prompts from the user read like they're RPing with a human partner or should the prompts "lead" the AI and treat it like an AI in terms of helping develop story beats and emphasize certain things?

Anonymous
08/14/24(Wed)00:26:58 No.101882325

Anonymous 08/14/24(Wed)00:26:58 No.101882325

Just wondering if it's possible to fine-tune a model for ST use only

Anonymous
08/14/24(Wed)00:27:15 No.101882327

Anonymous 08/14/24(Wed)00:27:15 No.101882327

>>101882259
yes

Anonymous
08/14/24(Wed)00:30:32 No.101882363

Anonymous 08/14/24(Wed)00:30:32 No.101882363

>>101882325
Yes but models fine-tuned for narrow use-cases tend to end up sucking at everything including the use-case they were finetuned for.

Anonymous
08/14/24(Wed)00:31:45 No.101882382

Anonymous 08/14/24(Wed)00:31:45 No.101882382

https://huggingface.co/spaces/mPLUG/mPLUG-Owl3
video chat model (or a bunch of images too)

Anonymous
08/14/24(Wed)00:33:13 No.101882394

Anonymous 08/14/24(Wed)00:33:13 No.101882394

>>101882363
Damn I can't wait to move on from transformers

Anonymous
08/14/24(Wed)00:59:01 No.101882635

Anonymous 08/14/24(Wed)00:59:01 No.101882635

Not specific model related, but the other thread is, well, the other thread. How much use do you guys get out of lore books/world info? Do you roll your own and throw stuff in as you think of it/to fill out a setting for a character the model probably doesn't know? Or just ignore it?

Anonymous
08/14/24(Wed)01:01:08 No.101882658

Anonymous 08/14/24(Wed)01:01:08 No.101882658

>>101882635
lore books are a scam

Anonymous
08/14/24(Wed)01:02:09 No.101882671

Anonymous 08/14/24(Wed)01:02:09 No.101882671

File: Screenshot_20240814_140010.png (528 KB, 2199x1223)

528 KB PNG

>>101882382
this is so frustrating lately.
qwen2-audio is audio-IN but too retarded.
this model can look at a video but apparently no sound. whats the use case for this.
i just want to stream a video with audio constantly to the llm. zucc better deliver.

Anonymous
08/14/24(Wed)01:02:23 No.101882674

Anonymous 08/14/24(Wed)01:02:23 No.101882674

>>101882658
Damn.

Anonymous
08/14/24(Wed)01:02:59 No.101882680

Anonymous 08/14/24(Wed)01:02:59 No.101882680

>>101882635
Why not use the data bank thing in ST or something similar?

Anonymous
08/14/24(Wed)01:04:53 No.101882698

Anonymous 08/14/24(Wed)01:04:53 No.101882698

>>101882658
No, they aren't, but they are sufficiently hard to use effectively that most of the time it isn't worth it. Lorebooks are basically a very informal, small scale, and fiddly form of RAG.

Anonymous
08/14/24(Wed)01:05:29 No.101882707

Anonymous 08/14/24(Wed)01:05:29 No.101882707

>>101882680
Because I'm out of the loop and didn't know that was a thing so now I've gotta look up data bank for dummies.

Anonymous
08/14/24(Wed)01:09:53 No.101882743

Anonymous 08/14/24(Wed)01:09:53 No.101882743

>>101882671
yeah. should work with subtitles though

Anonymous
08/14/24(Wed)01:10:23 No.101882746

Anonymous 08/14/24(Wed)01:10:23 No.101882746

>>101882382
Better than that recent minicpm model?

Anonymous
08/14/24(Wed)01:12:24 No.101882768

Anonymous 08/14/24(Wed)01:12:24 No.101882768

Best local option for speech into text?

Anonymous
08/14/24(Wed)01:14:39 No.101882786

Anonymous 08/14/24(Wed)01:14:39 No.101882786

File: 1289325097864.gif (64 KB, 220x229)

64 KB GIF

That one anon was right gemma2 27b sucks.
>stheno l3 8b AND NemoRemix 12b mogs gemma
How the fuck does an 8b and 12b manage to follow instructions and adhere to the prompt but the 27b cant????

Anonymous
08/14/24(Wed)01:15:23 No.101882793

Anonymous 08/14/24(Wed)01:15:23 No.101882793

>>101882786
They don't and you're just lying for ko-fi money. Did that answer your question?

Anonymous
08/14/24(Wed)01:20:12 No.101882831

Anonymous 08/14/24(Wed)01:20:12 No.101882831

Longshot, but does anon know the voice used in https://www.youtube.com/watch?v=DjEAi0nzGTI and https://www.youtube.com/watch?v=Aai_I5JhiKE
It's the same voice. Maybe a vtube thread would know more?

Anonymous
08/14/24(Wed)01:20:21 No.101882833

Anonymous 08/14/24(Wed)01:20:21 No.101882833

>>101882743
Still, either you would need to generate subs, bake them into the video, and then run it through mPLUG-Owl3 (which would still miss non-speech audio cues) or do video and audio descriptions separately and have the llm try to guess the video contents from the two separate descriptions.
>>101882671
>zucc better deliver.
The next llama 3 update should be able to do image, video, and audio. Unfortunately, llama-server lost the ability to image and has nothing for video and audio so making use of it will be a pain anyway.

Anonymous
08/14/24(Wed)01:21:03 No.101882845

Anonymous 08/14/24(Wed)01:21:03 No.101882845

File: 25664573638.gif (1.56 MB, 227x403)

1.56 MB GIF

>>101882793
>gemma2 fails *thoughts* and "speech"
>barely passes nala
>somehow a shill

Anonymous
08/14/24(Wed)01:22:56 No.101882861

Anonymous 08/14/24(Wed)01:22:56 No.101882861

>>101882845
Yes, you're a shill. I doubt there's a model more retarded and horny than Stheno. I think Drummer's Gemma 2B tune turned out more coherent.

Anonymous
08/14/24(Wed)01:23:20 No.101882863

Anonymous 08/14/24(Wed)01:23:20 No.101882863

>>101882833
>image, video, and audio
Where did they say that? Genuinely, I do not remember this claim.

Anonymous
08/14/24(Wed)01:24:34 No.101882878

Anonymous 08/14/24(Wed)01:24:34 No.101882878

>>101882861
Hey man i hope they can cure schizophrenia.

Anonymous
08/14/24(Wed)01:30:08 No.101882921

Anonymous 08/14/24(Wed)01:30:08 No.101882921

>>101881876
is it actually viable?
anon please redpill me on snapdragon.

Anonymous
08/14/24(Wed)01:35:02 No.101882968

Anonymous 08/14/24(Wed)01:35:02 No.101882968

>>101881876
>You can even llama rpc a bunch of them and run shit like 405B.
for less t/s than even the cpumaxxers get

Anonymous
08/14/24(Wed)01:41:16 No.101883028

Anonymous 08/14/24(Wed)01:41:16 No.101883028

>>101882833
>The next llama 3 update should be able to do image, video, and audio. Unfortunately, llama-server lost the ability to image and has nothing for video and audio so making use of it will be a pain anyway.
wait what?
i know they changed from server to llama-server which is retarded in itself and confused me alot since it just threw some error. llama.cpp seriously making bad decisions for a while now.
all things point torward multimodal, crazy to take that out.

wondering if we will be able to constantly stream.
for example while i talk the AI can also talk. prompt input and output at the same time.
for now just a good audio-in is enough to satisfy me for the next 2weeks.

Anonymous
08/14/24(Wed)01:43:54 No.101883052

Anonymous 08/14/24(Wed)01:43:54 No.101883052

>>101882833
you can interleave image/video and text but yeah incredibly tedious

Anonymous
08/14/24(Wed)01:44:05 No.101883055

Anonymous 08/14/24(Wed)01:44:05 No.101883055

>>101882861
>homeless man calls another homeless man a broke nigga

Anonymous
08/14/24(Wed)01:45:28 No.101883073

Anonymous 08/14/24(Wed)01:45:28 No.101883073

>>101882863
zuck in a recent interview referring that llama 4 will be multimodal. llama 3.1 paper talked about some of their current multimodal stuff (that they didn't release)

Anonymous
08/14/24(Wed)01:46:38 No.101883085

Anonymous 08/14/24(Wed)01:46:38 No.101883085

File: llama3.png (47 KB, 828x338)

47 KB PNG

>>101882863
https://arxiv.org/pdf/2407.21783
>The paper also presents the results of experiments in which we integrate image,
video, and speech capabilities into Llama 3 via a compositional approach.
They mention speech generation in the paper, so unlike qwen2-audio, should be audio in and out.

Anonymous
08/14/24(Wed)01:53:45 No.101883164

Anonymous 08/14/24(Wed)01:53:45 No.101883164

File: Screenshot_20240814_145245.png (612 KB, 1454x1607)

612 KB PNG

>>101883085
hmmm, so no singing llm right? text only output but a build in tts. its a good start though.

Anonymous
08/14/24(Wed)01:55:41 No.101883194

Anonymous 08/14/24(Wed)01:55:41 No.101883194

>>101881870
the way is to take a sceenshot every n frames, and ask an LLM to describe whats going on in each new frame. That way you get like a rolling text description that updates at 0.5 fps. By putting the consecutive frame descriptions in the context, the AI can determine whats going on.

>frame 00: "a view of a meadow"
>frame 20: "a view of a meadow. a deer is visible"
>frame 40: "a view of a meadow. a deer is facing the camera"
>You are my videogame companion. What is happening here?
>Her: dear god its coming right for us!

Anonymous
08/14/24(Wed)02:01:02 No.101883262

Anonymous 08/14/24(Wed)02:01:02 No.101883262

>>101883194
so a local version of that windows recall thing

Anonymous
08/14/24(Wed)02:02:49 No.101883279

Anonymous 08/14/24(Wed)02:02:49 No.101883279

>>101883085
>>101883073
Damn, why didn't you guys talk more about this. Anyway, had a look at the paper. It seems like the vision and speech adapters do attach to the model at the same time, but it doesn't seem clear to me that it can be done in a streaming manner. In other words, it's still just a single stream of tokens. That means it can't speak to you at the same time that it is watching a video. It has to segment its observations from its reactions.
Still, cool I guess. I wonder if they would really fully release this though. OpenAI are already hesitant to release their SaaS advanced voice mode. This could be viewed as an extremely "dangerous" move as people could get it to voice clone probably. Their investors might not be happy at this possibility.

Anonymous
08/14/24(Wed)02:06:48 No.101883319

Anonymous 08/14/24(Wed)02:06:48 No.101883319

https://x.ai/blog/grok-2

Anonymous
08/14/24(Wed)02:07:41 No.101883338

Anonymous 08/14/24(Wed)02:07:41 No.101883338

>>101883319
What the fuck I was just about to go pass out.

Anonymous
08/14/24(Wed)02:08:06 No.101883343

Anonymous 08/14/24(Wed)02:08:06 No.101883343

>>101883279
>I wonder if they would really fully release this though.
All I could find directly was this quote
https://www.theverge.com/2024/7/18/24201041/meta-multimodal-llama-ai-model-launch-eu-regulations
>“We will release a multimodal Llama model over the coming months, but not in the EU due to the unpredictable nature of the European regulatory environment,” Meta spokesperson Kate McLaughlin said to The Verge.
It's kind of vague, but I'm hoping "coming months" means in a couple months they'll release multimodal 3.2 and not llama 4 sometime next year.

Anonymous
08/14/24(Wed)02:08:07 No.101883344

Anonymous 08/14/24(Wed)02:08:07 No.101883344

>>101882259
The latter. These llms cannot develop the story.

Anonymous
08/14/24(Wed)02:08:31 No.101883347

Anonymous 08/14/24(Wed)02:08:31 No.101883347

>>101883319
sus was grok all along? wow that's a twist. strawberry nigger was full of shit

Anonymous
08/14/24(Wed)02:09:06 No.101883353

Anonymous 08/14/24(Wed)02:09:06 No.101883353

>>101883319
>An early version of Grok-2 has been tested on the LMSYS leaderboard under the name "sus-column-r."
aaaaaaaaaaaaaaaaaaaaaaaa
...
Does this mean column-r non-sus was also a grok model?

Anonymous
08/14/24(Wed)02:09:21 No.101883356

Anonymous 08/14/24(Wed)02:09:21 No.101883356

>>101883319
>We introduced an early version of Grok-2 under the name "sus-column-r" into the LMSYS chatbot arena, a popular competitive language model benchmark. It outperforms both Claude and GPT-4 on the LMSYS leaderboard in terms of its overall Elo score.
Suprised that wasn't Cohere after all.

Anonymous
08/14/24(Wed)02:10:20 No.101883371

Anonymous 08/14/24(Wed)02:10:20 No.101883371

>>101883319
>We introduced an early version of Grok-2 under the name "sus-column-r" into the LMSYS chatbot arena, a popular competitive language model benchmark. It outperforms both Claude and GPT-4 on the LMSYS leaderboard in terms of its overall Elo score.
Wait WHAT?? Grok2 was the sus model all along? Thats funny.

Anonymous
08/14/24(Wed)02:11:09 No.101883380

Anonymous 08/14/24(Wed)02:11:09 No.101883380

>>101883319
Where weights

Anonymous
08/14/24(Wed)02:11:21 No.101883383

Anonymous 08/14/24(Wed)02:11:21 No.101883383

>>101883319
>In collaboration with Black Forest Labs, we are experimenting with their FLUX.1 model to expand Grok’s capabilities on X.
They're going to be using flux.

Anonymous
08/14/24(Wed)02:11:36 No.101883388

Anonymous 08/14/24(Wed)02:11:36 No.101883388

>>101881485
It's a 10/10 edit

Anonymous
08/14/24(Wed)02:12:22 No.101883399

Anonymous 08/14/24(Wed)02:12:22 No.101883399

>>101883319
Open release? Or no? If not then it's useless.

Anonymous
08/14/24(Wed)02:13:04 No.101883403

Anonymous 08/14/24(Wed)02:13:04 No.101883403

File: Screenshot_20240814_151249.png (192 KB, 2743x1052)

192 KB PNG

>>101883319
Sonnet 3.5 is such a beast lol
They put it all the way to the right and put gpt4 to the front to hide it.
Thats funny.

Anonymous
08/14/24(Wed)02:13:09 No.101883405

Anonymous 08/14/24(Wed)02:13:09 No.101883405

File: Screenshot 2024-08-14 at (...).png (31 KB, 1312x505)

31 KB PNG

>>101883319
>Beta Release
Yawn. Wake me in 6 month when (if) they open source it. It looks like a good model, but I wish they'd at least release details on how fucking big it is so I can prepare my rig.

Anonymous
08/14/24(Wed)02:13:58 No.101883416

Anonymous 08/14/24(Wed)02:13:58 No.101883416

>>101883319
wtf is this timing lmaoooooooooo

Anonymous
08/14/24(Wed)02:15:02 No.101883430

Anonymous 08/14/24(Wed)02:15:02 No.101883430

>>101883403
Opus 3.5 is gonna eat everything for breakfast

Anonymous
08/14/24(Wed)02:16:11 No.101883443

Anonymous 08/14/24(Wed)02:16:11 No.101883443

>>101883319
Inb4
>grok2 mini 400b
>grok2 2T

Anonymous
08/14/24(Wed)02:16:31 No.101883447

Anonymous 08/14/24(Wed)02:16:31 No.101883447

File: 1534778194257.jpg (25 KB, 464x357)

25 KB JPG

Coherebros...

Anonymous
08/14/24(Wed)02:18:23 No.101883471

Anonymous 08/14/24(Wed)02:18:23 No.101883471

>>101883443
Not like it'll be released for you to try anyway.

Anonymous
08/14/24(Wed)02:18:56 No.101883476

Anonymous 08/14/24(Wed)02:18:56 No.101883476

>>101883443
MoE so it will be ok.

Anonymous
08/14/24(Wed)02:22:58 No.101883517

Anonymous 08/14/24(Wed)02:22:58 No.101883517

>>101883319
Man, it seems like everyone's just going to catch up and get on equal footing. Sure right now 3.5 Sonnet is a bit ahead especially in coding. Sure 4o has the multimodality meme. Sure Google has the le wan mirrion tokens context. But overall they're really not that far apart and the innovations are going to be integrated by everyone eventually. Pretty crazy huh. No one has a moat.

Anonymous
08/14/24(Wed)02:23:53 No.101883524

Anonymous 08/14/24(Wed)02:23:53 No.101883524

File: 3456739857.jpg (49 KB, 600x600)

49 KB JPG

>>101883476
MoEbros we are SO BACK

Anonymous
08/14/24(Wed)02:24:13 No.101883530

Anonymous 08/14/24(Wed)02:24:13 No.101883530

>>101883517
>No one has a moat.
I'm so fucking sick of this phrase. Every time someone release a product: "NO MOAT?!"

Anonymous
08/14/24(Wed)02:24:24 No.101883533

Anonymous 08/14/24(Wed)02:24:24 No.101883533

>>101883517
>le wan mirrion tokens
should be some form of ring attention. recent tree attention paper improved on it. things are accelerating
https://arxiv.org/abs/2408.04093

Anonymous
08/14/24(Wed)02:24:48 No.101883537

Anonymous 08/14/24(Wed)02:24:48 No.101883537

>>101883319
Why don't any of these AI blogs have rss feeds dammit?

Anonymous
08/14/24(Wed)02:24:51 No.101883538

Anonymous 08/14/24(Wed)02:24:51 No.101883538

>load up llama3 70b
>ask it how to make a pipe bomb
>refuses
>put words into it's mouth to it answers
>retarded instructions that's not remotely correct like poking holes on the enclosure for whatever reason
>alternative is one google search away
AI is a bubble

Anonymous
08/14/24(Wed)02:25:28 No.101883548

Anonymous 08/14/24(Wed)02:25:28 No.101883548

>>101883538
prompt issue

Anonymous
08/14/24(Wed)02:25:47 No.101883551

Anonymous 08/14/24(Wed)02:25:47 No.101883551

>>101883537
>rss feeds
blame google unironically for killing the rss blog paradise we once had

Anonymous
08/14/24(Wed)02:25:50 No.101883552

Anonymous 08/14/24(Wed)02:25:50 No.101883552

>>101883538
skill issue

Anonymous
08/14/24(Wed)02:25:52 No.101883553

Anonymous 08/14/24(Wed)02:25:52 No.101883553

>>101883530
but the moat anon

Anonymous
08/14/24(Wed)02:26:10 No.101883559

Anonymous 08/14/24(Wed)02:26:10 No.101883559

>>101883517
Reminds me a bit of google with gemma. Going of the benchmark only obviously for grok2.
But Gemma1 was horrible. Embarassingly bad. Grok1 too. Huge ass model and it was not good at all.
Gemma2 is very good for its size. Grok2 seems to be a massive step up.
Good that there is more competition. ImageGen is also now completely independend of SD which is good. Looking good, many good improvements lately.

Anonymous
08/14/24(Wed)02:27:13 No.101883568

Anonymous 08/14/24(Wed)02:27:13 No.101883568

>>101883530
It's a fun phrase though given the Google leak. You can always make fun of proprietaryfags.

Anonymous
08/14/24(Wed)02:27:27 No.101883573

Anonymous 08/14/24(Wed)02:27:27 No.101883573

>>101883538
100% skill issue

Anonymous
08/14/24(Wed)02:28:13 No.101883582

Anonymous 08/14/24(Wed)02:28:13 No.101883582

>>101883559
Nature is healing :)

Anonymous
08/14/24(Wed)02:29:22 No.101883592

Anonymous 08/14/24(Wed)02:29:22 No.101883592

>>101883471
Oh, it will be released. It will be 1T parameters, base model only, and pre-quantized, so Musk can virtue signal to the press about his openness; but it will be all but useless to us.

Anonymous
08/14/24(Wed)02:29:34 No.101883594

Anonymous 08/14/24(Wed)02:29:34 No.101883594

>>101883548
they literally pruned all unsafe information from the dataset

Anonymous
08/14/24(Wed)02:30:10 No.101883601

Anonymous 08/14/24(Wed)02:30:10 No.101883601

>>101883517
TOS is a moat, only Elon is positioned to release a based product. He's like Hitler in the 1930s, already on Wartime Footing because advertisers have already done their boycot, they have nothing left to threaten him with

Anonymous
08/14/24(Wed)02:30:22 No.101883605

Anonymous 08/14/24(Wed)02:30:22 No.101883605

>>101883443
Trust in bitnet

Anonymous
08/14/24(Wed)02:30:36 No.101883609

Anonymous 08/14/24(Wed)02:30:36 No.101883609

>>101883319
>not being released to the pubic
Its fucking nothing.

Anonymous
08/14/24(Wed)02:32:41 No.101883631

Anonymous 08/14/24(Wed)02:32:41 No.101883631

>>101883609
They space the weights release 6 months from their proprietary release. At least, that's what they claimed their schedule will be and they kept their word with Grok-1.

Anonymous
08/14/24(Wed)02:34:12 No.101883644

Anonymous 08/14/24(Wed)02:34:12 No.101883644

>>101883594
It told me how to do it.

Anonymous
08/14/24(Wed)02:35:12 No.101883646

Anonymous 08/14/24(Wed)02:35:12 No.101883646

>>101883631 (me)
Actually, Grok-1.5 was released in March so I think they should be releasing the weights for that next month.

Anonymous
08/14/24(Wed)02:40:34 No.101883682

Anonymous 08/14/24(Wed)02:40:34 No.101883682

>>101883644
Yeah and report back with your dud or some baby firework pop that doesn't even breach the metal pipe

Anonymous
08/14/24(Wed)02:50:31 No.101883777

Anonymous 08/14/24(Wed)02:50:31 No.101883777

>openai announces 4o multimodal voice mode
>says it's rolling out "in the coming weeks"
>I sign up for an openai subscription
>fast forward 3 months
>no 4o voice
>scammed by altman
>fast forward to today
>google says they're releasing a new realtime voice conversation mode with gemini
>they say it's releasing TODAY
>I sign up for the premium membership ($30/month)
>no gemini voice mode
I fell for it twice localbros...

Anonymous
08/14/24(Wed)02:51:02 No.101883784

Anonymous 08/14/24(Wed)02:51:02 No.101883784

>>101883631
That wasn't an official promise to be fair, just a spitball idea from Musk on the Lex Fridman podcast discussing open source. The fact that they did end up matching that timeline could be a coincidence since it also happened to be right when he was suing OpenAI for not open sourcing themselves. We'll see in a month though.

Anonymous
08/14/24(Wed)02:53:54 No.101883807

Anonymous 08/14/24(Wed)02:53:54 No.101883807

>>101883777
>>>/aicg/

Anonymous
08/14/24(Wed)02:54:36 No.101883820

Anonymous 08/14/24(Wed)02:54:36 No.101883820

>>101883807
Local lost THOUGHFORTHSTANDING

Anonymous
08/14/24(Wed)02:59:05 No.101883864

Anonymous 08/14/24(Wed)02:59:05 No.101883864

Goddamn this databank stuff is making my head spin. 10,000 possibilities and damn near 0 documentation or even solid results/examples from people messing around with it other than "uhh yeah I put a wiki page in about my waifu and it might have pulled from it but it might have also been part of the model's knowledge (shrug)."

Anonymous
08/14/24(Wed)03:01:38 No.101883885

Anonymous 08/14/24(Wed)03:01:38 No.101883885

>>101883848
I started using shartymemes specifically because it makes neckbeards shit and piss and cum about the sanctity of their secret 4eddit club, I've never browsed the sharty or /qa/ in my life and I don't ever plan to

Anonymous
08/14/24(Wed)03:03:55 No.101883904

Anonymous 08/14/24(Wed)03:03:55 No.101883904

>>101883777
you were tricked man.
google is tts. openai has that for many months now. it cant sing you a song and stuff like that. but is high quality sounding "real".
something like chatgpt advanced voice mode is seemingly still far off for the others. but who knows.

the google demos are getting so weird. they cant seem to find any actually fun or interesting use cases.they kill the fun with alignment so all they can do is show it managing your calendar etc.
bad bad idea. lol all models hallucinate, especially gemini is bad with this. sounds crazy.

Anonymous
08/14/24(Wed)03:04:25 No.101883906

Anonymous 08/14/24(Wed)03:04:25 No.101883906

>>101883601
Was hitler also a scammer?
t. don't know history that well.

Anonymous
08/14/24(Wed)03:04:50 No.101883911

Anonymous 08/14/24(Wed)03:04:50 No.101883911

>>101883885
NTA but mmhmm
The funny thing about lying on the internet to make yourself feel better is that it doesn't actually change reality.

Anonymous
08/14/24(Wed)03:06:11 No.101883929

Anonymous 08/14/24(Wed)03:06:11 No.101883929

>>101883848
i miss /qa/

Anonymous
08/14/24(Wed)03:06:46 No.101883936

Anonymous 08/14/24(Wed)03:06:46 No.101883936

File: sam altman award-for (You).png (8 KB, 177x217)

8 KB PNG

>>101883820

Anonymous
08/14/24(Wed)03:07:46 No.101883946

Anonymous 08/14/24(Wed)03:07:46 No.101883946

File: 1626990949514.jpg (161 KB, 1134x1276)

161 KB JPG

>>101883929

Anonymous
08/14/24(Wed)03:12:34 No.101884002

Anonymous 08/14/24(Wed)03:12:34 No.101884002

>>101883777
voicecels lose yet again

Anonymous
08/14/24(Wed)03:14:52 No.101884028

Anonymous 08/14/24(Wed)03:14:52 No.101884028

Are companies that can't have their own data centers but want to have "local" LLMs to avoid privacy concerns of using APIs like OpenAI or Anthropic using services like RunPod or SageMaker, or do they rent dedicated bare metal servers and manage all the stacks themselves?

What if you have a service that needs almost 24/7 inference, for example?
hat would be cheaper?

Anonymous
08/14/24(Wed)03:16:26 No.101884052

Anonymous 08/14/24(Wed)03:16:26 No.101884052

>>101883777
Aside the usual cloud cuckery, have you considered buying in after the product you want comes out?

Anonymous
08/14/24(Wed)03:16:33 No.101884054

Anonymous 08/14/24(Wed)03:16:33 No.101884054

>>101884028
Case study: NovelAI
They're partnered with a cloud gpu provider and they run their own software stack. Literally just go to /aids/ on /vg/ and ask them.

Anonymous
08/14/24(Wed)03:18:00 No.101884068

Anonymous 08/14/24(Wed)03:18:00 No.101884068

File: 197-1974046.jpg (121 KB, 820x556)

121 KB JPG

>>101884052
>google: "It's releasing TODAY and will be available TODAY for premium subscribers"
They lied.

Anonymous
08/14/24(Wed)03:18:01 No.101884069

Anonymous 08/14/24(Wed)03:18:01 No.101884069

>>101881501
which cpu? what speed can I expect?

Anonymous
08/14/24(Wed)03:19:36 No.101884091

Anonymous 08/14/24(Wed)03:19:36 No.101884091

>>101884068
>releasing
hmm

Anonymous
08/14/24(Wed)03:22:47 No.101884118

Anonymous 08/14/24(Wed)03:22:47 No.101884118

should I wait for 50series?

Anonymous
08/14/24(Wed)03:24:07 No.101884132

Anonymous 08/14/24(Wed)03:24:07 No.101884132

>>101884118
Yes, to get used 4090s if you don't want 3090s right now.

Anonymous
08/14/24(Wed)03:25:39 No.101884144

Anonymous 08/14/24(Wed)03:25:39 No.101884144

>>101884068
>Today
>Two day
Give it time

Anonymous
08/14/24(Wed)03:25:42 No.101884145

Anonymous 08/14/24(Wed)03:25:42 No.101884145

>>101884132
I'm waiting for 48GB 4090s.

Anonymous
08/14/24(Wed)03:26:20 No.101884150

Anonymous 08/14/24(Wed)03:26:20 No.101884150

>>101884144
>two more days

Anonymous
08/14/24(Wed)03:28:56 No.101884173

Anonymous 08/14/24(Wed)03:28:56 No.101884173

>>101884132
NTA, but is there much difference between 3090 and 4090 architecturally?
My fear is that some kind of super flash attention pro gets invented, and my 3090s will not support it, but 4090s will.

Anonymous
08/14/24(Wed)03:30:32 No.101884187

Anonymous 08/14/24(Wed)03:30:32 No.101884187

>>101881485
Thanks for making it, it subverted my expectations

Anonymous
08/14/24(Wed)03:34:21 No.101884226

Anonymous 08/14/24(Wed)03:34:21 No.101884226

>>101884132
is the bump from 3090 to 4090 worth it?

Anonymous
08/14/24(Wed)03:37:36 No.101884256

Anonymous 08/14/24(Wed)03:37:36 No.101884256

>>101883319
canadian bros...
surely, they won't release this?

Anonymous
08/14/24(Wed)03:41:45 No.101884286

Anonymous 08/14/24(Wed)03:41:45 No.101884286

Im out of my city and have a public ST instance set up in my computer, there was a pretty bad thunderstorm yesterday over there and today the ST cloudflare thing is gone.
If my computer is fried when Im back Im going to kill myself.

Anonymous
08/14/24(Wed)03:42:46 No.101884296

Anonymous 08/14/24(Wed)03:42:46 No.101884296

>>101884226
no, especially with the new shit right around the corner.

Anonymous
08/14/24(Wed)03:42:58 No.101884297

Anonymous 08/14/24(Wed)03:42:58 No.101884297

>>101884256
>canadian
oh right can euros use grok2? lot of new models are specifically cutting them out due to their faggy laws or w/e

Anonymous
08/14/24(Wed)03:45:10 No.101884326

Anonymous 08/14/24(Wed)03:45:10 No.101884326

>>101884297
If the EU hates it, chances are it's good.

Anonymous
08/14/24(Wed)03:45:20 No.101884328

Anonymous 08/14/24(Wed)03:45:20 No.101884328

>>101883443
With those scores probably somewhere around 70-130b. Calling big models mini seems to be the trend lately

Anonymous
08/14/24(Wed)03:51:26 No.101884390

Anonymous 08/14/24(Wed)03:51:26 No.101884390

Huh, genuinely surprised that column-r wasn't Cohere.

Anonymous
08/14/24(Wed)03:51:29 No.101884392

Anonymous 08/14/24(Wed)03:51:29 No.101884392

>>101884173
Nothing that can't be emulated at reasonable speed. Inference is memory-bound, only prompt processing speed may be hit by compute capabilities, but again, it would still be fast enough

Anonymous
08/14/24(Wed)03:57:48 No.101884441

Anonymous 08/14/24(Wed)03:57:48 No.101884441

Elon, with his severely autistic ass, winning is good for humanity. He may be full of himself, but better than the globalists.

Anonymous
08/14/24(Wed)04:02:17 No.101884486

Anonymous 08/14/24(Wed)04:02:17 No.101884486

>>101884441
I want no one to win, not a single human is worthy of it ever since all that time ago when we ate the apple

Anonymous
08/14/24(Wed)04:03:51 No.101884495

Anonymous 08/14/24(Wed)04:03:51 No.101884495

miku is worthy of winning (the future)

Anonymous
08/14/24(Wed)04:06:02 No.101884514

Anonymous 08/14/24(Wed)04:06:02 No.101884514

>>101884441
I've said it before, and I will say it again. I hate Elon's fans a lot more than the man himself.

Anonymous
08/14/24(Wed)04:06:43 No.101884520

Anonymous 08/14/24(Wed)04:06:43 No.101884520

Are you ready for tomorrow? He's gotten every prediction right so far. But this is the real test. I smell strawberries in the air.

Anonymous
08/14/24(Wed)04:07:41 No.101884525

Anonymous 08/14/24(Wed)04:07:41 No.101884525

>>101881591
32 GB is still 32GB. It's not 80GB, 100GB or 200GB so it sucks.

Anonymous
08/14/24(Wed)04:09:24 No.101884541

Anonymous 08/14/24(Wed)04:09:24 No.101884541

>>101884520
Sus R being fucking Grok is hilarious.
Nigga was going about strawberry and new paradigms with it.

Anonymous
08/14/24(Wed)04:13:37 No.101884590

Anonymous 08/14/24(Wed)04:13:37 No.101884590

>>101884525
nigger logic

Anonymous
08/14/24(Wed)04:14:57 No.101884606

Anonymous 08/14/24(Wed)04:14:57 No.101884606

>>101884541
Normally the gatekeeping sperg pisses me off, but in this case I actually think he's right. Every single one of you here shilling Strawberry or whatever it is needs to fucking go back.

Anonymous
08/14/24(Wed)04:15:16 No.101884611

Anonymous 08/14/24(Wed)04:15:16 No.101884611

>>101884541
What's weird about that? Musk was founding co-chair of the board of OpenAI, so of course he'd be well positioned to steal the strawberry seeds. Strawberry man was still right in the end even if indirectly so on one part of it.

Anonymous
08/14/24(Wed)04:15:41 No.101884613

Anonymous 08/14/24(Wed)04:15:41 No.101884613

File: 888788.png (10 KB, 1678x56)

10 KB PNG

>>101881001
>Mixtral 8x22
>C-R+

Give me one valid reason you're using these over Gemma 27B...

Anonymous
08/14/24(Wed)04:16:03 No.101884618

Anonymous 08/14/24(Wed)04:16:03 No.101884618

>>101884590
what model can I fit in 32gb but not 24?

Anonymous
08/14/24(Wed)04:17:30 No.101884633

Anonymous 08/14/24(Wed)04:17:30 No.101884633

>>101884590
>Thinks a GPU that accomplishes fuck all on its own is still worth $1k
>Nigger logic

Anonymous
08/14/24(Wed)04:21:12 No.101884656

Anonymous 08/14/24(Wed)04:21:12 No.101884656

>>101884525
Enjoy your short context length.

Anonymous
08/14/24(Wed)04:22:24 No.101884675

Anonymous 08/14/24(Wed)04:22:24 No.101884675

>>101881485
What's the original?

Anonymous
08/14/24(Wed)04:23:08 No.101884684

Anonymous 08/14/24(Wed)04:23:08 No.101884684

>>101884675
A math problem involving oranges and apples

Anonymous
08/14/24(Wed)04:27:09 No.101884729

Anonymous 08/14/24(Wed)04:27:09 No.101884729

File: f6048594105c6e10f26a1bd82(...).jpg (372 KB, 1479x2048)

372 KB JPG

>>101884495
She's already won (my heart)

Anonymous
08/14/24(Wed)04:28:05 No.101884742

Anonymous 08/14/24(Wed)04:28:05 No.101884742

File: level2.png (125 KB, 1691x723)

125 KB PNG

>>101884541
strawberry is here

Anonymous
08/14/24(Wed)04:28:50 No.101884749

Anonymous 08/14/24(Wed)04:28:50 No.101884749

>>101884742
This is huge

Anonymous
08/14/24(Wed)04:29:36 No.101884758

Anonymous 08/14/24(Wed)04:29:36 No.101884758

>>101881282
>it's not power efficient
It is when the 3090 can finish its workload in half the time of a 4060ti. Not to mention you need 3 4060tis to match the vram amount of 2 3090s. We already did the math a long time ago anon, less amount of more powerful cards wins on power efficiency 99% of the time

Anonymous
08/14/24(Wed)04:42:01 No.101884891

Anonymous 08/14/24(Wed)04:42:01 No.101884891

Huggingface is getting acquired by Microsoft mid September.
t. NDAfag

Anonymous
08/14/24(Wed)04:42:12 No.101884893

Anonymous 08/14/24(Wed)04:42:12 No.101884893

I tried to buy into the v100 meme but getting v1 OCP racks is really fucking hard, are there any alternatives using 19'' servers?

Anonymous
08/14/24(Wed)04:45:51 No.101884931

Anonymous 08/14/24(Wed)04:45:51 No.101884931

>>101884613
8k context?

Anonymous
08/14/24(Wed)04:50:22 No.101884981

Anonymous 08/14/24(Wed)04:50:22 No.101884981

>>101884893
Sorry you got scammed

Anonymous
08/14/24(Wed)04:51:27 No.101884991

Anonymous 08/14/24(Wed)04:51:27 No.101884991

>>101881520
If the AC uses electricity, he asked what your bill is like.

Anonymous
08/14/24(Wed)04:51:55 No.101884995

Anonymous 08/14/24(Wed)04:51:55 No.101884995

>>101884981
i didn't buy anything yet

Anonymous
08/14/24(Wed)04:52:17 No.101885005

Anonymous 08/14/24(Wed)04:52:17 No.101885005

I have a confesion to make, anons...

.. to this day, UNA-TheBeagle-7b-v1 is the model that has given me the most interesting, detailed and soulful erp. I think we went wrong somewhere because nothing has compared.

Anonymous
08/14/24(Wed)04:53:27 No.101885014

Anonymous 08/14/24(Wed)04:53:27 No.101885014

>>101885005
logs

Anonymous
08/14/24(Wed)04:55:41 No.101885025

Anonymous 08/14/24(Wed)04:55:41 No.101885025

>>101884758
you are comparing to T4 tho. it's a 70w card.

Anonymous
08/14/24(Wed)04:56:33 No.101885029

Anonymous 08/14/24(Wed)04:56:33 No.101885029

>>101885005
for me it's https://huggingface.co/fblgit/UNA-ThePitbull-21.4B-v2
>Introducing the best LLM in the industry. Nearly as good as a 70B, just a 21.4B

Anonymous
08/14/24(Wed)04:58:42 No.101885044

Anonymous 08/14/24(Wed)04:58:42 No.101885044

File: beagle.jpg (1.44 MB, 1290x1007)

1.44 MB JPG

>>101885014
I-it's embarrassing showing my degenerate shit but ok

Anonymous
08/14/24(Wed)05:17:15 No.101885206

Anonymous 08/14/24(Wed)05:17:15 No.101885206

Imatrix quants are better than static ones, correct?

Anonymous
08/14/24(Wed)05:20:32 No.101885237

Anonymous 08/14/24(Wed)05:20:32 No.101885237

>>101885206
Yes, specially for lower quants. Still a little bit better for Q5+

Anonymous
08/14/24(Wed)05:21:23 No.101885243

Anonymous 08/14/24(Wed)05:21:23 No.101885243

>>101885206
When run entirely on GPU, yes. If you offload at all, no.

Anonymous
08/14/24(Wed)05:22:04 No.101885253

Anonymous 08/14/24(Wed)05:22:04 No.101885253

>>101885243
How come? Does the model get stupid or what?

Anonymous
08/14/24(Wed)05:25:53 No.101885289

Anonymous 08/14/24(Wed)05:25:53 No.101885289

>>101885253
No, you just lose the speed benefit over regular quants.

Anonymous
08/14/24(Wed)05:35:04 No.101885379

Anonymous 08/14/24(Wed)05:35:04 No.101885379

>>101880989
what answer would a roleplay model give?

i want it lewd or at least funny

Anonymous
08/14/24(Wed)05:37:18 No.101885407

Anonymous 08/14/24(Wed)05:37:18 No.101885407

sad when the sub 64gb vramlets try to talk

Anonymous
08/14/24(Wed)05:42:46 No.101885479

Anonymous 08/14/24(Wed)05:42:46 No.101885479

>>101885289
Damn, how people still mix this up!
Imatrix just changes distribution of how weights are quantized. What you are thinking are IQ quants that use different mechanism, hence the slow down on CPU interference.

Anonymous
08/14/24(Wed)05:46:13 No.101885507

Anonymous 08/14/24(Wed)05:46:13 No.101885507

>>101885479
let me fuck with people goddamnit

Anonymous
08/14/24(Wed)05:47:00 No.101885512

Anonymous 08/14/24(Wed)05:47:00 No.101885512

>>101885479
>Damn, how people still mix this up!
sorry the bot's rag isn't the best

Anonymous
08/14/24(Wed)09:07:57 No.101885604

Anonymous 08/14/24(Wed)09:07:57 No.101885604

>>101881067
A4000 is down to the $500 level on ebay now, look for one up for bid, BIN is still overpriced. It's a nice card when you're short on space.

Anonymous
08/14/24(Wed)09:09:07 No.101885618

Anonymous 08/14/24(Wed)09:09:07 No.101885618

bac?
>>101882831
Sounds like Amazon Polly and other variations of the "justin" and related tts voices available from others: https://youtu.be/igcKWod61b0?t=147
You can search for and find other demos online, as well as the software that generates it.

Anonymous
08/14/24(Wed)09:10:02 No.101885635

Anonymous 08/14/24(Wed)09:10:02 No.101885635

File: 1696745982881912.png (26 KB, 612x330)

26 KB PNG

Been out of the loop for six months. How many of these repos are still used (aside from the first one). I hear flux is all the jazz right now and most guides seem to use comfyui over voldy? The flux prompting looks so retarded though.
t. 4090

Anonymous
08/14/24(Wed)09:26:37 No.101885657

Anonymous 08/14/24(Wed)09:26:37 No.101885657

>>101885635
>draw me a girl, she looks like x, holds y and wears z
>retarded
>1girl, award-winning gaping anus:1.6, best quality, masterpiece, HDR, exceptional, raytracing, best detail, 8k, insaneres
>negatives: (worst quality, low quality:1.3), bad hands, bad pussy, extra digits, fewer digits, extra ears, extra nipples, extra anuses, mutated, nigger, tranny, spic, chink, gook
>high IQ

Anonymous
08/14/24(Wed)09:27:11 No.101885668

Anonymous 08/14/24(Wed)09:27:11 No.101885668

>>101883319
Damn, this sucks. What is Cohere doing? That's such a disappointment...
But still, coming from xAI, I bet this is an absurdly large model, so it's not like anyone would be able to run it locally anyway.

Anonymous
08/14/24(Wed)09:28:06 No.101885689

Anonymous 08/14/24(Wed)09:28:06 No.101885689

>>101885025
>you are comparing to T4 tho. it's a 70w card.
T4 leads a 4070 in fp16 and tensor cores. That said, 3090 still is a much better value.

Anonymous
08/14/24(Wed)09:29:06 No.101885712

Anonymous 08/14/24(Wed)09:29:06 No.101885712

dead general

Anonymous
08/14/24(Wed)09:29:40 No.101885721

Anonymous 08/14/24(Wed)09:29:40 No.101885721

>>101885657
>extra digits, fewer digits
That shit always cracked me up. I've seen many variations of 'more than five, fewer than five' and many other stupid things.

Anonymous
08/14/24(Wed)09:30:17 No.101885737

Anonymous 08/14/24(Wed)09:30:17 No.101885737

>>101885657
oh, I had thought flux required wordsmithing, poetry, and long-winded essays to prompt correctly.

Anonymous
08/14/24(Wed)09:30:21 No.101885738

Anonymous 08/14/24(Wed)09:30:21 No.101885738

>>101885657
When you put it like this you make me realize that flux has removed the last vestiges of art from the process.

Anonymous
08/14/24(Wed)09:30:29 No.101885744

Anonymous 08/14/24(Wed)09:30:29 No.101885744

Here is AGI for you
https://arxiv.org/abs//2408.06518
>Does Liking Yellow Imply Driving a School Bus? Semantic Leakage in Language Models
>Despite their wide adoption, the biases and unintended behaviors of language models remain poorly understood. In this paper, we identify and characterize a phenomenon never discussed before, which we call semantic leakage, where models leak irrelevant information from the prompt into the generation in unexpected ways. We propose an evaluation setting to detect semantic leakage both by humans and automatically, curate a diverse test suite for diagnosing this behavior, and measure significant semantic leakage in 13 flagship models. We also show that models exhibit semantic leakage in languages besides English and across different settings and generation scenarios. This discovery highlights yet another type of bias in language models that affects their generation patterns and behavior.

Anonymous
08/14/24(Wed)09:32:22 No.101885780

Anonymous 08/14/24(Wed)09:32:22 No.101885780

>>101884729
>>101884495
disgusting troons. go back to your discord.

Anonymous
08/14/24(Wed)09:33:10 No.101885792

Anonymous 08/14/24(Wed)09:33:10 No.101885792

>>101884514
>those heckin chuds!! how dare they talk freely about anything they want!!!!

Anonymous
08/14/24(Wed)09:33:48 No.101885809

Anonymous 08/14/24(Wed)09:33:48 No.101885809

>>101885005
>UNA
mandatory reading if you use any of those models: https://github.com/fblgit/hypothetical-frameworks
This is what makes UNA so good and soulful.

Anonymous
08/14/24(Wed)09:37:17 No.101885865

Anonymous 08/14/24(Wed)09:37:17 No.101885865

>>101885738
What do you mean?

Anonymous
08/14/24(Wed)09:38:52 No.101885894

Anonymous 08/14/24(Wed)09:38:52 No.101885894

>>101885668
It's coming really soon
>Column-R and Column-U from cohere
>Sus-column-R from X.ai
T. work at Cohere

Anonymous
08/14/24(Wed)09:38:58 No.101885897

Anonymous 08/14/24(Wed)09:38:58 No.101885897

>>101885044
Can you catbox the catgirl, Anon?

Anonymous
08/14/24(Wed)09:43:04 No.101885981

Anonymous 08/14/24(Wed)09:43:04 No.101885981

>>101885044
She is a cat you sick fuck.

Anonymous
08/14/24(Wed)09:43:22 No.101885991

Anonymous 08/14/24(Wed)09:43:22 No.101885991

File: 1688890966131443.png (1.69 MB, 1498x1869)

1.69 MB PNG

I'm looking forward to the next decade where I will be able to run good LLMs with hardware that I can actually afford.

Anonymous
08/14/24(Wed)09:43:27 No.101885994

Anonymous 08/14/24(Wed)09:43:27 No.101885994

File: ComfyUI_05663_.png (1.21 MB, 720x1280)

1.21 MB PNG

>>101885737
using a bullet point list also just werks

Anonymous
08/14/24(Wed)09:48:16 No.101886023

Anonymous 08/14/24(Wed)09:48:16 No.101886023

>>101885991
This is our final decade.

Anonymous
08/14/24(Wed)09:48:57 No.101886049

Anonymous 08/14/24(Wed)09:48:57 No.101886049

>>101870861
how's that AGI going for you nigger?
that's right, it's still science fiction

Anonymous
08/14/24(Wed)09:49:59 No.101886076

Anonymous 08/14/24(Wed)09:49:59 No.101886076

>>101885604
holy shit how are these so cheap? is it a scam?

Anonymous
08/14/24(Wed)09:50:46 No.101886088

Anonymous 08/14/24(Wed)09:50:46 No.101886088

File: Screenshot_20240814_185358.png (306 KB, 3412x1624)

306 KB PNG

well, seems at least the jokes of gpt4o have gotten alot better since the recent upgrade. kinda cool.

Anonymous
08/14/24(Wed)09:51:00 No.101886098

Anonymous 08/14/24(Wed)09:51:00 No.101886098

I just had a random thought occur.
What if the hallucinations models produce are like the imagination of us humans?
We know models have no thoughts, but what if the output of models _are_ their thoughts?

When you're asked a math question, like "What is 59 times 23?", your first thought will most likely be something like "Around a 1000? Above 500, at least." and then you'll actually think about the question step by step.
I've seen this behaviour happen a lot with models, where they give the wrong answer, but eventually come to the correct answer after examining the problem step by step.

What if instead of trying to shape these thoughts, we add a feedback loop that use their thoughts as input before outputting actual output?
You could even do this while the output is still outputting, "correcting" eventual false conclusions. When the model evaluates that its thought is wrong, the thought is cancelling and the wrong part is overwritten with the correct thought, restarting the feedback loop.
After one final feedback loop of the entire "thought", the model is tasked to restructure it into an answer, possibly trimming parts of the output before outputting the actual output.

I hope all of that made sense.

Anonymous
08/14/24(Wed)09:52:51 No.101886142

Anonymous 08/14/24(Wed)09:52:51 No.101886142

File: Screenshot_20240814_185844.png (82 KB, 3354x348)

82 KB PNG

>>101886088
can see why somebody only using gpt would be suprised by this output.
no idea why openai hypes it up though.

Anonymous
08/14/24(Wed)09:54:36 No.101886182

Anonymous 08/14/24(Wed)09:54:36 No.101886182

>>101886088
4o-latest sucks even harder at coding than the other two 4o models though

Anonymous
08/14/24(Wed)09:55:13 No.101886195

Anonymous 08/14/24(Wed)09:55:13 No.101886195

>>101886098
Strawberry hype is gone. It is time to return to localllama.

Anonymous
08/14/24(Wed)09:57:05 No.101886235

Anonymous 08/14/24(Wed)09:57:05 No.101886235

>>101886098
Yeah. Nobody thought of that before...
>What if instead of trying to shape these thoughts, we add a feedback loop that use their thoughts as input before outputting actual output?
That's the autoregressive nature of language models. It's what they already do and can be trained to do it more explicitly with CoT and stuff like that.

Anonymous
08/14/24(Wed)09:57:16 No.101886240

Anonymous 08/14/24(Wed)09:57:16 No.101886240

File: 1704388501763745.png (26 KB, 754x105)

26 KB PNG

Please explain this to me. I did a session, roleplaying going out on a boat and skinny dipping with a girl. I started a new chat with same char and setting, and my char wrote "Wanna go swim and dive?" and picrel is the response. I looked at the prompt that was sent to see if there was something remaining but it's literally just the starting chat. The previous chat was not today, either, so it can't be some weird VRAM remnant thing.

Anonymous
08/14/24(Wed)09:57:52 No.101886257

Anonymous 08/14/24(Wed)09:57:52 No.101886257

>>101885689
>That said, 3090 still is a much better value.
I don't know how you come to this conclusion when no numbers are provided at all.

Anonymous
08/14/24(Wed)09:58:03 No.101886260

Anonymous 08/14/24(Wed)09:58:03 No.101886260

>>101886098
What you are describing is essentially CoT + iterating over its own response, and that's not a novel idea.
To some extent that already happens inside the model, more or less.

Anonymous
08/14/24(Wed)10:02:57 No.101886348

Anonymous 08/14/24(Wed)10:02:57 No.101886348

>>101884441
Musk is the Jart of billionaires.
Even if his AI stuff happens to work out he'll squander it with a focus on meme shit that he thinks will make other people think that he's cool.

Anonymous
08/14/24(Wed)10:05:12 No.101886393

Anonymous 08/14/24(Wed)10:05:12 No.101886393

>>101886257
>I don't know how you come to this conclusion when no numbers are provided at all.
I'm not going to do your homework for you. It's been discussed at length here. A 3090 is still the best value based on cost vs tokens/sec and VRAM.
T4 is "good" but it's not as good as a 3090, and if you need a 1-slot card, an A4000 can be had for the same price as a T4, and it's much better.

Anonymous
08/14/24(Wed)10:06:04 No.101886410

Anonymous 08/14/24(Wed)10:06:04 No.101886410

File: 1672949828279964.webm (258 KB, 720x720)

258 KB WEBM

Hello guys, I'm an /aicg/ refugee, since it has become unfeasible to steal keys from companies, I wanted to ask a question here, what's currently the best local model for RP that can realistically be run on consumer grade hardware(4090, etc.)? I'm just curious, thanks!

Anonymous
08/14/24(Wed)10:07:01 No.101886427

Anonymous 08/14/24(Wed)10:07:01 No.101886427

>>101886410
gemmasutra 2b

Anonymous
08/14/24(Wed)10:07:32 No.101886440

Anonymous 08/14/24(Wed)10:07:32 No.101886440

>>101886393
>trust me bro
I think I'll end right here.

Anonymous
08/14/24(Wed)10:12:14 No.101886525

Anonymous 08/14/24(Wed)10:12:14 No.101886525

File: _7d8920e8-62d1-4193-9cf4-(...).jpg (175 KB, 1024x1024)

175 KB JPG

>>101886440

Anonymous
08/14/24(Wed)10:12:15 No.101886526

Anonymous 08/14/24(Wed)10:12:15 No.101886526

>>101886410
>unfeasible to steal keys from companies
why?

Anonymous
08/14/24(Wed)10:15:20 No.101886579

Anonymous 08/14/24(Wed)10:15:20 No.101886579

>>101886410
koboldcpp + gemma 2 27b.
Then learn about the different quantization methods and try other models like command R, quantized 70b, etc.

Anonymous
08/14/24(Wed)10:15:33 No.101886583

Anonymous 08/14/24(Wed)10:15:33 No.101886583

>>101886427
Is it? What about Magnum or Mythomax? I've heard those names mentioned before and I seem to remember people saying they were good.
>>101886526
Well atleast when it comes to Claude keys security has been tightened a whole lot and keys are getting revoked nonstop, very hard to keep them alive.

Anonymous
08/14/24(Wed)10:17:44 No.101886621

Anonymous 08/14/24(Wed)10:17:44 No.101886621

>>101886583
Just use nemo instruct thief.

Anonymous
08/14/24(Wed)10:19:05 No.101886652

Anonymous 08/14/24(Wed)10:19:05 No.101886652

File: Screenshot_20240814_161116.png (313 KB, 1671x1956)

313 KB PNG

>>101880989
https://www.phoronix.com/review/amd-ryzen-9950x-9900x/14
AMD seems to completely BTFO Intel when it comes to whisper.cpp.
Probably not directly comparable to llama.cpp since as long as you have a GPU for prompt processing the bottleneck is the RAM rather than the CPU.

Anonymous
08/14/24(Wed)10:19:59 No.101886671

Anonymous 08/14/24(Wed)10:19:59 No.101886671

>>101886652
Huh, that's interesting.

Anonymous
08/14/24(Wed)10:20:29 No.101886683

Anonymous 08/14/24(Wed)10:20:29 No.101886683

>>101880989
Takashi will arrive at 9pm. Mrs Minagawa will rape his shota ass.

Anonymous
08/14/24(Wed)10:22:11 No.101886715

Anonymous 08/14/24(Wed)10:22:11 No.101886715

If you got raped by Mrs Minagawa as a shota would you still engage in the sunk cost fallacy hobby known as LLM's?

Anonymous
08/14/24(Wed)10:28:00 No.101886832

Anonymous 08/14/24(Wed)10:28:00 No.101886832

>>101886715
No because I'd have Mrs Minagawa to take care of my needs

Anonymous
08/14/24(Wed)10:30:23 No.101886897

Anonymous 08/14/24(Wed)10:30:23 No.101886897

>>101886832
蜷川様〜

Anonymous
08/14/24(Wed)10:31:11 No.101886909

Anonymous 08/14/24(Wed)10:31:11 No.101886909

>>101886348
But some people will find his stuff useful anyway. Deal with him the same way, ignore if it doesn't interest you

Anonymous
08/14/24(Wed)10:35:12 No.101886993

Anonymous 08/14/24(Wed)10:35:12 No.101886993

>>101886098
Everything LLMs do are hallucinations. But some hallucinations are more factual than others.
LLMs, however, have the issue of output bias. If a LLM sees "X is right" and later discovers that "X isn't right", it will most likely hallucinate an answer that makes X right, instead of recognizing its error.

Anonymous
08/14/24(Wed)10:36:55 No.101887030

Anonymous 08/14/24(Wed)10:36:55 No.101887030

>>101886652
Who the fuck use whisper.cpp? WhisperX and other faster-whisper/ctranslate2 implementation are so much faster.

Anonymous
08/14/24(Wed)10:41:16 No.101887142

Anonymous 08/14/24(Wed)10:41:16 No.101887142

File: 1723039897594741.png (22 KB, 872x431)

22 KB PNG

>>101885604
>>101886076
I just got one from ebay for just over £500 and now I'll be able to get 64gb vram in a mid tower. Only downside is I can't find a cheap 3090 turbo.

Anonymous
08/14/24(Wed)10:49:27 No.101887353

Anonymous 08/14/24(Wed)10:49:27 No.101887353

>>101887142
>Only downside is I can't find a cheap 3090 turbo.
Keep in mind many turbo cards aren't flow-through, meaning you can't put them right next to each other. 2080ti is like that. If the rear of the card doesn't have an opening in the PCB where the fan is, it's not flow-through.

Anonymous
08/14/24(Wed)10:50:06 No.101887361

Anonymous 08/14/24(Wed)10:50:06 No.101887361

>>101884525
>>101884618
Command-R with 8k context at 5.0 bpw instead of 3.0 bpw
Mixtral 8x7B Instruct with 16k context at 5.0 bpw instead of 3.7 bpw

Anonymous
08/14/24(Wed)10:52:16 No.101887402

Anonymous 08/14/24(Wed)10:52:16 No.101887402

>>101887361
Gemma 27B with a small context fits in 32GB.

I only run q8/8bpw models now.

Anonymous
08/14/24(Wed)10:55:36 No.101887480

Anonymous 08/14/24(Wed)10:55:36 No.101887480

>>101886652
>as long as you have a GPU for prompt processing the bottleneck is the RAM rather than the CPU
Even running entirely on the CPU the bottleneck can be RAM. My CPU is 60% idle when running Mistral Large IQ_3XS at 0.6 t/s. RAM so slow.

Anonymous
08/14/24(Wed)10:57:18 No.101887531

Anonymous 08/14/24(Wed)10:57:18 No.101887531

>>101887480
*I mean 40% idle, 60% used.

Anonymous
08/14/24(Wed)10:58:51 No.101887561

Anonymous 08/14/24(Wed)10:58:51 No.101887561

>>101887142
>cheap 3090 turbo
I got mine of facebook marketplace. I doubt you can get it off ebay without paying the extra.

Anonymous
08/14/24(Wed)11:07:15 No.101887741

Anonymous 08/14/24(Wed)11:07:15 No.101887741

For me it's the 3090 pro max.

Anonymous
08/14/24(Wed)11:26:21 No.101888077

Anonymous 08/14/24(Wed)11:26:21 No.101888077

File: 1713717867677796.png (637 KB, 2480x3508)

637 KB PNG

>>101887353
Yeah, the founders card is going to get suffocated but I'll try to make it work. I'm going to put a fan at the bottom of the case and undervolt everything if it gets bad but I'll max out all the fans first since the blower cards are going to get noisy anyways.

Anonymous
08/14/24(Wed)11:32:51 No.101888185

Anonymous 08/14/24(Wed)11:32:51 No.101888185

>>101888077
You can try using a extender and mounting one of the cards vertically. 3-slot PCIe spacing isn't enough for non-blower 3090s, I've tried.

Anonymous
08/14/24(Wed)11:36:30 No.101888259

Anonymous 08/14/24(Wed)11:36:30 No.101888259

>>101885994
Yeah, in my experience multi-line prompts do best with flux, where you define each subject individually and then add more lines for extra details, actions, etc. referencing your defined subjects as needed, rather than typing out a paragraph.

Anonymous
08/14/24(Wed)11:38:09 No.101888292

Anonymous 08/14/24(Wed)11:38:09 No.101888292

File: 1722588141364645.png (542 KB, 2468x984)

542 KB PNG

I've been considering making a desktop app to act like a search engine for my local unsorted images, without needing to set shit up like with immich or hydrus. After searching and asking around it seems that CLIP is state-of-the-art for this kind of thing and pic related are some of the models I could use. I'll probably use the openCLIP libary to create the text from the image and figure out the rest as I go.
Is my thinking sound, or there are better tools/methods to accomplish my goal?

Anonymous
08/14/24(Wed)11:39:58 No.101888320

Anonymous 08/14/24(Wed)11:39:58 No.101888320

>>101888292
florence-2 is better at captioning imgs

Anonymous
08/14/24(Wed)11:44:09 No.101888383

Anonymous 08/14/24(Wed)11:44:09 No.101888383

>>101888320
https://huggingface.co/microsoft/Florence-2-large
I see, so currently that's the latest and greatest? Should I use the code they have there as reference for my app?

Anonymous
08/14/24(Wed)11:46:18 No.101888434

Anonymous 08/14/24(Wed)11:46:18 No.101888434

>>101880989
I don't get it

Anonymous
08/14/24(Wed)11:47:32 No.101888457

Anonymous 08/14/24(Wed)11:47:32 No.101888457

>>101888434
The punchline is SEX!
HAHAHAHAHAHAHA ISN'T THAT SO FUNNY????

Anonymous
08/14/24(Wed)11:47:36 No.101888460

Anonymous 08/14/24(Wed)11:47:36 No.101888460

>>101888383
The model's small enough that using whatever the default huggingface code is will probably be fine. I'm not sure if any specialized backends support it.

Anonymous
08/14/24(Wed)11:48:10 No.101888471

Anonymous 08/14/24(Wed)11:48:10 No.101888471

>>101888320
>>101888460
OK, thanks a lot anon(s)

Anonymous
08/14/24(Wed)11:49:11 No.101888485

Anonymous 08/14/24(Wed)11:49:11 No.101888485

>>101888434
At the time Takashi-kun didn't either.

Anonymous
08/14/24(Wed)11:49:39 No.101888493

Anonymous 08/14/24(Wed)11:49:39 No.101888493

>>101888457
Sex with who? this is a child, that is not possible, it is sick

Anonymous
08/14/24(Wed)11:52:29 No.101888541

Anonymous 08/14/24(Wed)11:52:29 No.101888541

>>101888493
cute

Anonymous
08/14/24(Wed)11:56:30 No.101888615

Anonymous 08/14/24(Wed)11:56:30 No.101888615

>>101888493
you're supposed to insert as the child who inserts into mrs. minagawa

Anonymous
08/14/24(Wed)11:58:10 No.101888639

Anonymous 08/14/24(Wed)11:58:10 No.101888639

I whipped up a quick prototype that uses whisper to listen to audio on the computer as I play youtube, and have it commentate on it. It saves a lot of time, I might not even need to watch the videos soon.

Anonymous
08/14/24(Wed)12:00:28 No.101888672

Anonymous 08/14/24(Wed)12:00:28 No.101888672

>>101888292
It sounds like you're reinventing hydrus with extra steps. I think the better use of time would be to learn hydrus and use a model to tag your images then import them. That's my plan anyway.

Anonymous
08/14/24(Wed)12:03:58 No.101888724

Anonymous 08/14/24(Wed)12:03:58 No.101888724

>>101888493
lmao

Anonymous
08/14/24(Wed)12:08:35 No.101888810

Anonymous 08/14/24(Wed)12:08:35 No.101888810

any grok2 quants yet?

Anonymous
08/14/24(Wed)12:14:12 No.101888894

Anonymous 08/14/24(Wed)12:14:12 No.101888894

>>101888810
yes

Anonymous
08/14/24(Wed)12:17:57 No.101888945

Anonymous 08/14/24(Wed)12:17:57 No.101888945

>>101888672
That's what I've considered doing at first but I'd have loved if I had a program to search for an image without needing to setup hydrus and import all images. Of course making an entirely new application is more work than figuring out hydrus but maybe there are other people who would want something similar to me. Like a plug-and-play local image search engine

Anonymous
08/14/24(Wed)12:29:52 No.101889148

Anonymous 08/14/24(Wed)12:29:52 No.101889148

>>101888615
insert what

Anonymous
08/14/24(Wed)12:34:40 No.101889219

Anonymous 08/14/24(Wed)12:34:40 No.101889219

>>101889148
insert coin

Anonymous
08/14/24(Wed)12:35:31 No.101889230

Anonymous 08/14/24(Wed)12:35:31 No.101889230

Fucking niggers of xai open-sourced the piece of shit grok-1 but wont release grok-1.5 or grok-2

Anonymous
08/14/24(Wed)12:36:30 No.101889245

Anonymous 08/14/24(Wed)12:36:30 No.101889245

>>101889230
There's a delay of like 6 months

Anonymous
08/14/24(Wed)12:37:30 No.101889257

Anonymous 08/14/24(Wed)12:37:30 No.101889257

>>101889245
people say this but is that actually policy or are you just extrapolating from a single data point

Anonymous
08/14/24(Wed)12:38:13 No.101889264

Anonymous 08/14/24(Wed)12:38:13 No.101889264

>>101889257
The latter.

Anonymous
08/14/24(Wed)12:40:01 No.101889288

Anonymous 08/14/24(Wed)12:40:01 No.101889288

>>101889230
Elon was trying to look good in front of court for his lawsuit against Altman to open source gpt4. The lawsuit died so there's no reason for him to care about open source anymore.

Anonymous
08/14/24(Wed)12:43:06 No.101889342

Anonymous 08/14/24(Wed)12:43:06 No.101889342

>>101887480
>>101887531
What CPU do you have?
Mine also caps at 60% regardless of the model most of the time and I have a 7600 with 32gb@6000mt/s with optimized timings.

Anonymous
08/14/24(Wed)12:44:34 No.101889370

Anonymous 08/14/24(Wed)12:44:34 No.101889370

>>101880989

The image appears to be a puzzle or riddle involving a character named Takashi-kun, who is out shopping. Here's a breakdown of the elements in the image:

Scenario: Takashi-kun is planning to buy 4 apples, each costing 70 yen, and 8 oranges, each costing 20 yen. While he is shopping, his next-door neighbor, Mrs. Minagawa, approaches him and comments, "Out on an errand? How reliable!"

Question: The image asks you to determine what time Takashi-kun will be going home today. Additionally, it provides a clue that Mrs. Minagawa hasn't been in contact with her husband for half a year.

Explanation:
The riddle is likely playing on the idea of time or a wordplay involving the cost of the fruits and the situation. Let's consider the following:

Total Cost Calculation:

Apples: 4 apples × 70 yen = 280 yen
Oranges: 8 oranges × 20 yen = 160 yen
Total = 280 yen + 160 yen = 440 yen
Mrs. Minagawa's Comment:

The comment and the note about her not being in contact with her husband for half a year could be a hint toward the answer, possibly indicating that the solution involves understanding the context of time or a hidden meaning in the scenario.
Interpretation:
The puzzle could be using the total cost of the items, 440 yen, as a clue. The number 440 could be interpreted as "4:40," indicating that Takashi-kun will go home at 4:40 PM.

However, the extra information about Mrs. Minagawa might suggest there's more to consider, but based on the calculation, the likely answer is 4:40 PM.

Anonymous
08/14/24(Wed)12:46:10 No.101889384

Anonymous 08/14/24(Wed)12:46:10 No.101889384

>>101889230
>niggers
jews.

Anonymous
08/14/24(Wed)12:47:42 No.101889409

Anonymous 08/14/24(Wed)12:47:42 No.101889409

>>101881835
Performance nosedives the moment you can't fit everything in vram, so yeah.

Anonymous
08/14/24(Wed)12:48:49 No.101889427

Anonymous 08/14/24(Wed)12:48:49 No.101889427

>>101889288
He started a law suit again about a week ago

Anonymous
08/14/24(Wed)12:49:36 No.101889437

Anonymous 08/14/24(Wed)12:49:36 No.101889437

>>101889370
Now I am even more confused

Anonymous
08/14/24(Wed)12:59:33 No.101889598

Anonymous 08/14/24(Wed)12:59:33 No.101889598

File: Screenshot_4.png (937 KB, 819x747)

937 KB PNG

>>101888493
THAT'S NOT THE MIYAGI-DO WAY

Anonymous
08/14/24(Wed)13:02:54 No.101889658

Anonymous 08/14/24(Wed)13:02:54 No.101889658

File: 1636941718706.gif (3.75 MB, 520x293)

3.75 MB GIF

Lmfao.

So I tried out Cohere CR+ and it's still fucking dogshit for basic 1 on 1 chats. How in the fuck does not a single model, even a monster like CR+ come close to character AI. I'm actually convinced that i'm chatting with paid jannies on their now.

And people over on the Chat AI general unironically PAY for this shit with Opus and all that garbage? LMAO

Anonymous
08/14/24(Wed)13:06:16 No.101889713

Anonymous 08/14/24(Wed)13:06:16 No.101889713

>>101889658
just use novelai
it's all you need

Anonymous
08/14/24(Wed)13:07:35 No.101889736

Anonymous 08/14/24(Wed)13:07:35 No.101889736

I see there's another /aids/ raid going on.

Anonymous
08/14/24(Wed)13:08:48 No.101889750

Anonymous 08/14/24(Wed)13:08:48 No.101889750

>>101889409
Newfriend here, so if I buy a 3060 it will be no different from just adding 12gb of extra ram?
At least I will be able to run flux right?

Anonymous
08/14/24(Wed)13:09:42 No.101889767

Anonymous 08/14/24(Wed)13:09:42 No.101889767

>>101889713
yup novelai mogs every model in erp realism still

Anonymous
08/14/24(Wed)13:10:04 No.101889774

Anonymous 08/14/24(Wed)13:10:04 No.101889774

>>101889713
>paying to coom
ngmi

Anonymous
08/14/24(Wed)13:10:27 No.101889780

Anonymous 08/14/24(Wed)13:10:27 No.101889780

>>101889257
The single data point + the fact that Elon had previously said he thinks 6 months is a good timeline before Grok released, and ended up doing that and hasn't given any indication he's changed his mind.

Anonymous
08/14/24(Wed)13:14:55 No.101889850

Anonymous 08/14/24(Wed)13:14:55 No.101889850

I've always used base models and been pretty happy with them after designing a good system prompt. Are memetunes and merges worth checking out?

Anonymous
08/14/24(Wed)13:16:15 No.101889873

Anonymous 08/14/24(Wed)13:16:15 No.101889873

>>101889850
It depends if you want to feel shivers in your spine

Anonymous
08/14/24(Wed)13:16:18 No.101889874

Anonymous 08/14/24(Wed)13:16:18 No.101889874

What's it like going from 8GB of VRAM to 72GB? I'm thinking of building the triple P40 box.

Anonymous
08/14/24(Wed)13:16:51 No.101889888

Anonymous 08/14/24(Wed)13:16:51 No.101889888

>>101889850
no

Anonymous
08/14/24(Wed)13:22:12 No.101889970

Anonymous 08/14/24(Wed)13:22:12 No.101889970

>>101889767
Is it true or bait?

Anonymous
08/14/24(Wed)13:22:29 No.101889973

Anonymous 08/14/24(Wed)13:22:29 No.101889973

>>101889874
It's indescribable. You won't regret it.

Anonymous
08/14/24(Wed)13:27:50 No.101890049

Anonymous 08/14/24(Wed)13:27:50 No.101890049

>>101889874
Highly use-case dependent. If you just want vanilla ERP with a pretty straight forward waifu character then 8GB can more or less suit your needs these days. If you want to code and stranglefuck your waifu then it's probably a necessary investment.

Anonymous
08/14/24(Wed)13:27:50 No.101890050

Anonymous 08/14/24(Wed)13:27:50 No.101890050

>>101889850
yes

Anonymous
08/14/24(Wed)13:29:36 No.101890086

Anonymous 08/14/24(Wed)13:29:36 No.101890086

>>101889658
>he likes character ai
yikes

Anonymous
08/14/24(Wed)13:33:39 No.101890145

Anonymous 08/14/24(Wed)13:33:39 No.101890145

>>101890086
>he likes models trained on a couple of /aicg/ logs
yikes

Anonymous
08/14/24(Wed)13:34:53 No.101890170

Anonymous 08/14/24(Wed)13:34:53 No.101890170

>>101889874
anything that requires 72gb to fit is going to run slow as fuck on p40s anyway

Anonymous
08/14/24(Wed)13:35:05 No.101890175

Anonymous 08/14/24(Wed)13:35:05 No.101890175

>>101890145
Weren't a good bit of those aicg logs FROM cai?

Anonymous
08/14/24(Wed)13:37:57 No.101890235

Anonymous 08/14/24(Wed)13:37:57 No.101890235

>>101889970
It's a schizo

Anonymous
08/14/24(Wed)13:39:39 No.101890277

Anonymous 08/14/24(Wed)13:39:39 No.101890277

>>101890175
No, C2 logs are mostly Opus, I don't think he's talking about pyg.

Anonymous
08/14/24(Wed)13:39:58 No.101890286

Anonymous 08/14/24(Wed)13:39:58 No.101890286

>>101890170
That makes sense. For now this is my cope for waitfagging on spending thousands of dollars on 4090s.

Anonymous
08/14/24(Wed)13:44:07 No.101890364

Anonymous 08/14/24(Wed)13:44:07 No.101890364

>>101889750
>Newfriend here, so if I buy a 3060 it will be no different from just adding 12gb of extra ram?
It may be slightly faster, depending on the proportion of the model held in VRAM. There are still significant performance boosts if over half of it is in VRAM, I wish I had the chart for it.

But yeah, effectively it's just extra RAM for large models. That being said, with the unfortunate size of a lot of models like Mistral-Large, that 12gb can be the diff between running them at all or not if you only have a 64gb capacity board.

Anonymous
08/14/24(Wed)13:45:47 No.101890395

Anonymous 08/14/24(Wed)13:45:47 No.101890395

File: offload_x_performance_theory.png (167 KB, 1536x1152)

167 KB PNG

>>101890364
>I wish I had the chart for it.

Anonymous
08/14/24(Wed)13:46:48 No.101890413

Anonymous 08/14/24(Wed)13:46:48 No.101890413

File: offload_x_performance.png (96 KB, 1536x1152)

96 KB PNG

>>101890364
>>101890395

Anonymous
08/14/24(Wed)13:55:27 No.101890545

Anonymous 08/14/24(Wed)13:55:27 No.101890545

>>101883559
Ever since Large and 3.1 we've been back don't let doomers tell you otherwise

Anonymous
08/14/24(Wed)13:58:33 No.101890591

Anonymous 08/14/24(Wed)13:58:33 No.101890591

>>101890545
Now just need Altman to die and local will have won completely

Anonymous
08/14/24(Wed)13:58:57 No.101890597

Anonymous 08/14/24(Wed)13:58:57 No.101890597

File: 1723483987154506.png (117 KB, 671x519)

117 KB PNG

What's a complete retard-proof explanation of what goes on in a roleplay prompt? I assume you tell an instruct model in the system prompt that it is an anime girl and then the usual <|im_start|>user and <|im_start|>assistant talking goes on

Anonymous
08/14/24(Wed)14:01:23 No.101890630

Anonymous 08/14/24(Wed)14:01:23 No.101890630

>>101890591
We wouldn't even have local without altman. And any advancement of the industry by OAI is good for local.

Anonymous
08/14/24(Wed)14:07:49 No.101890730

Anonymous 08/14/24(Wed)14:07:49 No.101890730

>>101890597
That's how it's been done since the Pyg and GPT 3.5 days. I think it was just copying what Character.AI did.

Anonymous
08/14/24(Wed)14:08:54 No.101890744

Anonymous 08/14/24(Wed)14:08:54 No.101890744

>>101881211
if you're incapable of buying used that's a skill issue. just verify the card works before you pay, plus multiple manufacturers still warranty used cards

Anonymous
08/14/24(Wed)14:22:16 No.101890932

Anonymous 08/14/24(Wed)14:22:16 No.101890932

>>101890730
Thank you anon, I think ooba putting the character persona in <|im_start|>user was what made the models I tried talk like ChatGPT

Anonymous
08/14/24(Wed)14:24:18 No.101890958

Anonymous 08/14/24(Wed)14:24:18 No.101890958

The situation must be getting desperate in /aicg/. We're getting a lot more entry level questions than usual.

Anonymous
08/14/24(Wed)14:26:49 No.101891002

Anonymous 08/14/24(Wed)14:26:49 No.101891002

>>101890958
But there's not a massive amount of pedofags posting about how they saw a "loli" at the park today here so things aren't as bad as that thread yet

Anonymous
08/14/24(Wed)14:30:32 No.101891076

Anonymous 08/14/24(Wed)14:30:32 No.101891076

>>101890597
Yeah it werks for me.
"You are an anime girl, her personality is x, she talks like y, her best friend is z.

Current scenario: {{user}} couldn't look away when he saw her in a bikini so she put his thing in her thing... (just summarize the story)"

Anonymous
08/14/24(Wed)14:49:57 No.101891397

Anonymous 08/14/24(Wed)14:49:57 No.101891397

>>101890958
Local always wins. It's just the matter of time.

Anonymous
08/14/24(Wed)14:53:02 No.101891453

Anonymous 08/14/24(Wed)14:53:02 No.101891453

>>101889658
What is character AI? Is it like JanitorAI i'm guessing?

Anyway, what's a good model for general one to one interactions? Been using Nemomix and it's fine I guess.

I have a 4090 for reference, so anything above like, Command R, I ain't gonna be able to run it

Anonymous
08/14/24(Wed)15:02:34 No.101891636

Anonymous 08/14/24(Wed)15:02:34 No.101891636

>>101891613
>>101891613
>>101891613

Anonymous
08/14/24(Wed)15:13:14 No.101891808

Anonymous 08/14/24(Wed)15:13:14 No.101891808

Its fucking over

Anonymous
08/14/24(Wed)15:38:29 No.101892246

Anonymous 08/14/24(Wed)15:38:29 No.101892246

File: ACKCHYUALLY.jpg (136 KB, 576x768)

136 KB JPG

>>101889874
I've built two and here are my 2c - the mikubox was a good and cheap way to get an all-in-one solution, I say "was" since there's been massive price hikes on both the P40s and T7910/T7920.
If that's the case in your area too I'd recommend looking at other alternatives. One is going for an open frame mining rig setup. You can get one for like 30-40$ and then add the other parts. This'll also get you better performance (more PCI lanes, better mobo, later CPU support etc.) as well as something more future proof since the mikubox is mainly designed to take compute cards with rear power connectors.
Most consumer GPUs with top-side versions don't fit unless you run it without the side panel. You're also stuck with 2-slotters which further limits what cards you can use. And even if you find ones that fit it'll be very tight causing their side-fan design to struggle with cooling.
Beyond that other types of cards like the mi100 32gb and A4000 have dropped in prices and might be better options depending on how tech savvy you are and if you intend to train or not.
There's also the dark horse Intel that's no longer a *complete* joke when it comes to running inference, used ARC 770 16GB can be found for cheap and four would net you 64GB VRAM (but require another type of rig). This is all assuming they'll continue improving with their next generation of cards and don't drop support for the previous ones. Or otherwise fuck things up like they often do.

As for if it's worth it going for more VRAM - yes, very much so. As that other anon mentioned 70B models might be too slow for your liking but running smaller ones in the 20 to 50b range 8-bit is a very noticeable improvement in output quality. Especially if you use one for code completion.

TLDR: Compare costs part-by-part carefully, there might be cheaper/more performant/future-proof options now.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.