/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 11/01/24(Fri)21:18:44 No.103057367

File: 2024-11-01_013616_seed31_(...).png (658 KB, 1280x1024)

658 KB PNG

/lmg/ - Local Models General Anonymous 11/01/24(Fri)21:18:44 No.103057367 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>103045507 & >>103038380

►News
>(10/31) QTIP: Quantization with Trellises and Incoherence Processing: https://github.com/Cornell-RelaxML/qtip
>(10/31) Fish Agent V0.1 3B: Voice-to-Voice and TTS model: https://hf.co/fishaudio/fish-agent-v0.1-3b
>(10/31) Transluce open-sources AI investigation toolkit: https://github.com/TransluceAI/observatory
>(10/30) TokenFormer models with fully attention-based architecture: https://hf.co/Haiyang-W/TokenFormer-1-5B
>(10/30) MaskGCT: Zero-Shot TTS with Masked Generative Codec Transformer: https://hf.co/amphion/MaskGCT

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://livecodebench.github.io/leaderboard.html

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
11/01/24(Fri)21:19:01 No.103057368

Anonymous 11/01/24(Fri)21:19:01 No.103057368

File: 1707500550725825.jpg (134 KB, 1024x1024)

134 KB JPG

►Recent Highlights from the Previous Thread: >>103045507

--Cost-effective GPU options for a club inference rig:
>103052715 >103052772 >103052916 >103053194 >103053478 >103053504
--Transluce open-sources AI investigation toolkit:
>103047524 >103047575
--Techniques for generating comic panels using AI models:
>103051465 >103051506 >103051561
--Sovits quality issues and potential solutions:
>103052143 >103053933 >103054902 >103054946 >103054976 >103054990 >103055007 >103055039 >103055112 >103055123
--SmolLM2-1.7B-Instruct-GGUF performance discussion:
>103046664 >103046725 >103046877
--Recommended settings for EVA-Qwen-72b:
>103053678 >103053685 >103053853 >103053891
--Oasis model discussion and comparisons:
>103050901 >103051033 >103051044 >103051367 >103051464 >103052003 >103053693 >103055898 >103055913
--New QTIP quant method for Llama models:
>103053148 >103053432 >103053501 >103053622 >103053536 >103053578 >103053647 >103053603 >103053813
--KoboldAI Lite updates and troubleshooting discussion:
>103053393 >103053409 >103053496 >103053919 >103054012 >103054064 >103054133 >103054374 >103054027
--Hardware and quantization considerations for running AI models:
>103046482 >103046516 >103046693 >103046832 >103046668 >103046613 >103050826
--First tests of Ezo, an AI that speaks Japanese, show promise:
>103054961
--Discussion of model recommendations and VRAM requirements:
>103045564 >103056095 >103056118 >103046718 >103049829 >103050128 >103052315 >103050874 >103053655 >103053699
--Anon shares a high-quality Japanese voice actor model:
>103056589 >103056606 >103056644 >103056737
--Advancements in embodied AI and the potential for synthetic beings:
>103054157 >103054218 >103054325
--Miku (free space):
>103045564 >103046333 >103047988 >103049417 >103051299 >103054122 >103055225 >103055337 >103056077 >103056425 >103056693

►Recent Highlight Posts from the Previous Thread: >>103045519

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script

Anonymous
11/01/24(Fri)21:19:25 No.103057373

Anonymous 11/01/24(Fri)21:19:25 No.103057373

File: 023a3def6f9.jpg (465 KB, 1024x1024)

465 KB JPG

--- A Measure of the Current Meta ---
> a suggestion of what to try from (You)

96GB VRAM
Qwen/Qwen2.5-72B-Instruct-Q8_0.gguf (aka the best of the best)
anthracite-org/magnum-v4-72b-gguf-Q8_0.gguf

64GB VRAM
Qwen/Qwen2.5-72B-Instruct-Q5_K_M.gguf
anthracite-org/magnum-v4-72b-gguf-Q5_K_M.gguf

48GB VRAM
Qwen/Qwen2.5-72B-Instruct-IQ4_XS.gguf
anthracite-org/magnum-v4-72b-gguf-IQ4_XS.gguf

24GB VRAM
Qwen/Qwen2.5-32B-Instruct-Q4_K_M.gguf
EVA-UNIT-01/EVA-Qwen2.5-32B-v0.1-Q4_K_M.gguf

16GB VRAM
Qwen/Qwen2.5-14B-Instruct-Q6_K.gguf
EVA-UNIT-01/EVA-Qwen2.5-14B-v0.1-Q6_K.gguf

12GB VRAM
Qwen/Qwen2.5-14B-Instruct-Q4_K_M.gguf
EVA-UNIT-01/EVA-Qwen2.5-14B-v0.1-Q4_K_M.gguf

8GB VRAM
mistralai/Mistral-Nemo-Instruct-2407-IQ4_XS.gguf
anthracite-org/magnum-v4-12b-IQ4_XS.gguf
TheDrummer/Rocinante-12B-v1.1-IQ4_XS.gguf

Potato
>>>/g/aicg

> fite me

Anonymous
11/01/24(Fri)21:21:11 No.103057388

Anonymous 11/01/24(Fri)21:21:11 No.103057388

File: file.png (67 KB, 1630x370)

67 KB PNG

You think gemini 2.0 gonna beat OpenAI? They are already very close

Anonymous
11/01/24(Fri)21:22:49 No.103057402

Anonymous 11/01/24(Fri)21:22:49 No.103057402

>>103057373
yeah ok dude you could have just told me to fuck off when i asked and not be a dick about it.

Anonymous
11/01/24(Fri)21:24:39 No.103057417

Anonymous 11/01/24(Fri)21:24:39 No.103057417

>>103057388
No lol it's a meme leaderboard

Anonymous
11/01/24(Fri)21:25:26 No.103057422

Anonymous 11/01/24(Fri)21:25:26 No.103057422

>>103057417
>meme
as opposed to the non meme leaderboard as an because?

Anonymous
11/01/24(Fri)21:25:54 No.103057424

Anonymous 11/01/24(Fri)21:25:54 No.103057424

File: 1711690590289518.jpg (93 KB, 800x600)

93 KB JPG

>>103057388
No.

Anonymous
11/01/24(Fri)21:26:07 No.103057427

Anonymous 11/01/24(Fri)21:26:07 No.103057427

>>103057422
Because if you use gemini 1.5 pro you would know how retarded it is.

Anonymous
11/01/24(Fri)21:27:30 No.103057441

Anonymous 11/01/24(Fri)21:27:30 No.103057441

>>103057388
i'm unironically expecting an exponentially significant breakthrough soon but not from google lmao

Anonymous
11/01/24(Fri)21:27:34 No.103057442

Anonymous 11/01/24(Fri)21:27:34 No.103057442

>>103057427
That was not my question tho, i wanted a non meme benchmark or leaderboard

Anonymous
11/01/24(Fri)21:29:10 No.103057449

Anonymous 11/01/24(Fri)21:29:10 No.103057449

>>103057441
Isnt that suppsed to be the "live" thing that everyone is implementing now? Like chatgptsearch, perplexity, and now google has introduced google grounded into the ai studio I think?

Did anyone try those? Is it useful for work or research or just a meme?

Anonymous
11/01/24(Fri)21:33:46 No.103057482

Anonymous 11/01/24(Fri)21:33:46 No.103057482

>>103057373
thank you babe

Anonymous
11/01/24(Fri)21:33:51 No.103057484

Anonymous 11/01/24(Fri)21:33:51 No.103057484

File: ceba623027e1.jpg (196 KB, 622x673)

196 KB JPG

--- A Measure of the Current Meta --
> a suggestion of what to try from last thread

96GB VRAM
mistralai/Mistral-Large-Instruct-2407 (aka Largestral)
mradermacher/Luminum-v0.1-123B-GGUF

64GB VRAM
bartowski/Llama-3.1-Nemotron-70B-Instruct-HF-GGUF/Llama-3.1-Nemotron-70B-Instruct-HF-Q5_K_L

48GB VRAM
bartowski/Llama-3.1-Nemotron-70B-Instruct-HF-GGUF
bartowski/EVA-Qwen2.5-72B-v0.0-GGUF

24GB VRAM
bartowski/c4ai-command-r-v01-GGUF/c4ai-command-r-v01-Q4_K_M.gguf
bartowski/gemma-2-27b-it-GGUF/gemma-2-27b-it-Q5_K_L.gguf
TheDrummer/Gemmasutra-Pro-27B-v1-GGUF

16GB VRAM
bartowski/Mistral-Small-22B-ArliAI-RPMax-v1.1-GGUF/Mistral-Small-22B-ArliAI-RPMax-v1.1-Q4_K_L.gguf

12GB VRAM
TheDrummer/UnslopNemo-12B-v3-GGUF/Rocinante-12B-v2g-Q5_K_M.gguf

8GB VRAM @ 30 GPU Layers (75% GPU offload)
Lewdiculous/MN-12B-Lyra-v4-GGUF-IQ-Imatrix/MN-12B-Lyra-v4-Q4_K_M-imat.gguf
mradermacher/Arcanum-12b-i1-GGUF/Arcanum-12b.i1-Q4_K_M.gguf

Potato
>>>/g/aicg
> or toppy 7b

Use:
koboldcpp
LM Studio
oobabooga/text-generation-webui

Anonymous
11/01/24(Fri)21:35:41 No.103057495

Anonymous 11/01/24(Fri)21:35:41 No.103057495

>>103057388
So do you guys show your ass to the gay bear or?

Anonymous
11/01/24(Fri)21:39:51 No.103057527

Anonymous 11/01/24(Fri)21:39:51 No.103057527

File: file.png (96 KB, 1465x892)

96 KB PNG

Ecker's doing it again: falling for yet another meme for his TTS model...

Anonymous
11/01/24(Fri)21:40:06 No.103057528

Anonymous 11/01/24(Fri)21:40:06 No.103057528

>>103057373
at least create your own miku next time copy poster

Anonymous
11/01/24(Fri)21:41:48 No.103057539

Anonymous 11/01/24(Fri)21:41:48 No.103057539

>>103057528
>using Bing to create Mikus in the local thread
>unironically shilling Lyra, the one with the anti-merge license
I think you're the one without shame, Mikufag.

Anonymous
11/01/24(Fri)21:46:47 No.103057572

Anonymous 11/01/24(Fri)21:46:47 No.103057572

>>103057373
>nothing for 40xH100 anons
ngmi

Anonymous
11/01/24(Fri)21:52:33 No.103057602

Anonymous 11/01/24(Fri)21:52:33 No.103057602

Imagine if we combine all algorithmic advances, quantization methods, inference optimizations together by like 2035 and it means you can run AGI on a raspberry pi 3b 4GB.

Wouldn't it be very depressing to realize your old shitty laptop from 2010 that you used to play TF2 and L4D2 on had enough processing power and RAM to emulate a human level intelligence?

I sometimes also wonder about a timeline where the soviet union never collapsed and instead they adopted the internet in the 1990s and focused on computer technology as a new "space race". How would the current AI race look and what would the implications on the world be.

Anonymous
11/01/24(Fri)21:53:28 No.103057608

Anonymous 11/01/24(Fri)21:53:28 No.103057608

>>103057449
no i mean like either a huge jump in intelligence or a giant reduction in training/inference compute

Anonymous
11/01/24(Fri)21:53:56 No.103057613

Anonymous 11/01/24(Fri)21:53:56 No.103057613

>>103057539
I'm not "shilling" anything, it was shown by an anon in the last thread that's what he used.

because i actually went through the thread and put in the effort.

Whereas people want to stop people trying to help because their selfish belief system that knowledge should be horded and put a price tag on it.

well fuck that.

Anonymous
11/01/24(Fri)21:54:21 No.103057616

Anonymous 11/01/24(Fri)21:54:21 No.103057616

>>103057608
Oh
Really? I litterarly only read about it plateauing for like a year right now everywhere

Anonymous
11/01/24(Fri)21:55:09 No.103057620

Anonymous 11/01/24(Fri)21:55:09 No.103057620

>>103057602
The Soviets actual had their own computers based on ternary. But they didn't have the population or money to last much longer than they did.

Anonymous
11/01/24(Fri)21:59:02 No.103057637

Anonymous 11/01/24(Fri)21:59:02 No.103057637

File: file.png (172 KB, 693x767)

172 KB PNG

>>103057367
AyyMD OLMo, babby's first model
https://www.amd.com/en/developer/resources/technical-articles/introducing-the-first-amd-1b-language-model.html
https://huggingface.co/amd/AMD-OLMo-1B
just dropping this here idk what anyone would do with this information

Anonymous
11/01/24(Fri)22:00:35 No.103057647

Anonymous 11/01/24(Fri)22:00:35 No.103057647

>>103057602
With how things are, humans will reach cat IQ faster than AI will reach human intelligence.

Anonymous
11/01/24(Fri)22:00:40 No.103057648

Anonymous 11/01/24(Fri)22:00:40 No.103057648

>>103057620
They also had water based analog computers that were more efficient than digital computers for integral calculus needed for the space race up until the mid-1970s. It's kinda insane that the US was trying to figure out how the soviets were ahead in computing in the early 1970s despite CIA knowing their electronics were behind. It was a big mystery and you had all kind of theories of Russians faking their shitty electronics to fool US spies. Only to find out after the soviet union collapsed they just had very weird water based analog computers.

https://en.wikipedia.org/wiki/Water_integrator

Anonymous
11/01/24(Fri)22:07:18 No.103057680

Anonymous 11/01/24(Fri)22:07:18 No.103057680

>AMD Ryzen AI 9 HX 370 w/ Radeon 890M
Are there industrial embedded motherboard that has this cpu? It seems very fast for igpu and would be great if it can use 64gb ram

Anonymous
11/01/24(Fri)22:08:57 No.103057690

Anonymous 11/01/24(Fri)22:08:57 No.103057690

File: file.png (48 KB, 486x737)

48 KB PNG

>>103057637
WTF is that eos token

Anonymous
11/01/24(Fri)22:12:48 No.103057713

Anonymous 11/01/24(Fri)22:12:48 No.103057713

>>103057690
wtf indeed

Anonymous
11/01/24(Fri)22:13:26 No.103057717

Anonymous 11/01/24(Fri)22:13:26 No.103057717

File: 1730351457555265.png (103 KB, 765x384)

103 KB PNG

>>103057690

Anonymous
11/01/24(Fri)22:14:09 No.103057722

Anonymous 11/01/24(Fri)22:14:09 No.103057722

>>103057690
oh, I guess that's a mistake, the eos is token 0.

Anonymous
11/01/24(Fri)22:14:55 No.103057728

Anonymous 11/01/24(Fri)22:14:55 No.103057728

>>103057690
>it's real
What in the tarnation

Anonymous
11/01/24(Fri)22:18:51 No.103057749

Anonymous 11/01/24(Fri)22:18:51 No.103057749

>>103057690
the first ever GLM (Glowing Language Model)

Anonymous
11/01/24(Fri)22:31:26 No.103057823

Anonymous 11/01/24(Fri)22:31:26 No.103057823

>>103057442
It doesn't exist because of the inherent nature of leaderboards. The second a leaderboard is introduced in an environment like this is the second everyone starts trying to game the benchmarks instead of making better models. So, every benchmark becomes a meme benchmark by default.

Anonymous
11/01/24(Fri)22:33:00 No.103057834

Anonymous 11/01/24(Fri)22:33:00 No.103057834

>>103057637
Is the bartowski quant broken? Why would they release this?

Anonymous
11/01/24(Fri)22:37:48 No.103057865

Anonymous 11/01/24(Fri)22:37:48 No.103057865

Are QTIP quants good for you?

Anonymous
11/01/24(Fri)22:42:02 No.103057900

Anonymous 11/01/24(Fri)22:42:02 No.103057900

File: IMG_0898.jpg (324 KB, 1320x414)

324 KB JPG

NuClaude is hell. I can’t live like this.

Anonymous
11/01/24(Fri)22:47:17 No.103057931

Anonymous 11/01/24(Fri)22:47:17 No.103057931

>>103057900
You wanted the boring assistant thing?

Anonymous
11/01/24(Fri)22:50:49 No.103057961

Anonymous 11/01/24(Fri)22:50:49 No.103057961

>>103057931
I don’t want to interact with something less autistic than me. It’s unsettling.

Anonymous
11/01/24(Fri)22:53:42 No.103057979

Anonymous 11/01/24(Fri)22:53:42 No.103057979

>>103057900
I prefer NuClaude to the boring ChatGPT that only speaks in lists. However it seems they're trying to make ChatGPT "cool" too

REAL RECOMMENDATION POST
11/01/24(Fri)22:54:05 No.103057983

REAL RECOMMENDATION POST 11/01/24(Fri)22:54:05 No.103057983

File: buy-a-fucking-ad-asshole.jpg (396 KB, 1664x2432)

396 KB JPG

>>103057367
--- A Measure of the Current Meta --
> a suggestion of what to try from last thread

>196GB VRAM
Qwen/Qwen2.5-72B-Instruct BF16

>96GB VRAM
Qwen/Qwen2.5-72B-Instruct Q8_0

>64GB VRAM
Qwen/Qwen2.5-72B-Instruct Q5_K_M

>48GB VRAM
Qwen/Qwen2.5-72B-Instruct IQ4_XS

>24GB VRAM
Qwen/Qwen2.5-32B-Instruct Q4_K_M

>16GB VRAM
Qwen/Qwen2.5-14B-Instruct Q6_K

>12GB VRAM
Qwen/Qwen2.5-14B-Instruct Q4_K_M

>8GB VRAM
Qwen/Qwen2.5-7B-Instruct Q5_K_M

Potato
>>>/g/aicg
>or Qwen/Qwen2.5-0.5B IQ2_XXS

Use:
vLLM
llama.cpp
tabbyAPI

Anonymous
11/01/24(Fri)22:57:19 No.103057999

Anonymous 11/01/24(Fri)22:57:19 No.103057999

https://github.com/lyogavin/airllm
>70B on 4GB GPU
wtf

Anonymous
11/01/24(Fri)22:58:13 No.103058007

Anonymous 11/01/24(Fri)22:58:13 No.103058007

File: 1694655604976846.gif (1.24 MB, 480x366)

1.24 MB GIF

>>103057983

Anonymous
11/01/24(Fri)22:59:30 No.103058011

Anonymous 11/01/24(Fri)22:59:30 No.103058011

>>103057983
High quality post and nice Miku

Anonymous
11/01/24(Fri)22:59:42 No.103058013

Anonymous 11/01/24(Fri)22:59:42 No.103058013

File: AMD.png (40 KB, 817x912)

40 KB PNG

Anonymous
11/01/24(Fri)23:02:10 No.103058032

Anonymous 11/01/24(Fri)23:02:10 No.103058032

>>103058013
It's like I'm back in the 2.7b pyg days.

Anonymous
11/01/24(Fri)23:08:21 No.103058072

Anonymous 11/01/24(Fri)23:08:21 No.103058072

what model can I put in 32gb vram?

Anonymous
11/01/24(Fri)23:12:35 No.103058102

Anonymous 11/01/24(Fri)23:12:35 No.103058102

>>103058032
The base seems to be GPTNeoX so it's very close to pyg

Anonymous
11/01/24(Fri)23:12:56 No.103058108

Anonymous 11/01/24(Fri)23:12:56 No.103058108

>>103058013
I’m crying

Anonymous
11/01/24(Fri)23:14:39 No.103058116

Anonymous 11/01/24(Fri)23:14:39 No.103058116

File: IMG_0904.jpg (386 KB, 1320x1469)

386 KB JPG

>>103058013
It’s time

Anonymous
11/01/24(Fri)23:39:44 No.103058285

Anonymous 11/01/24(Fri)23:39:44 No.103058285

>>103058072
yeah nobody actually gives a fuck now.

>>103057367
/lmg/ can go die in a fire along with all the fucking retards who give a shit about AI

What's going to happen when a fucking AI takes all your job and you die of starvation?

Or when it fucking crashes the economy so money isn't worth a shit any more?

yeah have a good laugh now while you still can, lets see if you survive the next layoff wave.

Anonymous
11/01/24(Fri)23:42:26 No.103058307

Anonymous 11/01/24(Fri)23:42:26 No.103058307

>>103058285
Nah. It's gonna be fine.

Anonymous
11/01/24(Fri)23:44:07 No.103058315

Anonymous 11/01/24(Fri)23:44:07 No.103058315

File: sensible-chuckle.gif (992 KB, 250x250)

992 KB GIF

>>103058285

Anonymous
11/01/24(Fri)23:52:01 No.103058368

Anonymous 11/01/24(Fri)23:52:01 No.103058368

thanks for whoever recommended sovits for tts, but it occasionally generate things liek this is it supposed to be doing that?
https://vocaroo.com/1oj8DMm2i1CK

Anonymous
11/01/24(Fri)23:57:56 No.103058416

Anonymous 11/01/24(Fri)23:57:56 No.103058416

>>103058285
good. goyim should starve.

Anonymous
11/02/24(Sat)00:02:00 No.103058450

Anonymous 11/02/24(Sat)00:02:00 No.103058450

>>103058285
What layoff? We all work in AIML research here

Anonymous
11/02/24(Sat)00:02:09 No.103058451

Anonymous 11/02/24(Sat)00:02:09 No.103058451

>>103058285
>muh job
>muh layoffs
I have almost half a brain so I milked this thing for all it’s worth from the beginning and never have to have a job again
Sucks to suck

Anonymous
11/02/24(Sat)00:02:22 No.103058455

Anonymous 11/02/24(Sat)00:02:22 No.103058455

>>103058285
Hi, Sao.

Anonymous
11/02/24(Sat)00:04:12 No.103058468

Anonymous 11/02/24(Sat)00:04:12 No.103058468

>>103058285
/lmg/ is so dead

Anonymous
11/02/24(Sat)00:06:47 No.103058484

Anonymous 11/02/24(Sat)00:06:47 No.103058484

>>103058368
Improve your audio reference, match the tone of your prompt, play with the samplers

Anonymous
11/02/24(Sat)00:08:23 No.103058491

Anonymous 11/02/24(Sat)00:08:23 No.103058491

>>103058368
Is that any good?
I have had some success with xtts2 but sometimes it also just outputs garbage.

Anonymous
11/02/24(Sat)00:09:04 No.103058495

Anonymous 11/02/24(Sat)00:09:04 No.103058495

>>103058451
Grifting homos like you are the reason everything is shit now

Anonymous
11/02/24(Sat)00:10:59 No.103058503

Anonymous 11/02/24(Sat)00:10:59 No.103058503

>>103058484
but it's just saying random things in chinese

Anonymous
11/02/24(Sat)00:12:51 No.103058517

Anonymous 11/02/24(Sat)00:12:51 No.103058517

>>103058285
Sorry man, unless humanity stagnates, if it ain't AI displacing people from their jobs, it'll be some other form of automation.
That's what we as a species have been doing all ever since the industrial revolution.

Anonymous
11/02/24(Sat)00:14:11 No.103058523

Anonymous 11/02/24(Sat)00:14:11 No.103058523

>>103058503
You got the right weights?

Anonymous
11/02/24(Sat)00:14:18 No.103058524

Anonymous 11/02/24(Sat)00:14:18 No.103058524

>>103058495
Nah I’m one of the good ones actually

Anonymous
11/02/24(Sat)00:17:47 No.103058547

Anonymous 11/02/24(Sat)00:17:47 No.103058547

>>103058503
nta. Not sure if this is your problem. Once when i was testing it i forgot to set the input language in the top-right, next to the reference text, and only set it in the output. It would spell out the output text before actually speaking it. I suppose you set both language dropdowns, but still worth a check. Sharing a screenshot of your settings *of when the gen failed* would be useful.

Anonymous
11/02/24(Sat)00:29:02 No.103058602

Anonymous 11/02/24(Sat)00:29:02 No.103058602

>>103058523
i got it to speak english but it just says this about the weights
https://vocaroo.com/1dUOlsSOYDX6

Anonymous
11/02/24(Sat)00:31:13 No.103058612

Anonymous 11/02/24(Sat)00:31:13 No.103058612

>>103058602
Ttsfags are extra retarded, can't bother to spoonfeed you

Anonymous
11/02/24(Sat)00:34:33 No.103058632

Anonymous 11/02/24(Sat)00:34:33 No.103058632

File: pic.png (2.25 MB, 1869x1346)

2.25 MB PNG

>>103057367
fugg I love local AI

Anonymous
11/02/24(Sat)00:35:04 No.103058636

Anonymous 11/02/24(Sat)00:35:04 No.103058636

File: icy4u.jpg (201 KB, 570x380)

201 KB JPG

>>103058632
https://github.com/danielgatis/rembg

Anonymous
11/02/24(Sat)00:49:32 No.103058707

Anonymous 11/02/24(Sat)00:49:32 No.103058707

is this the new imggen thread?
>>103057367
nice gen

Anonymous
11/02/24(Sat)00:51:48 No.103058716

Anonymous 11/02/24(Sat)00:51:48 No.103058716

>>103057637
I guess there's no rest for nala tonight.

Anonymous
11/02/24(Sat)00:53:29 No.103058723

Anonymous 11/02/24(Sat)00:53:29 No.103058723

>>103058716
it's 1B, automatically ain't worth shit

Anonymous
11/02/24(Sat)00:54:14 No.103058729

Anonymous 11/02/24(Sat)00:54:14 No.103058729

>>103058723
only Nala is qualified to make that assessment.

Anonymous
11/02/24(Sat)00:55:02 No.103058732

Anonymous 11/02/24(Sat)00:55:02 No.103058732

>>103057999
Why nobody talk about this is totally ignored.

Anonymous
11/02/24(Sat)00:57:29 No.103058740

Anonymous 11/02/24(Sat)00:57:29 No.103058740

>>103058732
>mom. i said the thing the other boys were saying... see? see? i'm like them!

Anonymous
11/02/24(Sat)00:59:30 No.103058752

Anonymous 11/02/24(Sat)00:59:30 No.103058752

>>103057427
I never used it but Gemma 2 9b and 27b are very good for their sizes so I don’t think they would fuck up gemini after making that, right?

Anonymous
11/02/24(Sat)01:07:22 No.103058800

Anonymous 11/02/24(Sat)01:07:22 No.103058800

File: Olmo nala.png (99 KB, 942x387)

99 KB PNG

Nala test for the SFT DPO version of Olmo...and yeah... (I'll try putting in the glowy bos/eos token (just using a generic ML template) but so far it's about what you'd expect from a 1B model... of the Llama-1 generation. It can't sort out who is who.

Anonymous
11/02/24(Sat)01:08:23 No.103058809

Anonymous 11/02/24(Sat)01:08:23 No.103058809

>>103058800
somebody should draw this in ms paint

Anonymous
11/02/24(Sat)01:08:56 No.103058812

Anonymous 11/02/24(Sat)01:08:56 No.103058812

>>103058800
That's like trying to test a toddler on college stuff

Anonymous
11/02/24(Sat)01:11:00 No.103058822

Anonymous 11/02/24(Sat)01:11:00 No.103058822

103057983
Fuck this bait post, Largestral destroys Qwen2.5.

Anonymous
11/02/24(Sat)01:11:04 No.103058823

Anonymous 11/02/24(Sat)01:11:04 No.103058823

>>103058368
>https://vocaroo.com/1oj8DMm2i1CK
You still have it set to chinese, don't you

Anonymous
11/02/24(Sat)01:13:06 No.103058833

Anonymous 11/02/24(Sat)01:13:06 No.103058833

>>103058800
Yeah the response is about the same when using the amd hackjob version of ChatML as it is just using a proper ChatML template.

Anonymous
11/02/24(Sat)01:20:46 No.103058873

Anonymous 11/02/24(Sat)01:20:46 No.103058873

>>103058602
>i got it to speak english but it just says this about the weights
You need insanely clean reference audio or it sounds like shit. Either re-record in a silent environment with a good mic or run it through some noise reduction software. Then make sure the "Text for reference audio" is a letter-perfect transcript of what is being said.

Anonymous
11/02/24(Sat)01:24:07 No.103058887

Anonymous 11/02/24(Sat)01:24:07 No.103058887

>>103057999
>>103058732
Much slower than CPU + RAM. All it does it run one unquanted layer at a time.

Anonymous
11/02/24(Sat)01:27:50 No.103058905

Anonymous 11/02/24(Sat)01:27:50 No.103058905

>>103057999
>>103058887
NTA but 405B on 8GB vram seems like a neat proof of concept from a technical standpoint if nothing else. And fun for the world's most patient enthusiast lmao

Anonymous
11/02/24(Sat)01:29:28 No.103058913

Anonymous 11/02/24(Sat)01:29:28 No.103058913

>>103057484
The thread needs this post except in english.

Anonymous
11/02/24(Sat)01:30:32 No.103058920

Anonymous 11/02/24(Sat)01:30:32 No.103058920

>>103058905
If you're patient enough you could already do it on swap.

Anonymous
11/02/24(Sat)01:30:38 No.103058923

Anonymous 11/02/24(Sat)01:30:38 No.103058923

>>103058873
You're right that the reference is unclean.

Anonymous
11/02/24(Sat)01:31:17 No.103058929

Anonymous 11/02/24(Sat)01:31:17 No.103058929

>>103057484
For potatoes just install LM studio and download mistral.

Idk why thats not in the pasta....

Anonymous
11/02/24(Sat)01:32:16 No.103058937

Anonymous 11/02/24(Sat)01:32:16 No.103058937

>>103057999
Well its not like its not possible. Just extremely slow.

Anonymous
11/02/24(Sat)01:32:57 No.103058940

Anonymous 11/02/24(Sat)01:32:57 No.103058940

>>103058285
The elites always need slaves to do slave work

Anonymous
11/02/24(Sat)01:33:09 No.103058942

Anonymous 11/02/24(Sat)01:33:09 No.103058942

>>103057572
I believe some anon here has 40xh100

Anonymous
11/02/24(Sat)01:34:31 No.103058950

Anonymous 11/02/24(Sat)01:34:31 No.103058950

>none of these lists recommending Storybreaker-Ministral
That's how I know these lists are fake.

Anonymous
11/02/24(Sat)01:34:51 No.103058952

Anonymous 11/02/24(Sat)01:34:51 No.103058952

>>103058920
I imagine this doesn't render the computer unusable while it's thinking like the swap method would, though

Anonymous
11/02/24(Sat)01:37:05 No.103058965

Anonymous 11/02/24(Sat)01:37:05 No.103058965

>>103058822
>sunk cost fallacy
It's time to let go.

Anonymous
11/02/24(Sat)01:38:05 No.103058976

Anonymous 11/02/24(Sat)01:38:05 No.103058976

>>103058965
nah I'm not a richfag, I only have 36gb vram
Largestral is just a better model, even at low quants

Anonymous
11/02/24(Sat)01:40:01 No.103058986

Anonymous 11/02/24(Sat)01:40:01 No.103058986

>>103058013
bizarrely nostalgic
is AMD that far behind

Anonymous
11/02/24(Sat)01:40:58 No.103058989

Anonymous 11/02/24(Sat)01:40:58 No.103058989

>>103058940
The elites need more kids to rape, so they'll at least keep some people around.

Anonymous
11/02/24(Sat)01:43:21 No.103059010

Anonymous 11/02/24(Sat)01:43:21 No.103059010

>>103058976
Nah, it's even worse for people with low VRAM because of how exponentially worse they get below 4 bits.

Anonymous
11/02/24(Sat)01:44:05 No.103059013

Anonymous 11/02/24(Sat)01:44:05 No.103059013

>>103058976
What quant are you running? Same VRAM. Even at IQ2_M it feels really slow for me.

Anonymous
11/02/24(Sat)01:47:20 No.103059026

Anonymous 11/02/24(Sat)01:47:20 No.103059026

>>103058989
poor people can also rape kids

Anonymous
11/02/24(Sat)01:49:11 No.103059037

Anonymous 11/02/24(Sat)01:49:11 No.103059037

>>103059013
IQ3_XXS, it gets too dumb below that. No model can survive dropping below 3 in my experience
I get just under 2 tokens/sec with 65 layers on gpu and a 16 core 5950x

Anonymous
11/02/24(Sat)01:58:49 No.103059078

Anonymous 11/02/24(Sat)01:58:49 No.103059078

File: 30157bcd7c877d9ff22ed5eb8(...).png (620 KB, 800x800)

620 KB PNG

Best Free LLM Proxy
https://api.pawan.krd/cosmosrp/v1
put a single space in the API key field

Anonymous
11/02/24(Sat)02:00:53 No.103059088

Anonymous 11/02/24(Sat)02:00:53 No.103059088

File: MikuNotAgain.png (1016 KB, 1200x848)

1016 KB PNG

>>103059078
>cloud shite
Wrong thread

Anonymous
11/02/24(Sat)02:03:13 No.103059098

Anonymous 11/02/24(Sat)02:03:13 No.103059098

Speaking of cloudshit, while playing around with hailuo after discovering that they finally added image-to-video as well as official release outside of China I've noticed when it's not in queue it's way faster than Luma or Kling... which leads me to believe it's not an exceptionally gargantuan model... so maybe we'll get good local videogen at some point.

Anonymous
11/02/24(Sat)02:07:09 No.103059119

Anonymous 11/02/24(Sat)02:07:09 No.103059119

>>103059078
Sooner or later that shit is going to implode

Anonymous
11/02/24(Sat)02:11:57 No.103059141

Anonymous 11/02/24(Sat)02:11:57 No.103059141

>>103059119
It's quite obviously being funded by big money. It's clearly not stolen.
Day 1 that the key proxy repo was put up on HF the community manager took it down due to violating the rules, and then it was back up shortly thereafter without a further peep about it. They clearly exist for research purposes with a lot of money behind them. My guess is people are providing human training data for further censorship.

Anonymous
11/02/24(Sat)02:19:46 No.103059184

Anonymous 11/02/24(Sat)02:19:46 No.103059184

>>103058013
Cactus

Anonymous
11/02/24(Sat)02:21:24 No.103059194

Anonymous 11/02/24(Sat)02:21:24 No.103059194

>>103059078
what's the model?
his github claims 3.5-turbo but that's been deprecated off the API now, so it has to be something else

Anonymous
11/02/24(Sat)02:37:20 No.103059256

Anonymous 11/02/24(Sat)02:37:20 No.103059256

>>103059141
I didn’t say anything about it being stolen keys. Things like this glow like a supernova. Sooner or later it’s going to implode in a way worse way than “oh no he went to jail :(“. People are going to die. I will not elaborate.

Anonymous
11/02/24(Sat)02:38:31 No.103059266

Anonymous 11/02/24(Sat)02:38:31 No.103059266

>>103059256
>People are going to die. I will not elaborate.
did you learn to be this dramatic from sam altman

Anonymous
11/02/24(Sat)02:50:30 No.103059323

Anonymous 11/02/24(Sat)02:50:30 No.103059323

>>103059141
https://x.com/pawanosmant/status/1743661803110908119

Anonymous
11/02/24(Sat)02:54:14 No.103059341

Anonymous 11/02/24(Sat)02:54:14 No.103059341

>>103059323
"If it's free, you're the product"
Who falls for this shit?04gng

Anonymous
11/02/24(Sat)02:55:58 No.103059346

Anonymous 11/02/24(Sat)02:55:58 No.103059346

>>103059341
obviously but the deal is fair

Anonymous
11/02/24(Sat)02:57:08 No.103059352

Anonymous 11/02/24(Sat)02:57:08 No.103059352

>>103059266
No it’s just a side effect of the medication

Anonymous
11/02/24(Sat)02:57:10 No.103059353

Anonymous 11/02/24(Sat)02:57:10 No.103059353

>>103059323
oh so it's openai
not worth using for RP or cooming, even for free

Anonymous
11/02/24(Sat)02:59:40 No.103059361

Anonymous 11/02/24(Sat)02:59:40 No.103059361

File: grift.png (135 KB, 612x549)

135 KB PNG

>>103059323
I'm not entirely sure he understands the words he's using.

Anonymous
11/02/24(Sat)03:04:14 No.103059386

Anonymous 11/02/24(Sat)03:04:14 No.103059386

>>103059323
>sponsorship
lol he’s calling the basic bitch new user credits a sponsorship
If someone sent them a strongly worded email they’d probably purge his account

Anonymous
11/02/24(Sat)03:04:22 No.103059388

Anonymous 11/02/24(Sat)03:04:22 No.103059388

How's the ST situation? Nothingburger? Forked? Migrated to something else?
Been away for about a month or so, saw that the ST guys we're trying to get more corpo friendly by removing a bunch of stuff.

Anonymous
11/02/24(Sat)03:05:25 No.103059394

Anonymous 11/02/24(Sat)03:05:25 No.103059394

>>103059388
>Nothingburger
yes

Anonymous
11/02/24(Sat)03:13:54 No.103059441

Anonymous 11/02/24(Sat)03:13:54 No.103059441

File: 1723771660587148.png (14 KB, 694x632)

14 KB PNG

>buy AM4 mobo
>put in spare 5600GT and 128GB of RAM
>run larger models using vulkan
How bad could it be?

Anonymous
11/02/24(Sat)03:15:15 No.103059450

Anonymous 11/02/24(Sat)03:15:15 No.103059450

>>103059441
>5600GT
Memory Support: DDR4
Rated Speed: 3200 MT/s
Memory Bus: Dual-channel
ouch

Anonymous
11/02/24(Sat)03:16:37 No.103059458

Anonymous 11/02/24(Sat)03:16:37 No.103059458

>>103059450
does it really matter if you run the model on ram/cpu?

Anonymous
11/02/24(Sat)03:22:18 No.103059493

Anonymous 11/02/24(Sat)03:22:18 No.103059493

>>103059458
for inferencing it's not that big of a hit
for processing? thousands of times slower

Anonymous
11/02/24(Sat)03:28:09 No.103059518

Anonymous 11/02/24(Sat)03:28:09 No.103059518

File: supermicrodualepycgenoa.png (62 KB, 599x615)

62 KB PNG

>>103059458
picrel would be an acceptable spec for pure cpu inference if paired with a 24gb+ GPU for context processing

Anonymous
11/02/24(Sat)03:28:44 No.103059520

Anonymous 11/02/24(Sat)03:28:44 No.103059520

is there a good local fine tuned llm that bypasses AI detection for assignments? I have to write a pretty long thing about a topic I don't care about much, but if I use GPT they're gonna find out (professor uses gptzero)

Anonymous
11/02/24(Sat)03:37:03 No.103059546

Anonymous 11/02/24(Sat)03:37:03 No.103059546

File: 1715062579295070.png (40 KB, 857x179)

40 KB PNG

Can't even make my own bot wet, damn it's fucking OVER

Anonymous
11/02/24(Sat)03:40:38 No.103059564

Anonymous 11/02/24(Sat)03:40:38 No.103059564

>>103059520
no

Anonymous
11/02/24(Sat)03:50:21 No.103059616

Anonymous 11/02/24(Sat)03:50:21 No.103059616

File: IMG_20241102_154912.jpg (129 KB, 765x605)

129 KB JPG

>>103059078
It's glowing like fucking sun. I joined their discord to ask about the legality of the model to the owner himself but this is the response

Anonymous
11/02/24(Sat)05:10:07 No.103060042

Anonymous 11/02/24(Sat)05:10:07 No.103060042

Has anyone tried using LLMs to practice/learn Japanese? I'm still a beginner, and I'm wondering if I can use an LLM to practice writing and reading sentences in a grammatically correct, but not stilted manner.

Maybe it's just a matter of having a good character card that instructs it to avoid using Kanji or other higher-level things?

Any recommendations are appreciated.

Anonymous
11/02/24(Sat)05:10:45 No.103060045

Anonymous 11/02/24(Sat)05:10:45 No.103060045

>>103059616
SIR DO NOT REDEEM DISCORD LOGS SAR

Anonymous
11/02/24(Sat)05:18:04 No.103060081

Anonymous 11/02/24(Sat)05:18:04 No.103060081

>>103060042
Ezo 72b is perfectly fluent, and after using it extensively over the last couple of days I’m confident that it can be prompted to do what you want. You may want to combine it with SoVITS tts engine as well for listening practice

Anonymous
11/02/24(Sat)05:21:18 No.103060096

Anonymous 11/02/24(Sat)05:21:18 No.103060096

>>103059616
What a shady fucker. Anyone who uses that deserves what they get.

Anonymous
11/02/24(Sat)05:44:39 No.103060213

Anonymous 11/02/24(Sat)05:44:39 No.103060213

>>103057983
Thanks Xi

Anonymous
11/02/24(Sat)05:47:32 No.103060226

Anonymous 11/02/24(Sat)05:47:32 No.103060226

>>103059078
This picture looks AI generated

Anonymous
11/02/24(Sat)05:53:05 No.103060264

Anonymous 11/02/24(Sat)05:53:05 No.103060264

>>103058942
Only for the weekend because his boss said it's ok.

Anonymous
11/02/24(Sat)06:02:09 No.103060318

Anonymous 11/02/24(Sat)06:02:09 No.103060318

>>103060226
I imagine there's a bushel of fingers behind the camera. Also, he has no ears. How does he keep his glasses on?

Anonymous
11/02/24(Sat)06:37:22 No.103060510

Anonymous 11/02/24(Sat)06:37:22 No.103060510

>>103050760
>>103050827
Would you all mind sharing settings/system prompt? The amount of times I’ve seen “impish grin” is bothering me more than I’d like.

>>103050826
Which 70B? I just tried the new RPmax at 5bpw and was not very impressed at all.

Anonymous
11/02/24(Sat)06:41:11 No.103060536

Anonymous 11/02/24(Sat)06:41:11 No.103060536

Can burgers finally stop pretending democracy is real?

Anonymous
11/02/24(Sat)06:54:17 No.103060603

Anonymous 11/02/24(Sat)06:54:17 No.103060603

>>103060081
Thanks for the suggestion. Have you tried the 32B version? I wont be able to run the 72b on a 3090.

Anonymous
11/02/24(Sat)07:05:31 No.103060666

Anonymous 11/02/24(Sat)07:05:31 No.103060666

>>103057983
>Only Qwen 2.5 matter
is this a meme or it's true? kek

Anonymous
11/02/24(Sat)07:13:34 No.103060719

Anonymous 11/02/24(Sat)07:13:34 No.103060719

>>103060666
as far as the chinese government is concerned, yes

Anonymous
11/02/24(Sat)07:49:36 No.103060911

Anonymous 11/02/24(Sat)07:49:36 No.103060911

>>103060666
no qwen is a meme
use largestral instead

Anonymous
11/02/24(Sat)07:53:36 No.103060938

Anonymous 11/02/24(Sat)07:53:36 No.103060938

>>103060911
USE MY ANUS INSTEAD

Anonymous
11/02/24(Sat)07:57:00 No.103060964

Anonymous 11/02/24(Sat)07:57:00 No.103060964

File: ChatBIT.png (515 KB, 751x1000)

515 KB PNG

Local models have been WEAPONIZED

>Nov 1 (Reuters) - Top Chinese research institutions linked to the People's Liberation Army have used Meta's publicly available Llama model to develop an AI tool for potential military applications, according to three academic papers and analysts.
>In a June paper reviewed by Reuters, six Chinese researchers from three institutions, including two under the People's Liberation Army's (PLA) leading research body, the Academy of Military Science (AMS), detailed how they had used an early version of Meta's Llama as a base for what it calls "ChatBIT".
>The researchers used an earlier Llama 13B large language model (LLM) from Meta (META.O), opens new tab, incorporating their own parameters to construct a military-focused AI tool to gather and process intelligence, and offer accurate and reliable information for operational decision-making.
>ChatBIT was fine-tuned and "optimised for dialogue and question-answering tasks in the military field", the paper said. It was found to outperform some other AI models that were roughly 90% as capable as OpenAI's powerful ChatGPT-4. The researchers didn't elaborate on how they defined performance or specify whether the AI model had been put into service.

Anonymous
11/02/24(Sat)08:02:02 No.103061009

Anonymous 11/02/24(Sat)08:02:02 No.103061009

File: lightyear.jpg (435 KB, 2048x2048)

435 KB JPG

>>103060964
>Llama 13B
>accurate and reliable information for operational decision-making

Anonymous
11/02/24(Sat)08:04:05 No.103061012

Anonymous 11/02/24(Sat)08:04:05 No.103061012

>>103057373
>anthracite-org/magnum-v4-72b-gguf-IQ4_XS.gguf
Better than miqu? I could finally retire it, it's been serving me for over a year

Anonymous
11/02/24(Sat)08:05:07 No.103061022

Anonymous 11/02/24(Sat)08:05:07 No.103061022

>>103060964
>llama 13B
what kind of retard worked on this psyop?

Anonymous
11/02/24(Sat)08:05:10 No.103061024

Anonymous 11/02/24(Sat)08:05:10 No.103061024

>>103058800
Holy fuck.

Anonymous
11/02/24(Sat)08:06:46 No.103061036

Anonymous 11/02/24(Sat)08:06:46 No.103061036

>>103060964
>considering the information you have provided me with, I would suggest to launch nukes to stop the capitalist threat
>the next step should be to launch the nukes
>finally, after you have accessed the situation and asked for the US' consent, you should launch the nukes

Anonymous
11/02/24(Sat)08:07:16 No.103061040

Anonymous 11/02/24(Sat)08:07:16 No.103061040

>>103058800
this is the worst I've seen a model perform on the Nala test
why can't AMD do anything right?

Anonymous
11/02/24(Sat)08:07:25 No.103061042

Anonymous 11/02/24(Sat)08:07:25 No.103061042

File: 1714372708048886.jpg (33 KB, 600x632)

33 KB JPG

>realized that ollama models are by default q4 quantized
ollama bros... I don't feel so good

Anonymous
11/02/24(Sat)08:16:05 No.103061077

Anonymous 11/02/24(Sat)08:16:05 No.103061077

>>103061042
Doesn't matter. It's easy to use. It's the Linux to llama.cpp's GNU.

Anonymous
11/02/24(Sat)08:16:14 No.103061078

Anonymous 11/02/24(Sat)08:16:14 No.103061078

>>103061042
It makes sense.
Less bandwidth cost for LLM tourists trying baby's first local.
Q4 isn't completely lobotomized and runs fast.
Also,
>Exam 1
This response is more than correct and should earn 11 points.

Anonymous
11/02/24(Sat)08:24:27 No.103061114

Anonymous 11/02/24(Sat)08:24:27 No.103061114

>>103059518
mindbroken

Anonymous
11/02/24(Sat)08:33:32 No.103061158

Anonymous 11/02/24(Sat)08:33:32 No.103061158

>>103061078
He forgot to describe

Anonymous
11/02/24(Sat)08:39:43 No.103061191

Anonymous 11/02/24(Sat)08:39:43 No.103061191

File: sample_c0b7e92fd3ebb220dd(...).jpg (105 KB, 850x983)

105 KB JPG

>>103060964
This is a Chinese psyop. Its aim was to drive the U.S. to impose heavy regulations on the machine learning field, effectively crippling research capabilities.

Anonymous
11/02/24(Sat)08:41:46 No.103061202

Anonymous 11/02/24(Sat)08:41:46 No.103061202

>>103061078
It's not even q4_K_M though if I remember correctly, it's q4_0.

Anonymous
11/02/24(Sat)08:56:24 No.103061289

Anonymous 11/02/24(Sat)08:56:24 No.103061289

>>103061191
Obviously. The question is whether they'll fall for it, egged on by OpenAI hoping for a regulatory moat.

Anonymous
11/02/24(Sat)09:03:50 No.103061333

Anonymous 11/02/24(Sat)09:03:50 No.103061333

>>103061202
If it works, it works.

Anonymous
11/02/24(Sat)09:07:35 No.103061368

Anonymous 11/02/24(Sat)09:07:35 No.103061368

>>103061202
If you're using ollama you shouldn't have any standard so it's fine

Anonymous
11/02/24(Sat)09:09:30 No.103061388

Anonymous 11/02/24(Sat)09:09:30 No.103061388

>>103061368
I mean, on their YouTube channel they literally say that you should always run the smallest quant possible since it's not worth waiting a few extra seconds.

Anonymous
11/02/24(Sat)09:11:33 No.103061402

Anonymous 11/02/24(Sat)09:11:33 No.103061402

>>103061388
I started my first local model with ollama after watching one of the videos... and switched over to kobold after half a day.

Anonymous
11/02/24(Sat)09:11:58 No.103061406

Anonymous 11/02/24(Sat)09:11:58 No.103061406

>>103061388
>ollama
>YouTube
are you just completely incapable of independent thought?

Anonymous
11/02/24(Sat)09:15:27 No.103061422

Anonymous 11/02/24(Sat)09:15:27 No.103061422

>>103061158
There is no describing that which speaks for itself.
What, is this one of those classes where they want you to do
//Find out if list is empty.
//Get lengthOfList.
final int lengthOfList = list.size();
//Test if lengthOfList is that of an empty list.
if(lengthOfList==0){
 //Print true if true.
 println("true");
 }else{
 //Print false if false.
 println("false");
}
instead of a one liner?

>>103061202
I think so, at least for older models back when I let Ollama be my baby's first for about a week and then went Kobold and never looked back, except I think I still use one Ollama model file because it hasn't been awful and what works works.
Well, really I tried Ooba before that but that was when 1B and 2B were SOTA instead of whatever weird thing brought those designations back this week.

>>103061406
For a normie, using a local LLM at all is "independent" of the main stream, who just say "hey siri google alexa, send food."

Anonymous
11/02/24(Sat)09:38:23 No.103061570

Anonymous 11/02/24(Sat)09:38:23 No.103061570

>>103061042
You can download different quants in their website.
It is a bit hidden but it’s there.

Anonymous
11/02/24(Sat)09:41:25 No.103061584

Anonymous 11/02/24(Sat)09:41:25 No.103061584

>>103061040
>why can't AMD do anything right?
Honestly while it's perplexing why they would bother releasing this in the current landscape- I kind of get where they are coming from.
They obviously wanted to start from scratch themselves, internally, so they started with the basics. Training a small model from scratch. If they made this thing... 2 years ago maybe... it would be alright. "Hey cool AMD did an AI"
But they should have kept it to themselves until they had something more like Nemo. And they could have cited this precursory model in a paper or something or made it available for shits and giggles. But yeah. Unless you're presenting new novel architecture there's no reason to even bother releasing something like this in 2024.9

Anonymous
11/02/24(Sat)10:05:28 No.103061731

Anonymous 11/02/24(Sat)10:05:28 No.103061731

>>103061570
That was the moment when I realized that instead of going to their site I could go to HF and use what the cool kids on /lmg/ were using.
And that's how I became a cool kid, too.

>>103061584
AMD is Nvidia's cousin. They need only to make the token effort to create a technicality of there being competition in the market to keep the monopoly charges at bay. If anything the only AI they should be earnestly investing in is game performance related so they can become the back side of Nvidia who picks up all of the gamers who can't get Nvidia because AI cleaned out the top shelf of the market and are willing to take AMD's second-rate offerings because they're there.

Anonymous
11/02/24(Sat)10:40:48 No.103061989

Anonymous 11/02/24(Sat)10:40:48 No.103061989

Will a 100B BiNet model actually fit in 24GB or will there be more fat that comes with it like kv cache or embedding magic or attention that makes it not quite fit?

Anonymous
11/02/24(Sat)10:48:35 No.103062045

Anonymous 11/02/24(Sat)10:48:35 No.103062045

File: m4max.png (28 KB, 900x276)

28 KB PNG

It's so bad out there for AI hobbyists right now it's crazy. You have to choose between a designer shitbox or a rig of heaters both of which have 400% markup

Anonymous
11/02/24(Sat)10:55:38 No.103062084

Anonymous 11/02/24(Sat)10:55:38 No.103062084

>>103061584
>should have kept it to themselves
Because it ruined your day to see a nothing mentioned? You're only hearing about it because someone saw a random blog post and said "look they did a thing".

Anonymous
11/02/24(Sat)10:56:27 No.103062087

Anonymous 11/02/24(Sat)10:56:27 No.103062087

>>103062045
When is it getting better?

Anonymous
11/02/24(Sat)10:57:40 No.103062103

Anonymous 11/02/24(Sat)10:57:40 No.103062103

>>103062084
Did it work? are you a real woman now?

Anonymous
11/02/24(Sat)10:59:09 No.103062115

Anonymous 11/02/24(Sat)10:59:09 No.103062115

>>103062087
Not anytime soon. But you're already familiar with llms, you're supposed to be making money off them to buy better hardware

Anonymous
11/02/24(Sat)11:01:02 No.103062126

Anonymous 11/02/24(Sat)11:01:02 No.103062126

>>103062045
>1.4kw for the GPUs alone
I PL mine to 200W each and it still has about 75-80% of its stock performance for both training and inferencing.
Quad 3090 rig (4090 if you're a richfag, P40 if you're a poorfag and have the technical skills to make it work) built on a server platform is the standard for going big at home. Anything else is contrarian nonsense. Mac can't train and has absolutely no upgrade/expansion path. It's only an option if you're okay with living in the pod and eating the bugs.

Anonymous
11/02/24(Sat)11:02:07 No.103062139

Anonymous 11/02/24(Sat)11:02:07 No.103062139

>>103058285
You're blaming societal issues on a bunch of mathematics.

Anonymous
11/02/24(Sat)11:04:44 No.103062167

Anonymous 11/02/24(Sat)11:04:44 No.103062167

>>103060666
It's true.

Anonymous
11/02/24(Sat)11:05:33 No.103062172

Anonymous 11/02/24(Sat)11:05:33 No.103062172

>>103058285
>What's going to happen when a fucking AI takes all your job and you die of starvation?
I know how to feed myself and live off the land.
>Or when it fucking crashes the economy so money isn't worth a shit any more?
The west is the great satan and I have been waiting all my life for it to collapse.
>yeah have a good laugh now while you still can, lets see if you survive the next layoff wave.
The people who cling to toil and materialism will have the worst time. Stop projecting your fear onto others. Not everyone is a pussy-ass bitch like you who would just lay down and starve when push came to shove.

Anonymous
11/02/24(Sat)11:07:57 No.103062191

Anonymous 11/02/24(Sat)11:07:57 No.103062191

>>103058011
Quit samefagging nigger
>>103060666
Zoomer discovered gatekeeping, idk why no one ridicules him for this.

llama.cpp CUDA dev !!OM2Fp6Fn93S
11/02/24(Sat)11:31:21 No.103062371

llama.cpp CUDA dev !!OM2Fp6Fn93S 11/02/24(Sat)11:31:21 No.103062371

>>103062087
I am currently assisting a company in designing RISC-V processors for parallel workloads (by weighing in on which features would be useful for machine learning applications).
Presumably there are other companies doing the same thing so
>two more years

Anonymous
11/02/24(Sat)11:33:22 No.103062387

Anonymous 11/02/24(Sat)11:33:22 No.103062387

I still dont get how this shit works. I put in words then my gpu dose millions of calculations and makes more words coherently and now my dick is spraying semen out. It make's no sense.

Anonymous
11/02/24(Sat)11:34:02 No.103062395

Anonymous 11/02/24(Sat)11:34:02 No.103062395

For other people that are using nemotron, are you having issues with it constantly trying to put things into bold and using weird formatting? I dunno why but it's constantly using brackets and double asterisks. Neither of those things are anywhere in my sysprompt or cards.

Anonymous
11/02/24(Sat)11:35:29 No.103062410

Anonymous 11/02/24(Sat)11:35:29 No.103062410

>>103062371
damn, on one hand it's really cool, on the other... Seriously? there's no one who knows how to do this and they had to ask help from a physics student who has AI as a hobby?

Anonymous
11/02/24(Sat)11:36:40 No.103062421

Anonymous 11/02/24(Sat)11:36:40 No.103062421

>>103062410
Not only are there not that many, those that are available are expensive as all fuck. The only thing more valuable than GPU clusters is experienced personnel.

Anonymous
11/02/24(Sat)11:55:12 No.103062578

Anonymous 11/02/24(Sat)11:55:12 No.103062578

>>103062191
Given responses like >>103060666 I approve of his post.

Anonymous
11/02/24(Sat)12:24:41 No.103062854

Anonymous 11/02/24(Sat)12:24:41 No.103062854

>>103060603
I haven't tried it. that 30b class tends to be cursed. I'd personally take the speed hit and run partial offload of a q5.

Anonymous
11/02/24(Sat)12:32:12 No.103062951

Anonymous 11/02/24(Sat)12:32:12 No.103062951

>>103062395
Yes, Nemotron was tuned to 'think' things through using bullet points and lists so it'll always try to implement those in its replies.

Anonymous
11/02/24(Sat)12:34:46 No.103062983

Anonymous 11/02/24(Sat)12:34:46 No.103062983

>>103062951
Isn't that what "stepped thinking" addon does in ST? Wonder if it works with Nemotron.

Anonymous
11/02/24(Sat)12:36:12 No.103063001

Anonymous 11/02/24(Sat)12:36:12 No.103063001

>>103057367
the real picture is a classic

Anonymous
11/02/24(Sat)12:46:49 No.103063148

Anonymous 11/02/24(Sat)12:46:49 No.103063148

>>103059493
I see

Anonymous
11/02/24(Sat)12:48:18 No.103063172

Anonymous 11/02/24(Sat)12:48:18 No.103063172

Why are vision models so bad? Even top proprietary ones on lmsys arena can't OCR a paragraph or random words without making mistakes. They always hallucinate new ones or skip some.

Anonymous
11/02/24(Sat)12:50:57 No.103063195

Anonymous 11/02/24(Sat)12:50:57 No.103063195

>>103061289
Anthropic and OpenAI are playing such a stupid game amping up the "risks" to 11 and asking every freaking time for "regulation".
The day this shit explodes on their face I will laugh, even if it impacts me.

Anonymous
11/02/24(Sat)12:51:40 No.103063204

Anonymous 11/02/24(Sat)12:51:40 No.103063204

>>103063172
OCR was already a thing before AI and "better OCR" doesn't sound that interesting to investors

Anonymous
11/02/24(Sat)12:53:24 No.103063222

Anonymous 11/02/24(Sat)12:53:24 No.103063222

>>103063172
>can't OCR a paragraph or random words
OCR is a solved problem with classical algorithms. Use the vision models for their novel abilities eg. inferring what people are doing in a scene or telling you about composition, etc.

Anonymous
11/02/24(Sat)12:57:16 No.103063259

Anonymous 11/02/24(Sat)12:57:16 No.103063259

>>103063222
Yep, using the correct tool for the job instead of dumping everything on transformer models

Anonymous
11/02/24(Sat)12:58:02 No.103063268

Anonymous 11/02/24(Sat)12:58:02 No.103063268

>>103062045
>M4 Max supports up to 128GB of fast unified memory
Would the M4 max actually be usable/worth it for local LLM?

Anonymous
11/02/24(Sat)13:04:35 No.103063328

Anonymous 11/02/24(Sat)13:04:35 No.103063328

>>103063268
>Would the M4 max actually be usable/worth it for local LLM?
It remains to be seen if there is any dedicated hardware on the M4 that can compete with a gpu for context processing. The prices of the mac stuff is high enough to go another route that doesn't kneecap you every time the backend needs to process context for some reason.
If you could pick up an older mac studio with 192gb m2 ultra for a reasonable price, it might back a good RPC backend in conjunction with a regular pc gpu rig, but that's the only scenario where I think the hermetically sealed apple monoliths are worthwhile.

Anonymous
11/02/24(Sat)13:09:32 No.103063372

Anonymous 11/02/24(Sat)13:09:32 No.103063372

>>103063328
>older mac studio with 192gb m2 ultra for a reasonable price
Yeah I've looked into this, prices are prohibitive for this stuff to this day sadly.

Anonymous
11/02/24(Sat)13:11:19 No.103063389

Anonymous 11/02/24(Sat)13:11:19 No.103063389

>>103062126
>P40
It's not like that anymore. These days, it is hugely overpriced e-waste.

Anonymous
11/02/24(Sat)13:12:34 No.103063403

Anonymous 11/02/24(Sat)13:12:34 No.103063403

how to use local models to "humanize" text to write my phd thesis on gender studies

Anonymous
11/02/24(Sat)13:13:39 No.103063411

Anonymous 11/02/24(Sat)13:13:39 No.103063411

>>103062371
>>103062421
I think tenstorrent is doing 600 riscv cores with vector extensions, and it's basically a cheapo machine learning setup, you can buy one of their cards for something under 1000$ and over 500$, brand new, 24gb of vram, it's not worth it relative to a used 3090 as you'd also need to write all the software you need for it, but it's the future!

Anonymous
11/02/24(Sat)13:14:14 No.103063419

Anonymous 11/02/24(Sat)13:14:14 No.103063419

>>103063372
>Yeah I've looked into this, prices are prohibitive for this stuff to this day sadly.
on ebay for sure. there are some that pop up on eg. facebook marketplace from time to time. There's a 128gb one near me for under $4k. Still not worth it imo, but it might be to someone else vs picking up a few more 48gb gpus if their box is already getting maxed out.

Anonymous
11/02/24(Sat)13:15:36 No.103063433

Anonymous 11/02/24(Sat)13:15:36 No.103063433

>>103063389
Yeah it's unfortunate.
And it doesn't support bnb quantization which = no qlora
but if you absolutely have to have a GPU server for running LLMs and you're an utter poorfag it's basically what you get.

Anonymous
11/02/24(Sat)13:19:25 No.103063477

Anonymous 11/02/24(Sat)13:19:25 No.103063477

>>103063403
My kids use our local AI rig to help with schoolwork, and telling it to essentially "Write like a 10th grader midwit" seems to get acceptable output the teacher doesn't catch.

Anonymous
11/02/24(Sat)13:21:50 No.103063499

Anonymous 11/02/24(Sat)13:21:50 No.103063499

>>103063204
>>103063222
Lol what a total meme technology. Completely fucking useless for doing anything productive.

Anonymous
11/02/24(Sat)13:22:41 No.103063510

Anonymous 11/02/24(Sat)13:22:41 No.103063510

>>103063477
>Already cheating in 10th grade
Grim

Anonymous
11/02/24(Sat)13:23:42 No.103063521

Anonymous 11/02/24(Sat)13:23:42 No.103063521

>>103063499
Retard

Anonymous
11/02/24(Sat)13:23:45 No.103063522

Anonymous 11/02/24(Sat)13:23:45 No.103063522

>>103063499
>a rope?
>I can't use a rope to cook my dinner
>total meme technology.

Anonymous
11/02/24(Sat)13:27:37 No.103063551

Anonymous 11/02/24(Sat)13:27:37 No.103063551

>>103063510
Silence boomer. It's more importnat for them to learn how to use AI than to practice writing papers. How often does manual math without calculator come in handy for you?

Anonymous
11/02/24(Sat)13:29:03 No.103063572

Anonymous 11/02/24(Sat)13:29:03 No.103063572

>>103063521
>>103063522
You're coping. Now I totally get why ggerganov doesn't want to add this stupid meme. Who even needs it? Why not just have an image recognition model separately, why is there need to couple it to llm?

Anonymous
11/02/24(Sat)13:29:19 No.103063577

Anonymous 11/02/24(Sat)13:29:19 No.103063577

>>103063433
It's literally overpriced and no longer affordable.

Anonymous
11/02/24(Sat)13:31:00 No.103063596

Anonymous 11/02/24(Sat)13:31:00 No.103063596

>>103063572
Because you can't read beyond one sentence, retard >>103063222

Anonymous
11/02/24(Sat)13:31:55 No.103063611

Anonymous 11/02/24(Sat)13:31:55 No.103063611

>>103063577
What the fuck I just looked it up holy fuck.
It's more than half the cost of a fucking 3090 now.

Anonymous
11/02/24(Sat)13:32:17 No.103063618

Anonymous 11/02/24(Sat)13:32:17 No.103063618

File: tenthgradecheater.png (222 KB, 884x791)

222 KB PNG

>>103063510
its mostly good for engaging with the material outside of class (or pure busywork)
picrel

Anonymous
11/02/24(Sat)13:32:50 No.103063625

Anonymous 11/02/24(Sat)13:32:50 No.103063625

>>103063551
>How often does manual math without calculator come in handy for you?
A lot, midwit. Would you not learn to walk and to talk if there was a machine to do it for yourself?

Anonymous
11/02/24(Sat)13:33:28 No.103063632

Anonymous 11/02/24(Sat)13:33:28 No.103063632

>>103063596
>Why not just have an image recognition model separately, why is there need to couple it to llm?

Anonymous
11/02/24(Sat)13:43:12 No.103063724

Anonymous 11/02/24(Sat)13:43:12 No.103063724

>>103063625
>Would you not learn to walk and to talk if there was a machine to do it for yourself?
A 5 minute stroll through any walmart filled with diabetics on mobility scooters will tell you that plenty of people do.

Anonymous
11/02/24(Sat)13:48:46 No.103063791

Anonymous 11/02/24(Sat)13:48:46 No.103063791

>>103063611
I was curious so I checked the price of a used 3090, it's the exact same as when I bought my second one 2 years ago.
Kind of insane.

Anonymous
11/02/24(Sat)13:50:35 No.103063813

Anonymous 11/02/24(Sat)13:50:35 No.103063813

>>103063791
Yeah they sort of edge up and down a bit but have stayed roughly the same...which given how old they are getting might as well be a price increase. But it's still at least somewhat reasonable for what you are getting.

Anonymous
11/02/24(Sat)13:54:11 No.103063854

Anonymous 11/02/24(Sat)13:54:11 No.103063854

Companies who claim they're pro-open source should give me the hardware to run their models too

Anonymous
11/02/24(Sat)13:55:47 No.103063864

Anonymous 11/02/24(Sat)13:55:47 No.103063864

cactus

Anonymous
11/02/24(Sat)13:57:16 No.103063873

Anonymous 11/02/24(Sat)13:57:16 No.103063873

>>103062045
I find it crazy that we are almost in 2025, and nothing still beats buying a gaming GPU to do this hobby for most individual users.

Anonymous
11/02/24(Sat)13:59:38 No.103063892

Anonymous 11/02/24(Sat)13:59:38 No.103063892

>>103063873
Yeah it's been zero fucks given for hobbyists on all hardware fronts, unfortunately. I know they don't want their enterprise/workstation market to go buying lower price consumer products..but then... just kneecap it on compute and give it lots of VRAM. Make like essentially a 3060 with 48GB VRAM. Problem solved. Too little compute to be worth a damn to any professional client. But gets a lot of hobbyists much deeper into the hobby for a reasonable price.

Anonymous
11/02/24(Sat)14:00:23 No.103063896

Anonymous 11/02/24(Sat)14:00:23 No.103063896

>>103063873
That's what you get when you have one company producing GPUs. No, AMD and Intel don't count.

Anonymous
11/02/24(Sat)14:03:02 No.103063919

Anonymous 11/02/24(Sat)14:03:02 No.103063919

>>103063896
>No, AMD and Intel don't count.
They absolutely do count in the sense that they've done literally nothing to compete for the hobbyist niche which Nvidia has left wide fucking open. They are anti-competitive cunts.

Anonymous
11/02/24(Sat)14:03:43 No.103063925

Anonymous 11/02/24(Sat)14:03:43 No.103063925

>>103063873
the chinese fabs will solve this

Anonymous
11/02/24(Sat)14:04:30 No.103063931

Anonymous 11/02/24(Sat)14:04:30 No.103063931

>>103063892
Surprised Nvidia doesn't go this direction, actually.
Selling a 48GB card so that people can start using CUDA as hobbyists as the next generation of people working on LLM will come from them.
The microsoft or even apple strategy to spread their OSes to students basically.

Anonymous
11/02/24(Sat)14:04:48 No.103063935

Anonymous 11/02/24(Sat)14:04:48 No.103063935

>>103063896
>That's what you get when you have one company producing GPUs. No, AMD and Intel don't count.
and they don't count specifically because we are at the moment in time where you need the bleeding edge tech in order to do the neat thing at all. Shades of late 90s Intel. It will all be commoditized soon. Its too useful not to be.

Anonymous
11/02/24(Sat)14:05:15 No.103063941

Anonymous 11/02/24(Sat)14:05:15 No.103063941

>>103063925
how so

Anonymous
11/02/24(Sat)14:06:17 No.103063950

Anonymous 11/02/24(Sat)14:06:17 No.103063950

>>103063864
cactus

Anonymous
11/02/24(Sat)14:06:55 No.103063955

Anonymous 11/02/24(Sat)14:06:55 No.103063955

what kind of t/s am i looking at serving nemo 12b on a 3060?
i want to serve it for a small project

Anonymous
11/02/24(Sat)14:08:01 No.103063966

Anonymous 11/02/24(Sat)14:08:01 No.103063966

File: hotpotcar.png (2.55 MB, 2045x1369)

2.55 MB PNG

>>103063941
in the same way the chinese market bears making weirdo EVs, it'll bear making weirdo GPUs (one the fabs are up and running)

Anonymous
11/02/24(Sat)14:10:05 No.103063988

Anonymous 11/02/24(Sat)14:10:05 No.103063988

>>103063892
If they proceed with that design, how will they justify offering such low VRAM on their new graphics cards?

Anonymous
11/02/24(Sat)14:11:29 No.103064000

Anonymous 11/02/24(Sat)14:11:29 No.103064000

>>103063955
30+

Anonymous
11/02/24(Sat)14:12:37 No.103064014

Anonymous 11/02/24(Sat)14:12:37 No.103064014

>>103063988
They could have 3 options for 3 different markets:
>top of the line sota enterprise shit
>gaymin with decked out features and optimized for raytracing or whatever
>hobbyist stuff that's slightly faster than macbooks but with a ton of vram

Anonymous
11/02/24(Sat)14:12:57 No.103064021

Anonymous 11/02/24(Sat)14:12:57 No.103064021

>>103064000
is that an estimate from bandwidth or real world performance?
if that's the actual performance that'd be very much sufficient

Anonymous
11/02/24(Sat)14:15:24 No.103064049

Anonymous 11/02/24(Sat)14:15:24 No.103064049

>>103064014
Nvidia was always greedy with vram, so it is unlikely that they'll change their ways now. Remember 970 with 3.5GB that were advertised as 4?

Anonymous
11/02/24(Sat)14:16:53 No.103064066

Anonymous 11/02/24(Sat)14:16:53 No.103064066

>>103063988
>>103064014
I imagine they'll start to make headless vram rich inference engines that aren't economical for training at scale vs their high-end stuff once the right part of the price/performance curve opens up. I don't think there's a way to do it right now without kneecapping their enterprise stuff.

Anonymous
11/02/24(Sat)14:19:10 No.103064090

Anonymous 11/02/24(Sat)14:19:10 No.103064090

>>103064021
Real hardware, linux:
3060 Nemo Q5_K_S 33 t/s
3080 59 t/s
6800xt 32 t/s

Anonymous
11/02/24(Sat)14:20:45 No.103064103

Anonymous 11/02/24(Sat)14:20:45 No.103064103

>>103064090
damn
thanks

Anonymous
11/02/24(Sat)14:27:07 No.103064166

Anonymous 11/02/24(Sat)14:27:07 No.103064166

>>103063966
weird doesn't mean good. it just means they'll churn out a lot of shit GPUs that no one in their right mind would buy

Anonymous
11/02/24(Sat)14:29:37 No.103064185

Anonymous 11/02/24(Sat)14:29:37 No.103064185

Looks like llama 4 might be IT. Pretty sure openai / anthropics big secret is just grokking, aka throwing a absolute ton of compute at it till it overfits then keeping training until it inexplicitly starts generalizing again. Llama 4 is apparently going to get 100x more compute than llama 3. So like 1500 Trillion tokens?

Anonymous
11/02/24(Sat)14:30:02 No.103064188

Anonymous 11/02/24(Sat)14:30:02 No.103064188

File: {34143E5B-3035-40ED-B4CB-(...).png (49 KB, 819x493)

49 KB PNG

Started using qwen2.5-14b-instruct instead of vntl lamma 8b and the quality is much better.
But sometimes it still gets some stuff wrong.
In this sentence, Ril is the one that sent the apology but it keeps getting it wrong. I've tried all kinds of prompts but it never get it right.

Can anyone suggest another model or promp?

Anonymous
11/02/24(Sat)14:31:09 No.103064198

Anonymous 11/02/24(Sat)14:31:09 No.103064198

>>103063966
There will be no software support for chinkshit. They'll try to make the drivers compatible with CUDA for marketing purposes, but the drivers will be full bugs and if you email them about that you'll get replies in chinese runes

Anonymous
11/02/24(Sat)14:32:17 No.103064207

Anonymous 11/02/24(Sat)14:32:17 No.103064207

>>103064185
>llama 4
That'll come out in Q1 2025.

Anonymous
11/02/24(Sat)14:32:34 No.103064211

Anonymous 11/02/24(Sat)14:32:34 No.103064211

>>103064185
You can optimize your compression algorithm as much as you want. At the end of the day if you compress filtered midwit garbage then you'll get filtered midwit models.

Anonymous
11/02/24(Sat)14:32:59 No.103064220

Anonymous 11/02/24(Sat)14:32:59 No.103064220

>>103064185
With such ample computing power, they're able to run various experiments, like, you know, BitNet maybe?

Anonymous
11/02/24(Sat)14:33:04 No.103064221

Anonymous 11/02/24(Sat)14:33:04 No.103064221

>>103064198
that's already better than AMD then

Anonymous
11/02/24(Sat)14:37:56 No.103064271

Anonymous 11/02/24(Sat)14:37:56 No.103064271

>>103064211
I used to think that until I explored just what stuff claude 3.5 sonnet knows. It has utter trash sites and stuff baked in like literotica / fanfiction.net and the like and it still is the best model. Training is what matters.

Anonymous
11/02/24(Sat)14:42:38 No.103064320

Anonymous 11/02/24(Sat)14:42:38 No.103064320

>>103064198
If they have their own fabs, there's a non-zero chance that they could potentially smuggle the 4090 design. The Taiwanese are quite corrupt.

Anonymous
11/02/24(Sat)14:43:17 No.103064327

Anonymous 11/02/24(Sat)14:43:17 No.103064327

>>103064021
I can confirm, but use exl2, avoid llama.cpp when you can fit the entire model in VRAM, the prompt processing is basically instantaneous.

Anonymous
11/02/24(Sat)14:45:45 No.103064349

Anonymous 11/02/24(Sat)14:45:45 No.103064349

>>103064271
That's why Claude beats everything else by miles, because they train on everything for rare tokens. OpenAI and Meta will never get anything good if they keep training on ScaleAI garbage. Llama3 8B didn't know what a greentext was, the 70B version did but I suspect it drew from blog posts and knowyourmeme instead of 4chan, which means they filtered this domain out of their training data.

Anonymous
11/02/24(Sat)14:46:43 No.103064358

Anonymous 11/02/24(Sat)14:46:43 No.103064358

>>103064188
Assuming you can't run 20B+, I guess I would suggest vntl llama3 8b, but the older one, in case you haven't tried it yet.

Anonymous
11/02/24(Sat)14:54:56 No.103064438

Anonymous 11/02/24(Sat)14:54:56 No.103064438

File: {99A56FC5-1702-4B20-A76A-(...).png (5 KB, 693x37)

5 KB PNG

>>103064358
I've tried this one and it was really bad

Anonymous
11/02/24(Sat)14:55:57 No.103064447

Anonymous 11/02/24(Sat)14:55:57 No.103064447

>>103064438
Yeah, take a look at the one without 202409 in the name.

Anonymous
11/02/24(Sat)15:01:34 No.103064504

Anonymous 11/02/24(Sat)15:01:34 No.103064504

>>103064349
Nah, 70B 3.1 knew some stuff better than 3.0 did showing its just a problem of not enough training imo. It needs to "overfit" more to better retain more trivial stuff.

Anonymous
11/02/24(Sat)15:01:47 No.103064505

Anonymous 11/02/24(Sat)15:01:47 No.103064505

>>103064349
>That's why Claude beats everything else by miles
Alright, where is our local edition of Claude?

Anonymous
11/02/24(Sat)15:02:15 No.103064511

Anonymous 11/02/24(Sat)15:02:15 No.103064511

>>103064504
Was it trained or distilled from 405b?

Anonymous
11/02/24(Sat)15:03:01 No.103064522

Anonymous 11/02/24(Sat)15:03:01 No.103064522

>>103064504
Maybe. I tested 3.1 8b and it could make greentexts again.

Anonymous
11/02/24(Sat)15:05:18 No.103064544

Anonymous 11/02/24(Sat)15:05:18 No.103064544

>>103064522
If you have ever tried training a model then you would know they seem to forget stuff they learned earlier then at a certain point it will reemerge often better / more accurate than it was before.

Anonymous
11/02/24(Sat)15:06:39 No.103064555

Anonymous 11/02/24(Sat)15:06:39 No.103064555

File: f062abe3432851657cbe7fb6e(...).jpg (458 KB, 3583x2894)

458 KB JPG

>>103053148
Has anyone tried it? How well does it work?
https://huggingface.co/relaxml/Llama-3.1-405B-Instruct-QTIP-2Bit/discussions/1

Anonymous
11/02/24(Sat)15:06:57 No.103064557

Anonymous 11/02/24(Sat)15:06:57 No.103064557

>>103064447
Seems to translate correctly, but refuses to do more than one line at once even in chat mode. QWEN only got it wrong when I had those two lines consecutive of one another so I couldn't get a proper test

Anonymous
11/02/24(Sat)15:07:45 No.103064565

Anonymous 11/02/24(Sat)15:07:45 No.103064565

>>103064555
I don't know but nylon on feet looks good

Anonymous
11/02/24(Sat)15:08:38 No.103064577

Anonymous 11/02/24(Sat)15:08:38 No.103064577

>>103064544
I think its basically the "neurons" changing and losing connections they once had and later reforming them in a different / better way / place

Anonymous
11/02/24(Sat)15:09:42 No.103064583

Anonymous 11/02/24(Sat)15:09:42 No.103064583

>>103064555
Still uploading it looks like.

Anonymous
11/02/24(Sat)15:10:16 No.103064586

Anonymous 11/02/24(Sat)15:10:16 No.103064586

>>103064557
Translating more than one line at a time is generally a bad idea unless you're using a big model iirc

Anonymous
11/02/24(Sat)15:11:33 No.103064600

Anonymous 11/02/24(Sat)15:11:33 No.103064600

>>103064583
https://huggingface.co/relaxml/Llama-3.1-405B-Instruct-QTIP-2Bit/tree/c37474cce555fe60ded7da1ea254ef19da13bcd1

Anonymous
11/02/24(Sat)15:16:52 No.103064637

Anonymous 11/02/24(Sat)15:16:52 No.103064637

>>103064555
how challenging would it be to generate an image like this

Anonymous
11/02/24(Sat)15:21:57 No.103064686

Anonymous 11/02/24(Sat)15:21:57 No.103064686

>>103064220
They did say llama 4 was supposed to be not only better but faster? Not sure if that means just quantization aware training or what though.

Anonymous
11/02/24(Sat)15:22:17 No.103064687

Anonymous 11/02/24(Sat)15:22:17 No.103064687

>>103064637
If flux wasn't retarded for this stuff it wouldn't be impossible

Anonymous
11/02/24(Sat)15:23:12 No.103064696

Anonymous 11/02/24(Sat)15:23:12 No.103064696

>>103064637
There is Ass Stacking (Pony) lora for that

Anonymous
11/02/24(Sat)15:23:13 No.103064698

Anonymous 11/02/24(Sat)15:23:13 No.103064698

>>103064637
Noobaixl, stack of butts or something tag on danbooru

Anonymous
11/02/24(Sat)15:25:17 No.103064719

Anonymous 11/02/24(Sat)15:25:17 No.103064719

>>103064686
Most likely quantization aware + layer skip + something else, but they could at least try BitNet

Anonymous
11/02/24(Sat)15:25:41 No.103064724

Anonymous 11/02/24(Sat)15:25:41 No.103064724

File: file.png (52 KB, 871x492)

52 KB PNG

https://github.com/EikaMikiku/SillyVoice
Did a thing before voice-voice models take off.
Maybe someone will find it useful.

Anonymous
11/02/24(Sat)15:26:13 No.103064727

Anonymous 11/02/24(Sat)15:26:13 No.103064727

File: 1730491728028790.gif (1.12 MB, 224x224)

1.12 MB GIF

just having a small 3060 server to serve nemo with drawing 40 watts on idle would be 15€/month here in germany (28.8 kwh at 50 cents per kwh)
no wonder all the industry is leaving

Anonymous
11/02/24(Sat)15:29:13 No.103064757

Anonymous 11/02/24(Sat)15:29:13 No.103064757

>>103064727
>GPU: 0% 38°C 6W VRAM: 9.9GB 62%
You can have half of that, 6800xt draws only 6-8W on idle

Anonymous
11/02/24(Sat)15:29:58 No.103064763

Anonymous 11/02/24(Sat)15:29:58 No.103064763

>>103064687
Are there still no Flux NSFW loras?

Anonymous
11/02/24(Sat)15:32:20 No.103064784

Anonymous 11/02/24(Sat)15:32:20 No.103064784

>>103064727
Holy shit the situation is way worse than I thought for energy price in Germany. How are people still pushing for more renewable instead of nuclear is beyond me.
I'd pay 7€ where I live.

Anonymous
11/02/24(Sat)15:33:02 No.103064791

Anonymous 11/02/24(Sat)15:33:02 No.103064791

shilling for monstral at 5bpw. it beat's largestral, all the finetunes, and miqu. i don't know why people are saying it has issues following prompts, it seems to stay on track up to 32k.

Anonymous
11/02/24(Sat)15:33:03 No.103064792

Anonymous 11/02/24(Sat)15:33:03 No.103064792

File: 1424215745865.png (31 KB, 716x302)

31 KB PNG

>>103064757
just having my cpu on idle costs me 6€/month

Anonymous
11/02/24(Sat)15:36:43 No.103064819

Anonymous 11/02/24(Sat)15:36:43 No.103064819

>>103064763
There are some but it's not amazing nor ground breaking.
Maybe it will change with the new sd3.5l and sd3.5m. I hope it does, pony was fine but I want natural language prompting to be a thing, like dalle did.

Anonymous
11/02/24(Sat)15:39:28 No.103064842

Anonymous 11/02/24(Sat)15:39:28 No.103064842

>>103064819
Use this:
https://civitai.com/models/833294?modelVersionId=998979

And new pony is apparently training soon and will be natural language captions as well.

Anonymous
11/02/24(Sat)15:42:27 No.103064873

Anonymous 11/02/24(Sat)15:42:27 No.103064873

i'm salivating of 1Bit Qtip 405b

Anonymous
11/02/24(Sat)15:42:34 No.103064875

Anonymous 11/02/24(Sat)15:42:34 No.103064875

>>103064842
>The creator of this asset requires you to be logged in to download it
Why do we tolerate this kind of racism on the internet?

Anonymous
11/02/24(Sat)15:43:22 No.103064881

Anonymous 11/02/24(Sat)15:43:22 No.103064881

>>103064842
>https://civitai.com/models/833294?modelVersionId=998979
There are so many sdxl finetunes lol. I want that for newer bigger models tbdesu.

>And new pony is apparently training soon and will be natural language captions as well.
Won't be based on sd3.5 nor flux afaik, so who knows what the result will be.

Anonymous
11/02/24(Sat)15:44:50 No.103064894

Anonymous 11/02/24(Sat)15:44:50 No.103064894

>>103064881
This one is legit next level though. It just came out, its better than pony and novelai v3. Just try complicated stuff with it that pony could not do / needed a lora for.

Anonymous
11/02/24(Sat)15:45:12 No.103064898

Anonymous 11/02/24(Sat)15:45:12 No.103064898

did anyone manage to snag any turin engineering samples when they hit ebay briefly?

Anonymous
11/02/24(Sat)15:45:22 No.103064900

Anonymous 11/02/24(Sat)15:45:22 No.103064900

>>103064727
My electricity costs between 45 and 65 cents per kwh. My homelab averages 100w and costs me almost $30/month and it doesn't even have a gpu in it.
I'm in a major city in the US btw.

Anonymous
11/02/24(Sat)15:46:31 No.103064910

Anonymous 11/02/24(Sat)15:46:31 No.103064910

>>103064894
>better than novelai v3
OK that peeks my interest, I never saw any sdxl model able to even touch it

Anonymous
11/02/24(Sat)15:47:04 No.103064912

Anonymous 11/02/24(Sat)15:47:04 No.103064912

>>103064875
https://temp-mail.org/

Anonymous
11/02/24(Sat)15:47:12 No.103064915

Anonymous 11/02/24(Sat)15:47:12 No.103064915

>>103064900
wtf I thought the US had the cheapest energy prices
is it California maybe?

Anonymous
11/02/24(Sat)15:47:18 No.103064916

Anonymous 11/02/24(Sat)15:47:18 No.103064916

>>103064894
what are the recommended cfg/steps/sampler?

Anonymous
11/02/24(Sat)15:48:32 No.103064928

Anonymous 11/02/24(Sat)15:48:32 No.103064928

Just the qtip

Anonymous
11/02/24(Sat)15:50:51 No.103064959

Anonymous 11/02/24(Sat)15:50:51 No.103064959

>>103063919
There is no hobbyist niche ... a couple thousand people simply don't matter.

Anonymous
11/02/24(Sat)15:51:20 No.103064966

Anonymous 11/02/24(Sat)15:51:20 No.103064966

>>103064698
>>103064696
>>103064687
we live in an age of hyper optimization

Anonymous
11/02/24(Sat)15:52:00 No.103064975

Anonymous 11/02/24(Sat)15:52:00 No.103064975

>>103064916
New beta sampler in comfyui is nice. Not sure what it might be called in forge / reforge. Usual 4-7 cfg. Most samplers are 30-50 steps for decent results.

Anonymous
11/02/24(Sat)16:03:01 No.103065065

Anonymous 11/02/24(Sat)16:03:01 No.103065065

>>103064842
why is civitai completely killing my chromium

Anonymous
11/02/24(Sat)16:10:47 No.103065118

Anonymous 11/02/24(Sat)16:10:47 No.103065118

File: 1710004668627709.png (211 KB, 749x650)

211 KB PNG

>>103063001
Indeed. It is worth paying such homage to.
[pretenditworkspoiler]Even though I never watched the film it came from.[/spoiler]

Anonymous
11/02/24(Sat)16:14:07 No.103065155

Anonymous 11/02/24(Sat)16:14:07 No.103065155

Is MGS2 colonel voice avalable șomewhere? I cannot find model online.

Anonymous
11/02/24(Sat)16:14:31 No.103065158

Anonymous 11/02/24(Sat)16:14:31 No.103065158

Is there no easy diffussion equivalent for TTS?

I saw that XTTS is pretty good for voice cloning, but i can't exactly figure out how to install it.

Anonymous
11/02/24(Sat)16:15:53 No.103065169

Anonymous 11/02/24(Sat)16:15:53 No.103065169

>>103065158
Here's 2 options.
https://github.com/effusiveperiscope/GPT-SoVITS

https://github.com/fishaudio/fish-speech/blob/main/Start_Agent.md

Anonymous
11/02/24(Sat)16:16:44 No.103065175

Anonymous 11/02/24(Sat)16:16:44 No.103065175

>>103065169
There's also https://huggingface.co/SWivid/F5-TTS

Anonymous
11/02/24(Sat)16:16:46 No.103065178

Anonymous 11/02/24(Sat)16:16:46 No.103065178

File: 4bc8f6c0-8262.jpg (6 KB, 241x209)

6 KB JPG

>>103065169
Thank you anon. God bless

Anonymous
11/02/24(Sat)16:20:11 No.103065206

Anonymous 11/02/24(Sat)16:20:11 No.103065206

>>103064915
Yep California, gas is expensive as fuck here too.

Anonymous
11/02/24(Sat)16:20:50 No.103065213

Anonymous 11/02/24(Sat)16:20:50 No.103065213

So is QTIP really better than IQ quants?

Anonymous
11/02/24(Sat)16:20:53 No.103065214

Anonymous 11/02/24(Sat)16:20:53 No.103065214

>>103065206
then yeah makes sense
Cali if the most european like in its tax structure

Anonymous
11/02/24(Sat)16:27:14 No.103065273

Anonymous 11/02/24(Sat)16:27:14 No.103065273

>>103065213
maybe

Anonymous
11/02/24(Sat)16:27:42 No.103065276

Anonymous 11/02/24(Sat)16:27:42 No.103065276

>>103064959
>a couple thousand people
There's 8 billion people on the planet. 4 billion if you exclude people living in abject poverty. a 1% niche hobby would have 4 million participants.
Let's say 1% of people are computer enthusiasts and AI is a 10% niche within the computer enthusiast space. You're looking at hundreds of thousands of people. That's why P40s got memed up to half the price of a 3090 because there's a lot of people trying to get their hands on them. Thankfully most of them are too poor to go for 3090s otherwise that market would get wrecked too.

Anonymous
11/02/24(Sat)16:28:44 No.103065283

Anonymous 11/02/24(Sat)16:28:44 No.103065283

>>103065276
oops I fucked up that math.
1% niche would be 40 million participants
So millions of people potentially competing for e-waste computer parts right now.

Anonymous
11/02/24(Sat)16:33:34 No.103065336

Anonymous 11/02/24(Sat)16:33:34 No.103065336

>>103065283
I also heard rumors of Chinese companies hoarding 3090s (and 4090s but there are less of these used).

Anonymous
11/02/24(Sat)16:34:23 No.103065339

Anonymous 11/02/24(Sat)16:34:23 No.103065339

File: 1730138368937693.png (1.37 MB, 1024x1024)

1.37 MB PNG

>>103065276
The availability of affordable GPUs may lead to increased competition with cloud-based AI services, potentially reducing the demand for H100

Anonymous
11/02/24(Sat)16:37:46 No.103065358

Anonymous 11/02/24(Sat)16:37:46 No.103065358

File: pepe-laugh.jpg (6 KB, 231x218)

6 KB JPG

>>103058013
SOVL

Anonymous
11/02/24(Sat)16:41:06 No.103065380

Anonymous 11/02/24(Sat)16:41:06 No.103065380

>>103064975
>New beta sampler in comfyui is nice
what's this called? I actually broke down and took the noodle pill to get a good flux workflow going.

Anonymous
11/02/24(Sat)16:45:36 No.103065403

Anonymous 11/02/24(Sat)16:45:36 No.103065403

>>103065380
beta

Anonymous
11/02/24(Sat)16:48:11 No.103065425

Anonymous 11/02/24(Sat)16:48:11 No.103065425

>>103065118
what movie?

Anonymous
11/02/24(Sat)16:55:06 No.103065473

Anonymous 11/02/24(Sat)16:55:06 No.103065473

>>103065425
Three Days by Sharunas Bartas

Anonymous
11/02/24(Sat)16:57:13 No.103065494

Anonymous 11/02/24(Sat)16:57:13 No.103065494

>>103065155
>MGS voice
just use any tts with a short audio clip. here a super low effort one made with one random 5 second MGS wav from a sound board and the pretrained model gpt-sovits ships with.
https://vocaroo.com/1nK4tXqqlCRO

Anonymous
11/02/24(Sat)16:59:22 No.103065518

Anonymous 11/02/24(Sat)16:59:22 No.103065518

File: IronMiku.png (1.71 MB, 832x1216)

1.71 MB PNG

>>103065065
because your computer is WEAK
(and because their webdev is shit and disrespectful of their users)

Anonymous
11/02/24(Sat)17:12:53 No.103065633

Anonymous 11/02/24(Sat)17:12:53 No.103065633

>>103064555
I wanted to try 70B 2bit long time ago but of course it required some dependency I couldn't install. I suspect this is the same.

Anonymous
11/02/24(Sat)17:14:39 No.103065653

Anonymous 11/02/24(Sat)17:14:39 No.103065653

>>103065633
https://github.com/Cornell-RelaxML/qtip

Anonymous
11/02/24(Sat)17:15:38 No.103065667

Anonymous 11/02/24(Sat)17:15:38 No.103065667

Nemotron vs largestral?
Also, how is the now taken down wizardlm holding up?

Anonymous
11/02/24(Sat)17:19:45 No.103065716

Anonymous 11/02/24(Sat)17:19:45 No.103065716

>>103065667
Go back to /r/localllama, shill.

Anonymous
11/02/24(Sat)17:19:51 No.103065718

Anonymous 11/02/24(Sat)17:19:51 No.103065718

>>103065653
Cool story bro but just search hf for quip quants. You can see nobody used this shit probably because of what I said.

Anonymous
11/02/24(Sat)17:20:57 No.103065730

Anonymous 11/02/24(Sat)17:20:57 No.103065730

>>103065716
sir you forgot to take your pills

Anonymous
11/02/24(Sat)17:21:30 No.103065735

Anonymous 11/02/24(Sat)17:21:30 No.103065735

>>103065667
Get back to /r/eddit

Anonymous
11/02/24(Sat)17:21:35 No.103065738

Anonymous 11/02/24(Sat)17:21:35 No.103065738

>>103065718
https://huggingface.co/collections/relaxml/qtip-quantized-models-66fa253ad3186746f4b62803

Anonymous
11/02/24(Sat)17:37:07 No.103065891

Anonymous 11/02/24(Sat)17:37:07 No.103065891

>>103061671
>>103061671
>>103061671
New thread

Anonymous
11/02/24(Sat)17:42:35 No.103065948

Anonymous 11/02/24(Sat)17:42:35 No.103065948

>>103065667
I personally hate nemotron because it likes to format everything as a list. Largestral doesn't have that problem. Wizard is outdated and dumb.

Anonymous
11/02/24(Sat)17:44:11 No.103065962

Anonymous 11/02/24(Sat)17:44:11 No.103065962

>>103065948
It does not do that if you tell it to RP. Or if you use one of the finetunes. Or you can embrace it and tell it to plan its response first.

Anonymous
11/02/24(Sat)17:57:09 No.103066074

Anonymous 11/02/24(Sat)17:57:09 No.103066074

>>103065891
oh right. newfags don't know that the baker is a mentally ill schizo.

Anonymous
11/02/24(Sat)18:00:14 No.103066096

Anonymous 11/02/24(Sat)18:00:14 No.103066096

>>103065494
Thanks, looks great. Didn't test yet

Anonymous
11/02/24(Sat)18:10:02 No.103066170

Anonymous 11/02/24(Sat)18:10:02 No.103066170

https://www.pcworld.com/article/2504035/security-flaws-found-in-all-nvidia-geforce-gpus-update-drivers-asap.html
Update your drivers. Don't mind the -15% in t/s throughput.

Anonymous
11/02/24(Sat)18:22:07 No.103066291

Anonymous 11/02/24(Sat)18:22:07 No.103066291

any idea on where i can get a local model that does something similar to chatpdf? it essentially takes a pdf and produces a bunch of questions, summaries, and exercises based on the pdf, which tends to be an ebook.

are there any models on hf that can something like that or better?

Anonymous
11/02/24(Sat)18:23:08 No.103066306

Anonymous 11/02/24(Sat)18:23:08 No.103066306

>>103066291
Go to reddit newfriend.

Anonymous
11/02/24(Sat)18:25:14 No.103066330

Anonymous 11/02/24(Sat)18:25:14 No.103066330

>>103066291
Silly Tavern has a functionality called databank.
it's a little rough, but can be used for that, although you'd be better off turning the PDF into cleaned raw text files.
As for the models? Anything you can run with a large context. Stuff like nemo for example.

Anonymous
11/02/24(Sat)18:27:38 No.103066354

Anonymous 11/02/24(Sat)18:27:38 No.103066354

>>103066306
>>103066330
ah so its a matter of front-end functionality, okay thank you. I've been using deepseek v2.5 as a programming tutor so I assume that should suffice, right?

Anonymous
11/02/24(Sat)18:27:47 No.103066358

Anonymous 11/02/24(Sat)18:27:47 No.103066358

>>103066291
Holy retards. Imagine a normalfag hitting you with "Eh bro I just saw the last tesla video, is there a local robot that can do my dishes like that for free? Robotics? Nah I don't want to learn all that stuff, just give me the end product thanks."

Anonymous
11/02/24(Sat)18:29:14 No.103066375

Anonymous 11/02/24(Sat)18:29:14 No.103066375

>>103066354
If you are running it with a large enough context window, sure.
Might want to look for dedicated RAG solution. I remember trying jan.ai one and it was okay I guess.

Anonymous
11/02/24(Sat)18:30:35 No.103066383

Anonymous 11/02/24(Sat)18:30:35 No.103066383

>>103066358
It's not his fault local bros are not developing tools for their LLMs.

Anonymous
11/02/24(Sat)18:32:08 No.103066402

Anonymous 11/02/24(Sat)18:32:08 No.103066402

What models are best for erotic roleplay? I can't find a half decent one.

LLaMa 3.x:
- Can follow structure of roleplay
- Can remain sensical an coherent
- Can NOT stray from "safe" programming

SmolLM2 1.7b:
- Can follow structure of roleplay
- Can NOT remain sensical an coherent
- Can stray from "safe" programming

Unholy v2 13b:
- Can NOT follow structure of roleplay
- Can remain sensical an coherent
- Can stray from "safe" programming

Anonymous
11/02/24(Sat)18:33:44 No.103066415

Anonymous 11/02/24(Sat)18:33:44 No.103066415

>>103066402
You need a 70B model

Anonymous
11/02/24(Sat)18:34:18 No.103066419

Anonymous 11/02/24(Sat)18:34:18 No.103066419

>>103066402
Mistral Small is the best small model.

Anonymous
11/02/24(Sat)18:34:30 No.103066422

Anonymous 11/02/24(Sat)18:34:30 No.103066422

>>103066402
Try Behemoth-123B-v1.1

Anonymous
11/02/24(Sat)18:35:30 No.103066435

Anonymous 11/02/24(Sat)18:35:30 No.103066435

>>103066402
>- Can NOT stray from "safe" programming
Prompt issue.

Anonymous
11/02/24(Sat)18:35:35 No.103066436

Anonymous 11/02/24(Sat)18:35:35 No.103066436

>>103066358
got it, thanks
>>103066375
thank you, ill give this a try. i tried silly tavern a while ago and was completely filtered (especially the API key part). this site seems to dumb it down for retards like myself.

Anonymous
11/02/24(Sat)18:40:04 No.103066487

Anonymous 11/02/24(Sat)18:40:04 No.103066487

>>103066415
I can't handle 1 tok/s
>>103066435
no it's not, faggot. there's no magic prompt that unfucks llama 3.x

Anonymous
11/02/24(Sat)18:40:52 No.103066494

Anonymous 11/02/24(Sat)18:40:52 No.103066494

File: 1714976103618856.png (270 KB, 1717x1517)

270 KB PNG

>>103066487
Explain this image.

Anonymous
11/02/24(Sat)18:40:59 No.103066498

Anonymous 11/02/24(Sat)18:40:59 No.103066498

>>103066487
>I can't handle 1 tok/s
Try 30Bs. Not as good but the next best option

Anonymous
11/02/24(Sat)18:44:23 No.103066528

Anonymous 11/02/24(Sat)18:44:23 No.103066528

>>103066494
looks pretty cringe, desu

Anonymous
11/02/24(Sat)18:45:50 No.103066547

Anonymous 11/02/24(Sat)18:45:50 No.103066547

>>103066170
>privilege escalation
Nothingburger unless you rent out your hardware.

Anonymous
11/02/24(Sat)18:53:46 No.103066620

Anonymous 11/02/24(Sat)18:53:46 No.103066620

File: {31A65F8A-9DC8-4849-B6ED-(...).png (31 KB, 1210x180)

31 KB PNG

Why does it do this
Fucking hella good translation, it just fucking forgets to keep translating

Anonymous
11/02/24(Sat)18:57:42 No.103066654

Anonymous 11/02/24(Sat)18:57:42 No.103066654

>>103066620
(Translator's note: keikaku means plan)

Anonymous
11/02/24(Sat)19:17:54 No.103066806

Anonymous 11/02/24(Sat)19:17:54 No.103066806

File: 1711668129691197.png (1.21 MB, 2160x2160)

1.21 MB PNG

>>103066795
>>103066795
>>103066795
New thread

Anonymous
11/02/24(Sat)19:25:03 No.103066871

Anonymous 11/02/24(Sat)19:25:03 No.103066871

>>103066620
Let me guess, that's Qwen

Anonymous
11/02/24(Sat)19:26:35 No.103066891

Anonymous 11/02/24(Sat)19:26:35 No.103066891

>>103066871
yes, works great until it decides not to

Anonymous
11/02/24(Sat)19:40:01 No.103066992

Anonymous 11/02/24(Sat)19:40:01 No.103066992

>>103066620
Use something that lets you see the token probabilities, it might have been a low probability token, and you can adjust min_p/top_k for that.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.