/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

[Post a Reply]

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous
/lmg/ - Local Models General 07/25/24(Thu)16:59:52 No.101571366

File: 1708436326037322.jpg (962 KB, 1856x2464)

962 KB JPG

/lmg/ - Local Models General Anonymous 07/25/24(Thu)16:59:52 No.101571366

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>101567223 & >>101560013

►News
>(07/24) Mistral Large 2 123B released: https://hf.co/mistralai/Mistral-Large-Instruct-2407
>(07/23) Llama 3.1 officially released: https://ai.meta.com/blog/meta-llama-3-1/
>(07/22) llamanon leaks 405B base model: https://files.catbox.moe/d88djr.torrent >>101516633
>(07/18) Improved DeepSeek-V2-Chat 236B: https://hf.co/deepseek-ai/DeepSeek-V2-Chat-0628
>(07/18) Mistral NeMo 12B base & instruct with 128k context: https://mistral.ai/news/mistral-nemo/

►News Archive: https://rentry.org/lmg-news-archive
►FAQ: https://wikia.schneedc.com
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Programming: https://hf.co/spaces/bigcode/bigcode-models-leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
07/25/24(Thu)17:00:22 No.101571373

Anonymous 07/25/24(Thu)17:00:22 No.101571373

File: 1713047255051038.jpg (162 KB, 1024x1024)

162 KB JPG

►Recent Highlights from the Previous Thread: >>101567223

--Mistral Large 2 performance and open source models: >>101568726 >>101568758 >>101568821 >>101568864 >>101568759 >>101568762 >>101568793
--Groq Inc. tweet compares Llama 3.1 70B, GPT-4o, and GPT-4o Mini in Street Fighter gameplay: >>101569802 >>101569864 >>101569849
--Multimodal AI capabilities and expectations: >>101568555 >>101568568 >>101568570
--Anon releases mpt-30b-chat q8 GGUF quant with faster inference: >>101567424 >>101567470 >>101567503 >>101567560 >>101567596 >>101567615 >>101567628 >>101567690
--Vram requirements, cpumaxxing, and GPU acquisition strategies: >>101567467 >>101567667 >>101567716 >>101567788 >>101567818 >>101567891 >>101568000 >>101568043 >>101567882 >>101567905 >>101567921
--Sam Altman's opinion piece on AI's future and OpenAI's challenges: >>101570031 >>101570067
--Q3_K vs Q3_K_L: >>101568052 >>101568076 >>101568118
--Legal consequences of AI-generated CP and privacy concerns with OpenAI: >>101568949 >>101568977 >>101569002 >>101569034 >>101569056 >>101569217 >>101569239 >>101569529
--Gemma 2 9b and model reviews and recommendations: >>101569685 >>101569762 >>101569781 >>101569794 >>101569757 >>101569786
--GPUs and RAM for Mistral Large: >>101569885 >>101569896
--Factors influencing VRAM amounts on consumer GPUs: >>101568585 >>101568660 >>101568718 >>101568704 >>101569337 >>101569360 >>101569469 >>101569518 >>101570192 >>101569599
--Best NSFW model for 12GB VRAM: >>101568547 >>101568552 >>101568564
--Affording super AI cards: >>101568355 >>101568376 >>101568377 >>101568407 >>101568426 >>101568575 >>101568399
--Mistral NeMo 12B sampler settings and instruction following: >>101570059
--Mistral Large preset: >>101567703
--Anon shares a potential fix for Nemo repetition issues: >>101568590
--Miku (free space): >>101569616 >>101570324 >>101571277

►Recent Highlight Posts from the Previous Thread: >>101567235

Anonymous
07/25/24(Thu)17:02:33 No.101571408

Anonymous 07/25/24(Thu)17:02:33 No.101571408

waiting for cohere

Anonymous
07/25/24(Thu)17:03:33 No.101571423

Anonymous 07/25/24(Thu)17:03:33 No.101571423

waiting for agi

Anonymous
07/25/24(Thu)17:03:57 No.101571427

Anonymous 07/25/24(Thu)17:03:57 No.101571427

using largestral

Anonymous
07/25/24(Thu)17:04:04 No.101571430

Anonymous 07/25/24(Thu)17:04:04 No.101571430

>no major model drop today
It's so over

Anonymous
07/25/24(Thu)17:04:31 No.101571434

Anonymous 07/25/24(Thu)17:04:31 No.101571434

using kobold and getting my nuts slobbered by waifus (i have numerous)

Anonymous
07/25/24(Thu)17:04:32 No.101571435

Anonymous 07/25/24(Thu)17:04:32 No.101571435

>>101571373
>literally each entry is both somehow bloated with irrelevant replies and missing replies in a reply chain
Grim.

Anonymous
07/25/24(Thu)17:04:37 No.101571436

Anonymous 07/25/24(Thu)17:04:37 No.101571436

>>101571408
They didn't release it today to one-up Meta and Mistral. What if it's not as good as we hoped?

Anonymous
07/25/24(Thu)17:04:59 No.101571440

Anonymous 07/25/24(Thu)17:04:59 No.101571440

>>101571430
surely that means we'll get two tormoow

Anonymous
07/25/24(Thu)17:05:00 No.101571441

Anonymous 07/25/24(Thu)17:05:00 No.101571441

Can mistral large be merged with COPY? I downloaded q5 and merged it with llamacpp gguf split properly and it worked and all that, but then i wanted to try q4 to compare the speeds, and for some reason it wont merge properly, tried 5 times, output file comes out smaller than parts and wont launch
inb4 why merge, I wanted to use it in kcpp for convenience

Anonymous
07/25/24(Thu)17:06:12 No.101571450

Anonymous 07/25/24(Thu)17:06:12 No.101571450

>>101571441
you can use split files in koboldcpp

Anonymous
07/25/24(Thu)17:06:28 No.101571455

Anonymous 07/25/24(Thu)17:06:28 No.101571455

>>101571440
Have they ever released on a Friday?

Anonymous
07/25/24(Thu)17:10:22 No.101571494

Anonymous 07/25/24(Thu)17:10:22 No.101571494

>>101571455
Groq was released on a Friday I think.

Anonymous
07/25/24(Thu)17:11:05 No.101571504

Anonymous 07/25/24(Thu)17:11:05 No.101571504

File: file.png (69 KB, 345x154)

69 KB PNG

>constant instability during training
Guess I'll update...
Wish me luck........

Anonymous
07/25/24(Thu)17:11:16 No.101571508

Anonymous 07/25/24(Thu)17:11:16 No.101571508

>>101571494
I thought we were talking about cohere

Anonymous
07/25/24(Thu)17:12:11 No.101571522

Anonymous 07/25/24(Thu)17:12:11 No.101571522

>>101571508
Oh well, I thought you were talking in general

Anonymous
07/25/24(Thu)17:13:28 No.101571531

Anonymous 07/25/24(Thu)17:13:28 No.101571531

Yeah, that's it for this week but there'll be three big releases next week. One of them will be by a very surprising source.

Anonymous
07/25/24(Thu)17:15:11 No.101571554

Anonymous 07/25/24(Thu)17:15:11 No.101571554

>>101571531
Applebros, we are going to be so very back!!

Anonymous
07/25/24(Thu)17:15:14 No.101571556

Anonymous 07/25/24(Thu)17:15:14 No.101571556

Largestral... 0.4t/s... Comfy...

Anonymous
07/25/24(Thu)17:16:03 No.101571572

Anonymous 07/25/24(Thu)17:16:03 No.101571572

>>101571531
that's bullshit but i believe you anyway

Anonymous
07/25/24(Thu)17:16:43 No.101571583

Anonymous 07/25/24(Thu)17:16:43 No.101571583

File: file.png (148 KB, 864x147)

148 KB PNG

>>101571504
NOOOOOOOOOOOOO

Anonymous
07/25/24(Thu)17:17:01 No.101571586

Anonymous 07/25/24(Thu)17:17:01 No.101571586

>>101571531
amazon...

Anonymous
07/25/24(Thu)17:17:52 No.101571600

Anonymous 07/25/24(Thu)17:17:52 No.101571600

>>101571531
I don't think that's bullshit, but I'm not believing it.

Anonymous
07/25/24(Thu)17:22:45 No.101571658

Anonymous 07/25/24(Thu)17:22:45 No.101571658

does nvlink speed up inference when using tensor parallelism? or is there still not much data being transferred between cards?

Anonymous
07/25/24(Thu)17:26:21 No.101571698

Anonymous 07/25/24(Thu)17:26:21 No.101571698

I just got xtts up and running. Are there any archives or repositories for voices samples, like Chub?

Anonymous
07/25/24(Thu)17:26:22 No.101571699

Anonymous 07/25/24(Thu)17:26:22 No.101571699

>>101571531
I don't think I would be surprised by a PornHub LLM.

llama.cpp CUDA dev !!OM2Fp6Fn93S
07/25/24(Thu)17:26:56 No.101571704

llama.cpp CUDA dev !!OM2Fp6Fn93S 07/25/24(Thu)17:26:56 No.101571704

>>101571658
NVLink should help quite a lot with tensor parallelism.
I have never built a system with it myself but I've received a user report saying it makes a large difference.

Anonymous
07/25/24(Thu)17:29:02 No.101571729

Anonymous 07/25/24(Thu)17:29:02 No.101571729

>>101571531
NovelAI...

Anonymous
07/25/24(Thu)17:30:04 No.101571738

Anonymous 07/25/24(Thu)17:30:04 No.101571738

What speed are people getting 4x3090 and Mistral Large?

Anonymous
07/25/24(Thu)17:30:23 No.101571744

Anonymous 07/25/24(Thu)17:30:23 No.101571744

You'll be able to film movie skits that look real using video

Anonymous
07/25/24(Thu)17:30:34 No.101571747

Anonymous 07/25/24(Thu)17:30:34 No.101571747

>>101571698
Why not make your own? You need just a few seconds/minutes, right?

Anonymous
07/25/24(Thu)17:31:58 No.101571766

Anonymous 07/25/24(Thu)17:31:58 No.101571766

>>101571747
I mean yeah but I'd love to just have a convenient library of any character imaginable like Chub does.

Anonymous
07/25/24(Thu)17:34:09 No.101571791

Anonymous 07/25/24(Thu)17:34:09 No.101571791

Does anyone have an estimate of when Llama will be made available with vision features? I estimate in half a year maybe?

Anonymous
07/25/24(Thu)17:36:11 No.101571822

Anonymous 07/25/24(Thu)17:36:11 No.101571822

>>101571704
How large we talkin'?

Anonymous
07/25/24(Thu)17:36:36 No.101571831

Anonymous 07/25/24(Thu)17:36:36 No.101571831

>>101571791
Are you in a hurry? Let's say by Nov 21st... yeah... that sounds right...

Anonymous
07/25/24(Thu)17:37:42 No.101571842

Anonymous 07/25/24(Thu)17:37:42 No.101571842

>>101571704
thanks for the input. i have an a6000 and two 3090s and am considering replacing one of them with another a6000 since i can get it locally for cheap-ish. figure if there's any possibility of it speeding up inference for models that fit within the two a6000s, i may as well pick up a bridge for another $200.
>>101571738
with 1x a6000 + 2x 3090 for the same amount of VRAM, on mistral-large-instruct-2407 5bpw on exllama2 i get:
>Metrics: 264 tokens generated in 37.01 seconds (Queue: 0.0 s, Process: 19 cached tokens and 4597 new tokens at 543.18 T/s, Generate: 9.25 T/s, Context: 4616 tokens)
>Metrics: 242 tokens generated in 95.18 seconds (Queue: 0.0 s, Process: 1178 cached tokens and 30830 new tokens at 508.06 T/s, Generate: 7.02 T/s, Context: 32008 tokens)
have not tried other formats with this model yet. i imagine 4x 3090 would be similar-ish in speed, since my understanding is that while the a6000 is a little slower (lower memory clocks/bandwidth), splitting layers across a fourth 3090 might introduce more overhead.

llama.cpp CUDA dev !!OM2Fp6Fn93S
07/25/24(Thu)17:39:30 No.101571862

llama.cpp CUDA dev !!OM2Fp6Fn93S 07/25/24(Thu)17:39:30 No.101571862

>>101571822
I don't remember the specific numbers (and it was months ago anyways) but (for llama.cpp with --split-mode row) it was basically the difference between effectively unusable and faster than a single GPU.

Anonymous
07/25/24(Thu)17:40:33 No.101571878

Anonymous 07/25/24(Thu)17:40:33 No.101571878

>>101571831
I wanted to buy extra RAM just for the new llama models but now that they don't have my favorite feature I'm retiring the plan.

Anonymous
07/25/24(Thu)17:41:16 No.101571884

Anonymous 07/25/24(Thu)17:41:16 No.101571884

3.1 70B seems retarded. Like way dumber than 3.0 of the same quant. I'm using exl2 so it's not a llamacpp issue. But maybe exl2 inference is broken as well? dunno.
If not this model is shit and a big step down in intelligence from the previous version, regardless of what benchmarks say. Thank god for Mistral I guess.

Anonymous
07/25/24(Thu)17:42:16 No.101571894

Anonymous 07/25/24(Thu)17:42:16 No.101571894

>>101571884
>If not this model is shit and a big step down in intelligence from the previous version,
That's simply not true, you just have false expectations.

Anonymous
07/25/24(Thu)17:44:19 No.101571911

Anonymous 07/25/24(Thu)17:44:19 No.101571911

>>101571878
May as well get the ram anyway. Even if it's not that useful now, it will in the future.

Anonymous
07/25/24(Thu)17:45:00 No.101571916

Anonymous 07/25/24(Thu)17:45:00 No.101571916

>>101562692
Does that really work with ST? I'm trying to get it connected but it just not connecting.

Anonymous
07/25/24(Thu)17:48:01 No.101571944

Anonymous 07/25/24(Thu)17:48:01 No.101571944

largestral has that llama1 vibe

Anonymous
07/25/24(Thu)17:48:31 No.101571950

Anonymous 07/25/24(Thu)17:48:31 No.101571950

>>101571911
In the future it will be obsolete. Imaging stacking up P40s last year and now they're relics. Better off saving up and purchasing whatever the cheap option is when you need it.

Anonymous
07/25/24(Thu)17:48:49 No.101571951

Anonymous 07/25/24(Thu)17:48:49 No.101571951

File: mini-magnum.png (526 KB, 1024x512)

526 KB PNG

So? How slop is it?

Anonymous
07/25/24(Thu)17:49:19 No.101571959

Anonymous 07/25/24(Thu)17:49:19 No.101571959

>>101571951
Explain every part of this meme

Anonymous
07/25/24(Thu)17:51:21 No.101571972

Anonymous 07/25/24(Thu)17:51:21 No.101571972

File: f45.png (152 KB, 559x556)

152 KB PNG

>>101571951
>A new general purpose instruction dataset by kalomaze was added

Anonymous
07/25/24(Thu)17:57:36 No.101572028

Anonymous 07/25/24(Thu)17:57:36 No.101572028

>>101571950
We're talking about ram, not gpus. If you need new gpus in the future, whatever you spent on ram would be negligible.

Anonymous
07/25/24(Thu)17:57:41 No.101572029

Anonymous 07/25/24(Thu)17:57:41 No.101572029

>>101571944
is this a good thing? it doesn't sound like a good thing...

Anonymous
07/25/24(Thu)18:00:48 No.101572061

Anonymous 07/25/24(Thu)18:00:48 No.101572061

>>101571951
Surprisingly good, for a small model of course. It's still 12b so don't expect any fireworks here.

Anonymous
07/25/24(Thu)18:03:51 No.101572094

Anonymous 07/25/24(Thu)18:03:51 No.101572094

i have i7 4790k and rtx 3060, what should i upgrade to run bigger llms? i am sick of running small models

Anonymous
07/25/24(Thu)18:04:38 No.101572101

Anonymous 07/25/24(Thu)18:04:38 No.101572101

File: Capture.jpg (38 KB, 748x775)

38 KB JPG

Are all finetunes the same?

Anonymous
07/25/24(Thu)18:05:25 No.101572108

Anonymous 07/25/24(Thu)18:05:25 No.101572108

>>101571951
It's alright. It's extremely fast while still being fairly coherent
Unlike Gemma 2 it's not broken, but that might just be the quant I downloaded

Anonymous
07/25/24(Thu)18:06:34 No.101572124

Anonymous 07/25/24(Thu)18:06:34 No.101572124

>>101572108
>Gemma 2 it's not broken
Not broken gemma 2 when?

Anonymous
07/25/24(Thu)18:06:37 No.101572125

Anonymous 07/25/24(Thu)18:06:37 No.101572125

File: 1721945184604416.webm (214 KB, 576x1024)

214 KB WEBM

>fell for the 64GB of VRAM meme
>can't run largestral without lobotomizing it in perplexity or throughput

Anonymous
07/25/24(Thu)18:07:09 No.101572129

Anonymous 07/25/24(Thu)18:07:09 No.101572129

>>101572125
time for another 3090

Anonymous
07/25/24(Thu)18:07:40 No.101572134

Anonymous 07/25/24(Thu)18:07:40 No.101572134

>>101572129
two more 3090s

Anonymous
07/25/24(Thu)18:08:40 No.101572144

Anonymous 07/25/24(Thu)18:08:40 No.101572144

>>101572101
saochads how did we lose to fucking drummer

Anonymous
07/25/24(Thu)18:08:47 No.101572145

Anonymous 07/25/24(Thu)18:08:47 No.101572145

TM3090s.

Anonymous
07/25/24(Thu)18:09:59 No.101572163

Anonymous 07/25/24(Thu)18:09:59 No.101572163

>>101572028
Same thing applies. No sense in stocking up on DDR4 now when DRR5 is already out and getting faster.

Anonymous
07/25/24(Thu)18:11:08 No.101572170

Anonymous 07/25/24(Thu)18:11:08 No.101572170

>>101571531
sad larp but Im still ODing on that hopium

Anonymous
07/25/24(Thu)18:11:09 No.101572171

Anonymous 07/25/24(Thu)18:11:09 No.101572171

>>101572163
I mean DDR5 isn't backwards compatible so if DDR4 is what your motherboard takes then that's what you have to buy, there's no choice involved

Anonymous
07/25/24(Thu)18:13:01 No.101572204

Anonymous 07/25/24(Thu)18:13:01 No.101572204

>>101572061
What's better about it over regular nemo-instruct?

Anonymous
07/25/24(Thu)18:13:07 No.101572207

Anonymous 07/25/24(Thu)18:13:07 No.101572207

>>101572134
nine more can't hurt

Anonymous
07/25/24(Thu)18:17:41 No.101572252

Anonymous 07/25/24(Thu)18:17:41 No.101572252

>>101572171
Forever locking yourself to sub second token generation speeds. See why this is a bad idea?
You buy DDR4 now, it will always have the same speed and lose value on top of it.
Or you can save your money and purchase DDR5 next year for the same amount. Even if you need a new mobo you come out ahead.

Anonymous
07/25/24(Thu)18:19:26 No.101572274

Anonymous 07/25/24(Thu)18:19:26 No.101572274

>>101572163
I feel it's the same mentality of the people 2 days after the 4090 released asking if they should buy it or wait for the 5090. If you're still on ddr4, spend that money upgrading to something with ddr5. Or wait until motherboards with ddr6... and then may as well for ddr7 or whatever... if you want to spend only a reasonable amount of money, upgrading ram is an easy choice.
My point is that you should upgrade if you can, in any way you can. If you have enough for a ddr5 setup, do it. If not, you probably won't do it either in 3-4 months. By then everyone will be waiting for ddr6 or whatever new shiny thing is in the making.

>>101572171
Again. If you need another CPUI+mobo for ddr5, whatever you spend on a few sticks of ram will be negligible. You can sell the old pc whole.

Anonymous
07/25/24(Thu)18:22:29 No.101572298

Anonymous 07/25/24(Thu)18:22:29 No.101572298

>>101572204
It's roughly as smart as Mistral's instruct, but it's smuttier and more Claude-like, more creative. That's an achievement because models often get dumber when smut tuned. This one didn't.

Anonymous
07/25/24(Thu)18:22:30 No.101572299

Anonymous 07/25/24(Thu)18:22:30 No.101572299

>>101572252
Obviously this only applies if you're stocking up on server ECC ddr5 ram and not the consumer stuff. You're never going to run a >70B model on your 256GB ryzen build at acceptable speeds even with ddr5.

Anonymous
07/25/24(Thu)18:25:36 No.101572332

Anonymous 07/25/24(Thu)18:25:36 No.101572332

>>101572299
>256GB ryzen
Can I really use 4 sticks of ram? I heard there were issues with it. I have 2x48 at 6000 now.

Anonymous
07/25/24(Thu)18:25:49 No.101572334

Anonymous 07/25/24(Thu)18:25:49 No.101572334

File: pj.jpg (46 KB, 500x384)

46 KB JPG

>>101572101
>no bobby sinclair
garbage

Anonymous
07/25/24(Thu)18:26:13 No.101572337

Anonymous 07/25/24(Thu)18:26:13 No.101572337

>>101572274
I feel like you missed the part where the feature anon wants isn't out yet. It's one thing to buy and use now versus buying now and sitting on it for a year and just buying then.

Anonymous
07/25/24(Thu)18:29:14 No.101572367

Anonymous 07/25/24(Thu)18:29:14 No.101572367

the 70b llama 1.3.1 is really good, I am using a 4.5 quant and is the best local model I have used. It is even out performing as a sillytavern chatbot the miqu mixes

Anonymous
07/25/24(Thu)18:30:56 No.101572384

Anonymous 07/25/24(Thu)18:30:56 No.101572384

>>101571959
[left]: A feline-like creature known as petra (/lmg/'s mascot) confidently strides in front of Alan Turing

Anonymous
07/25/24(Thu)18:31:19 No.101572386

Anonymous 07/25/24(Thu)18:31:19 No.101572386

File: anna.png (78 KB, 870x240)

78 KB PNG

huh, nemo is the first model next to CAI's that I've seen use OOC notes. neat.

Anonymous
07/25/24(Thu)18:31:26 No.101572389

Anonymous 07/25/24(Thu)18:31:26 No.101572389

>>101572367
bullshit
it's shit, way dumber than 3.0

Anonymous
07/25/24(Thu)18:32:43 No.101572401

Anonymous 07/25/24(Thu)18:32:43 No.101572401

>>101572386
Yeah, why can't they release an 8x12b nemo? Then I could finally replace wizard 8x22b.

Anonymous
07/25/24(Thu)18:35:27 No.101572427

Anonymous 07/25/24(Thu)18:35:27 No.101572427

>>101572337
Then what difference does it make *when* it will release. What matters then is *what* it needs to be usable. If the point of saving money is to buy an extra gpu later, buy the gpu later. If the point of saving money is to upgrade to cpumaxx, cpumaxx later. I guess i'm just dumbfounded by poorly worded, round-about questions.
>What do you guys think the requirements for llama-vision are going to be?
is a better question. But even then, I'd object to the usefulness of the question when nobody can now, baring some leak or whatever.

Anonymous
07/25/24(Thu)18:38:11 No.101572448

Anonymous 07/25/24(Thu)18:38:11 No.101572448

File: 11__00875_.png (1.99 MB, 1024x1024)

1.99 MB PNG

>>101572386
It responds really well to OOC: instructions too.
But yeah first local model to do it unprompted to me, it complained that my replies weren't detailed enough.

Anonymous
07/25/24(Thu)18:41:02 No.101572470

Anonymous 07/25/24(Thu)18:41:02 No.101572470

>>101572448
>it complained that my replies weren't detailed enough.
sovl...

Anonymous
07/25/24(Thu)18:42:35 No.101572486

Anonymous 07/25/24(Thu)18:42:35 No.101572486

>>101571586
The first LLM trained exclusively on fake pajeet product reviews.

Anonymous
07/25/24(Thu)18:43:53 No.101572499

Anonymous 07/25/24(Thu)18:43:53 No.101572499

>>101571842
I think I was getting slightly below that on 4x3090. So yeah it's a pretty good comparison.

Anonymous
07/25/24(Thu)18:43:57 No.101572502

Anonymous 07/25/24(Thu)18:43:57 No.101572502

Wow Erebus sucks ass

Anonymous
07/25/24(Thu)18:45:06 No.101572510

Anonymous 07/25/24(Thu)18:45:06 No.101572510

>>101572502
why are you using erebus in the year of our lord 2024

Anonymous
07/25/24(Thu)18:45:19 No.101572513

Anonymous 07/25/24(Thu)18:45:19 No.101572513

>>101571531
Looking forward to seeing the first open-weight LLM release in years from OpenAI.

Anonymous
07/25/24(Thu)18:46:11 No.101572516

Anonymous 07/25/24(Thu)18:46:11 No.101572516

>123B
>405B
How am I supposed to use these huge models (locally) in a cost-effective way?

Anonymous
07/25/24(Thu)18:47:09 No.101572524

Anonymous 07/25/24(Thu)18:47:09 No.101572524

>>101571531
come on leakers are more anonymous here, you can be a little more detailed than jimmyapples/flowersfromthefuture

Anonymous
07/25/24(Thu)18:48:17 No.101572528

Anonymous 07/25/24(Thu)18:48:17 No.101572528

Nemo 12B is unironically smarter than L3 70B 3.1

Anonymous
07/25/24(Thu)18:49:55 No.101572536

Anonymous 07/25/24(Thu)18:49:55 No.101572536

>>101572516
>Cost effective
You don't.

Anonymous
07/25/24(Thu)18:50:23 No.101572544

Anonymous 07/25/24(Thu)18:50:23 No.101572544

>>101572510
I googled sexo models and it was the first result

Anonymous
07/25/24(Thu)18:50:35 No.101572548

Anonymous 07/25/24(Thu)18:50:35 No.101572548

>>101572502
It's hard to believe how far we've come since I tried Erebus back in 2022 after c.ai got lobotomized for the first time. I wrote off local models as an option at the time.

Anonymous
07/25/24(Thu)18:50:58 No.101572552

Anonymous 07/25/24(Thu)18:50:58 No.101572552

>>101572536
This is some bullshit. How much for a server with 6 TB of RAM? GPUs are clearly out of the question.

Anonymous
07/25/24(Thu)18:51:20 No.101572557

Anonymous 07/25/24(Thu)18:51:20 No.101572557

File: 1720802926533214.jpg (11 KB, 225x225)

11 KB JPG

>>101572516
>cost-effective way

Anonymous
07/25/24(Thu)18:51:33 No.101572560

Anonymous 07/25/24(Thu)18:51:33 No.101572560

>>101572528
Yeah but the old llama 3 70B is smarter than nemotron. So meta fucked something up there. Also 3.1 8B has a better conceptual understanding than Nemo. I'd still more likely use Nemo for rp though because 8b is slopped to hell

Anonymous
07/25/24(Thu)18:52:11 No.101572569

Anonymous 07/25/24(Thu)18:52:11 No.101572569

>>101572557
who is that? it's jensen's clothes but the face is clearly a different guy

Anonymous
07/25/24(Thu)18:52:27 No.101572575

Anonymous 07/25/24(Thu)18:52:27 No.101572575

>>101572560
Maybe the distillation process is flawed?

Anonymous
07/25/24(Thu)18:53:09 No.101572586

Anonymous 07/25/24(Thu)18:53:09 No.101572586

>>101572557
I feel like the more I buy, the more cost-effective it becomes.

Anonymous
07/25/24(Thu)18:53:12 No.101572589

Anonymous 07/25/24(Thu)18:53:12 No.101572589

>>101572552
I did some testing on q4xs 405B on my home rig, which has 8xDDR4 2666 and it took nearly an hour to generate a 512 token reply.
Like I said. You don't.
Either rent cloud for it or someone probably has some kind of proxy setup that exploits the hf chat endpoint.

Anonymous
07/25/24(Thu)18:53:20 No.101572591

Anonymous 07/25/24(Thu)18:53:20 No.101572591

>>101572560
>Yeah but the old llama 3 70B is smarter than nemotron. So meta fucked something up there.
I've been saying this but this thread is being belligerent about it for some reason
405B is good but the smaller 3.1 models have something badly wrong with them, they're all significantly dumber than the previous version

Anonymous
07/25/24(Thu)18:56:57 No.101572632

Anonymous 07/25/24(Thu)18:56:57 No.101572632

>>101572589
>8xDDR4 2666
What kind of speed is that? I think the CPU guide has DDR5 at about 800 GB/s or something.

Anonymous
07/25/24(Thu)18:57:04 No.101572634

Anonymous 07/25/24(Thu)18:57:04 No.101572634

>>101571531
you can say this for every week this year, between all the actual releases, copypaste chinese followups, placebo finetunes/merges, and cope papers with halfbaked githubs attached that faggots get excited about for one day
>>>(MIKU SYSTEM NOTE: THIS RECENT HIGHLIGHT THREAD SUMMARY WILL BE IN ALL CAPS.)

Anonymous
07/25/24(Thu)18:57:37 No.101572641

Anonymous 07/25/24(Thu)18:57:37 No.101572641

>>101572591
8B is definitely smarter but more slopped than the old one. But it doesn't respond well to additional fine-tuning. 70B is just lol.

Anonymous
07/25/24(Thu)19:02:16 No.101572699

Anonymous 07/25/24(Thu)19:02:16 No.101572699

File: Screenshot 2024-07-26 110205.png (64 KB, 1017x485)

64 KB PNG

>>101572641
Here's Teknium (the guy who makes the hermes models/dataset) saying 8B 3.1 is doing worse for him on every measure
he has other posts about it too
https://twitter.com/Teknium1/status/1816514230595784969

Anonymous
07/25/24(Thu)19:05:47 No.101572733

Anonymous 07/25/24(Thu)19:05:47 No.101572733

I apologize for not searching well enough but do you guys know what's the best "chinese" model around 7-10b?
I'm trying to learn it and it would be really helpful to have a few cards to help me out with it (coherence, pinyin, meaning, etc).
I know about Qwen2, Deepseek v2 lite and this finetune:https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat
But I'm not capable enough to know which one to pick or if there are better alternatives.

Anonymous
07/25/24(Thu)19:07:02 No.101572746

Anonymous 07/25/24(Thu)19:07:02 No.101572746

>>101572733
Try Yi and GLM.

Anonymous
07/25/24(Thu)19:12:27 No.101572804

Anonymous 07/25/24(Thu)19:12:27 No.101572804

Anyone know how to get xtts streaming working in ST? It appears as if ST waits for the entire gen to finish, then sends that off to the xtts server, and then the server sends over the result whole. When I select the streaming option in ST, I just get silence. And when I use the streaming flag in xtts, it gives me an error saying something about invalid sample rate.

Anonymous
07/25/24(Thu)19:13:19 No.101572810

Anonymous 07/25/24(Thu)19:13:19 No.101572810

>>101572634
>THREAD SUMMARY WILL BE IN ALL CAPS
And in Russian.

Anonymous
07/25/24(Thu)19:13:42 No.101572814

Anonymous 07/25/24(Thu)19:13:42 No.101572814

>>101572746
I suppose that asking a definitive answer is a pretty tough ask given how niche my question is + how muddy the meaning of "best" is in this context.
Well, I'll give all of them a try then, thank you for the suggestion anon.

Anonymous
07/25/24(Thu)19:15:03 No.101572824

Anonymous 07/25/24(Thu)19:15:03 No.101572824

llama 3.1 70b felt smarter than base llama3 for me but also infinitely worse when it comes to SHIVERS and the like.

Anonymous
07/25/24(Thu)19:15:20 No.101572829

Anonymous 07/25/24(Thu)19:15:20 No.101572829

https://x.com/InfernoOmni/status/1816492686087508174
>guys this is actually INSANE. a former employee of a multi-billion dollar company, Runway, confirmed that they mass downloaded YouTube videos in order to feed their AI. there's a spreadsheet with NOTES showing HOW they swiped videos. Nintendo was on the list.
whats going to happen to them bros??

Anonymous
07/25/24(Thu)19:16:45 No.101572844

Anonymous 07/25/24(Thu)19:16:45 No.101572844

>>101572829
Time to mysteriously disappear from the face of the earth.

Anonymous
07/25/24(Thu)19:17:17 No.101572847

Anonymous 07/25/24(Thu)19:17:17 No.101572847

>>101572829
Nothing. We knew about this years ago. Google were doing it all along.

Anonymous
07/25/24(Thu)19:17:33 No.101572849

Anonymous 07/25/24(Thu)19:17:33 No.101572849

>>101572844
I wish I was a sociopath CEO so I had the balls to do stuff like this myself

Anonymous
07/25/24(Thu)19:17:57 No.101572853

Anonymous 07/25/24(Thu)19:17:57 No.101572853

>>101572589
>an hour to generate a 512 token reply
>0.14 token/sec
That might be usable depending on the quality and what I need from it. How much did that system cost?

Anonymous
07/25/24(Thu)19:19:54 No.101572875

Anonymous 07/25/24(Thu)19:19:54 No.101572875

>>101572847
>Google mass downloaded YouTube videos in order to feed their AI
Oh they're in for it now.

Anonymous
07/25/24(Thu)19:19:59 No.101572876

Anonymous 07/25/24(Thu)19:19:59 No.101572876

>>101572829
WTF this is bullshit none of those youtube authors consented to having their content watched and learned from without even paying them for it
lawsuit time!!

Anonymous
07/25/24(Thu)19:20:55 No.101572885

Anonymous 07/25/24(Thu)19:20:55 No.101572885

>>101572829
Despite all the confident talk there STILL hasn't actually been a legal test case to determine and set precedent on the question of whether showing copyrighted media to a transformer or diffusion ML model counts as copyright infringement.

Both sides of the issue talk about it as if the question's been settled in their favour, but they both know it hasn't been. That's just what you do when a legal question is still open, pretend it's closed in order to appear confident.

Anonymous
07/25/24(Thu)19:21:35 No.101572890

Anonymous 07/25/24(Thu)19:21:35 No.101572890

>>101572875
The videos literally belong to them.

Anonymous
07/25/24(Thu)19:22:20 No.101572898

Anonymous 07/25/24(Thu)19:22:20 No.101572898

>>101572876
I consented. I have a youtube channel with two videos from 10 years ago with 300 views

Anonymous
07/25/24(Thu)19:22:44 No.101572904

Anonymous 07/25/24(Thu)19:22:44 No.101572904

>>101572876
Now that I think about it, is using someone's youtube videos even not allowed in any way? Google says they're copyrighted.

Anonymous
07/25/24(Thu)19:22:49 No.101572905

Anonymous 07/25/24(Thu)19:22:49 No.101572905

>>101572890
>upload video to youtube (for free)
>youtube creates subtitles for you
>download subtitles and use that for your dataset
The perfect crime.

Anonymous
07/25/24(Thu)19:25:01 No.101572927

Anonymous 07/25/24(Thu)19:25:01 No.101572927

>>101572885
I mean, copyright probably shouldn't exist. Japan allowed it i think. In a civilized world it would be allowed

Anonymous
07/25/24(Thu)19:25:05 No.101572929

Anonymous 07/25/24(Thu)19:25:05 No.101572929

>>101572904
Im sure if google is using them then somewhere in the terms of service they agreed to is a agreement that they can use the videos anyway they want.

Anonymous
07/25/24(Thu)19:26:08 No.101572936

Anonymous 07/25/24(Thu)19:26:08 No.101572936

>>101572890
Not my videos. I put a disclaimer in the description of all my videos that I do not consent to giving ownership to YouTube. They are mearly a hosting provider.

Anonymous
07/25/24(Thu)19:26:59 No.101572944

Anonymous 07/25/24(Thu)19:26:59 No.101572944

>>101572929
I mean using someone else's videos

Anonymous
07/25/24(Thu)19:29:25 No.101572970

Anonymous 07/25/24(Thu)19:29:25 No.101572970

>>101572944
Again, I'm sure part of the TOS is that google can use the videos anyway they want, that includes selling rights for other companies to train AI off of them. Nothing is free, people just don't bother reading terms of service and don't realize that they are the product.

Anonymous
07/25/24(Thu)19:30:24 No.101572977

Anonymous 07/25/24(Thu)19:30:24 No.101572977

been away for a bit
llama 3.0 70b q4 -> 3.1, is it worth the upgrade at all? can't be bothered to redownload that much if the upgrade is marginal
also the new mistral large, any means to run it on a 3090 & 64 GB RAM and is it better than llama3?

Anonymous
07/25/24(Thu)19:34:21 No.101573016

Anonymous 07/25/24(Thu)19:34:21 No.101573016

>>101572977
There's some question whether it's better as some people are getting bad benchmark scores from it. May or may not be loader bugs.
However it is not made for RP. Mistral Large does RP. It can be loaded with a Q3 quant if you want, but it will not be fast. You will probably get like 1 t/s or less.

Anonymous
07/25/24(Thu)19:35:19 No.101573024

Anonymous 07/25/24(Thu)19:35:19 No.101573024

File: tos.png (36 KB, 788x378)

36 KB PNG

>>101572936
Not into reading ToS, i suppose. You don't need a disclaimer for that, but they can still do pretty much anything with them. Specially serving them unless you private them.
>By providing Content to the Service, you grant to YouTube a worldwide, non-exclusive, royalty-free, transferable, sublicensable licence to use that Content (including to reproduce, distribute, modify, display and perform it) for the purpose of operating, promoting, and improving the Service.

Anonymous
07/25/24(Thu)19:36:21 No.101573042

Anonymous 07/25/24(Thu)19:36:21 No.101573042

>>101573016
meh, i already get slow ~1T/s speeds with llama3-70b, as long as the results are good and the model doesn't need tardwrangling too much i can be patient
i'll try mistral out then, thanks anon

Anonymous
07/25/24(Thu)19:37:43 No.101573059

Anonymous 07/25/24(Thu)19:37:43 No.101573059

>>101572977
Been running it at Q2_K on 64 GB RAM, it's less retarded than I expected and I get about the same speed as a 70B at Q5. I think it might be my favorite RP model and replace CR+ for me.

Anonymous
07/25/24(Thu)19:37:46 No.101573060

Anonymous 07/25/24(Thu)19:37:46 No.101573060

Anyone with 48gb running Mistral Large EXL2? Which quant and settings? I can't get this fucker to load with more than 3584 context size.

Anonymous
07/25/24(Thu)19:38:11 No.101573064

Anonymous 07/25/24(Thu)19:38:11 No.101573064

>>101572970
I just didn't (don't) get what will happen to me if I use someone else's copyrighted video content

Anonymous
07/25/24(Thu)19:39:00 No.101573068

Anonymous 07/25/24(Thu)19:39:00 No.101573068

>>101573059
Made with imatrix?

Anonymous
07/25/24(Thu)19:39:20 No.101573071

Anonymous 07/25/24(Thu)19:39:20 No.101573071

I wish my eyes had a gleam in them...

Anonymous
07/25/24(Thu)19:40:33 No.101573078

Anonymous 07/25/24(Thu)19:40:33 No.101573078

>>101573071
kek

Anonymous
07/25/24(Thu)19:41:01 No.101573083

Anonymous 07/25/24(Thu)19:41:01 No.101573083

>>101573071
lmao

Anonymous
07/25/24(Thu)19:41:35 No.101573089

Anonymous 07/25/24(Thu)19:41:35 No.101573089

>>101573068
Didn't check, quant from https://huggingface.co/MaziyarPanahi/Mistral-Large-Instruct-2407-GGUF

Anonymous
07/25/24(Thu)19:41:52 No.101573092

Anonymous 07/25/24(Thu)19:41:52 No.101573092

>>101573060
3584 is all you need

Anonymous
07/25/24(Thu)19:43:16 No.101573103

Anonymous 07/25/24(Thu)19:43:16 No.101573103

>>101573060
Have you considered being less poor?

Anonymous
07/25/24(Thu)19:43:22 No.101573107

Anonymous 07/25/24(Thu)19:43:22 No.101573107

>>101571366
What's the best local model for ERP with 8GB VRAM?

Anonymous
07/25/24(Thu)19:45:20 No.101573127

Anonymous 07/25/24(Thu)19:45:20 No.101573127

>>101573107
Snowflake Arctic Instruct

Anonymous
07/25/24(Thu)19:46:28 No.101573144

Anonymous 07/25/24(Thu)19:46:28 No.101573144

>>101573059
Also something weird, it seems to be extremely deterministic, not sure if it's because of the quant. I'm using 4 temp and 0.03 minP to get some variety between rerolls.

Anonymous
07/25/24(Thu)19:46:44 No.101573150

Anonymous 07/25/24(Thu)19:46:44 No.101573150

>>101572298
its actually too horny imo and a lot dumber in certain situations.

Anonymous
07/25/24(Thu)19:47:24 No.101573157

Anonymous 07/25/24(Thu)19:47:24 No.101573157

>>101573107
With 8GB you're better off using ram as well.

Anonymous
07/25/24(Thu)19:48:25 No.101573167

Anonymous 07/25/24(Thu)19:48:25 No.101573167

>>101573157
Won't replies be really really slow?

Anonymous
07/25/24(Thu)19:53:53 No.101573215

Anonymous 07/25/24(Thu)19:53:53 No.101573215

>>101573167
Not if you don't offload too many layers to RAM
Nemo 12GB at Q6 should be quite usable in llamacpp with partial cpu offload

Anonymous
07/25/24(Thu)19:54:54 No.101573226

Anonymous 07/25/24(Thu)19:54:54 No.101573226

>>101573215 (me)
oops *12B, not 12GB

Anonymous
07/25/24(Thu)20:06:21 No.101573356

Anonymous 07/25/24(Thu)20:06:21 No.101573356

>Karpathy and other niggas been shitposting about tokenization in Transformers, spamming that "is 9.11 > 9.9" meme like it's the fucking "arrow to the knee" of AI
>Llama 3 rolls out with some new tokenization shit, claiming better text compression but makes L3-405b parse Markdown like a down syndrome kid
>FAIR was already on this shit a month ago with multi-token prediction in Meta Chameleon, but Llama 3 paper doesn't even acknowledge it
Are these AI labs just circlejerking about scaling and dataset quality instead of actually fixing tokenization?

Anonymous
07/25/24(Thu)20:12:12 No.101573425

Anonymous 07/25/24(Thu)20:12:12 No.101573425

whatever happened to chameleon? has anyone written about using it for anything?

Anonymous
07/25/24(Thu)20:14:18 No.101573443

Anonymous 07/25/24(Thu)20:14:18 No.101573443

>>101573356
Tokenization is not a fixable problem. Different problems require different tokenization. We cannot have one to rule them all.

Anonymous
07/25/24(Thu)20:19:17 No.101573501

Anonymous 07/25/24(Thu)20:19:17 No.101573501

https://www.reddit.com/r/LocalLLaMA/comments/1ebz4rt/gpt_4o_mini_size_about_8b/
>gpt-4o mini is 8b
foss j33ts lost.

Anonymous
07/25/24(Thu)20:21:07 No.101573521

Anonymous 07/25/24(Thu)20:21:07 No.101573521

>>101573501
Go back

Anonymous
07/25/24(Thu)20:21:15 No.101573525

Anonymous 07/25/24(Thu)20:21:15 No.101573525

File: 1719850934570705.jpg (566 KB, 1792x2304)

566 KB JPG

>>101571366

Anonymous
07/25/24(Thu)20:22:53 No.101573539

Anonymous 07/25/24(Thu)20:22:53 No.101573539

>>101573501
brain dead redditard
Go back

Anonymous
07/25/24(Thu)20:24:24 No.101573555

Anonymous 07/25/24(Thu)20:24:24 No.101573555

>>101573501
> 8b
The journo just made that shit up.

Anonymous
07/25/24(Thu)20:26:48 No.101573581

Anonymous 07/25/24(Thu)20:26:48 No.101573581

>>101573060
I suggest getting 2 more GPUs.

Anonymous
07/25/24(Thu)20:35:58 No.101573675

Anonymous 07/25/24(Thu)20:35:58 No.101573675

What kind of videos would you gen now if you could?

Anonymous
07/25/24(Thu)20:38:00 No.101573696

Anonymous 07/25/24(Thu)20:38:00 No.101573696

>>101573675
Full anime episodes from story descriptions.

Anonymous
07/25/24(Thu)20:42:20 No.101573737

Anonymous 07/25/24(Thu)20:42:20 No.101573737

>>101573675
I'd use AI ti rewrite Eragon to not be shit (no turning into elf, fucks the dragon) and use that script to make a whole damn series.

Anonymous
07/25/24(Thu)20:42:56 No.101573746

Anonymous 07/25/24(Thu)20:42:56 No.101573746

>>101573356
I have a possibly too simplistic thought that a model figuring out that straberry has 3r's in it on its own is a nice intelligence milestone. It knows what alphabet is so it should just apply that knowledge. I would also be happy with LLM saying that it actually doesn't know cause it uses tokens instead of letters. But we probably will never see a next token predictor get to this level.

Anonymous
07/25/24(Thu)20:46:25 No.101573772

Anonymous 07/25/24(Thu)20:46:25 No.101573772

Would Mistral Large with the lowest 1 bit quant still be good compared to Nemo?

Anonymous
07/25/24(Thu)20:47:00 No.101573781

Anonymous 07/25/24(Thu)20:47:00 No.101573781

I like Nemo. Wish it was smarter. Frogs make a moe out of this please.

Anonymous
07/25/24(Thu)20:47:13 No.101573784

Anonymous 07/25/24(Thu)20:47:13 No.101573784

>>101573772
No

Anonymous
07/25/24(Thu)20:47:51 No.101573797

Anonymous 07/25/24(Thu)20:47:51 No.101573797

>>101573784
But, the filesize is still larger. 25 GB vs 12 GB (Q8).

Anonymous
07/25/24(Thu)20:48:01 No.101573799

Anonymous 07/25/24(Thu)20:48:01 No.101573799

>>101573772
>bitnet
Only thing that would be better is a miqu bitnet.

Anonymous
07/25/24(Thu)20:49:02 No.101573811

Anonymous 07/25/24(Thu)20:49:02 No.101573811

>>101573799
Oh, no, I'm talking about the IQ quants. 1 bit was probably the wrong wording.

Anonymous
07/25/24(Thu)20:49:07 No.101573814

Anonymous 07/25/24(Thu)20:49:07 No.101573814

>>101573799
1 bit quant != ternary != bitnet

Anonymous
07/25/24(Thu)20:50:05 No.101573829

Anonymous 07/25/24(Thu)20:50:05 No.101573829

>>101573772
why don't you try it out

Anonymous
07/25/24(Thu)20:50:19 No.101573832

Anonymous 07/25/24(Thu)20:50:19 No.101573832

>can run CommandR
>can't run Mistral Large
AAAAAAAAAAAAAAAAAAAA

Anonymous
07/25/24(Thu)20:51:08 No.101573839

Anonymous 07/25/24(Thu)20:51:08 No.101573839

>>101573832
Cohere will save us next week, believe it!

Anonymous
07/25/24(Thu)20:52:26 No.101573848

Anonymous 07/25/24(Thu)20:52:26 No.101573848

File: file.png (31 KB, 1284x627)

31 KB PNG

>>101573811
Rest of the range is debatable.

Anonymous
07/25/24(Thu)20:53:19 No.101573856

Anonymous 07/25/24(Thu)20:53:19 No.101573856

>>101573829
Ok fine. Downloading.
yolo

Anonymous
07/25/24(Thu)20:53:32 No.101573858

Anonymous 07/25/24(Thu)20:53:32 No.101573858

>>101572448
It swiftly becomes obnoxious after the first OOC. The model takes the online RP forum schema too seriously, and because mistral does not specify roles in its prompt formatting, it begins criticizing both the user and itself over and over again. I had to add 'OOC' to the stopping string list.

Anonymous
07/25/24(Thu)21:04:28 No.101573977

Anonymous 07/25/24(Thu)21:04:28 No.101573977

File: chrome_2024-07-25_20-01-39.png (54 KB, 1017x476)

54 KB PNG

>>101573832
>can probably run it at exl2-4.0bpw, but with only 2GB free
No... it's over...

Anonymous
07/25/24(Thu)21:10:10 No.101574030

Anonymous 07/25/24(Thu)21:10:10 No.101574030

>>101573832
Just stop being a ramlet

Anonymous
07/25/24(Thu)21:13:30 No.101574063

Anonymous 07/25/24(Thu)21:13:30 No.101574063

What preset should I use with mistral nemo?

Anonymous
07/25/24(Thu)21:19:14 No.101574128

Anonymous 07/25/24(Thu)21:19:14 No.101574128

>>101573167
Define slow? If you're happy with 1-8T/s depending on model size then you should be fine.

Anonymous
07/25/24(Thu)21:20:11 No.101574140

Anonymous 07/25/24(Thu)21:20:11 No.101574140

File: 0.webm (3.74 MB, 1280x720)

3.74 MB WEBM

>>101574063
>What preset should I use with mistral nemo?
have you tried the vg/aicg anon's preset for mistral large? >>101567703

Anonymous
07/25/24(Thu)21:21:58 No.101574158

Anonymous 07/25/24(Thu)21:21:58 No.101574158

>>101574140
I'm using the shitter vramlet version

Anonymous
07/25/24(Thu)21:27:23 No.101574215

Anonymous 07/25/24(Thu)21:27:23 No.101574215

>>101574158
your shittier vramlet version is hornier, which is a good thing
you can use presets that are designed for different models. you don't really lose too much other than a bit of context applying a jailbreak to a model that doesn't need it

Anonymous
07/25/24(Thu)21:29:15 No.101574239

Anonymous 07/25/24(Thu)21:29:15 No.101574239

What's the best uncensored model currently? NAI doesn't cut it anymore, i also have a 4090 so i could do local if i have to

Anonymous
07/25/24(Thu)21:31:52 No.101574263

Anonymous 07/25/24(Thu)21:31:52 No.101574263

>>101574239
>What's the best uncensored model currently?
claude 3 opus or sonnet with a prefill/jb from /aicg

>>101574239
>i could do local if i have to
a single 4090 doesn't cut it anymore, but you can try out a gemma-27b or a mistral-nemo model
but if you're gonna use claude you won't be able to handle local's retardation

Anonymous
07/25/24(Thu)21:33:40 No.101574283

Anonymous 07/25/24(Thu)21:33:40 No.101574283

>>101574263
how do i use claude uncensored? Can i just sub on the site or is there a diff version somewhere? Can it handle lolis?

Anonymous
07/25/24(Thu)21:37:44 No.101574330

Anonymous 07/25/24(Thu)21:37:44 No.101574330

>>101574239
The question is not "what is the best uncensored model", it is "what is the best model", and for your hardware that's this one:
https://huggingface.co/bartowski/gemma-2-27b-it-GGUF/blob/main/gemma-2-27b-it-Q4_K_M.gguf
>>101574063
The template in its tokenizer_config.json.
>>101573832
You can certainly run the model below, and it's a good one.
https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q4_K_S.gguf
>>101573781
Good prose, poor intelligence and world knowledge.
>>101573772
Likely not. Models degrade too much below 2-bit no matter how good the quantization algorithm.
>>101573675
Pornography of the same type that you can find on xvideos. And I would feel like a vermin about it.
>>101573501
Redditors are retarded and can't read.

Anonymous
07/25/24(Thu)21:38:43 No.101574341

Anonymous 07/25/24(Thu)21:38:43 No.101574341

>>101574330
Gemma is dogshit

Anonymous
07/25/24(Thu)21:39:23 No.101574349

Anonymous 07/25/24(Thu)21:39:23 No.101574349

>>101574341
You just don't know how to use it with whatever bastardized prompt you are giving it.

Anonymous
07/25/24(Thu)21:41:34 No.101574366

Anonymous 07/25/24(Thu)21:41:34 No.101574366

>>101574283
>how do i use claude uncensored?
openrouter if you want to pay, or use a proxy from /aicg/

>Can it handle lolis?
its the smartest and most creative model out right now. it still doesn't make children act realistically but if you like the idea of horny brat children its fine

Anonymous
07/25/24(Thu)21:42:25 No.101574376

Anonymous 07/25/24(Thu)21:42:25 No.101574376

>>101574349
>SARRR YOU ARE REDEEMING IT THE WRONG SARR

Anonymous
07/25/24(Thu)21:43:47 No.101574389

Anonymous 07/25/24(Thu)21:43:47 No.101574389

>>101574349
NTA but anons always say that then never provide their prompt. A model that needs some magical mystery prompt to be good (?) is no use.

Anonymous
07/25/24(Thu)21:49:05 No.101574439

Anonymous 07/25/24(Thu)21:49:05 No.101574439

>>101574330
>The template in its tokenizer_config.json.
Presets can also mean sampler settings. I *wish* those were in models. Or at least recommended starting points.

Anonymous
07/25/24(Thu)21:52:29 No.101574470

Anonymous 07/25/24(Thu)21:52:29 No.101574470

>>101572386
Theme pls

Anonymous
07/25/24(Thu)21:56:07 No.101574521

Anonymous 07/25/24(Thu)21:56:07 No.101574521

>>101574439
They recommended temp at 0.3 or 0.4. They cannot recommend other settings for every inference program out there. It's not reasonable and nobody would agree with them if they were included.

Anonymous
07/25/24(Thu)21:56:30 No.101574522

Anonymous 07/25/24(Thu)21:56:30 No.101574522

>>101572448
are you telling me that "ahh...ahh...mistress!" isn't good enough? preposterous

Anonymous
07/25/24(Thu)21:58:25 No.101574540

Anonymous 07/25/24(Thu)21:58:25 No.101574540

>write author's note saying character has k-cup breasts
>tease bot about her huge breasts
>they say they are not that big, only a K-cup, not even C-cup yet
So this is the power of AI roleplay...

Anonymous
07/25/24(Thu)22:00:06 No.101574566

Anonymous 07/25/24(Thu)22:00:06 No.101574566

Anyone using nemo on sillytavern? What context and instruct fo you use? Just Mustral?

Anonymous
07/25/24(Thu)22:02:17 No.101574595

Anonymous 07/25/24(Thu)22:02:17 No.101574595

>>101574566
yes

Anonymous
07/25/24(Thu)22:03:01 No.101574602

Anonymous 07/25/24(Thu)22:03:01 No.101574602

>>101574330
>Pornography of the same type that you can find on xvideos.
Except it could be any anime character you want

Anonymous
07/25/24(Thu)22:08:01 No.101574655

Anonymous 07/25/24(Thu)22:08:01 No.101574655

where can i generate porn videos?

Anonymous
07/25/24(Thu)22:08:28 No.101574661

Anonymous 07/25/24(Thu)22:08:28 No.101574661

>As for the uptime of your computer, I'm pleased to inform you it's been running for a lovely 6 days, 4 hours, and 44 minutes. You know what they say: "Idle hands are the devil's playthings," but in this case, an idle computer is merely a testament to its owner's questionable life choices.
kek i'm enjoying this discovery too much, it even knew that my load average was low.

Anonymous
07/25/24(Thu)22:09:49 No.101574675

Anonymous 07/25/24(Thu)22:09:49 No.101574675

File: 0.webm (1.78 MB, 1280x720)

1.78 MB WEBM

>>101574655
>where can i generate porn videos?
nowhere good, but klingai.com is your best bet for free text2video of any kind

Anonymous
07/25/24(Thu)22:09:52 No.101574676

Anonymous 07/25/24(Thu)22:09:52 No.101574676

>>101574655
In like a few months. Or maybe a year?

Anonymous
07/25/24(Thu)22:11:33 No.101574695

Anonymous 07/25/24(Thu)22:11:33 No.101574695

>>101574675
Quit posting these goblins

Anonymous
07/25/24(Thu)22:13:08 No.101574710

Anonymous 07/25/24(Thu)22:13:08 No.101574710

>>101574695
ill get bored in 2 weeks or when they tighten the filter as a result of my degeneracy, whichever comes sooner

Anonymous
07/25/24(Thu)22:18:34 No.101574776

Anonymous 07/25/24(Thu)22:18:34 No.101574776

I put on my robe and wizard hat

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.