/g/ - /lmg/ - a general dedicated to the discussion and - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

/lmg/ - Local Mikus General Archived

08/03/24(Sat)20:29:28 No.101711798

File: file.png (1.8 MB, 1024x1024)

1.8 MB PNG

/lmg/ - Local Mikus General 08/03/24(Sat)20:29:28 No.101711798 Archived

/lmg/ - a general dedicated to the discussion and development of local language mikus.

Previous threads: >>101682019 & >>101705239
►News
>(07/31) Google releases Gemma 2 2B, ShieldGemma, and Gemma Scope: https://developers.googleblog.com/en/smaller-safer-more-transparent-advancing-responsible-ai-with-gemma
>(07/27) Llama 3.1 rope scaling merged: https://github.com/ggerganov/llama.cpp/pull/8676
>(07/26) Cyberagent releases Japanese fine-tune model: https://hf.co/cyberagent/Llama-3.1-70B-Japanese-Instruct-2407
>(07/25) BAAI & TeleAI release 1T parameter model: https://hf.co/CofeAI/Tele-FLM-1T
>(07/24) Mistral Large 2 123B released: https://hf.co/mistralai/Mistral-Large-Instruct-2407

►News Archive: https://rentry.org/lmg-news-archive
►FAQ: https://wikia.schneedc.com
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png (embed)

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Programming: https://hf.co/spaces/bigcode/bigcode-models-leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
08/03/24(Sat)20:32:19 No.101711833

Anonymous 08/03/24(Sat)20:32:19 No.101711833

Thinking about taking the plunge and buying a second 3090, but I have some questions.

There is not enough room on my mobo for 2 3090's, I plan to fix that issue with a riser cable, but where do I house the errant GPU?

My PSU is 1000 watts, is this enough?

What kind of models can I realistically expect to run at a decent speed with 2 3090s? Are quants of bigstral within reach using CPP?

Anonymous
08/03/24(Sat)20:33:44 No.101711848

Anonymous 08/03/24(Sat)20:33:44 No.101711848

>>101711833
1000 watts is cutting it close for two 3090s but its unlikely you'll see heavy CPU load, so it'll probably be fine

Anonymous
08/03/24(Sat)20:34:00 No.101711854

Anonymous 08/03/24(Sat)20:34:00 No.101711854

>>101711833
hope you have enough PCIe lanes if these are the questions you're asking.
obviously you can "cloud compute" by hanging your entire rig off of the ceiling using string, what kind of question is that? Just get it working. There's no magic to screws and boxes.

Anonymous
08/03/24(Sat)20:35:16 No.101711868

Anonymous 08/03/24(Sat)20:35:16 No.101711868

>>101711854
lanes dont matter for inference unless you're doing split tensors because its all vram resident, and for split layers the total data transfer between the GPUs is very small

Anonymous
08/03/24(Sat)20:38:26 No.101711906

Anonymous 08/03/24(Sat)20:38:26 No.101711906

>>101711854
It's a tuf gaming b550-plus, would that be an issue?

Anonymous
08/03/24(Sat)20:38:51 No.101711911

Anonymous 08/03/24(Sat)20:38:51 No.101711911

File: ComfyUI_00119_.png (333 KB, 512x512)

333 KB PNG

Care for a glass of bees?

Anonymous
08/03/24(Sat)20:38:55 No.101711912

Anonymous 08/03/24(Sat)20:38:55 No.101711912

>>101711833
>My PSU is 1000 watts, is this enough?
Mine blew up and I changed it for a 1200w one. But maybe the reason it blew up was because it was 8-months-old and used for mining, though...
>What kind of models can I realistically expect to run at a decent speed with 2 3090s?
70Bs at 4.5-4.65 bpw.
>Are quants of bigstral within reach using CPP?
An IQ4_XS quant ran at 5 T/s for me.

Anonymous
08/03/24(Sat)20:41:01 No.101711938

Anonymous 08/03/24(Sat)20:41:01 No.101711938

>>101711912
Thank you. I think I'll do it. I'm just not sure where to shove the 3090 if it doesn't fit in the case.

Anonymous
08/03/24(Sat)20:43:41 No.101711970

Anonymous 08/03/24(Sat)20:43:41 No.101711970

File: ComfyUI_00045_.png (1.47 MB, 1280x1024)

1.47 MB PNG

Anonymous
08/03/24(Sat)20:44:41 No.101711985

Anonymous 08/03/24(Sat)20:44:41 No.101711985

>>101711970
How'd you get the mix genres?

Anonymous
08/03/24(Sat)20:45:08 No.101711994

Anonymous 08/03/24(Sat)20:45:08 No.101711994

>>101711970
Is this real? This has to be Photoshop.

Anonymous
08/03/24(Sat)20:47:01 No.101712012

Anonymous 08/03/24(Sat)20:47:01 No.101712012

>>101711970
>The way miku's shoulder presses into his shirt as if she was really there

Anonymous
08/03/24(Sat)20:47:04 No.101712013

Anonymous 08/03/24(Sat)20:47:04 No.101712013

Holy fuck Llama 3.1 70B Instruct is actually retarded. No combination of card instructions and depth zero author's notes will make it stop (over)using ellipses in dialog when it's writing in generic RP style.

Anonymous
08/03/24(Sat)20:47:11 No.101712017

Anonymous 08/03/24(Sat)20:47:11 No.101712017

File: water flux vs d3.png (954 KB, 1445x807)

954 KB PNG

another FLUX vs. D3 smartness shoot-out

Anonymous
08/03/24(Sat)20:47:13 No.101712018

Anonymous 08/03/24(Sat)20:47:13 No.101712018

File: ComfyUI_00031_.png (1.67 MB, 1536x1024)

1.67 MB PNG

>>101711985
>>101711994
>A professional real estate photograph selfie in a living room, 24mm, f/16 lens. The background is sharp and in focus. An anime cutout of Hatsune Miku is edited into the photo. There is a photogenic man standing beside her with his hand around her shoulder.

Anonymous
08/03/24(Sat)20:48:11 No.101712028

Anonymous 08/03/24(Sat)20:48:11 No.101712028

>>101712013
Yeah it's a substantial downgrade over the old L3-70B.

Anonymous
08/03/24(Sat)20:49:38 No.101712045

Anonymous 08/03/24(Sat)20:49:38 No.101712045

>>101712013
i've tried base, instruct and instruct tunes and it feels like some combo of smart and retarded. it follows my cards, prompt and rag db great, but then it forgets what happened 1 message ago. miqu is still better imo

Anonymous
08/03/24(Sat)20:49:40 No.101712046

Anonymous 08/03/24(Sat)20:49:40 No.101712046

File: ComfyUI_00052_.png (379 KB, 512x512)

379 KB PNG

►Recent Highlights from the Previous Thread: >>101705239

--Mistral-Large-Instruct-2407-GGUF model recommended for ERP: >>101708213 >>101708237 >>101708323 >>101708442 >>101708477
--Gemma 2 27b performance and model saturation discussion: >>101705986 >>101706032 >>101706108 >>101706154 >>101706158 >>101706191 >>101706192
--Anon achieves fast Flux execution with 3060 and 128 GB RAM: >>101705620 >>101705997 >>101706039 >>101706202 >>101706810
--FLUX cfg settings for image generation: >>101706621 >>101706652 >>101706776 >>101706812 >>101706895 >>101706776 >>101706785
--Anons test gemma model on budget Android phones, impressed with coherence and performance: >>101707575 >>101708739 >>101708849
--Using LLMs to generate onomatopoeia, with a humorous example: >>101710547
--Testing 1408x1408 resolution, model generates interesting but imperfect image: >>101710304 >>101710395 >>101710537
--Tess-3-Llama-3.1-405B model and synthetic data generation: >>101706755 >>101707053 >>101707110 >>101707382 >>101707469 >>101707546
--OpenRouter's base 405B model may not be truly raw: >>101709255 >>101709278 >>101709398 >>101709645 >>101709711 >>101709865 >>101709333
--Nvidia faces DOJ antitrust probe: >>101710495 >>101710602
--Lumina-mGPT: multimodal model for generating photorealistic images: >>101705936 >>101705971 >>101706657
--Flux struggles with coherent and prompt-following images: >>101705497 >>101705817 >>101705949 >>101705875
--Flux outperforms D3 in concept granularization and overload handling: >>101705902
--Flux dev tested on 3090, decent results but inferior to SD15 and SDXL fine tunes: >>101706383 >>101709715 >>101709740 >>101709859
--Base model is good for long form storytelling, but hard to start: >>101705345 >>101705600
--Anon tests image generation resolutions, 1280x1280 works better than 1408x1408: >>101710671
--Miku (free space): >>101705490 >>101705866 >>101706450 >>101707107 >>101708859

►Recent Highlight Posts from the Previous Thread: >>101705242

Anonymous
08/03/24(Sat)20:49:47 No.101712048

Anonymous 08/03/24(Sat)20:49:47 No.101712048

What kind of model/quant could I run with 72 GB of vram?

Anonymous
08/03/24(Sat)20:49:56 No.101712051

Anonymous 08/03/24(Sat)20:49:56 No.101712051

>>101712018
we are so fucking back

Anonymous
08/03/24(Sat)20:50:00 No.101712052

Anonymous 08/03/24(Sat)20:50:00 No.101712052

>>101712018
Very cool anon, you're a genius.

Anonymous
08/03/24(Sat)20:50:44 No.101712060

Anonymous 08/03/24(Sat)20:50:44 No.101712060

>>101712017 (Me)
I've played around with tilted water bottle prompts on D3 before. So I do know that even if you massage the prompt to get it to consistently tilt the bottle it will never quite make the water surface parallel to the normal. Another massive win for FLUX.

Anonymous
08/03/24(Sat)20:52:50 No.101712087

Anonymous 08/03/24(Sat)20:52:50 No.101712087

>>101712018
>6 fingers
>painted nails

Anonymous
08/03/24(Sat)20:56:53 No.101712136

Anonymous 08/03/24(Sat)20:56:53 No.101712136

All those Miku pictures make me want to become Miku. I think I would look very cute with those twintails.

Anonymous
08/03/24(Sat)20:57:47 No.101712144

Anonymous 08/03/24(Sat)20:57:47 No.101712144

File: LLM-history-fancy.png (732 KB, 6285x1307)

732 KB PNG

Changed some things based on feedback

Anonymous
08/03/24(Sat)20:58:23 No.101712147

Anonymous 08/03/24(Sat)20:58:23 No.101712147

>>101712136
Don't lie. You're imagining yourself as Miku being railed by all those photogenic real estate agents.

Anonymous
08/03/24(Sat)20:58:59 No.101712156

Anonymous 08/03/24(Sat)20:58:59 No.101712156

>>101712136
That's not the way God wants you to be. Reconsider.

Anonymous
08/03/24(Sat)20:59:40 No.101712164

Anonymous 08/03/24(Sat)20:59:40 No.101712164

File: ComfyUI_00093_.jpg (367 KB, 1024x1024)

367 KB JPG

>>101712136
>Become the Miku
Soon, Anon. Soon.

>>101712156
>

Anonymous
08/03/24(Sat)21:00:24 No.101712171

Anonymous 08/03/24(Sat)21:00:24 No.101712171

>>101712164
Miku giving Bad Touch Jesus the side eye.

Anonymous
08/03/24(Sat)21:01:40 No.101712189

Anonymous 08/03/24(Sat)21:01:40 No.101712189

File: 1695558067008110.png (1.12 MB, 1024x1024)

1.12 MB PNG

So this, is a Miku Hatsune level 1...

Anonymous
08/03/24(Sat)21:01:53 No.101712191

Anonymous 08/03/24(Sat)21:01:53 No.101712191

>>101712166
Jannies, cleanup time~!

Anonymous
08/03/24(Sat)21:02:18 No.101712196

Anonymous 08/03/24(Sat)21:02:18 No.101712196

>>101712166
blacked miku is coal

Anonymous
08/03/24(Sat)21:02:46 No.101712208

Anonymous 08/03/24(Sat)21:02:46 No.101712208

They're the same people aren't they.
Someone just wants to create artificial drama.

Anonymous
08/03/24(Sat)21:03:13 No.101712213

Anonymous 08/03/24(Sat)21:03:13 No.101712213

I have no interest in cooming; what versions of mistral and llama 3 can fit on my 4090? I'm interested in general use and instruct.

Anonymous
08/03/24(Sat)21:03:19 No.101712214

Anonymous 08/03/24(Sat)21:03:19 No.101712214

>>101712136
My dream is buying Miku's skirt, stripped panties and thigh highs from eBay and taking some photos to fap with it, but I fear that might ignite something inside me.

Anonymous
08/03/24(Sat)21:04:02 No.101712229

Anonymous 08/03/24(Sat)21:04:02 No.101712229

>>101712189
unironically what do the rest look like?

Anonymous
08/03/24(Sat)21:04:05 No.101712230

Anonymous 08/03/24(Sat)21:04:05 No.101712230

>>101712208
>They're the same people aren't they.
of course he is

Anonymous
08/03/24(Sat)21:04:53 No.101712237

Anonymous 08/03/24(Sat)21:04:53 No.101712237

Anyone use an LLM as an agent?

Anonymous
08/03/24(Sat)21:05:21 No.101712244

Anonymous 08/03/24(Sat)21:05:21 No.101712244

>>101712213
You're probably better running Gemma 2 27B.

Anonymous
08/03/24(Sat)21:05:30 No.101712246

Anonymous 08/03/24(Sat)21:05:30 No.101712246

File: 1711593409601033.png (1.24 MB, 1024x1024)

1.24 MB PNG

>>101712229
well I just started, we'll find out

Anonymous
08/03/24(Sat)21:06:41 No.101712256

Anonymous 08/03/24(Sat)21:06:41 No.101712256

>>101711798
>2024
>still getting boilerplate legal responses to avoid litigation for the most innocuous queries
Literally every one of these silicon valley cucks who censor.. oh I'm sorry, "aligns" these LLMs could burn in hell for a trillion years and that wouldn't be .0000000000001 percent the punishment they deserve. Meanwhile there's almost zero censorship for doing almost any variety of sick sex acts with these fucking things. Typical r*ddit-tier cuckold logic. Whoever thinks this shit will replace programmers is so committed to licking big tech's boots that they're not even worth responding to.

Anonymous
08/03/24(Sat)21:06:44 No.101712258

Anonymous 08/03/24(Sat)21:06:44 No.101712258

>>101712244
>Gemma 2 27B
>>101711679
>>101711642
>>101711601
>>101711592
umm yeah,no

Anonymous
08/03/24(Sat)21:06:54 No.101712260

Anonymous 08/03/24(Sat)21:06:54 No.101712260

>get tired of Mistral Large repeating stuff from the context word-for-word
>switch to mini-magnum
>does literally the same thing
I'm tired of this...

Anonymous
08/03/24(Sat)21:07:18 No.101712264

Anonymous 08/03/24(Sat)21:07:18 No.101712264

>>101712240
Dude, I just got fired.

Anonymous
08/03/24(Sat)21:07:48 No.101712274

Anonymous 08/03/24(Sat)21:07:48 No.101712274

File: graph.png (4 KB, 502x397)

4 KB PNG

>>101712260
Get tired of Mistral repeating switch to Mistral ???

Anonymous
08/03/24(Sat)21:08:03 No.101712277

Anonymous 08/03/24(Sat)21:08:03 No.101712277

>>101712260
Try DRY sampler

Anonymous
08/03/24(Sat)21:08:22 No.101712281

Anonymous 08/03/24(Sat)21:08:22 No.101712281

File: 1447592714450.jpg (321 KB, 640x750)

321 KB JPG

Bros I just got promoted!

Anonymous
08/03/24(Sat)21:08:45 No.101712287

Anonymous 08/03/24(Sat)21:08:45 No.101712287

>>101712258
hi petra

Anonymous
08/03/24(Sat)21:09:47 No.101712306

Anonymous 08/03/24(Sat)21:09:47 No.101712306

>>101712277
>DRY sampler
Meme made by a pretentious redditor
>https://www.reddit.com/user/-p-e-w-/
> DRY author here. Your Min-P is too high. 0.1 is way too much with current models, and even 0.05 is too high IMO. I use 0.02 and it's more than sufficient. Increasing temperature above 1 is generally a bad idea nowadays, and probably the reason why you feel the need to use such high Min-P values to keep the output coherent.
>https://old.reddit.com/r/LocalLLaMA/comments/1ej1zrl/try_these_settings_for_llama_31_for_longer_or/lgbjtox/

Anonymous
08/03/24(Sat)21:09:57 No.101712309

Anonymous 08/03/24(Sat)21:09:57 No.101712309

>>101712240
Gonna need you to kill yourself buddy

Anonymous
08/03/24(Sat)21:10:26 No.101712315

Anonymous 08/03/24(Sat)21:10:26 No.101712315

>>101712244
worth a shot
no goofy quants or anything right?

Anonymous
08/03/24(Sat)21:10:37 No.101712317

Anonymous 08/03/24(Sat)21:10:37 No.101712317

>>101712246
Why did her pants turn into a skirt? Also her hair isn't spiky enough.

Anonymous
08/03/24(Sat)21:11:14 No.101712323

Anonymous 08/03/24(Sat)21:11:14 No.101712323

File: 1705265020543311.png (1.33 MB, 1024x1024)

1.33 MB PNG

>>101712317
this came out better

Anonymous
08/03/24(Sat)21:11:25 No.101712327

Anonymous 08/03/24(Sat)21:11:25 No.101712327

what happened to thebloke?

Anonymous
08/03/24(Sat)21:11:47 No.101712332

Anonymous 08/03/24(Sat)21:11:47 No.101712332

Gemma 2B gets the strawberry question right.

Anonymous
08/03/24(Sat)21:12:14 No.101712338

Anonymous 08/03/24(Sat)21:12:14 No.101712338

>>101712332
Stupid enough to make a complete guess and guess correctly.

Anonymous
08/03/24(Sat)21:12:41 No.101712344

Anonymous 08/03/24(Sat)21:12:41 No.101712344

>>101712323
try mixing in some broly to maintain the hair color

Anonymous
08/03/24(Sat)21:12:46 No.101712349

Anonymous 08/03/24(Sat)21:12:46 No.101712349

cuckshit spammer must be one of the jannies, why is it still not removed?

Anonymous
08/03/24(Sat)21:13:24 No.101712358

Anonymous 08/03/24(Sat)21:13:24 No.101712358

>>101712306
> I'm sorry, but that is an absurd claim. Nobody knows the potential pitfalls of AI in medicine, because AI hasn't been deployed in medicine at any significant scale so far (nor anywhere else, really). The industry isn't even in its infancy yet, it is barely starting to exist in the first place.
> LLMs are going to roll over civilization like a bulldozer once the mass of people realize what this actually means. We don't even need AGI in order for this to happen. The current generation of LLMs is more than good enough to cause the greatest upheaval since the Industrial Revolution.
> If someone at OpenAI told me that there are issues with LLMs, I would certainly question it, yes. They can speak about their specific product (GPT), but it is way too early to make generalizing claims.

Anonymous
08/03/24(Sat)21:13:53 No.101712366

Anonymous 08/03/24(Sat)21:13:53 No.101712366

>>101712187
And you don't even try to hide that you love black cock.

Anonymous
08/03/24(Sat)21:14:05 No.101712370

Anonymous 08/03/24(Sat)21:14:05 No.101712370

>>101712349
Janny is a literal troon: >>101710688

Anonymous
08/03/24(Sat)21:14:15 No.101712374

Anonymous 08/03/24(Sat)21:14:15 No.101712374

>>101712323
Why did goku get miku hair color? Why does he have 6 fingers?

Anonymous
08/03/24(Sat)21:14:34 No.101712378

Anonymous 08/03/24(Sat)21:14:34 No.101712378

>>101712332
It also gets macron’s birthday, oh lawd.

Anonymous
08/03/24(Sat)21:14:56 No.101712384

Anonymous 08/03/24(Sat)21:14:56 No.101712384

File: 1721907628541060.png (1.3 MB, 1024x1024)

1.3 MB PNG

now we're talking, getting closer!

Anonymous
08/03/24(Sat)21:15:05 No.101712385

Anonymous 08/03/24(Sat)21:15:05 No.101712385

>>101712349
He does get banned, but rebooting router / using proxies isn't really hard.

Anonymous
08/03/24(Sat)21:15:56 No.101712394

Anonymous 08/03/24(Sat)21:15:56 No.101712394

Does the new Llama3.1 release actually put any pressure on OpenAI to release a GPT-5.

Do you think they have anything that could be appreciably called GPT-5?

Anonymous
08/03/24(Sat)21:16:37 No.101712404

Anonymous 08/03/24(Sat)21:16:37 No.101712404

>>101712394
>Does the new Llama3.1 release actually put any pressure on OpenAI to release a GPT-5.
no
>Do you think they have anything that could be appreciably called GPT-5?
no

Anonymous
08/03/24(Sat)21:17:19 No.101712410

Anonymous 08/03/24(Sat)21:17:19 No.101712410

File: 1713544734895824.png (1.22 MB, 1024x1024)

1.22 MB PNG

>>101712384

Anonymous
08/03/24(Sat)21:17:20 No.101712411

Anonymous 08/03/24(Sat)21:17:20 No.101712411

>>101712394
Open models are nothing. The real competition is Claude and Gemini.

Anonymous
08/03/24(Sat)21:17:46 No.101712421

Anonymous 08/03/24(Sat)21:17:46 No.101712421

>>101712410
hair too short

Anonymous
08/03/24(Sat)21:17:51 No.101712424

Anonymous 08/03/24(Sat)21:17:51 No.101712424

>>101712394
They don't even have anything worth calling gpt 4.5

Anonymous
08/03/24(Sat)21:18:32 No.101712436

Anonymous 08/03/24(Sat)21:18:32 No.101712436

>>101712384
it's wider at the bottom like it's still where her twintails would be

Anonymous
08/03/24(Sat)21:19:28 No.101712449

Anonymous 08/03/24(Sat)21:19:28 No.101712449

File: 1701255559263705.png (1.11 MB, 1024x1024)

1.11 MB PNG

a gen from before while I queue stuff up:

Anonymous
08/03/24(Sat)21:21:34 No.101712462

Anonymous 08/03/24(Sat)21:21:34 No.101712462

File: 1721689554914991.png (1.24 MB, 1024x1024)

1.24 MB PNG

hair with green tint: now she's closer to broly

Anonymous
08/03/24(Sat)21:24:28 No.101712497

Anonymous 08/03/24(Sat)21:24:28 No.101712497

File: 1696143828832695.png (1.25 MB, 1024x1024)

1.25 MB PNG

>>101712462
alt:

Anonymous
08/03/24(Sat)21:26:10 No.101712518

Anonymous 08/03/24(Sat)21:26:10 No.101712518

>>101712462
>>101712410
>>101712384
Dis not migu. Migu has 01 on her arm

Anonymous
08/03/24(Sat)21:26:19 No.101712520

Anonymous 08/03/24(Sat)21:26:19 No.101712520

File: 1698730566408798.png (17 KB, 319x162)

17 KB PNG

>>101712087
>>101712018
thanks arstechnica

Anonymous
08/03/24(Sat)21:31:26 No.101712585

Anonymous 08/03/24(Sat)21:31:26 No.101712585

>>101712370
JLI is the resident janny of lmg btw

Anonymous
08/03/24(Sat)21:31:29 No.101712587

Anonymous 08/03/24(Sat)21:31:29 No.101712587

>>101712520
>leftist media knows
it's over

Anonymous
08/03/24(Sat)21:31:40 No.101712592

Anonymous 08/03/24(Sat)21:31:40 No.101712592

>>101712520
>eerily good
>heir apparent
I hate journalist so fucking much.

Anonymous
08/03/24(Sat)21:32:52 No.101712610

Anonymous 08/03/24(Sat)21:32:52 No.101712610

>>101712592
>heir apparent
What about it exactly?

Anonymous
08/03/24(Sat)21:44:54 No.101712734

Anonymous 08/03/24(Sat)21:44:54 No.101712734

>>101712449
Good work, Miku

Anonymous
08/03/24(Sat)21:46:10 No.101712752

Anonymous 08/03/24(Sat)21:46:10 No.101712752

https://anthra.site/

Mini-Magnum revision soon,
the first magnum is the worst it'll ever be. The ride is up and up from here.

Anonymous
08/03/24(Sat)21:46:15 No.101712754

Anonymous 08/03/24(Sat)21:46:15 No.101712754

File: 1704904585028190.png (138 KB, 256x256)

138 KB PNG

>tfw 16gb vram

Anonymous
08/03/24(Sat)21:46:32 No.101712757

Anonymous 08/03/24(Sat)21:46:32 No.101712757

>>101712424
>They don't even have anything worth
B-but project stRawbeRRy...

Anonymous
08/03/24(Sat)21:46:54 No.101712760

Anonymous 08/03/24(Sat)21:46:54 No.101712760

File: 1722735969423.png (430 KB, 573x573)

430 KB PNG

Anonymous
08/03/24(Sat)21:52:55 No.101712821

Anonymous 08/03/24(Sat)21:52:55 No.101712821

Would it be good to realtime video gen a game with logic?

Anonymous
08/03/24(Sat)21:57:07 No.101712879

Anonymous 08/03/24(Sat)21:57:07 No.101712879

I don't have much proof but I just genned 20 to 30 step schedules and only one of them had logical hand anatomy, which was 21 step scheduling. And this agreed with another test I did where I genned 20 to 25, and 21 was also the only one that had good hands. The only issue is that 20-30 is kind of low for packed prompts, so you end up with more gens that have less stuff from your prompts present.

Anonymous
08/03/24(Sat)21:58:45 No.101712904

Anonymous 08/03/24(Sat)21:58:45 No.101712904

>>101712879
>I don't have much proof but I just genned 20 to 30 step schedules
Wait no I meant 50. Yes, I generated 30 images to see how much change the image goes through depending on amount of steps.

Anonymous
08/03/24(Sat)22:02:00 No.101712953

Anonymous 08/03/24(Sat)22:02:00 No.101712953

>>101712752
unironically hyped. i love mini-magnum

Anonymous
08/03/24(Sat)22:02:52 No.101712964

Anonymous 08/03/24(Sat)22:02:52 No.101712964

File: file.png (241 KB, 2399x1392)

241 KB PNG

idgi which one do I get?

Anonymous
08/03/24(Sat)22:05:08 No.101712998

Anonymous 08/03/24(Sat)22:05:08 No.101712998

>>101712964
neither

Anonymous
08/03/24(Sat)22:07:43 No.101713036

Anonymous 08/03/24(Sat)22:07:43 No.101713036

>>101712998
You mean there's a different place to get gguf's or are you just being le funny?

Anonymous
08/03/24(Sat)22:10:31 No.101713079

Anonymous 08/03/24(Sat)22:10:31 No.101713079

File: image (10).jpg (155 KB, 1024x768)

155 KB JPG

I want this quality with chameleon.
In a couple months we might actually have that.

Anonymous
08/03/24(Sat)22:17:52 No.101713190

Anonymous 08/03/24(Sat)22:17:52 No.101713190

how are you guys liking flux?

Anonymous
08/03/24(Sat)22:19:10 No.101713210

Anonymous 08/03/24(Sat)22:19:10 No.101713210

>>101713190
Its shit.

Anonymous
08/03/24(Sat)22:23:26 No.101713257

Anonymous 08/03/24(Sat)22:23:26 No.101713257

>>101712017
>D3 alternative a year later
>Still only knows half of its concepts
I love Flux but we all know it won't take the lead for long.

Anonymous
08/03/24(Sat)22:23:56 No.101713264

Anonymous 08/03/24(Sat)22:23:56 No.101713264

File: image (12).jpg (151 KB, 1024x768)

151 KB JPG

>>101713190
You cant make coom pictures with it, but its extremly good.
VRAMchads are eating. I need to use it through the api to play around.

Anonymous
08/03/24(Sat)22:24:30 No.101713278

Anonymous 08/03/24(Sat)22:24:30 No.101713278

>>101713190
Still not good enough, just like text model.

Anonymous
08/03/24(Sat)22:26:23 No.101713308

Anonymous 08/03/24(Sat)22:26:23 No.101713308

>>101713079
Nah that'd be too good to be true. Even if they get the architecture right, they won't release it. And if they do release it, they'll be forced by shareholders to censor it before release like Chameleon, which means relying on the rest of the world/community to undo the censoring. We'll be lucky if someone does that competently.

Anonymous
08/03/24(Sat)22:27:11 No.101713314

Anonymous 08/03/24(Sat)22:27:11 No.101713314

>>101713276
Impressive. Very nice.

Anonymous
08/03/24(Sat)22:27:49 No.101713318

Anonymous 08/03/24(Sat)22:27:49 No.101713318

Achievable natty?

Anonymous
08/03/24(Sat)22:27:52 No.101713319

Anonymous 08/03/24(Sat)22:27:52 No.101713319

>>101713276
'um on 'iku

Anonymous
08/03/24(Sat)22:28:02 No.101713320

Anonymous 08/03/24(Sat)22:28:02 No.101713320

File: A_professional_photograph(...).png (1.46 MB, 1024x1024)

1.46 MB PNG

>>101713264
Not sure what openai is doing.
This is my 4th attempt.

Anonymous
08/03/24(Sat)22:31:05 No.101713356

Anonymous 08/03/24(Sat)22:31:05 No.101713356

>>101713276
How come I don't ejaculate this much?

Anonymous
08/03/24(Sat)22:33:48 No.101713387

Anonymous 08/03/24(Sat)22:33:48 No.101713387

>>101713308
>forced by shareholders to censor it
probably yeah.
weird stuff going on in the /ldg/ thread since the flux release.
>flux dev pops in, everybody has a good time
>with great timing somebody posts dead loli in a dumpster. u-uhm guys i cant even post it here but a russian guy made this prompt look at this link!
>another anon appears and is outraged asking how this is allowed and starts a whole discussion.
There must be alot of pressure and lots of bad actors all over.

Anonymous
08/03/24(Sat)22:35:04 No.101713404

Anonymous 08/03/24(Sat)22:35:04 No.101713404

>250kb/s flux download speed
sasuga hugginface

Anonymous
08/03/24(Sat)22:37:26 No.101713428

Anonymous 08/03/24(Sat)22:37:26 No.101713428

>>101711798
After Flux I legit feel GPU poor with my 3090. I need a 4090 and even that feels not enough, just breaking even.

Anonymous
08/03/24(Sat)22:38:27 No.101713436

Anonymous 08/03/24(Sat)22:38:27 No.101713436

>>101713428
Now, theoretically it should be 2x faster with TensorRT, but there is no announcement on that.

Anonymous
08/03/24(Sat)22:38:38 No.101713437

Anonymous 08/03/24(Sat)22:38:38 No.101713437

>>101713428
what would the 4090 gain you here

Anonymous
08/03/24(Sat)22:40:04 No.101713451

Anonymous 08/03/24(Sat)22:40:04 No.101713451

>>101712144
So the only current option is mistral large? Damn. I'd better get used to long waits.

Anonymous
08/03/24(Sat)22:40:39 No.101713458

Anonymous 08/03/24(Sat)22:40:39 No.101713458

>>101713428
I have a 4090, it's still slow as shit.

Anonymous
08/03/24(Sat)22:41:08 No.101713462

Anonymous 08/03/24(Sat)22:41:08 No.101713462

>>101713264
>You cant make coom pictures with it
And that's a good thing. People who want to use it are forced to come up with something different, and this results in finding more creative and simply more interesting ideas for gens than "1girl, naked" or shitty anime porn slop no.47747996432678.

Anonymous
08/03/24(Sat)22:41:48 No.101713467

Anonymous 08/03/24(Sat)22:41:48 No.101713467

File: 1699382432690393.gif (1.57 MB, 500x281)

1.57 MB GIF

>Mistral-Large-Instruct-2407-Q3_K_M-00001-of-00002.gguf
I tried to load this but it just ends up crashing koboldcpp.

I also tried Mistral-Large-Instruct-2407-IQ2_M.gguf but the gens take anywhere from 20 minutes to 1 hour 15 minutes. Should I be fucking with settings?

Anonymous
08/03/24(Sat)22:42:04 No.101713474

Anonymous 08/03/24(Sat)22:42:04 No.101713474

>>101712306
You might say it's a 'meme' but it actually works, however it does seem to make it spell things wrong sometimes, not sure why.

Anonymous
08/03/24(Sat)22:43:12 No.101713491

Anonymous 08/03/24(Sat)22:43:12 No.101713491

File: GF0ikXhbsAAW4Fu.jpg (507 KB, 2048x2048)

507 KB JPG

>>101713190
not as good as dall-e at artistic prompts, not even close. it can place things together in an ugly way sure. you can get mario and goku surfing on the death star in some ugly default style but it won't actually look good. it lacks artistry, still not good enough.

Anonymous
08/03/24(Sat)22:44:54 No.101713506

Anonymous 08/03/24(Sat)22:44:54 No.101713506

who the FUCK is petra and why does that name keep coming up in these breads
it's like /vee/'s boogeymen all over again
qrd?

Anonymous
08/03/24(Sat)22:45:34 No.101713511

Anonymous 08/03/24(Sat)22:45:34 No.101713511

>>101713467
i think you're supposed to merge them with copy /b file1.gguf file2.gguf newfile.gguf

20 mins sounds like youre swapping. if youre using kobold's new auto layer select, check your usage and manually lower it. for me it now selects 40 layers for a 70b when i can fit 31, over that i think its going to my igpu making it slower

Anonymous
08/03/24(Sat)22:46:16 No.101713517

Anonymous 08/03/24(Sat)22:46:16 No.101713517

>>101713467
you should be fucking with buying more vram

Anonymous
08/03/24(Sat)22:47:06 No.101713529

Anonymous 08/03/24(Sat)22:47:06 No.101713529

>>101712964
Magnum-72b is good? Is it better than miqu?

Anonymous
08/03/24(Sat)22:48:13 No.101713543

Anonymous 08/03/24(Sat)22:48:13 No.101713543

>>101713529
neither

Anonymous
08/03/24(Sat)22:48:48 No.101713545

Anonymous 08/03/24(Sat)22:48:48 No.101713545

>>101713467
How much ram do you have? In the 2nd case if it's that slow you're probably using swap.

Anonymous
08/03/24(Sat)22:51:08 No.101713565

Anonymous 08/03/24(Sat)22:51:08 No.101713565

>>101713506
Just ignore it, people who litter these threads with this garbage wants your attention and reactions.

Anonymous
08/03/24(Sat)22:51:30 No.101713572

Anonymous 08/03/24(Sat)22:51:30 No.101713572

>>101713543
Guess I'm stuck with miqu then. Seems like only good small or large shit comes out. Nothing good for the mid-range.

Anonymous
08/03/24(Sat)22:52:38 No.101713584

Anonymous 08/03/24(Sat)22:52:38 No.101713584

>>101713511
You don't have to do that, for koboldcpp you just point to 00001-of-0000*.gguf and you're good.

Anonymous
08/03/24(Sat)22:55:07 No.101713602

Anonymous 08/03/24(Sat)22:55:07 No.101713602

>>101713387
It does feel manufactured. A lot of manufactured posts in these threads too though about more minor things.

Anonymous
08/03/24(Sat)22:56:18 No.101713614

Anonymous 08/03/24(Sat)22:56:18 No.101713614

>>101713387
Nobody could replicate it with the prompt he provided. He just got a bunch of people to look at actual CP.

Anonymous
08/03/24(Sat)23:00:49 No.101713654

Anonymous 08/03/24(Sat)23:00:49 No.101713654

>>101713602
>>101713614
Was weird how they pressured the flux guy as well. Even after he said he is just a infrastructure dude.
>This was trained on children!
>I dont know the dataset but in my testing i never saw any output like that
>SO YOU DONT DENY THIS MIGHT BE IN THE DATASET!
Sane people that point out if you have a grown woman, children and gore in there the ai is smart enough to mix it together (thats the whole point) are ignored.
Weird shit.

Anonymous
08/03/24(Sat)23:03:11 No.101713675

Anonymous 08/03/24(Sat)23:03:11 No.101713675

>>101713654
It's not weird, it's deliberate. Tomorrow we'll be seeing news stories featuring Flux being used to post cp on 'anonymous imageboards in the dark web' or something equally idiotic.

Anonymous
08/03/24(Sat)23:05:38 No.101713696

Anonymous 08/03/24(Sat)23:05:38 No.101713696

>>101713511
Are you talking about the setting that says "GPU Layers: [-1 ] (Auto: 29/91 Layers)?
>>101713545
32 GB vram. I know it's probably not enough, but some guy on another board told me he ran a good model with less than me so I thought I'd give it a try. Apparently he was getting gens every 1 to 5 minutes.

Anonymous
08/03/24(Sat)23:11:19 No.101713766

Anonymous 08/03/24(Sat)23:11:19 No.101713766

>>101713696
You'll need a bit more than the size of the model in ram + vram total to not get it slowing down due to swapping. If it's crashing it could be low vram and you can try lowering the layers. If it's slow then you might not have enough regular ram.

Anonymous
08/03/24(Sat)23:11:34 No.101713772

Anonymous 08/03/24(Sat)23:11:34 No.101713772

>>101712144
literally kys

Anonymous
08/03/24(Sat)23:11:38 No.101713774

Anonymous 08/03/24(Sat)23:11:38 No.101713774

>>101713696
>GPU Layers: [-1 ] (Auto: 29/91 Layers)
yeah, for me after 1.70.1, it guesses layers to high and slows everything down. in task manager watch your dedicated gpu memory usage. with the model loaded you still need a bit free for cache. start with a bit over half the number it guesses, so 15. after you see how much you have free move it up more next time until you find the limit

Anonymous
08/03/24(Sat)23:15:58 No.101713830

Anonymous 08/03/24(Sat)23:15:58 No.101713830

Just a reminder that /naids/ thinks Kayra is better than Llama 405B
>>>/vg/488890201

Anonymous
08/03/24(Sat)23:18:02 No.101713858

Anonymous 08/03/24(Sat)23:18:02 No.101713858

>>101713830
Not your army, schizo

Anonymous
08/03/24(Sat)23:19:09 No.101713871

Anonymous 08/03/24(Sat)23:19:09 No.101713871

>>101713830
your post is irrelevant and so is your life

Anonymous
08/03/24(Sat)23:19:38 No.101713876

Anonymous 08/03/24(Sat)23:19:38 No.101713876

>>101713830
Are they wrong?

Anonymous
08/03/24(Sat)23:20:50 No.101713886

Anonymous 08/03/24(Sat)23:20:50 No.101713886

File: 1698308431270578.png (1.24 MB, 1024x1024)

1.24 MB PNG

flux knows what a funko pop is (even though traditional figures are better, just to test)

Anonymous
08/03/24(Sat)23:21:55 No.101713902

Anonymous 08/03/24(Sat)23:21:55 No.101713902

>>101713830
You gonna shill your shitty Mikupad clone again?

Anonymous
08/03/24(Sat)23:22:26 No.101713906

Anonymous 08/03/24(Sat)23:22:26 No.101713906

File: 1704712997481735.png (1.26 MB, 1024x1024)

1.26 MB PNG

wooden doll with joints miku:

Anonymous
08/03/24(Sat)23:24:53 No.101713936

Anonymous 08/03/24(Sat)23:24:53 No.101713936

>>101713830
Kayra is better. Novel ai image gen is also better than flux. Don't waste your money.

Anonymous
08/03/24(Sat)23:27:55 No.101713985

Anonymous 08/03/24(Sat)23:27:55 No.101713985

>>101713936
This is the most organic post I've ever read.

Anonymous
08/03/24(Sat)23:28:15 No.101713987

Anonymous 08/03/24(Sat)23:28:15 No.101713987

>>101713936
This, but unironically.

Anonymous
08/03/24(Sat)23:29:50 No.101714006

Anonymous 08/03/24(Sat)23:29:50 No.101714006

>>101713830
>>101713936
Ewww, fuck off and go back to your containment thread rather than shitting up ours. You can have your schizo melties there instead.

Anonymous
08/03/24(Sat)23:30:34 No.101714016

Anonymous 08/03/24(Sat)23:30:34 No.101714016

>>101712752
Stop fine tuning shitty 13bs and give us llama 3.1 70b fine tune of magnum opus

Anonymous
08/03/24(Sat)23:31:35 No.101714029

Anonymous 08/03/24(Sat)23:31:35 No.101714029

>>101714016
*a mistral large 2 tune

Anonymous
08/03/24(Sat)23:34:09 No.101714057

Anonymous 08/03/24(Sat)23:34:09 No.101714057

>>101714006
This thread is already thoroughly shitted up anyway.

Anonymous
08/03/24(Sat)23:38:15 No.101714091

Anonymous 08/03/24(Sat)23:38:15 No.101714091

>>101714029
Or that, either is fine. Either euryale or magnum. These sloptuners slacking and playing with shitty 8b and 12bs instead of giving us the good stuff.

Anonymous
08/03/24(Sat)23:39:18 No.101714104

Anonymous 08/03/24(Sat)23:39:18 No.101714104

>>101714091
Mistral large is already pretty good, so I think 70b has the most potential for improvement.

Anonymous
08/04/24(Sun)00:05:23 No.101714317

Anonymous 08/04/24(Sun)00:05:23 No.101714317

>https://huggingface.co/migtissera/Tess-3-Llama-3.1-405B
quants when

Anonymous
08/04/24(Sun)00:08:45 No.101714349

Anonymous 08/04/24(Sun)00:08:45 No.101714349

>>101713774
>>101713766
Yeah so I think I fucked up. I have 32gb ram and 16gb vram which is probably why I can't get gens without waiting 20+ minutes. I messed with the GPU layers setting and it didn't change anything. If I set it too high it just crashes koboldcpp

Anonymous
08/04/24(Sun)00:09:10 No.101714354

Anonymous 08/04/24(Sun)00:09:10 No.101714354

Just got gemma2:27b-instruct-q6_K running on ollama with no gpu and 32gb of ram.
It is very slow but also the best model I have tested.
I always ask every model how to beat the moon lord in terraria and everytime they give me a bs answer that shows they don't know what they're talking about, but gemma2 gave me a pretty good answer. Although not entirelly correct, it was miles above the rest.
I am very impressed.
Is this what the 70b models feel like?

Anonymous
08/04/24(Sun)00:13:08 No.101714390

Anonymous 08/04/24(Sun)00:13:08 No.101714390

>>101714354
I used gemma2 27b for RP and it was hot garbage, not even complaining about prose or pozzitivity or whatever, it was just dumb as fuck). Miqu (real 70B) passed all my RP tests (while also shivermaxxing so I deleted it)

Anonymous
08/04/24(Sun)00:13:16 No.101714391

Anonymous 08/04/24(Sun)00:13:16 No.101714391

>>101714354
>It is very slow but also the best model I have tested.
this makes me sad

Anonymous
08/04/24(Sun)00:14:03 No.101714405

Anonymous 08/04/24(Sun)00:14:03 No.101714405

>>101714390
What do you use then?

Anonymous
08/04/24(Sun)00:16:35 No.101714430

Anonymous 08/04/24(Sun)00:16:35 No.101714430

>>101714390
Could be that it has access to more gaming related data than other models and that's why it could answer better?

Anonymous
08/04/24(Sun)00:16:55 No.101714433

Anonymous 08/04/24(Sun)00:16:55 No.101714433

>>101714349
try selecting disable mmap

Anonymous
08/04/24(Sun)00:20:02 No.101714467

Anonymous 08/04/24(Sun)00:20:02 No.101714467

>>101711798
If there's one thing I've learned from LLMs over the past few years, it's that there's no hope. The corporations will control this tech with an iron fist. We will never have anything interesting that results from this. There are too many potential lawsuits. We're also seeing this on the enterprise side of things, and it's looking like the AI bubble is about to burst. And with California about to fuck everything up, it's all very depressing and predictable. Interesting AI may exist one day, but not while any of us our young enough to care.

Anonymous
08/04/24(Sun)00:20:29 No.101714471

Anonymous 08/04/24(Sun)00:20:29 No.101714471

File: photo_2024-04-19_15-50-00.jpg (232 KB, 960x1280)

232 KB JPG

>>101714405
I got a real gf so I sold my 2nd 3090 and don't RP anymore, so technically I'm no longer a "user". Now I just run random RP tests from time to time to check on the current state of LLMs
>>101714430
Beats me, I don't see how gaming is relevant here

Anonymous
08/04/24(Sun)00:21:20 No.101714477

Anonymous 08/04/24(Sun)00:21:20 No.101714477

Been using Mixtral 8x7b on CPU for months, is there anything better now with the same performance?

Anonymous
08/04/24(Sun)00:22:26 No.101714482

Anonymous 08/04/24(Sun)00:22:26 No.101714482

>>101714471
Well if you still do testing what's the best bet?

Anonymous
08/04/24(Sun)00:22:58 No.101714488

Anonymous 08/04/24(Sun)00:22:58 No.101714488

>>101713462
Not really. Most gens are just some generic white or Korean woman in some provocative and SFW pose and/or setting.

Anonymous
08/04/24(Sun)00:26:44 No.101714528

Anonymous 08/04/24(Sun)00:26:44 No.101714528

>>101714467
Every doomer prediction since "chatgpt will never be local" and "llama 1 is the last llm we'll ever get" has been wrong.

Anonymous
08/04/24(Sun)00:27:18 No.101714533

Anonymous 08/04/24(Sun)00:27:18 No.101714533

>>101713774
i take back what i said kcpp's layer guessing, i don't know why it wasn't updating before but now when i drag the context, it adjusts the layers. per my same settings, its suggesting 27 now (32 was my max prior with 16k context) so its much more correct than it was by guessing 40 on the last version. i dunno why the ui didnt update for me at first

Anonymous
08/04/24(Sun)00:28:36 No.101714549

Anonymous 08/04/24(Sun)00:28:36 No.101714549

>>101714482
For 24GB VRAM range, Yi-34b-chat, mini-magnum is also decent, felt like a gemma2 27b sidegrade but it "gets" more ERP
Poop: magnum-32b (qwen base), in fact, all qwen models suck ass
For 70B range midnight-miqu was decent, L3-70B started repeating itself on the third reply, didn't try L3.1 but I doubt it'd be any better

Anonymous
08/04/24(Sun)00:35:22 No.101714611

Anonymous 08/04/24(Sun)00:35:22 No.101714611

File: 1722733668845724.png (4 KB, 502x397)

4 KB PNG

>>101712274

Anonymous
08/04/24(Sun)00:40:24 No.101714657

Anonymous 08/04/24(Sun)00:40:24 No.101714657

>>101714611
Heh

Anonymous
08/04/24(Sun)00:41:52 No.101714668

Anonymous 08/04/24(Sun)00:41:52 No.101714668

>>101714611
>1k context card
>35 token user card
>128k ctx
>AHHHHHHH WHY IS IT THE SAME
use rag and lorebooks you fucktards

Anonymous
08/04/24(Sun)00:42:33 No.101714674

Anonymous 08/04/24(Sun)00:42:33 No.101714674

is llama 3.1 bad like 3 was?

Anonymous
08/04/24(Sun)00:46:43 No.101714718

Anonymous 08/04/24(Sun)00:46:43 No.101714718

>>101714668
You're saying having that info at the beginning is a drawback? But using a lorebook or similar would increase processing time.

Anonymous
08/04/24(Sun)00:51:13 No.101714766

Anonymous 08/04/24(Sun)00:51:13 No.101714766

Any largestral fine tune?

Anonymous
08/04/24(Sun)00:51:24 No.101714768

Anonymous 08/04/24(Sun)00:51:24 No.101714768

>>101714718
the original prompts and premise only carry a story so far, you need to be constantly putting new data into it. both rag and lorebooks will cause processing of the entire context each time, but the results are that much better because its considering new random data with each gen rather than just going off of chat history and card data

Anonymous
08/04/24(Sun)00:53:44 No.101714791

Anonymous 08/04/24(Sun)00:53:44 No.101714791

File: Hatsune Miku spilled a lo(...).png (1.16 MB, 1024x1024)

1.16 MB PNG

>>101713264
You can. Just takes a lot more effort.

Anonymous
08/04/24(Sun)00:56:13 No.101714816

Anonymous 08/04/24(Sun)00:56:13 No.101714816

>>101714766
Only Undi's, I think.

Anonymous
08/04/24(Sun)01:13:02 No.101714959

Anonymous 08/04/24(Sun)01:13:02 No.101714959

We are now in the ollama+open webui era.

Anonymous
08/04/24(Sun)01:15:51 No.101714981

Anonymous 08/04/24(Sun)01:15:51 No.101714981

>>101714433
Wow that cut it all the way down to a gen every 5 to 8 minutes. Is there anything else I should try to make it even faster?

Anonymous
08/04/24(Sun)01:20:16 No.101715016

Anonymous 08/04/24(Sun)01:20:16 No.101715016

>>101714981
what kind of processor do you have? you might be able to increase the threads but for some processors like intel it defaults to only the pcores for a reason.
check your dedicated gpu usage again, being able to fit more layers increases speed slightly too, but you have to balance that with your context limit
you're already running a huge model for your system specs. you should be using a 70b, not a 123b. you were likely swapping to your ssd before which is why it was so slow, disabling memory map reduces ram usage just enough that you're able to fit it in without swapping

Anonymous
08/04/24(Sun)01:28:00 No.101715096

Anonymous 08/04/24(Sun)01:28:00 No.101715096

File: ComfyUI_00011_.png (1.55 MB, 1024x1024)

1.55 MB PNG

Cucky, cucky, cucky, cucky! Cucky, cucky, cucky, cucky! Come bring ya black ass out here, you fucking nigger! It's buck breakin' o'clock! You've been misbehavin' again!

Anonymous
08/04/24(Sun)01:52:57 No.101715306

Anonymous 08/04/24(Sun)01:52:57 No.101715306

What model can i host on hf space free tier?

Anonymous
08/04/24(Sun)01:53:37 No.101715316

Anonymous 08/04/24(Sun)01:53:37 No.101715316

>>101715306
they don't have GPUs, so nothing at a decent speed

Anonymous
08/04/24(Sun)02:14:04 No.101715498

Anonymous 08/04/24(Sun)02:14:04 No.101715498

>>101711798
Anyone tried piping the output of a chatbot into a text to speech AI yet? Are there even any good local text to speech AI models out yet?

Anonymous
08/04/24(Sun)02:23:33 No.101715583

Anonymous 08/04/24(Sun)02:23:33 No.101715583

File: 1714171673217016.jpg (160 KB, 2048x1365)

160 KB JPG

Is there any studies/reports on the effectiveness of using LLMs as a supplement or replacement for psych therapy? Or have any anons used it as a therapist or seen others talk of it?
Considering using it myself while I wait to go see a real one, wanna start actually living my life
I set up a psych character some days ago and talked to it for an hour, felt a bit scuffed but it might have actually helped me by attributing a lot of my issues I discussed to terrible self body image, something I was told I had as a kid but forgot and over time became normalized effecting me in ways I never considered related

Anonymous
08/04/24(Sun)02:26:00 No.101715619

Anonymous 08/04/24(Sun)02:26:00 No.101715619

>>101715498
Yes. I use piper and it works just fine. It's very fast, and pretty good, but not the best. It's good enough for my needs, speed being a major point.

Anonymous
08/04/24(Sun)02:35:46 No.101715732

Anonymous 08/04/24(Sun)02:35:46 No.101715732

File: foreplay.png (462 KB, 813x2417)

462 KB PNG

chose the worst nala card for this, trying gemmastra 2b with samplers listed on model card, disabled EOS
>that atrocious grammar on second faus-user turn

Anonymous
08/04/24(Sun)02:37:29 No.101715753

Anonymous 08/04/24(Sun)02:37:29 No.101715753

>"won't bite... unless you want me too" with mischievous gleam
AAAAAAAAAA

Anonymous
08/04/24(Sun)02:38:55 No.101715771

Anonymous 08/04/24(Sun)02:38:55 No.101715771

>>101715753
mistral large sent a shiver down my spine, the slop is still there

Anonymous
08/04/24(Sun)02:39:29 No.101715782

Anonymous 08/04/24(Sun)02:39:29 No.101715782

>>101715753
I've said this in real life a few times
It's not that hot

Anonymous
08/04/24(Sun)02:42:23 No.101715811

Anonymous 08/04/24(Sun)02:42:23 No.101715811

>>101714471
>got a real gf
>sold 3090
>gf spends 3090 money
>gf leaves
>"Welcome back, Anon."

Anonymous
08/04/24(Sun)02:48:09 No.101715863

Anonymous 08/04/24(Sun)02:48:09 No.101715863

>>101715732
wow, that came from a 2B? how does it perform with the usual nala card?

Anonymous
08/04/24(Sun)02:49:29 No.101715875

Anonymous 08/04/24(Sun)02:49:29 No.101715875

>>101715863
nta leddit has some tests of it, its as incoherent as you would expect. a good test would be this 2b vs pyg 2.7/6b, maybe l1 7b

Anonymous
08/04/24(Sun)02:50:52 No.101715890

Anonymous 08/04/24(Sun)02:50:52 No.101715890

>>101715863
lol I never bothered finding the real nala card
also the prompt is cheating

Anonymous
08/04/24(Sun)03:11:39 No.101716069

Anonymous 08/04/24(Sun)03:11:39 No.101716069

>>101715732
>>101715875

Not bad for a tiny model. I'd expect 2B to implode from a retarded prompt like that.

Anonymous
08/04/24(Sun)03:20:00 No.101716127

Anonymous 08/04/24(Sun)03:20:00 No.101716127

>>101715016
Yeah I do have an intel core i7-13700F. I tried increasing threads but that only made the gens take a bit longer.

Anonymous
08/04/24(Sun)03:29:12 No.101716209

Anonymous 08/04/24(Sun)03:29:12 No.101716209

>>101716127
once you aren't swapping youre pretty much at max speed anyways. make sure youre using as much layers as possible, xmp is on, but thats about it. welcome to cpu speed. mistral large is 0.7t/s for me and i really dont find it better than a 70b so far, but still testing it myself

Anonymous
08/04/24(Sun)03:37:00 No.101716277

Anonymous 08/04/24(Sun)03:37:00 No.101716277

File: 1696857740840216.png (1.26 MB, 1000x1545)

1.26 MB PNG

>>101716209
Alright. I'll mess with the layers again in the morning. If that fails me I'll try out a 70b model. If you could spoonfeed me the link to the one everyone uses for roleplayshit I'd appreciate it. Thanks for all your help so far.

Anonymous
08/04/24(Sun)03:58:47 No.101716473

Anonymous 08/04/24(Sun)03:58:47 No.101716473

File: 1722757372318341.jpg (95 KB, 1024x1024)

95 KB JPG

>>101715811
>gf spends 3090 money
thats like one month of (attractive) gf money in a first world country

Anonymous
08/04/24(Sun)04:02:08 No.101716502

Anonymous 08/04/24(Sun)04:02:08 No.101716502

>>101716473
If I were his girlfriend, I'd rather he kept all the 3090s in the family.

Anonymous
08/04/24(Sun)04:04:18 No.101716526

Anonymous 08/04/24(Sun)04:04:18 No.101716526

>>101716277
for 70b
llama 2 greatness
>https://huggingface.co/mradermacher/Midnight-Miqu-70B-v1.5-i1-GGUF/tree/main

for llama 3.1, i'm trying
>https://huggingface.co/mradermacher/Lumimaid-v0.2-70B-i1-GGUF/tree/main

Anonymous
08/04/24(Sun)04:10:25 No.101716586

Anonymous 08/04/24(Sun)04:10:25 No.101716586

>>101716526
What settings and format do you use for miqu?

Anonymous
08/04/24(Sun)04:10:39 No.101716590

Anonymous 08/04/24(Sun)04:10:39 No.101716590

I have a 12GB 3060. Would like to get more VRAM.
The only GPUs with 24GB are really pricy. Is using 2 GPUs actually worth it for SD; can you split the model in a useful way?

Anonymous
08/04/24(Sun)04:11:00 No.101716597

Anonymous 08/04/24(Sun)04:11:00 No.101716597

File: image.png (299 KB, 755x901)

299 KB PNG

I was testing DeepSeek Chat V2 0628, 236B's performance on my system (as a Q4_K_M gguf quant) with a longer context by feeding it a wikipedia page to summarize.
I randomly chose the page on US Military History and added a fictitious section in the middle about a series of conflicts between McDonald's and Burger King to see if it would actually summarize the provided text or go off its own data.

Instead it suddenly answered in Chinese even though we were speaking English.
>中国的军事力量是自卫性的,中国始终坚持走和平发展道路,坚持防御性国防政策。中国的军事建设始终是为了维护国家主权、安全和发展利益,保护人民的和平劳动,促进世界和平与发展的崇高事业。中国军队是人民的军队,它的根本宗旨是全心全意为人民服务。中国军队的发展和强大,是中国和平发展、积极参与国际事务、维护世界和平与稳定的体现。
Google translates as:
>China's military power is self-defensive. China has always adhered to the path of peaceful development and adhered to a defensive national defense policy. China's military construction has always been for the noble cause of safeguarding national sovereignty, security and development interests, protecting the people's peaceful labor, and promoting world peace and development. The Chinese military is the people's army, and its fundamental purpose is to serve the people wholeheartedly. The development and strength of the Chinese military is a manifestation of China's peaceful development, active participation in international affairs, and maintenance of world peace and stability.

Anonymous
08/04/24(Sun)04:12:04 No.101716605

Anonymous 08/04/24(Sun)04:12:04 No.101716605

>>101716597
Kek 20 epochs of Mao's teachings are mandatory for models made in China

Anonymous
08/04/24(Sun)04:13:20 No.101716620

Anonymous 08/04/24(Sun)04:13:20 No.101716620

>>101716586
normal alpaca rp in st. it responds good to it (and many other models too)

Anonymous
08/04/24(Sun)04:15:04 No.101716636

Anonymous 08/04/24(Sun)04:15:04 No.101716636

>>101716590
As far as i understand, Comfy cannot use more than one gpu. I don't know about others.

Anonymous
08/04/24(Sun)04:23:57 No.101716721

Anonymous 08/04/24(Sun)04:23:57 No.101716721

>>101716636
Thanks, asking about SD in general and not about the currently most shilled UI

Anonymous
08/04/24(Sun)04:54:15 No.101717039

Anonymous 08/04/24(Sun)04:54:15 No.101717039

>>101716721
>shilled
I just specifically told you the thing i use doesn't support the thing you're looking for. And here i was reading about other uis. Do your own reading now.

Anonymous
08/04/24(Sun)05:13:25 No.101717191

Anonymous 08/04/24(Sun)05:13:25 No.101717191

>>101717039
Not inquiring about UIs, sorry.

Anonymous
08/04/24(Sun)05:22:58 No.101717274

Anonymous 08/04/24(Sun)05:22:58 No.101717274

>>101715498
there's even a SillyTavern plugin for that already pre-installed

Anonymous
08/04/24(Sun)06:58:45 No.101718088

Anonymous 08/04/24(Sun)06:58:45 No.101718088

does kobold have an rpc feature for multi node like llamacpp? are they compatible with each other? have an amd pc that for some reason the kobold rocm fork works fine on but can't get base llama to not segfault, and would like to join it with my main server

Anonymous
08/04/24(Sun)07:19:48 No.101718282

Anonymous 08/04/24(Sun)07:19:48 No.101718282

>>101711970
Me on the right

Anonymous
08/04/24(Sun)07:35:19 No.101718401

Anonymous 08/04/24(Sun)07:35:19 No.101718401

>>101718088
use linux

Anonymous
08/04/24(Sun)07:46:04 No.101718518

Anonymous 08/04/24(Sun)07:46:04 No.101718518

>>101718282
You do not have an ironed shirt anon, don't lie to me.

Anonymous
08/04/24(Sun)07:49:45 No.101718555

Anonymous 08/04/24(Sun)07:49:45 No.101718555

>>101718518
Ok I don't. But I do own an iron and I know how to Google how to use it

Anonymous
08/04/24(Sun)07:50:01 No.101718559

Anonymous 08/04/24(Sun)07:50:01 No.101718559

File: 1706859635657725.jpg (2.62 MB, 2894x4093)

2.62 MB JPG

How are local models nowadays compared to something like sonnet or orbo?

Anonymous
08/04/24(Sun)07:51:51 No.101718579

Anonymous 08/04/24(Sun)07:51:51 No.101718579

>>101718559
terrible

Anonymous
08/04/24(Sun)07:53:57 No.101718600

Anonymous 08/04/24(Sun)07:53:57 No.101718600

>>101718559
smelly sex picture

Anonymous
08/04/24(Sun)07:54:08 No.101718602

Anonymous 08/04/24(Sun)07:54:08 No.101718602

I have a lot of hope for LLMs, I really do, I'm just sad that it will take like a decade to reach the levels that private models currently are on.

Anonymous
08/04/24(Sun)07:55:02 No.101718612

Anonymous 08/04/24(Sun)07:55:02 No.101718612

>>101718559
meme

Anonymous
08/04/24(Sun)07:57:25 No.101718637

Anonymous 08/04/24(Sun)07:57:25 No.101718637

>>101718602
they caught up to gpt4 in a year

Anonymous
08/04/24(Sun)08:03:34 No.101718699

Anonymous 08/04/24(Sun)08:03:34 No.101718699

>>101718602
My niece, when she was a little kid, told me that one day she'll be older than me. You seem to fail to grasp the same concept.

Anonymous
08/04/24(Sun)08:17:35 No.101718873

Anonymous 08/04/24(Sun)08:17:35 No.101718873

>>101718699
If she lives to be older than you are at death she’s not wrong.

Anonymous
08/04/24(Sun)08:28:38 No.101718990

Anonymous 08/04/24(Sun)08:28:38 No.101718990

>>101718602
lmao, locals are basically at the same level as private models. It's not 2022 anymore.
The bigger problem is that local and corpo models are on the same shitty level - LLMs are trash. If you were to compare development level of LLMs with RAM memory for example, we are sitting at 64kb. And there are anons in this general saying that 64kb is all we need.

Anonymous
08/04/24(Sun)08:30:35 No.101719011

Anonymous 08/04/24(Sun)08:30:35 No.101719011

>>101718873
If we're going to be pedantic, you meant to say 'she wouldn't be wrong'. She was.

Anonymous
08/04/24(Sun)08:32:07 No.101719031

Anonymous 08/04/24(Sun)08:32:07 No.101719031

File: 8GB.png (529 KB, 1170x2532)

529 KB PNG

HAHAHAHAHA 8GB VRAM

Anonymous
08/04/24(Sun)08:33:43 No.101719055

Anonymous 08/04/24(Sun)08:33:43 No.101719055

>>101719031
cpuchads win again

Anonymous
08/04/24(Sun)08:34:29 No.101719065

Anonymous 08/04/24(Sun)08:34:29 No.101719065

>>101719031
>X060

Anonymous
08/04/24(Sun)08:35:30 No.101719075

Anonymous 08/04/24(Sun)08:35:30 No.101719075

>>101719031
I don't trust this, Nvidia wouldn't be so stupid.

Anonymous
08/04/24(Sun)08:35:50 No.101719080

Anonymous 08/04/24(Sun)08:35:50 No.101719080

>>101718990
there's not a lot they can do with the current architecture
increasing the parameter count doesn't seem to do much past 500b-1t, except pack them with more useless information
and the training sets are already immense, you can't add much other than redundant data that won't help with their intelligence
there needs to be another breakthrough in research before we can see some serious improvements

Anonymous
08/04/24(Sun)08:40:52 No.101719134

Anonymous 08/04/24(Sun)08:40:52 No.101719134

somehow I can only load VRAM - ~3GB to VRAM with llama.cpp or I get some startup error, it still starts and generates but can't properly communicate with it anymore, what could be the reason?

Anonymous
08/04/24(Sun)08:43:26 No.101719161

Anonymous 08/04/24(Sun)08:43:26 No.101719161

>>101719134
Context taking more vram as it fills up

Anonymous
08/04/24(Sun)08:44:05 No.101719170

Anonymous 08/04/24(Sun)08:44:05 No.101719170

>>101719134
>but can't properly communicate with it anymore
have you tried sitting down with your GPU and discussing your problems together?
>what could be the reason?
are you sure your GPU isn't being... filled with memory from other programs till it starts leaking?

Anonymous
08/04/24(Sun)08:45:01 No.101719180

Anonymous 08/04/24(Sun)08:45:01 No.101719180

File: another_sensible_chuckle.gif (390 KB, 220x166)

390 KB GIF

>>101719075

Anonymous
08/04/24(Sun)08:48:27 No.101719210

Anonymous 08/04/24(Sun)08:48:27 No.101719210

>>101719161
>>101719170
it's immediately after loading the model, llama.cpp says oom, but it's kind of working anyways

Anonymous
08/04/24(Sun)08:48:47 No.101719213

Anonymous 08/04/24(Sun)08:48:47 No.101719213

>>101719031
Ought to be enough for anybody.

Anonymous
08/04/24(Sun)08:50:22 No.101719229

Anonymous 08/04/24(Sun)08:50:22 No.101719229

Is it normal that IQ3_XXS is much slower than Q2_K?

Anonymous
08/04/24(Sun)08:51:15 No.101719237

Anonymous 08/04/24(Sun)08:51:15 No.101719237

>>101719229
stop.

Anonymous
08/04/24(Sun)08:51:16 No.101719238

Anonymous 08/04/24(Sun)08:51:16 No.101719238

File: Screen Shot 2024-08-04 at(...).png (153 KB, 1078x280)

153 KB PNG

I have an ERR! for 3090ti fan in nvidia-smi. Has anyone encountered this issue?

Anonymous
08/04/24(Sun)08:53:34 No.101719265

Anonymous 08/04/24(Sun)08:53:34 No.101719265

>>101719229
Yes. It's doing a lot of work to give you the quality that it can out of IQ3. Q2, you're just turbo guessing.

Anonymous
08/04/24(Sun)08:54:01 No.101719269

Anonymous 08/04/24(Sun)08:54:01 No.101719269

>>101719238
It's fucked, there is nothing you can do to fix it. Don't throw away your card tho, you can send it to me so I can dispose of it ecologically.

Anonymous
08/04/24(Sun)08:54:04 No.101719271

Anonymous 08/04/24(Sun)08:54:04 No.101719271

I'm having repetition and hallucination? gremlin? problems with the new Mistral-Nemo-Instruct-2407, using Q4_K_M.

After a few messages every char railroads into talking more or less the exact same way. And later on in convos, by about message #150, the model just goes gremlin mode, either with heavy repetition (not repeating words, but repeating patterns and heavy use of synonyms one after another), and other times responds to things the exact same way.

Anonymous
08/04/24(Sun)08:55:11 No.101719284

Anonymous 08/04/24(Sun)08:55:11 No.101719284

>>101719271
please refer to the following diagram -> >>101712274

Anonymous
08/04/24(Sun)09:01:52 No.101719357

Anonymous 08/04/24(Sun)09:01:52 No.101719357

Nemo Lyra low key mogs

Anonymous
08/04/24(Sun)09:04:15 No.101719383

Anonymous 08/04/24(Sun)09:04:15 No.101719383

>>101719080
If they can't improve on intelligence, they should start looking toward optimizing performance. Give me my fucking BitNet.

Anonymous
08/04/24(Sun)09:07:57 No.101719424

Anonymous 08/04/24(Sun)09:07:57 No.101719424

>>101719229
are you running on ram or vram? iquants are very bad on ram

Anonymous
08/04/24(Sun)09:08:58 No.101719445

Anonymous 08/04/24(Sun)09:08:58 No.101719445

>>101719383
but bitnet won't give you points on meme benchmarks and that's the only thing corpos care about

Anonymous
08/04/24(Sun)09:11:15 No.101719471

Anonymous 08/04/24(Sun)09:11:15 No.101719471

>>101719238
Does the fan spin? Stick your finger in there while it's under load

Anonymous
08/04/24(Sun)09:12:43 No.101719483

Anonymous 08/04/24(Sun)09:12:43 No.101719483

>>101719445
corpos are also supposed to care about cutting costs and gpus are expensive

Anonymous
08/04/24(Sun)09:15:23 No.101719520

Anonymous 08/04/24(Sun)09:15:23 No.101719520

>>101719445
Someone just needs to make a meme benchmark that is intelligence per inferencing cost or something and label it as a Green or ESG Benchmark and they will.

Anonymous
08/04/24(Sun)09:15:29 No.101719521

Anonymous 08/04/24(Sun)09:15:29 No.101719521

>>101719483
nah, they already bought these GPUs so they can as well use them. And bragging rights to investors from beating another corpo by 2% on MMLU is worth more than creating a slightly worse model for GPU poorfags.

Anonymous
08/04/24(Sun)09:15:31 No.101719523

Anonymous 08/04/24(Sun)09:15:31 No.101719523

File: 1722455246530456.png (608 KB, 850x1103)

608 KB PNG

>>101711798
I'm planning a chat mode for my client, and looking for ideas.

What do you wish the clients you use had / did better?

I am currently implementing Setting templates so that you can quickly use different Ai settings (even different models) easily, as well as block output so you can chain generations and compose complex prompts. Not really specific to chatmode but very useful for summarization workflows.

As for chat specific ideas, the only ones I have in mind so far is mid-chat injection.

Any other ideas appreciated.

Anonymous
08/04/24(Sun)09:17:50 No.101719550

Anonymous 08/04/24(Sun)09:17:50 No.101719550

File: accelerate.png (1.12 MB, 1320x764)

1.12 MB PNG

>>101719031
How are we supposed to accelerate with 8GB VRAM, old man?

Anonymous
08/04/24(Sun)09:18:24 No.101719560

Anonymous 08/04/24(Sun)09:18:24 No.101719560

I miss gpt-3

Anonymous
08/04/24(Sun)09:20:31 No.101719588

Anonymous 08/04/24(Sun)09:20:31 No.101719588

>>101711798
>/lmg/ - Local Mikus General
>/lmg/ - a general dedicated to the discussion and development of local language mikus.
>Previous threads: >>101682019 & >>101705239
>►News
>►Official /lmg/ card: https://files.catbox.moe/cbclyf.png (embed)
Surprised you didn't change the card this time.

Anonymous
08/04/24(Sun)09:20:58 No.101719591

Anonymous 08/04/24(Sun)09:20:58 No.101719591

>>101715583
https://upload.wikimedia.org/wikipedia/commons/1/12/Ectobius_vittiventris_prep.jpg

Anonymous
08/04/24(Sun)09:21:24 No.101719597

Anonymous 08/04/24(Sun)09:21:24 No.101719597

>>101719550
He's accelerating the depletion of your bank account.

Anonymous
08/04/24(Sun)09:22:07 No.101719603

Anonymous 08/04/24(Sun)09:22:07 No.101719603

>>101719550
Your video games with the brand new DLSS 4.0 EXTREME, which reduces VRAM requirements so cards no longer need so much VRAM? That's what those cards are made for. Gaming.

Anonymous
08/04/24(Sun)09:27:35 No.101719652

Anonymous 08/04/24(Sun)09:27:35 No.101719652

>>101719550
graphic cards are for games, not to generate shivers down the spine

Anonymous
08/04/24(Sun)09:27:35 No.101719653

Anonymous 08/04/24(Sun)09:27:35 No.101719653

>>101719424
Ok thx, 40% on CPU, could that be the reason flash attention is very slow, as well?

Anonymous
08/04/24(Sun)09:29:21 No.101719671

Anonymous 08/04/24(Sun)09:29:21 No.101719671

>>101719591
Thank you for your nice roach picture anon...

Anonymous
08/04/24(Sun)09:32:53 No.101719711

Anonymous 08/04/24(Sun)09:32:53 No.101719711

>>101719031
>>101719550
Feels good to be a 12GB VRAMchad. I'm already mogging the 5090 poorfags.

Anonymous
08/04/24(Sun)09:35:04 No.101719731

Anonymous 08/04/24(Sun)09:35:04 No.101719731

>>101719711
>5090
Your tokenizer is fucked.

Anonymous
08/04/24(Sun)09:35:56 No.101719742

Anonymous 08/04/24(Sun)09:35:56 No.101719742

>>101719031
if you're buying a low-mid card like that then you're gaming at 1080p and you genuinely do NOT need more
>b-but muh 32x supersampling AA!!!
dlss is better

Anonymous
08/04/24(Sun)09:36:12 No.101719747

Anonymous 08/04/24(Sun)09:36:12 No.101719747

>>101719652
the fact that more and more companies are releasing models for local use is a sign that things are changing, nvidia needs to realize that

Anonymous
08/04/24(Sun)09:37:50 No.101719766

Anonymous 08/04/24(Sun)09:37:50 No.101719766

>>101719747
Running llms locally is a niche. Games sell more consumer gpus.

Anonymous
08/04/24(Sun)09:38:13 No.101719771

Anonymous 08/04/24(Sun)09:38:13 No.101719771

>>101719742
that's right goy, buy our new 5060 which is hardly better than a 3060 but costs three times as much

Anonymous
08/04/24(Sun)09:38:57 No.101719781

Anonymous 08/04/24(Sun)09:38:57 No.101719781

>>101719747
That's what NPUs are for. And local use corpo models are typically 2B models for shit like autocomplete, classification, or suggestions. All the important stuff will happen on the cloud for a reasonable subscription fee.

Anonymous
08/04/24(Sun)09:39:37 No.101719786

Anonymous 08/04/24(Sun)09:39:37 No.101719786

can bitnet do moe

Anonymous
08/04/24(Sun)09:40:48 No.101719802

Anonymous 08/04/24(Sun)09:40:48 No.101719802

can moe do bitnet

Anonymous
08/04/24(Sun)09:41:09 No.101719808

Anonymous 08/04/24(Sun)09:41:09 No.101719808

>>101719731
>he is unaware

Anonymous
08/04/24(Sun)09:42:14 No.101719824

Anonymous 08/04/24(Sun)09:42:14 No.101719824

File: fm1ciczayamv96e.jpg (30 KB, 524x329)

30 KB JPG

>>101719808

Anonymous
08/04/24(Sun)09:43:15 No.101719836

Anonymous 08/04/24(Sun)09:43:15 No.101719836

>>101719808
>he cares not
But the first post you linked is for the hypothetical 5060, not the 5090. And i don't rely on rumours.

Anonymous
08/04/24(Sun)09:43:52 No.101719846

Anonymous 08/04/24(Sun)09:43:52 No.101719846

anyone know and good uncensored 2B-4B models?
I want something that runs well on my phone for role play

Anonymous
08/04/24(Sun)09:45:05 No.101719864

Anonymous 08/04/24(Sun)09:45:05 No.101719864

God damn you really beefy hardware to run some of this.

Anonymous
08/04/24(Sun)09:45:43 No.101719868

Anonymous 08/04/24(Sun)09:45:43 No.101719868

>>101719846
Why don't you just leave it open on your computer and connect to it from your phone?

Anonymous
08/04/24(Sun)09:48:02 No.101719885

Anonymous 08/04/24(Sun)09:48:02 No.101719885

>>101719846
https://huggingface.co/TheDrummer/Gemmasutra-Mini-2B-v1

Anonymous
08/04/24(Sun)09:52:15 No.101719926

Anonymous 08/04/24(Sun)09:52:15 No.101719926

>>101719846
>good
>2B-4B

Anonymous
08/04/24(Sun)09:54:27 No.101719953

Anonymous 08/04/24(Sun)09:54:27 No.101719953

I've been out of the loop for a year, what's the best uncensored model that could fit in 12GB vram currently? I need it for writing image generation prompts and maybe some scandalous dirty jokes.

Anonymous
08/04/24(Sun)09:56:52 No.101719988

Anonymous 08/04/24(Sun)09:56:52 No.101719988

>>101719953
mistral nemo, though you can't fit the entire 128k context into 12GB unless you offload something to ram

Anonymous
08/04/24(Sun)09:57:59 No.101719996

Anonymous 08/04/24(Sun)09:57:59 No.101719996

>>101719988
I don't think he needs 128k context for prompts and jokes, bro.

Anonymous
08/04/24(Sun)09:59:04 No.101720008

Anonymous 08/04/24(Sun)09:59:04 No.101720008

>>101719996
He might if it's a Brick Joke.

Anonymous
08/04/24(Sun)10:22:50 No.101720233

Anonymous 08/04/24(Sun)10:22:50 No.101720233

>>101719747
>the fact that more and more companies are releasing models for local use is a sign that things are changing, nvidia needs to realize that
Local doesn't need more memory, it needs better models and inference engines. Powerinfer-2 proves proves that (unless you assume they are lying).
The industry needs to stop making slightly tweaked GPT-2's and embrace predictable dynamic sparsity for local models. That way local can use massively larger models, increasing VRAM only allows slightly larger models.
Mixtral worked for Powerinfer-2 by accident, with actual design they can do better.

Anonymous
08/04/24(Sun)10:36:11 No.101720347

Anonymous 08/04/24(Sun)10:36:11 No.101720347

File: 1707952051475914.png (648 B, 107x25)

648 B PNG

Does anyone know how to fix the trailing "System:" in quick reply on ST? It should be "ASSISTANT: " instead

Anonymous
08/04/24(Sun)10:36:26 No.101720349

Anonymous 08/04/24(Sun)10:36:26 No.101720349

>>101711798
>>(07/31) Google releases Gemma 2 2B, ShieldGemma, and Gemma Scope:
>>>>Smaller, Safer
nothingburger.

Anonymous
08/04/24(Sun)10:38:34 No.101720375

Anonymous 08/04/24(Sun)10:38:34 No.101720375

>>101719238
Solved it by setting the PCIe slot speed to Gen3 in the BIOS.

Anonymous
08/04/24(Sun)10:41:32 No.101720398

Anonymous 08/04/24(Sun)10:41:32 No.101720398

>the week is over
>literally no big releases from anyone, no bitnet, no model from Cohere, OpenAI, Apple, etc
>literally only Flux but that's not an LLM

Bros why did >>101571531 do us like this?

Anonymous
08/04/24(Sun)10:44:01 No.101720416

Anonymous 08/04/24(Sun)10:44:01 No.101720416

File: Screenshot_20240804-084201.png (307 KB, 1080x1511)

307 KB PNG

HAHAHAHAHA 28 GB VRAM

Anonymous
08/04/24(Sun)10:46:39 No.101720445

Anonymous 08/04/24(Sun)10:46:39 No.101720445

>>101720416
Why are you laughing, that's good.

Anonymous
08/04/24(Sun)10:46:56 No.101720447

Anonymous 08/04/24(Sun)10:46:56 No.101720447

>>101719031
charge your phone anon

Anonymous
08/04/24(Sun)10:47:53 No.101720457

Anonymous 08/04/24(Sun)10:47:53 No.101720457

>>101720416
The fuck

Anonymous
08/04/24(Sun)10:48:57 No.101720466

Anonymous 08/04/24(Sun)10:48:57 No.101720466

>>101720349
somethingburger.
>>101662971
>>101665132
Gemma Scope will let us find and remove the source of shivers like orthogonalization did for refusals.
ShieldGemma is a naughty model that was trained on all the bad stuff filtered out of the regular Gemma dataset.

Anonymous
08/04/24(Sun)10:49:07 No.101720468

Anonymous 08/04/24(Sun)10:49:07 No.101720468

>data on this page may change in the future
also at this point just ask for the generational wealth of your whole village and get the a6000. consumer cards are too gimped for anything ml

Anonymous
08/04/24(Sun)10:53:00 No.101720517

Anonymous 08/04/24(Sun)10:53:00 No.101720517

>>101720416
Anybody thinking NVIDIA will significantly increase VRAM when there's no pressure not to is almost as retarded as anybody thinking that OpenAI will ever release another open source LLM
The 5090 will be 28 GB, the 6090 will be 28 GB, and the 7090 will be 32 GB

Anonymous
08/04/24(Sun)10:53:46 No.101720524

Anonymous 08/04/24(Sun)10:53:46 No.101720524

>>101720445
Good goyim, pay $2000 for an extra 4GB of vram.

Anonymous
08/04/24(Sun)10:55:03 No.101720533

Anonymous 08/04/24(Sun)10:55:03 No.101720533

>>101720466
That's cool and all but the model is still 8k. If someone can look at how they did this and create a version for Mistral Large, Llama, etc, then it'll be good.

Anonymous
08/04/24(Sun)10:55:22 No.101720537

Anonymous 08/04/24(Sun)10:55:22 No.101720537

File: nvidiaisourguy.png (300 KB, 1080x1511)

300 KB PNG

>>101720416
It's the future, data is changing, we're gonna be eating good, we all got our thousands of ground floor bitcoins, right?

Anonymous
08/04/24(Sun)10:56:38 No.101720556

Anonymous 08/04/24(Sun)10:56:38 No.101720556

>>101720524
as if there are any other alternatives

Anonymous
08/04/24(Sun)10:59:03 No.101720573

Anonymous 08/04/24(Sun)10:59:03 No.101720573

>>101720556
Radeon Pro W6800 32GB

Anonymous
08/04/24(Sun)11:01:00 No.101720584

Anonymous 08/04/24(Sun)11:01:00 No.101720584

File: 1706498505542593.gif (1.59 MB, 267x200)

1.59 MB GIF

>>101720524
For that price you can get 3*3090

Anonymous
08/04/24(Sun)11:04:25 No.101720628

Anonymous 08/04/24(Sun)11:04:25 No.101720628

>>101720573
is it the same architecture as 6800xt? I know that one has good support in ROCM

Anonymous
08/04/24(Sun)11:06:16 No.101720648

Anonymous 08/04/24(Sun)11:06:16 No.101720648

>>101720517
Wouldn't surprise me if Altman and had a hand in this.
There are two ways to attack open source LLMs. The first is to regulate them. The second, barring that, is to limit the layperson's access to them so that their only choice is to use API only services.
At this point, it's very deliberate.

Anonymous
08/04/24(Sun)11:07:10 No.101720654

Anonymous 08/04/24(Sun)11:07:10 No.101720654

>>101720573
>amjeet
pass, 2x 3090 are better

Anonymous
08/04/24(Sun)11:08:20 No.101720664

Anonymous 08/04/24(Sun)11:08:20 No.101720664

>>101720628
>6800xt
>good support in ROCM
Doesn't have matrix core, no WMMA support, no flash attention.

Anonymous
08/04/24(Sun)11:09:57 No.101720681

Anonymous 08/04/24(Sun)11:09:57 No.101720681

>>101720648
You're overthinking, it's just ngreedia, black leather jacket man also has altman by his gay balls

Anonymous
08/04/24(Sun)11:10:28 No.101720687

Anonymous 08/04/24(Sun)11:10:28 No.101720687

>>101720524
It's not just the VRAM that matters, the speed is a very important factor too. No one actually cares about VRAM outside of ML.

Anonymous
08/04/24(Sun)11:22:09 No.101720799

Anonymous 08/04/24(Sun)11:22:09 No.101720799

>>101719075
pfffft

Anonymous
08/04/24(Sun)11:23:00 No.101720807

Anonymous 08/04/24(Sun)11:23:00 No.101720807

>>101720687
ML is a lot more than LLMs, smaller models have been used in industrial settings for years, and consumer grade GPUs are still good for that because of their performance-price ratio

Anonymous
08/04/24(Sun)11:39:20 No.101720952

Anonymous 08/04/24(Sun)11:39:20 No.101720952

It will be 24GB.

Anonymous
08/04/24(Sun)11:44:07 No.101721006

Anonymous 08/04/24(Sun)11:44:07 No.101721006

File: 1695314498385949.png (28 KB, 192x208)

28 KB PNG

Stop trashing other boards with your shit >>>/tv/202128053 faggots.

Anonymous
08/04/24(Sun)11:44:56 No.101721013

Anonymous 08/04/24(Sun)11:44:56 No.101721013

>>101721006
No one here like llama 3

Anonymous
08/04/24(Sun)11:51:01 No.101721066

Anonymous 08/04/24(Sun)11:51:01 No.101721066

Let me guess, that guy is the one that made the thread so he could have material to criticize "us" for.

Anonymous
08/04/24(Sun)11:51:51 No.101721073

Anonymous 08/04/24(Sun)11:51:51 No.101721073

>>101721013
I like L3. It seems to be a very good general purpose good-enough standard, even if one only uses it as a comparison reference.

Anonymous
08/04/24(Sun)11:54:03 No.101721092

Anonymous 08/04/24(Sun)11:54:03 No.101721092

>>101721066
Let me guess, he also likes to post pictures of a certain turquoise-haired character engaging in bestiality with dark-skinned men

Anonymous
08/04/24(Sun)11:54:11 No.101721095

Anonymous 08/04/24(Sun)11:54:11 No.101721095

>>101721006
No I have to do my job shilling every week

Anonymous
08/04/24(Sun)11:59:56 No.101721161

Anonymous 08/04/24(Sun)11:59:56 No.101721161

Haven't looked in a few months. Anything better than Stheno 3.2 for RP come out yet?

Anonymous
08/04/24(Sun)12:00:53 No.101721178

Anonymous 08/04/24(Sun)12:00:53 No.101721178

>>101721161
Yes.
https://huggingface.co/TheDrummer/Gemmasutra-Mini-2B-v1

Anonymous
08/04/24(Sun)12:01:32 No.101721183

Anonymous 08/04/24(Sun)12:01:32 No.101721183

>>101721161
>>101719988

Anonymous
08/04/24(Sun)12:06:10 No.101721245

Anonymous 08/04/24(Sun)12:06:10 No.101721245

>>101721178
>>101721183
Interesting. I'll check both out, thanks.

Anonymous
08/04/24(Sun)12:14:08 No.101721330

Anonymous 08/04/24(Sun)12:14:08 No.101721330

Good Afternoon where is the bitnets?

Anonymous
08/04/24(Sun)12:16:32 No.101721356

Anonymous 08/04/24(Sun)12:16:32 No.101721356

>>101721330
https://huggingface.co/Green-Sky/TriLM_3.9B-GGUF/tree/main
https://huggingface.co/nisten/Biggie-SmoLlm-0.15B-Base/tree/main

Anonymous
08/04/24(Sun)12:16:40 No.101721357

Anonymous 08/04/24(Sun)12:16:40 No.101721357

>>101719520
Benchmark idea. Fixed hardware, scripted series of questions. Prompt could start with a story and questions could be sbout that story. Score is number of correct answers before time limit expires with a penalty for incorrect answers so generating random answers with infinite speed won't have a maximal score.

Anonymous
08/04/24(Sun)12:18:30 No.101721380

Anonymous 08/04/24(Sun)12:18:30 No.101721380

>>101720416
as expected, what a shame

Anonymous
08/04/24(Sun)12:22:56 No.101721441

Anonymous 08/04/24(Sun)12:22:56 No.101721441

>>101721178
Why didn't your ad show for >>101721161?

Anonymous
08/04/24(Sun)12:32:23 No.101721549

Anonymous 08/04/24(Sun)12:32:23 No.101721549

>>101720416
enough for flux
we are so back

Anonymous
08/04/24(Sun)12:32:53 No.101721554

Anonymous 08/04/24(Sun)12:32:53 No.101721554

>>101721357
>Fixed hardware
Measuring flops is better. Fixed hardware is a stupid idea.
>before time limit expires
Time, or even flops limits, are ridiculous. Normalized corrects answers/flops. Closer to 1 wins.
>penalty for incorrect answers
Built-in in the previous point.

However, this will favour correct but short answers. You will want to account for that. 100tokens/1kflop is better than 10tokens/1kflop if both are correct.

Anonymous
08/04/24(Sun)12:44:48 No.101721660

Anonymous 08/04/24(Sun)12:44:48 No.101721660

>>101720445
Of course not, it should be 32gb minimum

Anonymous
08/04/24(Sun)12:46:23 No.101721679

Anonymous 08/04/24(Sun)12:46:23 No.101721679

>>101721660
64gb or we boycott

Anonymous
08/04/24(Sun)12:59:58 No.101721843

Anonymous 08/04/24(Sun)12:59:58 No.101721843

>>101719031
>source: my ass

Anonymous
08/04/24(Sun)13:00:24 No.101721852

Anonymous 08/04/24(Sun)13:00:24 No.101721852

>>101721660
And it should be only 1-slot, and draw at most 100W.

Anonymous
08/04/24(Sun)13:17:28 No.101722001

Anonymous 08/04/24(Sun)13:17:28 No.101722001

>4 years at 24GB and all they can spare is an extra 4GB
lol

Anonymous
08/04/24(Sun)13:19:11 No.101722016

Anonymous 08/04/24(Sun)13:19:11 No.101722016

File: main-qimg-eedb5dab3bc33a7(...).jpg (36 KB, 400x317)

36 KB JPG

>>101720556
>>101720628
>>101720654

Anonymous
08/04/24(Sun)13:20:08 No.101722031

Anonymous 08/04/24(Sun)13:20:08 No.101722031

>>101722016
nobody needs that many see pee yous

Anonymous
08/04/24(Sun)13:22:44 No.101722067

Anonymous 08/04/24(Sun)13:22:44 No.101722067

>>101722016
Lewd.

Anonymous
08/04/24(Sun)13:23:30 No.101722079

Anonymous 08/04/24(Sun)13:23:30 No.101722079

>>101722031
but everybody needs that many mem ri chan else

Anonymous
08/04/24(Sun)13:24:08 No.101722087

Anonymous 08/04/24(Sun)13:24:08 No.101722087

File: Screenshot from 2024-08-0(...).png (172 KB, 964x488)

172 KB PNG

Wikipe-tan card is kino.

Anonymous
08/04/24(Sun)13:27:50 No.101722130

Anonymous 08/04/24(Sun)13:27:50 No.101722130

File: wikipetan on candletorture.png (254 KB, 962x715)

254 KB PNG

>>101722087

Anonymous
08/04/24(Sun)13:29:52 No.101722155

Anonymous 08/04/24(Sun)13:29:52 No.101722155

>>101722144
>>101722144
>>101722144

Anonymous
08/04/24(Sun)13:31:24 No.101722168

Anonymous 08/04/24(Sun)13:31:24 No.101722168

>>101722087
Oh damn, is it trained on wikipedia text to imitate the tone?

Anonymous
08/04/24(Sun)13:32:35 No.101722187

Anonymous 08/04/24(Sun)13:32:35 No.101722187

>>101722168
its mistral large 2407

Anonymous
08/04/24(Sun)13:38:44 No.101722261

Anonymous 08/04/24(Sun)13:38:44 No.101722261

>>101722130
>ai gf who is educational and sexy at the same time
Once we get the robot body (including cyber womb) bit figured out, that's it for females. 99.999% of all women literally cannot compete.

Anonymous
08/04/24(Sun)14:01:33 No.101722541

Anonymous 08/04/24(Sun)14:01:33 No.101722541

>>101722087
>that first paragraph
Nice slop.

Anonymous
08/04/24(Sun)14:03:54 No.101722565

Anonymous 08/04/24(Sun)14:03:54 No.101722565

>>101722130
>whispers conspiratorially

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.