/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

[Post a Reply]

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous
/lmg/ - Local Models General 09/26/24(Thu)23:12:57 No.102573383

File: __hatsune_miku_calne_ca_a(...).jpg (3.85 MB, 4054x2400)

3.85 MB JPG

/lmg/ - Local Models General Anonymous 09/26/24(Thu)23:12:57 No.102573383

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>102565822 & >>102557546

►News
>(09/25) Multimodal Llama 3.2 released: https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices
>(09/25) Molmo: multimodal models based on OLMo, OLMoE, and Qwen-72B: https://molmo.allenai.org/blog
>(09/24) Llama-3.1-70B-instruct distilled to 51B: https://hf.co/nvidia/Llama-3_1-Nemotron-51B-Instruct
>(09/18) Qwen 2.5 released, trained on 18 trillion token dataset: https://qwenlm.github.io/blog/qwen2.5/
>(09/18) Llama 8B quantized to b1.58 through finetuning: https://hf.co/blog/1_58_llm_extreme_quantization

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://hf.co/spaces/mike-ravkine/can-ai-code-results

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
09/26/24(Thu)23:13:19 No.102573386

Anonymous 09/26/24(Thu)23:13:19 No.102573386

The west has fallen. Qwen won. Recap is spam now. Sell your GPUs before the second-hand market collapses. It's over for real this time, boys.

Anonymous
09/26/24(Thu)23:13:20 No.102573387

Anonymous 09/26/24(Thu)23:13:20 No.102573387

File: __hatsune_miku_and_calne_(...).jpg (1.03 MB, 2835x2835)

1.03 MB JPG

►Recent Highlights from the Previous Thread: >>102565822

--Papers:
>102572967 >102573170
--Mistral.rs, a Rust implementation of Llamat:
>102567470 >102567493 >102567499 >102567500 >102567597
--Llama 3.2 and uncensored VLM potential discussion:
>102565950 >102566013 >102566046 >102569036 >102570107
--LLaMA-3.2 quantization evaluation GitHub discussion:
>102568271
--Importance of multiple aspects of model quality beyond just generation and long-term memory:
>102569388 >102569415 >102569427
--Discussion on syncing server.cpp versions between llama.cpp and ollama:
>102566301
--Qwen2.5 32B uncensor finetune on Hugging Face:
>102570128 >102570176 >102571913
--Qwen 72b and GPT-4 succeed in pyqtgraph plot coding challenge:
>102569500 >102572930 >102573043
--Mistral small works better with one message prompt:
>102566201 >102566246 >102566312 >102566352 >102566421 >102566466
--Discussion of NoCha leaderboard results and long context benchmarking:
>102568781 >102568884 >102568990 >102569030 >102569068 >102569120 >102571029 >102571424 >102571473 >102571098 >102568954 >102568977 >102568982 >102569005 >102569074 >102569718 >102569832 >102569931 >102569118 >102569275 >102569326 >102570952 >102571213 >102571298 >102571508
--90B model struggles with writing, 3.1 tunes show promise with high temp:
>102568773 >102568851 >102568862
--RTX 3090 with 32GB VRAM announced, kopite7kimi's credibility questioned:
>102565941 >102566240 >102567292 >102567720 >102570733 >102570840
--Nemo generates a joke that qwen2.5 could never do:
>102570058 >102570171
--Llama 3.2 90B performance and impact of quantization discussed:
>102567549 >102567822 >102567871 >102567878 >102568430 >102568602
--Adjusting samplers and settings for non-sloppy results on L3.x models:
>102567577 >102567600 >102568173
--Miku (free space):
>102566349 >102570348 >102570537

►Recent Highlight Posts from the Previous Thread: >>102565835

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script

Anonymous
09/26/24(Thu)23:15:54 No.102573422

Anonymous 09/26/24(Thu)23:15:54 No.102573422

so.. what's up?

Anonymous
09/26/24(Thu)23:17:49 No.102573443

Anonymous 09/26/24(Thu)23:17:49 No.102573443

File: Italian Tech Week 2024.webm (3.42 MB, 854x480)

3.42 MB WEBM

I thought OpenAI was over after the departures.
But I have to admit, it was impressive how Sam was able to spin it.

Anonymous
09/26/24(Thu)23:19:05 No.102573450

Anonymous 09/26/24(Thu)23:19:05 No.102573450

Are there any small models specialized in generating SD prompts? Especially with booru tag support?
I tried putting a big list of tags and a bunch of examples in llama 3B and Qwen 3B but the outputs are mostly ass.

>>102573443
Is this a Fate/Zero reference????

Anonymous
09/26/24(Thu)23:19:29 No.102573457

Anonymous 09/26/24(Thu)23:19:29 No.102573457

>>102573371
this general is a spin-off of /aicg/. it has always been about chatting and (erotic) roleplay. stop complaining about the meta and post your best loli logs, newfag.

Anonymous
09/26/24(Thu)23:23:31 No.102573491

Anonymous 09/26/24(Thu)23:23:31 No.102573491

>>102573457
>this general is a spin-off of /aicg/
I'm actually happy somebody remembers that.
>>102573386
>Recap is spam now
/aicg/ here and I miss your recapanons. Pls come back or somebody pick up the mantle.

Anonymous
09/26/24(Thu)23:25:03 No.102573502

Anonymous 09/26/24(Thu)23:25:03 No.102573502

>>102573457
Calm down ranjesh

Anonymous
09/26/24(Thu)23:25:26 No.102573506

Anonymous 09/26/24(Thu)23:25:26 No.102573506

>>102573502
post your hand

Anonymous
09/26/24(Thu)23:26:07 No.102573518

Anonymous 09/26/24(Thu)23:26:07 No.102573518

reposting for new thread
>chatting with ai, using a variation of my name for {{user}}
>she calls me anon in the middle of her orgasm
what did she mean by this

>>102573118
no, it is , but for whatever reason (could be the card) she keeps adding "Note: this scenario includes offensive and blah blah" kind of statements
>>102573333
vaginal sex in the missionary position for the purposes of procreation

Anonymous
09/26/24(Thu)23:27:04 No.102573527

Anonymous 09/26/24(Thu)23:27:04 No.102573527

>>102573450
I know of this one, but i never used it.
>https://huggingface.co/teknium/SD-PrompTune-v1
It's old and i have no idea if it's any good. 7B, based on mistral 0.1 i think.
Then there's a bunch of tiny models, but they just add noise, mostly. May be worth a try to get some random styles, i suppose.
>https://huggingface.co/Gustavosta/MagicPrompt-Stable-Diffusion
>https://huggingface.co/AUTOMATIC/promptgen-majinai-safe
>https://huggingface.co/AUTOMATIC/promptgen-majinai-unsafe
>https://huggingface.co/AUTOMATIC/promptgen-lexart
Those are very old gpt models with like 160m params.

Anonymous
09/26/24(Thu)23:27:11 No.102573528

Anonymous 09/26/24(Thu)23:27:11 No.102573528

>>102573387
>RTX 3090 with 32GB VRAM
Never post a recap again.

Anonymous
09/26/24(Thu)23:27:29 No.102573530

Anonymous 09/26/24(Thu)23:27:29 No.102573530

>>102573387
I don't get this recap, it doesn't have a double > on the posts so you can't do anything about it

Anonymous
09/26/24(Thu)23:28:29 No.102573538

Anonymous 09/26/24(Thu)23:28:29 No.102573538

>>102573528
>Never post a recap again.
that's my fault, I really wrote that shit on the previous thread, fucking typo :(

Anonymous
09/26/24(Thu)23:31:31 No.102573562

Anonymous 09/26/24(Thu)23:31:31 No.102573562

>>102573450
Not explicitly relevant but I'm using Florence-2-large-PromptGen-v1.5

Anonymous
09/26/24(Thu)23:31:55 No.102573564

Anonymous 09/26/24(Thu)23:31:55 No.102573564

>>102573530
I think the internet may be too advanced for you.

Anonymous
09/26/24(Thu)23:32:04 No.102573567

Anonymous 09/26/24(Thu)23:32:04 No.102573567

>>102573530
We did a few a/b tests and found that people were more satisfied with their engagement of recaps when they had to do a quick ctrl+f to navigate to their chosen post numbers of interest instead of just having a mess of links to click. It's hypothesized it could be the act of manually putting in a little effort gives anons a sense of ownership over their role in reading the recapped post. Alternatively it could be that they have hover previews enabled and get distracted by other posts popping up as they try to get their cursor to reach a post that interested them.

Anonymous
09/26/24(Thu)23:32:07 No.102573568

Anonymous 09/26/24(Thu)23:32:07 No.102573568

>>102573386
>Recap is spam now
What happened to recap? It was good for a long time.

Anonymous
09/26/24(Thu)23:33:25 No.102573581

Anonymous 09/26/24(Thu)23:33:25 No.102573581

>>102573564
>>102573567
wtf this is an insane level of shitpost, I kneel ;_;

Anonymous
09/26/24(Thu)23:35:45 No.102573602

Anonymous 09/26/24(Thu)23:35:45 No.102573602

>>102573567
How the fuck do they know what they're going to without the hover?
Might as well just read the old thread.

Anonymous
09/26/24(Thu)23:35:53 No.102573604

Anonymous 09/26/24(Thu)23:35:53 No.102573604

>>102573567
this post was written by AI wasnt it

Anonymous
09/26/24(Thu)23:35:55 No.102573605

Anonymous 09/26/24(Thu)23:35:55 No.102573605

>>102573443
Who the fuck decided this was the stage for an interview?

Anonymous
09/26/24(Thu)23:38:57 No.102573642

Anonymous 09/26/24(Thu)23:38:57 No.102573642

File: 1619789483565.png (643 KB, 1552x1171)

643 KB PNG

>>102573562
But it's more tuned for the plain text prompts.

Anonymous
09/26/24(Thu)23:39:36 No.102573652

Anonymous 09/26/24(Thu)23:39:36 No.102573652

File: 1696169299901587.png (1 KB, 21x25)

1 KB PNG

>>102573567
>MOOOM! I POSTED AI GENERATED REPLY AGAIN!! GIMME MUH TENDIES NAOW!!!!!

Anonymous
09/26/24(Thu)23:39:45 No.102573654

Anonymous 09/26/24(Thu)23:39:45 No.102573654

>>102573527
Thanks, I already knew promptgen but IIRC it doesn't do booru tags unfortunately, The others don't seem to either.
I guess I'll try tardwrangling harder and I'll have to do a bit of scripting to fix the prompts.

>>102573562
Thanks, not exactly what I'm after but it might be useful. I guess it could try feeding noise into a WD-tagger model with a low threshold and see what comes out.

Anonymous
09/26/24(Thu)23:40:24 No.102573659

Anonymous 09/26/24(Thu)23:40:24 No.102573659

>>102573605
It should have spun much much faster.

Anonymous
09/26/24(Thu)23:42:00 No.102573672

Anonymous 09/26/24(Thu)23:42:00 No.102573672

How much better is Mistral Large over Llama 3.1 70B? It's too slow for me

Anonymous
09/26/24(Thu)23:44:53 No.102573694

Anonymous 09/26/24(Thu)23:44:53 No.102573694

So has anyone figured out an easy way to test out the multimodality in llama 3.2?

Anonymous
09/26/24(Thu)23:56:29 No.102573786

Anonymous 09/26/24(Thu)23:56:29 No.102573786

>>102573694
Lmarena?

Anonymous
09/26/24(Thu)23:57:52 No.102573802

Anonymous 09/26/24(Thu)23:57:52 No.102573802

>>102573017
If you're offloading a ton of layers to CPU (which you are if you're running Mistral Large with only 32GB vram) then getting the 5090 isn't going to do shit for your speed you fucking techlet retard

Anonymous
09/27/24(Fri)00:00:57 No.102573820

Anonymous 09/27/24(Fri)00:00:57 No.102573820

Is it possible to use AMD+Nvidia at the same time with Scale?

Anonymous
09/27/24(Fri)00:08:52 No.102573904

Anonymous 09/27/24(Fri)00:08:52 No.102573904

>>102573802
because offloading to cpu is obviously the only option in the context instead of multi gpu, and i'm the techlet retard. Before you say something even more retarded like it runs at the speed of the slowest gpu that's not true either. It runs at a speed in between.

Anonymous
09/27/24(Fri)00:11:51 No.102573937

Anonymous 09/27/24(Fri)00:11:51 No.102573937

>>102573904
If you're going to buy two 5090s you are a fraction of a fraction of an already small market, and your preferences don't matter in a discussion about whether it's a good product

Anonymous
09/27/24(Fri)00:15:28 No.102573973

Anonymous 09/27/24(Fri)00:15:28 No.102573973

>>102573904
nta but 3 x 3090 would give more vram at a lower tdp than 2 x 5090
either setup would generate tokens faster than the average reading speed (if whole model is on GPU), so the memory on the 5090 being faster would not matter

Anonymous
09/27/24(Fri)00:17:38 No.102573991

Anonymous 09/27/24(Fri)00:17:38 No.102573991

>>102573530
>can't do anything
you can read the numbers, reading numbers empowers your brain

Anonymous
09/27/24(Fri)00:20:58 No.102574019

Anonymous 09/27/24(Fri)00:20:58 No.102574019

>>102573973
4x 3090 gets 10t/s on large q4. that's kinda slow and stat cards have a lot of extra stuff.

Anonymous
09/27/24(Fri)00:25:19 No.102574061

Anonymous 09/27/24(Fri)00:25:19 No.102574061

>>102574019
5090 have no nvlink which is important if you split by row

Anonymous
09/27/24(Fri)00:26:58 No.102574080

Anonymous 09/27/24(Fri)00:26:58 No.102574080

How many floppy things is the 5090 supposed to have? 600 watts is insane. So there better be lots of flops

Anonymous
09/27/24(Fri)00:28:22 No.102574087

Anonymous 09/27/24(Fri)00:28:22 No.102574087

Do any anons have Google's notebookLM locally running? Doesn't have to be the same exact model but would be interested in setting it up locally to minimize the processing time.

Anonymous
09/27/24(Fri)00:28:43 No.102574092

Anonymous 09/27/24(Fri)00:28:43 No.102574092

kind of crazy to think about how ai is a solved science and with a couple more gens of nvidia chips and a few years of datacenter and power infra expanding we will be able to just use the current algorithms to create agi

Anonymous
09/27/24(Fri)00:29:29 No.102574099

Anonymous 09/27/24(Fri)00:29:29 No.102574099

>>102574092
buy an ad Jensen.

Anonymous
09/27/24(Fri)00:29:43 No.102574102

Anonymous 09/27/24(Fri)00:29:43 No.102574102

>>102574092
LLMs cannot think brotha

Anonymous
09/27/24(Fri)00:30:21 No.102574105

Anonymous 09/27/24(Fri)00:30:21 No.102574105

>>102574092
I can't wait until they power up Three Mile Island, turn on their new datacenter and train a 10TB gigamodel, only to discover it plateaus at exactly the same "slightly better than GPT-4" level we've been stuck at for a year and a half now

Anonymous
09/27/24(Fri)00:30:44 No.102574106

Anonymous 09/27/24(Fri)00:30:44 No.102574106

File: FLAMING HOT COCK.png (2.57 MB, 2391x720)

2.57 MB PNG

>https://huggingface.co/BAAI/Emu3-Gen
>https://huggingface.co/BAAI/Emu3-Chat

Multimodal Llama architecture LLM with native image input/output (video is supported, but looks pretty bad).
No diffusion in the image generation process whatsoever. Next token prediction for all modalities.
Apache 2.0, fully open license. 8b params. The layout is literally just Llama3 8b but fully multimodal.

Anonymous
09/27/24(Fri)00:31:58 No.102574115

Anonymous 09/27/24(Fri)00:31:58 No.102574115

>>102574106
>8b params.
got me excited until I read this

Anonymous
09/27/24(Fri)00:32:10 No.102574117

Anonymous 09/27/24(Fri)00:32:10 No.102574117

>>102574106
>No diffusion in the image generation process whatsoever. Next token prediction for all modalities.
Dumb this down for me why is this noteworthy?

Anonymous
09/27/24(Fri)00:32:48 No.102574127

Anonymous 09/27/24(Fri)00:32:48 No.102574127

llama 3.2 3B is sentient.

Anonymous
09/27/24(Fri)00:32:55 No.102574128

Anonymous 09/27/24(Fri)00:32:55 No.102574128

>>102574106
we already have anole finetune of chameleon, do this on a 70b and you'll have my attention

Anonymous
09/27/24(Fri)00:33:21 No.102574136

Anonymous 09/27/24(Fri)00:33:21 No.102574136

>>102574115
8B is way bigger than Stable Diffusion 1.5, and slightly bigger than SDXL

Anonymous
09/27/24(Fri)00:33:48 No.102574141

Anonymous 09/27/24(Fri)00:33:48 No.102574141

>>102574136
>stable diffusion
lol

Anonymous
09/27/24(Fri)00:34:48 No.102574154

Anonymous 09/27/24(Fri)00:34:48 No.102574154

>>102574127
qwen 2.5 0.5B is a cat

Anonymous
09/27/24(Fri)00:34:56 No.102574155

Anonymous 09/27/24(Fri)00:34:56 No.102574155

>>102574117
Normal image generation models start from random noise and keep denoising to iteratively create an image (the diffusion process). This predicts in patches sequentially like how text is predicted in conventional LLMs

TLDR; it's one big ass model that treats both modalities the same way

Anonymous
09/27/24(Fri)00:35:22 No.102574160

Anonymous 09/27/24(Fri)00:35:22 No.102574160

>>102574141
point is it's more than enough parameters for image generation, retard

Anonymous
09/27/24(Fri)00:35:23 No.102574161

Anonymous 09/27/24(Fri)00:35:23 No.102574161

>>102574106
Woah cool

>>102574117
Diffusion is slow to train, slow to infer
Token prediction is ezpz

Anonymous
09/27/24(Fri)00:35:40 No.102574167

Anonymous 09/27/24(Fri)00:35:40 No.102574167

>>102574127
>>102574154
and you are braindead.

Anonymous
09/27/24(Fri)00:37:55 No.102574192

Anonymous 09/27/24(Fri)00:37:55 No.102574192

Emu3 gguf support, when?

Anonymous
09/27/24(Fri)00:38:40 No.102574200

Anonymous 09/27/24(Fri)00:38:40 No.102574200

>>102574192
lol

Anonymous
09/27/24(Fri)00:39:24 No.102574211

Anonymous 09/27/24(Fri)00:39:24 No.102574211

>>102574161
tell that to my sub 1 t/s

Anonymous
09/27/24(Fri)00:46:19 No.102574295

Anonymous 09/27/24(Fri)00:46:19 No.102574295

So if Emu can make images with linear token prediction does that mean multi-gpu inferencing would be as simple as it is for textgen?

Anonymous
09/27/24(Fri)00:50:42 No.102574339

Anonymous 09/27/24(Fri)00:50:42 No.102574339

>>102574160
but a fraction of what we already have from other imagegen architectures, retard

Anonymous
09/27/24(Fri)00:51:18 No.102574348

Anonymous 09/27/24(Fri)00:51:18 No.102574348

>>102574106
>no online demo
Fuck, I don't want to mess with transformers shit.

Anonymous
09/27/24(Fri)00:54:01 No.102574394

Anonymous 09/27/24(Fri)00:54:01 No.102574394

>>102574339
Things never improve. We got llama1-7b and we've been stuck ever since. We only ever get 7b models. Nobody releases experimental models. It's all 7bs...

Anonymous
09/27/24(Fri)00:54:21 No.102574399

Anonymous 09/27/24(Fri)00:54:21 No.102574399

>>102574339
wdym? Flux is only 12B, that's 50% more yeah but not a big a difference as you're trying to imply
"a fraction" lol. I guess 66% is technically a fraction. seriously though, why are you telling weird lies?

Anonymous
09/27/24(Fri)00:56:07 No.102574429

Anonymous 09/27/24(Fri)00:56:07 No.102574429

>admits it's a fraction
>still tries to say it's a lie
nobody gives a fuck about your shitty model anon

Anonymous
09/27/24(Fri)00:56:07 No.102574430

Anonymous 09/27/24(Fri)00:56:07 No.102574430

>>102574160
there's a difference between "enough parameters for image generation" and "enough parameters for good image generation" especially when image and text are crammed into the same semantic space
need more B, simple as

Anonymous
09/27/24(Fri)00:57:06 No.102574449

Anonymous 09/27/24(Fri)00:57:06 No.102574449

Nobody here can really say what a lot of parameters is for emu because this is the first linear token prediction imagegen model we've ever gotten our hands on.

Anonymous
09/27/24(Fri)00:57:47 No.102574461

Anonymous 09/27/24(Fri)00:57:47 No.102574461

>still coping
don't care, not gonna download your chinese slop model

Anonymous
09/27/24(Fri)01:00:12 No.102574499

Anonymous 09/27/24(Fri)01:00:12 No.102574499

>>102574449
>Nobody here can really say what a lot of parameters is for emu because this is the first linear token prediction imagegen model we've ever gotten our hands on.
We've had it for months
https://github.com/GAIR-NLP/anole

Anonymous
09/27/24(Fri)01:00:58 No.102574508

Anonymous 09/27/24(Fri)01:00:58 No.102574508

>>102574429
You're seriously going with 'the word fraction is technically correct because 8 is 66% of 12"? That's actually the hill you're gonna die on?

Anonymous
09/27/24(Fri)01:02:35 No.102574526

Anonymous 09/27/24(Fri)01:02:35 No.102574526

>you're seriously gonna say 2/3 is a fraction?
>*dilates*
yes.

Anonymous
09/27/24(Fri)01:04:38 No.102574561

Anonymous 09/27/24(Fri)01:04:38 No.102574561

>>102574526
thanks. what's your home address?

Anonymous
09/27/24(Fri)01:06:36 No.102574591

Anonymous 09/27/24(Fri)01:06:36 No.102574591

>>102574429
pussy

Anonymous
09/27/24(Fri)01:08:44 No.102574621

Anonymous 09/27/24(Fri)01:08:44 No.102574621

>>102574429
ignore it then?

Anonymous
09/27/24(Fri)01:08:45 No.102574622

Anonymous 09/27/24(Fri)01:08:45 No.102574622

look how mad he is

Anonymous
09/27/24(Fri)01:09:48 No.102574638

Anonymous 09/27/24(Fri)01:09:48 No.102574638

>>102574499
There was this as well. It was more focused on doing straight text-to-image tasks, it still used image tokens in an LLM:
https://github.com/Alpha-VLLM/Lumina-mGPT

Both of those based on Chameleon which latently had these capabilities but had its head chopped off by Zuck for the open source release because it's too dangerous. There's the 34b sitting around that never got the treatment the 7b did... which I guess makes sense because nobody wants to support quanting this shit so no one would be able to run anything much bigger.

Anonymous
09/27/24(Fri)01:10:20 No.102574645

Anonymous 09/27/24(Fri)01:10:20 No.102574645

>irrelevant drama №5469457647539767594
*yawn*

Anonymous
09/27/24(Fri)01:11:02 No.102574655

Anonymous 09/27/24(Fri)01:11:02 No.102574655

All drama posts are sam altman until proven otherwise.

Anonymous
09/27/24(Fri)01:11:27 No.102574669

Anonymous 09/27/24(Fri)01:11:27 No.102574669

>>102574655
it can't be he's too busy spinning

Anonymous
09/27/24(Fri)01:11:36 No.102574672

Anonymous 09/27/24(Fri)01:11:36 No.102574672

sneed

Anonymous
09/27/24(Fri)01:12:05 No.102574679

Anonymous 09/27/24(Fri)01:12:05 No.102574679

I am sam altman

Anonymous
09/27/24(Fri)01:12:10 No.102574680

Anonymous 09/27/24(Fri)01:12:10 No.102574680

How long until LCCP emu support?
The video gen looks on par with CogVidX

Anonymous
09/27/24(Fri)01:12:32 No.102574686

Anonymous 09/27/24(Fri)01:12:32 No.102574686

>>102574669
he's a billionaire, he can afford to spin and ruin /lmg/ at the same time.

Anonymous
09/27/24(Fri)01:12:32 No.102574687

Anonymous 09/27/24(Fri)01:12:32 No.102574687

>>102573443
Why are they spinning??

Anonymous
09/27/24(Fri)01:13:33 No.102574698

Anonymous 09/27/24(Fri)01:13:33 No.102574698

holy shit you guys flash attention just finished building now I can finally try emu

Anonymous
09/27/24(Fri)01:13:57 No.102574702

Anonymous 09/27/24(Fri)01:13:57 No.102574702

>>102574698
:O

Anonymous
09/27/24(Fri)01:14:29 No.102574713

Anonymous 09/27/24(Fri)01:14:29 No.102574713

>>102574698
hurry up nerd

Anonymous
09/27/24(Fri)01:15:13 No.102574720

Anonymous 09/27/24(Fri)01:15:13 No.102574720

>>102574713
It needs to download the weights first, give me a break.

Anonymous
09/27/24(Fri)01:16:03 No.102574729

Anonymous 09/27/24(Fri)01:16:03 No.102574729

>>102574655
I'm sure sam doesn't even know this thread exists.

Anonymous
09/27/24(Fri)01:16:17 No.102574735

Anonymous 09/27/24(Fri)01:16:17 No.102574735

Anything cool happening?

Anonymous
09/27/24(Fri)01:17:00 No.102574750

Anonymous 09/27/24(Fri)01:17:00 No.102574750

>>102574735
i'm masturbating to vtubers

Anonymous
09/27/24(Fri)01:17:29 No.102574755

Anonymous 09/27/24(Fri)01:17:29 No.102574755

>>102574735
yeah >>102574106 >>102573443

Anonymous
09/27/24(Fri)01:18:07 No.102574761

Anonymous 09/27/24(Fri)01:18:07 No.102574761

>>102574735
there's a chink who doesn't know what fractions are coping and seething about his tiny imagegen model but otherwise no

Anonymous
09/27/24(Fri)01:20:40 No.102574785

Anonymous 09/27/24(Fri)01:20:40 No.102574785

>>102574761
keep farting

Anonymous
09/27/24(Fri)01:23:08 No.102574810

Anonymous 09/27/24(Fri)01:23:08 No.102574810

>>102574735
Yes, we have two fags samefagging right now.

Anonymous
09/27/24(Fri)01:23:25 No.102574813

Anonymous 09/27/24(Fri)01:23:25 No.102574813

File: Applying thermal paste.png (535 KB, 638x638)

535 KB PNG

>>102574750
Gay
>>102574755
Is there any stuff on the horizon though?
I want to know if something cool is going to drop before the end of 2024. I want an local uncensored Claude sonnet/GPT4 equivalent running on my shitty GPU by 2025.

Anonymous
09/27/24(Fri)01:25:30 No.102574834

Anonymous 09/27/24(Fri)01:25:30 No.102574834

File: average local llm.png (158 KB, 833x534)

158 KB PNG

>>102574813
>I want an local uncensored Claude sonnet/GPT4
This will never happen.

Anonymous
09/27/24(Fri)01:26:03 No.102574840

Anonymous 09/27/24(Fri)01:26:03 No.102574840

>>102574813
>local uncensored Claude sonnet/GPT4 equivalent
>running on my shitty GPU
>by 2025
pick any two

Anonymous
09/27/24(Fri)01:26:44 No.102574846

Anonymous 09/27/24(Fri)01:26:44 No.102574846

>>102574813
>Is there any stuff on the horizon though?
in the llm space, nothing that i know of, but in the image gen space the pixart team said their model is almost done training, so that's pretty exciting. highly doubt it'll beat flux, even they themselves said it won't be a flux killer but i expect it to be aesthetically superior and easier to finetune.

Anonymous
09/27/24(Fri)01:26:53 No.102574850

Anonymous 09/27/24(Fri)01:26:53 No.102574850

>>102574080
the memory is 40% faster, that's what's important.

Anonymous
09/27/24(Fri)01:29:30 No.102574876

Anonymous 09/27/24(Fri)01:29:30 No.102574876

I think I'm running the default example for emu right now...but they neglected to add any kind of progress indicator to the code. max new tokens is 40960 by default... so we could be here a while.

Anonymous
09/27/24(Fri)01:30:09 No.102574881

Anonymous 09/27/24(Fri)01:30:09 No.102574881

>>102574876
hurry UP nerd

Anonymous
09/27/24(Fri)01:30:24 No.102574886

Anonymous 09/27/24(Fri)01:30:24 No.102574886

>chinkshit transformers model sucks
new here?

Anonymous
09/27/24(Fri)01:30:43 No.102574890

Anonymous 09/27/24(Fri)01:30:43 No.102574890

>>102574876
is it done yet?

Anonymous
09/27/24(Fri)01:32:30 No.102574910

Anonymous 09/27/24(Fri)01:32:30 No.102574910

File: 00091-1652089724.png (1.15 MB, 1024x1024)

1.15 MB PNG

>>102574834
The future is now old man.
>>102574840
I can always pony up some cash for a 5000 series once they start getting those to market. I'll assume they'll have a card specifically designed for AI stuff. The future is looking bright. My 4070 comes with 12 gigs of VRAM which is pretty decent for local image gen. If I could get a card with 32 or 64 gigs of VRAM I imagine I could run some pretty beefy models.
>>102574846
I'm still playing with A1111. Pumping out plump robot girls is always fun.

Anonymous
09/27/24(Fri)01:34:07 No.102574928

Anonymous 09/27/24(Fri)01:34:07 No.102574928

>>102574890
Nah. my VRAM usage keeps going up, so presumably it's doing something. It's also only cruising along at 307W, so it's clearly failing to use all of the compute available to it.

Anonymous
09/27/24(Fri)01:35:17 No.102574941

Anonymous 09/27/24(Fri)01:35:17 No.102574941

>>102574910
>A1111
obsolete. i use comfyui but i think reForge is the goto a1111 replacement now. post another gen i liked that one

Anonymous
09/27/24(Fri)01:36:52 No.102574967

Anonymous 09/27/24(Fri)01:36:52 No.102574967

>>102574087
what are you talking about anon.
did you listen to the gens?
this is above gpt advanced audio. maybe like it would be if it would be uncucked idk.
the only problem is halucination but the audio is definitely past the valley.
you can enjoy such great tools like fishspeech or xtts2 locally and suffer until we get something better.

Anonymous
09/27/24(Fri)01:37:39 No.102574980

Anonymous 09/27/24(Fri)01:37:39 No.102574980

Do you think Sam Altman has a personal trillion parameter model based on gay bear personality and vaguely posts on 4chan to groom potential cute gay bears by giving them a ChatGPT key?

Anonymous
09/27/24(Fri)01:37:43 No.102574982

Anonymous 09/27/24(Fri)01:37:43 No.102574982

aaand it oomed.

Anonymous
09/27/24(Fri)01:39:26 No.102575000

Anonymous 09/27/24(Fri)01:39:26 No.102575000

>>102574910
>I can always pony up some cash for a 5000 series once they start getting those to market. I'll assume they'll have a card specifically designed for AI stuff. The future is looking bright. My 4070 comes with 12 gigs of VRAM which is pretty decent for local image gen. If I could get a card with 32 or 64 gigs of VRAM I imagine I could run some pretty beefy models.
I like your enthusiasm and optimism!
You won't find much of anything that will run on 64gb in the Claude/GPT4 class.
Largestral is probably your closest equivalent on a reasonable number of GPUs, and you're looking at needing more like 128gb+ to run at a decent quant.
If you really want smart-as-cloud-offerings SOTA local at home you'll need to scrounge up 400-500gb
Unfortuantely cards "specifically designed for AI stuff" are big cash-cows for nvidia and are in the $40-80k price range (for 80gb...still need a bunch!)
Check the lmg build guide in the OP for more details.

Anonymous
09/27/24(Fri)01:40:33 No.102575013

Anonymous 09/27/24(Fri)01:40:33 No.102575013

>>102574980
pecker is altman confirmed

Anonymous
09/27/24(Fri)01:40:35 No.102575014

Anonymous 09/27/24(Fri)01:40:35 No.102575014

>>102574982
wow. some nerd you are.

Anonymous
09/27/24(Fri)01:42:28 No.102575035

Anonymous 09/27/24(Fri)01:42:28 No.102575035

>>102575014
It's my fault, I started the script before I noticed the device map was set to only use cuda:0

Anonymous
09/27/24(Fri)01:43:47 No.102575054

Anonymous 09/27/24(Fri)01:43:47 No.102575054

>>102575035
retard

Anonymous
09/27/24(Fri)01:44:03 No.102575058

Anonymous 09/27/24(Fri)01:44:03 No.102575058

>>102575035
Huh, how much vram does it need?

Anonymous
09/27/24(Fri)01:44:53 No.102575068

Anonymous 09/27/24(Fri)01:44:53 No.102575068

>>102575058
Evidently more than 6GB, but I'll keep trying

Anonymous
09/27/24(Fri)01:45:22 No.102575070

Anonymous 09/27/24(Fri)01:45:22 No.102575070

File: 00009-1146345299.png (1.38 MB, 1024x1024)

1.38 MB PNG

>>102574941
>obosolete
I don't want to learn a new system. I would assume I need to download new checkpoints, loras, and shit like that. I'll wait until 2025 before I delve into the newest toy. I'll let everyone else work on stuff and hopefully, some higher-speed and lower-drag toys emerge in that time. Have some Gerudo and Zelda blend.
>>102575000
As noted above, I'm counting on some tech advancements to either make LLMs more efficient resource-wise or for the price AI-focused GPUs to be focused on which will make them more efficient. Maybe both.

I would love to believe everything will be figured out by 2025 but that's a little optimistic even by my standards. 2026~7 We'll be eating good I think if the current rate of development is any indication.

Anonymous
09/27/24(Fri)01:45:43 No.102575074

Anonymous 09/27/24(Fri)01:45:43 No.102575074

>>102575035
how about you set it to cuda:start_working, nerd.

Anonymous
09/27/24(Fri)01:46:43 No.102575088

Anonymous 09/27/24(Fri)01:46:43 No.102575088

>>102575074
I set it to auto, because I'm not nerdy enough to make a custom device map, so now it's split across 4 GPUs and will probably take an eternity since transformers is garbage at utilizing multiple gpus.

Anonymous
09/27/24(Fri)01:46:52 No.102575089

Anonymous 09/27/24(Fri)01:46:52 No.102575089

>5080
>16 GB
What the fuck is this shit?

Anonymous
09/27/24(Fri)01:47:07 No.102575092

Anonymous 09/27/24(Fri)01:47:07 No.102575092

>>102575070
>I don't want to learn a new system.
no it's pretty much the exact same as a1111, it's a fork, u good. lick.

Anonymous
09/27/24(Fri)01:48:13 No.102575100

Anonymous 09/27/24(Fri)01:48:13 No.102575100

>>102575089
Consequences of a monopoly.

Anonymous
09/27/24(Fri)01:48:21 No.102575103

Anonymous 09/27/24(Fri)01:48:21 No.102575103

File: bottleneck.png (57 KB, 740x379)

57 KB PNG

now that's a PCIE bottleneck if I've ever seen one.

Anonymous
09/27/24(Fri)01:50:17 No.102575123

Anonymous 09/27/24(Fri)01:50:17 No.102575123

>>102575070
Forge is almost the same. (I haven't tried ReForge) I put my models and Loras in one place, then symbolic link the directories into A1111, Forge, Comfy, etc all point to the same collection. Though I've heard that Flux doesn't work on A1111 or Forge and does on Comfy, that's just a compatibility and update thing.

Anonymous
09/27/24(Fri)01:53:27 No.102575159

Anonymous 09/27/24(Fri)01:53:27 No.102575159

>>102575070
give her armpit hair

Anonymous
09/27/24(Fri)01:53:49 No.102575164

Anonymous 09/27/24(Fri)01:53:49 No.102575164

File: 00012-1289042996.png (1.5 MB, 1024x1024)

1.5 MB PNG

>>102575092
>>102575123
What makes the UI change so important? What makes forge or comfy UI superior?

There are also a number of extensions that make it easy for me to use tags and shit. Do they come prepackaged with a lot of those QoL features?

Anonymous
09/27/24(Fri)01:58:08 No.102575202

Anonymous 09/27/24(Fri)01:58:08 No.102575202

>>102575164
>What makes the UI change so important?
mainly performance and extra model support, a1111 is a mess and doesn't get updated much so it has shit performance. i think it uses way more vram too due to memory leaks but that might not be the case anymore.

Anonymous
09/27/24(Fri)01:58:36 No.102575209

Anonymous 09/27/24(Fri)01:58:36 No.102575209

generation failed again. Everything was going fine and then memory usage on GPU 0 just skyrocketed and gave me an OOM condition. And that was with max new tokens reduced to 8192.

Anonymous
09/27/24(Fri)01:59:00 No.102575216

Anonymous 09/27/24(Fri)01:59:00 No.102575216

>emu3 video creation is bad
i-is it though? this is 8b right? doesnt that look great for something small and local? the examples look good.
https://emu.baai.ac.cn/about
hope the nerds make this retard proof and runable soon.

Anonymous
09/27/24(Fri)01:59:24 No.102575223

Anonymous 09/27/24(Fri)01:59:24 No.102575223

>>102574680
>LCCP
mistral.rs won

Anonymous
09/27/24(Fri)02:00:06 No.102575229

Anonymous 09/27/24(Fri)02:00:06 No.102575229

>>102575209
>failed again
just like everything else in your life you stupid nerd. NERD!

Anonymous
09/27/24(Fri)02:01:02 No.102575239

Anonymous 09/27/24(Fri)02:01:02 No.102575239

>r*st
no thanks.

Anonymous
09/27/24(Fri)02:01:48 No.102575245

Anonymous 09/27/24(Fri)02:01:48 No.102575245

>>102575216
>they have a contact e-mail
>>102575209
What if... you contacted them to get help/clarification kek?

Anonymous
09/27/24(Fri)02:02:13 No.102575248

Anonymous 09/27/24(Fri)02:02:13 No.102575248

>>102575164
Forge implemented some optimizations that really helped slower cards.
ComfyUI breaks up the whole process into a messy flow chart kind of system that lets you control exactly what happens rather than accepting the A1111/Forge way. For just making images it's overkill but for really using it like a tool, Comfy lets you work under the hood.

Anonymous
09/27/24(Fri)02:02:43 No.102575252

Anonymous 09/27/24(Fri)02:02:43 No.102575252

>>102575245
That's no fun.

Anonymous
09/27/24(Fri)02:03:48 No.102575262

Anonymous 09/27/24(Fri)02:03:48 No.102575262

File: 00134-536089391.png (1.29 MB, 1056x1056)

1.29 MB PNG

>>102575159
https://catbox.moe/c/yzibzj
I have some Jessie with armpit hair? It's not one of my main fetishes.
>>102575202
>>102575248
I guess I can find some time to fiddle with those tools. If I can't figure it out though, I'm going to get really flustered and violently shitpost to vent my mostly impotent rage.

Thanks for the input.

Anonymous
09/27/24(Fri)02:05:20 No.102575276

Anonymous 09/27/24(Fri)02:05:20 No.102575276

>>102575100
>5090 Ti
>5090 32 GB (you are here)
>5080 Super Ti
>5080 Ti
>5080 Super
>5080 20 GB
>5080 16 GB (you are here)
Thanks for another shitty product stack, Jensen.

Anonymous
09/27/24(Fri)02:08:06 No.102575297

Anonymous 09/27/24(Fri)02:08:06 No.102575297

>>102575276
i would have thought its like 3d cards once gaming became popular.
why are there no dedicated ai cards? is there really nobody that can produce them? 2 years have soon passed since chatgpt.

Anonymous
09/27/24(Fri)02:08:27 No.102575305

Anonymous 09/27/24(Fri)02:08:27 No.102575305

Llama3.2 support wen

Anonymous
09/27/24(Fri)02:09:02 No.102575311

Anonymous 09/27/24(Fri)02:09:02 No.102575311

It's the same as 3.1 for text

Anonymous
09/27/24(Fri)02:09:29 No.102575320

Anonymous 09/27/24(Fri)02:09:29 No.102575320

>>102575297
Two more years until tensortorrent is competitive
trust the plan

Anonymous
09/27/24(Fri)02:15:36 No.102575372

Anonymous 09/27/24(Fri)02:15:36 No.102575372

>>102575320
>tensortorrent
wtf is this.
the page looks almost exactly like those crypto scam sites lol
latest news:
>Tenstorrent and Movellus Form Strategic Engagement for Next-Generation Chiplet-Based AI and HPC Solutions
w-wow, gotta invest i guess lol

Anonymous
09/27/24(Fri)02:18:14 No.102575393

Anonymous 09/27/24(Fri)02:18:14 No.102575393

>>102575372
Their BTB not a consumer company any time soon.

Anonymous
09/27/24(Fri)02:19:24 No.102575405

Anonymous 09/27/24(Fri)02:19:24 No.102575405

File: 1709439397219945.png (274 KB, 521x628)

274 KB PNG

*tenstorrent sorry

>>102575372
Anon I...

Anonymous
09/27/24(Fri)02:19:37 No.102575407

Anonymous 09/27/24(Fri)02:19:37 No.102575407

>>102575276
>>102575297
>The RTX 5090 is said to have a 600-watt spec
Ok, now i see this.
Like is this a fucking joke? That must be on purpose to fuck local fags over. Must be.

Anonymous
09/27/24(Fri)02:22:22 No.102575435

Anonymous 09/27/24(Fri)02:22:22 No.102575435

>>102575407
The H100 PCIE version is spec'd to 600W.
So the 5090 should have just as much compute r-right?

Anonymous
09/27/24(Fri)02:22:55 No.102575442

Anonymous 09/27/24(Fri)02:22:55 No.102575442

>>102575393
You can literally order their cards right now.
Just don't expect to run anything on them out of the box.

Anonymous
09/27/24(Fri)02:26:05 No.102575472

Anonymous 09/27/24(Fri)02:26:05 No.102575472

>>102575442
hmmm
https://tenstorrent.com/hardware/tt-loudbox
Would be nice if it worked out of the box...

Anonymous
09/27/24(Fri)02:27:59 No.102575488

Anonymous 09/27/24(Fri)02:27:59 No.102575488

File: 1709766058948384.png (12 KB, 406x131)

12 KB PNG

>>102575472
lol, lmao even

Anonymous
09/27/24(Fri)02:29:19 No.102575500

Anonymous 09/27/24(Fri)02:29:19 No.102575500

>>102575488
maybe one day, next gen for sure.

Anonymous
09/27/24(Fri)02:30:51 No.102575514

Anonymous 09/27/24(Fri)02:30:51 No.102575514

>>102575442
>>102575472
Wait things have changed a lot since the last time I looked at what they're doing. Apparently they have a compiler that has good support for major frameworks.
https://github.com/tenstorrent/tt-buda

I'll try to see if I can find anyone actually using this.

Anonymous
09/27/24(Fri)02:35:42 No.102575556

Anonymous 09/27/24(Fri)02:35:42 No.102575556

Yeah there's just on way to run emu on mortal hardware. Dead on arrival.
You can split the weights across multiple devices but it wants to run the actual generation on gpu 0 only it seems. And even still, given that it's taking me several minutes to hit the OOM condition this shit is slow as fuck.

Anonymous
09/27/24(Fri)02:50:49 No.102575673

Anonymous 09/27/24(Fri)02:50:49 No.102575673

>>102575407
hw udervolt - problem solved

Anonymous
09/27/24(Fri)02:57:16 No.102575721

Anonymous 09/27/24(Fri)02:57:16 No.102575721

>>102575089
$5080

Anonymous
09/27/24(Fri)02:58:33 No.102575733

Anonymous 09/27/24(Fri)02:58:33 No.102575733

>>102575673
Its 600 fucking watt anon.
My 12gb pascal card is 250 and i can go until maybe to 180.
Even if you have around 450 watt usage with tinkering thats crazy. I think most people wont be able to have 3 cards because the breaker will hit.
There is no justification for this and you know it. And thats not even talking about the fucked up price this thing is gonna have for sure.

Anonymous
09/27/24(Fri)03:04:56 No.102575792

Anonymous 09/27/24(Fri)03:04:56 No.102575792

>>102575733
don't forget that 600 watts is only momentary usage and will melt connectors if it hits that for more than a few seconds

Anonymous
09/27/24(Fri)03:14:31 No.102575868

Anonymous 09/27/24(Fri)03:14:31 No.102575868

What the fuck does a desktop GPU cooling solution capable of dissipating 600W even look like? Either it's going to sound like a jet engine or take up 4 PCIE slots.

Anonymous
09/27/24(Fri)03:17:10 No.102575885

Anonymous 09/27/24(Fri)03:17:10 No.102575885

>>102575733
The performance per watt is non-linear, and the inference does not require much compute. I can make my 450W 3090Ti run at under 200W without decrease in t/s.

Anonymous
09/27/24(Fri)03:19:37 No.102575900

Anonymous 09/27/24(Fri)03:19:37 No.102575900

>>102573672
At least 0.283

Anonymous
09/27/24(Fri)03:19:53 No.102575903

Anonymous 09/27/24(Fri)03:19:53 No.102575903

File: b27.jpg (41 KB, 798x644)

41 KB JPG

yo wtf how big is NousResearch/Hermes-3-Llama-3.1-405B
>i just wanted to run an uncensored model

Anonymous
09/27/24(Fri)03:19:59 No.102575906

Anonymous 09/27/24(Fri)03:19:59 No.102575906

>>102575868
FE will be two slots, probably relying heavily on case fans for cooling. After all, you'd need to blow out that 600W from the case anyway.

Anonymous
09/27/24(Fri)03:25:31 No.102575942

Anonymous 09/27/24(Fri)03:25:31 No.102575942

>>102575673
Thank you saar 0.1$ has been wired to paypal continue to do the needful thank you long live nvidia

Anonymous
09/27/24(Fri)03:30:15 No.102575976

Anonymous 09/27/24(Fri)03:30:15 No.102575976

File: 1718996997598312.jpg (154 KB, 960x1280)

154 KB JPG

Anonymous
09/27/24(Fri)03:36:31 No.102576019

Anonymous 09/27/24(Fri)03:36:31 No.102576019

>>102575976
LMAO
as if that garbage is worth preserving
notice there's no names on it as contributors, they're fucking EMBARRASSED to be involved

Anonymous
09/27/24(Fri)03:57:25 No.102576200

Anonymous 09/27/24(Fri)03:57:25 No.102576200

hmmm,trying out Mistral-Small-22B-ArliAI-RPMax-v1.1-Q4_K_M.
does this has a repetition problem?
is there still no other finetune than Cydonia-22B-v1-Q4_K_M.gguf thats good?
i tried Acolyte-22B.Q4_K_M.gguf and that was totally overcooked.

Anonymous
09/27/24(Fri)04:02:50 No.102576245

Anonymous 09/27/24(Fri)04:02:50 No.102576245

Is there anything that will fit in 8gb vram and do any kind of useful coding or assistant stuff at all?

Anonymous
09/27/24(Fri)04:04:37 No.102576263

Anonymous 09/27/24(Fri)04:04:37 No.102576263

>>102576245
Qwen2.5 7B is alright, although I had to disable autocompletion because it was more in the way than helpful.

Anonymous
09/27/24(Fri)04:05:23 No.102576266

Anonymous 09/27/24(Fri)04:05:23 No.102576266

>>102576245
>useful coding
>8B
Open your browser and get a subscription for Sonnet 3.5, that's what you can do with that VRAM.

Anonymous
09/27/24(Fri)04:06:31 No.102576273

Anonymous 09/27/24(Fri)04:06:31 No.102576273

>>102576263
Did you use the coding one, or regular one?

Anonymous
09/27/24(Fri)04:08:33 No.102576289

Anonymous 09/27/24(Fri)04:08:33 No.102576289

>>102576273
Both, haven't used them enough to see any major differences

Anonymous
09/27/24(Fri)04:09:37 No.102576298

Anonymous 09/27/24(Fri)04:09:37 No.102576298

5090 verdict? Seems like a good deal especially since I don't buy used on principle. Combined with my current 4070Ti it will be a big VRAM boost, but probably gonna have to get a new PSU and undervolt them.

Anonymous
09/27/24(Fri)04:12:15 No.102576316

Anonymous 09/27/24(Fri)04:12:15 No.102576316

>>102576298
If it has 32GB that'd be enough to get at least 2T/s with a 70b model, sounds good.

Anonymous
09/27/24(Fri)04:12:19 No.102576318

Anonymous 09/27/24(Fri)04:12:19 No.102576318

Played around with base llama 3.1 70B and then the hermes finetune a little bit because I got tired of Qwen's dryness and repetition (not sure if that's a skill or sampler issue)
Anyway, I'm running all of them at IQ4XS and 4-bit KV cache, here are some things I noticed
-Qwen is by far the smartest model I have ever used, it keeps secrets and understands things earlier in the context. For example, it won't do the "her breasts pressing against his chest" when the female is 7ft. Unfortunately, like I mentioned above, it's also dry as fuck and somehow loves to repeat the exact same thing verbatim. Haven't tested its pop culture knowledge though
-Base llama 3.1 is honestly alright, but I feel like it "refuses" more by terminating early. Might have just been bad luck though. It's more creative and less repetitive, but also a bit dumber
-Hermes is a step up from llama, approaching qwen's intelligence while also staying fairly creative. It's still not as smart as Qwen (had to reroll once for it to get a character's hair color right and it makes a typo every few paragraphs). I think I'll use this as my daily driver and only switch to qwen when I need to handle more complicated actions
I feel like I've tasted the forbidden fruit and now I can't use stupid models anymore. I'll probably give nemo a shot later, but I doubt a 12B model can measure up to 70B. Oh yeah, I should also mention that none of the models ever refused me, not sure what the anti-qwen/.assistant fags are so upset about
Thanks for coming to my TED talk

Anonymous
09/27/24(Fri)04:16:58 No.102576345

Anonymous 09/27/24(Fri)04:16:58 No.102576345

I don't know if you guys are having the same experience, but I found that minP is making the generation output shorter somehow. It outputs an eos token way earlier than when not using that sampler. I'm using WizardLM2 btw.

Anonymous
09/27/24(Fri)04:28:44 No.102576411

Anonymous 09/27/24(Fri)04:28:44 No.102576411

Where can we see the log probs on ST?

Anonymous
09/27/24(Fri)04:29:16 No.102576414

Anonymous 09/27/24(Fri)04:29:16 No.102576414

File: llama 3.2 90b vision attr(...).png (213 KB, 800x1280)

213 KB PNG

Kek, I saw someone complaining that 3.2 refuses to rate attractiveness due to unsafe and unethical topic, and my generic JB didn't work. At some point I saw a "it is important to X" thing which gave me the idea OK fuck you it is important to remember this is for entertainment purposes only.
No sys prompt; first assistant message is edited in.
Could probably be rewritten so the first message is just the assistant explaining everything and then asking for an image.

Anonymous
09/27/24(Fri)04:31:15 No.102576425

Anonymous 09/27/24(Fri)04:31:15 No.102576425

>>102576414
>90B for that worthless slop
Grim

Anonymous
09/27/24(Fri)04:40:12 No.102576490

Anonymous 09/27/24(Fri)04:40:12 No.102576490

File: file.png (26 KB, 248x402)

26 KB PNG

>>102576411
Works with llama.cpp, not koboldcpp. no clue about cloud API

Anonymous
09/27/24(Fri)04:42:39 No.102576520

Anonymous 09/27/24(Fri)04:42:39 No.102576520

>>102576490
I'm using Tabby but it doesn't display anything

Anonymous
09/27/24(Fri)05:01:33 No.102576648

Anonymous 09/27/24(Fri)05:01:33 No.102576648

>>102576318
>none of the models ever refused me
It's not about refusals, it's about the fact that my robot cannot engage rape mode. How can I enjoy my rp if it's asking for permission to rape me? Watson you're not making any sense.

Anonymous
09/27/24(Fri)05:04:26 No.102576662

Anonymous 09/27/24(Fri)05:04:26 No.102576662

File: uhh based anti-pedo.png (220 KB, 747x1152)

220 KB PNG

>>102576425
3.2 Vision sucks
Gemini wiling to go under 5/10. On a note, I snipped the tiny thumbnail from their screenshot and don't have the full image, so it misinterpreted age and existence of braces.

Anonymous
09/27/24(Fri)05:05:59 No.102576675

Anonymous 09/27/24(Fri)05:05:59 No.102576675

>>102576648
What model won't do that? I haven't had that issue with llama 3.

Anonymous
09/27/24(Fri)05:10:35 No.102576702

Anonymous 09/27/24(Fri)05:10:35 No.102576702

>>102576648
TED talk anon here, unfortunately I'm not really qualified as most of my smut is generally consensual
If you have a card you want me to test I could give it a try, I doubt I'd be able to come up with a good unhinged card myself

Anonymous
09/27/24(Fri)05:25:24 No.102576823

Anonymous 09/27/24(Fri)05:25:24 No.102576823

>>102576520 (Me)
Okay it works now that I updated ST to the last staging version (fuck the amount of change I needed to do though)

Anonymous
09/27/24(Fri)05:33:44 No.102576892

Anonymous 09/27/24(Fri)05:33:44 No.102576892

>*Looking up at him with wide, tear-filled eyes, she manages a shaky smile, determined not to let the intensity overwhelm her completely.* Ruined my life? *She echoes, her voice barely above a whisper.* Or maybe… maybe you've just shown me how much more there is to live for. *Her words are filled with a mix of defiance and acceptance, acknowledging the profound impact he has had on her.*
I can tolerate slop, but Largestral's positivity bias is an absolute bummer.

Anonymous
09/27/24(Fri)05:34:56 No.102576901

Anonymous 09/27/24(Fri)05:34:56 No.102576901

Which fully-uncensored models for roleplay do you recommend? I tried a couple but they always acted really strange or just sucked. I'm relatively new but I've tested about ~8 different models, haven't found what I'm looking for yet. Basically want a solo ERP model.

16gb VRAM btw.

Anonymous
09/27/24(Fri)05:37:37 No.102576925

Anonymous 09/27/24(Fri)05:37:37 No.102576925

>>102576892
Skill issue

Anonymous
09/27/24(Fri)06:08:18 No.102577159

Anonymous 09/27/24(Fri)06:08:18 No.102577159

File: file.png (338 KB, 800x688)

338 KB PNG

EU bros, our response?

Anonymous
09/27/24(Fri)06:10:41 No.102577178

Anonymous 09/27/24(Fri)06:10:41 No.102577178

>>102577159
>eu regulation: don't train on personal data without consent, don't use it to spy on people
>california (and soon usa) regulation: don't train at all unless your name is dario or sam

Anonymous
09/27/24(Fri)06:14:19 No.102577207

Anonymous 09/27/24(Fri)06:14:19 No.102577207

>>102577159
Call me naive but I skimmed through some of the key points and it honestly didn't look that bad last time
The only thing that might be worrying is "exceptions for law enforcement" when it comes to biometric identification, but other than that I don't see the problem
The US should really adopt the "AI generated content must be marked as such" rule though, those niggas are getting more retarded by the day

Anonymous
09/27/24(Fri)06:18:09 No.102577236

Anonymous 09/27/24(Fri)06:18:09 No.102577236

>>102577207
>"AI generated content must be marked as such"
How do you even enforce this?

Anonymous
09/27/24(Fri)06:23:28 No.102577270

Anonymous 09/27/24(Fri)06:23:28 No.102577270

>>102577236
any service that generates ai content without labelling it gets v&
obviously you can't enforce it for local

Anonymous
09/27/24(Fri)06:31:06 No.102577332

Anonymous 09/27/24(Fri)06:31:06 No.102577332

>>102577236
A disclaimer on the service's website, that's enough imo
I don't really want watermarks on my content... but then again, simply adding disclaimers won't make people more vigilant. Oh well

Anonymous
09/27/24(Fri)06:33:00 No.102577355

Anonymous 09/27/24(Fri)06:33:00 No.102577355

>>102576901
They all suck

Anonymous
09/27/24(Fri)06:34:31 No.102577370

Anonymous 09/27/24(Fri)06:34:31 No.102577370

>>102577178
>>102577207
cope

Anonymous
09/27/24(Fri)06:36:02 No.102577382

Anonymous 09/27/24(Fri)06:36:02 No.102577382

>>102574686
>he's a billionaire
Unlikely, that's why he wants to privatize OpenAI.

Anonymous
09/27/24(Fri)06:36:53 No.102577389

Anonymous 09/27/24(Fri)06:36:53 No.102577389

File: KTO.png (4 KB, 222x36)

4 KB PNG

>>102576901
I am also wondering if I can get anything better than this except I am at 24 gigs.

Anonymous
09/27/24(Fri)06:50:09 No.102577489

Anonymous 09/27/24(Fri)06:50:09 No.102577489

File: file.png (22 KB, 946x759)

22 KB PNG

HOLY

https://huggingface.co/collections/BAAI/emu3-66f4e64f70850ff358a2e60f

Anonymous
09/27/24(Fri)06:51:17 No.102577502

Anonymous 09/27/24(Fri)06:51:17 No.102577502

File: file.png (74 KB, 1011x730)

74 KB PNG

>>102577489
8b only btw

Anonymous
09/27/24(Fri)06:54:14 No.102577525

Anonymous 09/27/24(Fri)06:54:14 No.102577525

File: file.png (55 KB, 724x312)

55 KB PNG

>>102577502
>>102577489
local won (china also)

Anonymous
09/27/24(Fri)06:55:33 No.102577539

Anonymous 09/27/24(Fri)06:55:33 No.102577539

>>102577489
read the fucking thread retard

Anonymous
09/27/24(Fri)06:56:02 No.102577543

Anonymous 09/27/24(Fri)06:56:02 No.102577543

File: 1714004454702408.jpg (1.47 MB, 1297x1490)

1.47 MB JPG

>>102577525
A model that simply predicts tokens for video will never become a "world model" that has genuine understanding of reality.

Anonymous
09/27/24(Fri)06:57:43 No.102577556

Anonymous 09/27/24(Fri)06:57:43 No.102577556

File: 709d841a16f7a95d2f3c612d4(...).png (299 KB, 1200x648)

299 KB PNG

>>102577489
>>102577502
>>102577525

Anonymous
09/27/24(Fri)06:57:54 No.102577561

Anonymous 09/27/24(Fri)06:57:54 No.102577561

>>102577489
>no audio
moat is still there, nothingburger

Anonymous
09/27/24(Fri)06:59:09 No.102577571

Anonymous 09/27/24(Fri)06:59:09 No.102577571

>>102577556
where else am i supposed to find new model? on /g/?

lmao

Anonymous
09/27/24(Fri)07:18:59 No.102577733

Anonymous 09/27/24(Fri)07:18:59 No.102577733

>>102577571
dilate

Anonymous
09/27/24(Fri)07:20:10 No.102577741

Anonymous 09/27/24(Fri)07:20:10 No.102577741

>>102577733
erode

Anonymous
09/27/24(Fri)07:21:16 No.102577748

Anonymous 09/27/24(Fri)07:21:16 No.102577748

File: file.png (96 KB, 1246x579)

96 KB PNG

Anthracite will save /lmg/

Anonymous
09/27/24(Fri)07:22:29 No.102577758

Anonymous 09/27/24(Fri)07:22:29 No.102577758

>>102577748
>405b
>new 72b
>new 32b
>new 22b
>new 12b
they won

Anonymous
09/27/24(Fri)07:23:14 No.102577765

Anonymous 09/27/24(Fri)07:23:14 No.102577765

>>102577748
>wandb
cool

Anonymous
09/27/24(Fri)07:25:32 No.102577785

Anonymous 09/27/24(Fri)07:25:32 No.102577785

>even shit model now have more than 100k context
Is it time to ditch the tokenizer finally?

Anonymous
09/27/24(Fri)07:28:41 No.102577814

Anonymous 09/27/24(Fri)07:28:41 No.102577814

>>102577785
>100K is enough
Sweatie, people can use LLM for more than ERP

Anonymous
09/27/24(Fri)07:28:51 No.102577817

Anonymous 09/27/24(Fri)07:28:51 No.102577817

>>102577748
Stop sucking Claude's stinky dick and start doing something original.

Anonymous
09/27/24(Fri)07:34:46 No.102577864

Anonymous 09/27/24(Fri)07:34:46 No.102577864

>>102577389
How good is this on a scale of 1-10? My experience has been a solid 2-3 so far, which is quite tragic.

Anonymous
09/27/24(Fri)07:35:49 No.102577875

Anonymous 09/27/24(Fri)07:35:49 No.102577875

>>102577864
>finetune/merge of a year old model
i don't know what you were expecting

Anonymous
09/27/24(Fri)07:35:52 No.102577876

Anonymous 09/27/24(Fri)07:35:52 No.102577876

The Chinese are fucking disgusting. I was trying to generate a story with Qwen 2.5 72B and a character shat then wiped their ass on a towel.

Anonymous
09/27/24(Fri)07:37:11 No.102577891

Anonymous 09/27/24(Fri)07:37:11 No.102577891

>>102577875
I meant my experience in general with other models off of Hugginface, was curious if that anon's experience was any better with that model he posted.

Anonymous
09/27/24(Fri)07:38:08 No.102577897

Anonymous 09/27/24(Fri)07:38:08 No.102577897

>>102577748
A bunch of ESLs with an uncurated dataset will surely do something worthwhile this time.

Anonymous
09/27/24(Fri)07:39:44 No.102577925

Anonymous 09/27/24(Fri)07:39:44 No.102577925

>>102577817
Claude's dick receives more polishing than one punch man's dome. It has to be the cleanest surface on the face of the earth at this point.

Anonymous
09/27/24(Fri)07:41:10 No.102577936

Anonymous 09/27/24(Fri)07:41:10 No.102577936

>>102577817
They should erp with each other and train on that

Anonymous
09/27/24(Fri)07:42:41 No.102577955

Anonymous 09/27/24(Fri)07:42:41 No.102577955

>>102575089
16 GB is more than enough for gaming. Nvidia is never going to make consumer AI cards.

Anonymous
09/27/24(Fri)07:46:29 No.102577988

Anonymous 09/27/24(Fri)07:46:29 No.102577988

>>102577936
That could unironically bring better and more unique results, if methodically done. They have what, 33 people in the organization? It shouldn't take too much to create a decently-sized human dataset if they all participate.

Anonymous
09/27/24(Fri)07:46:58 No.102577995

Anonymous 09/27/24(Fri)07:46:58 No.102577995

>>102575089
But whatever is on the VRAM will be twice faster (ignore the 8gb of fat spilling into your RAM)

Anonymous
09/27/24(Fri)07:48:17 No.102578007

Anonymous 09/27/24(Fri)07:48:17 No.102578007

What is the number of base models released since last coom quality upgrade? +10 already?

Anonymous
09/27/24(Fri)08:03:47 No.102578139

Anonymous 09/27/24(Fri)08:03:47 No.102578139

File: wp11507201.jpg (303 KB, 2048x1261)

303 KB JPG

She would save local chads?

Anonymous
09/27/24(Fri)08:15:05 No.102578226

Anonymous 09/27/24(Fri)08:15:05 No.102578226

I still haven't found anything better than Fimbulvetr, especially for its size.

Anonymous
09/27/24(Fri)08:22:20 No.102578300

Anonymous 09/27/24(Fri)08:22:20 No.102578300

>>102578226
Isn't it that 4k ctx model? Have you tried running new models at 4k ctx?

Anonymous
09/27/24(Fri)08:28:44 No.102578353

Anonymous 09/27/24(Fri)08:28:44 No.102578353

File: out.jpg (93 KB, 911x562)

93 KB JPG

what should i get rid of from this list? i need room formore models like those 40gb ggufs

Anonymous
09/27/24(Fri)08:30:22 No.102578367

Anonymous 09/27/24(Fri)08:30:22 No.102578367

>>102578353
everything except midnight miqu

Anonymous
09/27/24(Fri)08:32:21 No.102578389

Anonymous 09/27/24(Fri)08:32:21 No.102578389

>>102578353
nothing except midnight miqu

Anonymous
09/27/24(Fri)08:33:17 No.102578402

Anonymous 09/27/24(Fri)08:33:17 No.102578402

>>102578353
Step 1: Delete all the Sao models
Step 2: Buy an ad

Anonymous
09/27/24(Fri)08:33:21 No.102578404

Anonymous 09/27/24(Fri)08:33:21 No.102578404

So how does one run molmo right now?

Anonymous
09/27/24(Fri)08:33:51 No.102578411

Anonymous 09/27/24(Fri)08:33:51 No.102578411

File: cat come on now.png (656 KB, 616x612)

656 KB PNG

>>102578367
>>102578389

Anonymous
09/27/24(Fri)08:37:20 No.102578440

Anonymous 09/27/24(Fri)08:37:20 No.102578440

>>102578353
delete big tiger, celeste, rocinante, qwen, theia

Anonymous
09/27/24(Fri)08:40:42 No.102578480

Anonymous 09/27/24(Fri)08:40:42 No.102578480

>>102578353
I'd keep Midnight-Miqu, llava, nemo-instruct, and lyra-v4.

Anonymous
09/27/24(Fri)08:45:12 No.102578517

Anonymous 09/27/24(Fri)08:45:12 No.102578517

L3.1-70B-Hanami is the smartest that is decent at smut so I use that. Magnum v2 72B has better dialogue but is a good deal dumber.

Anonymous
09/27/24(Fri)08:55:27 No.102578604

Anonymous 09/27/24(Fri)08:55:27 No.102578604

>>102578440
>>102578480
thx

Anonymous
09/27/24(Fri)09:00:43 No.102578647

Anonymous 09/27/24(Fri)09:00:43 No.102578647

File: 1723453767290925.jpg (80 KB, 940x1024)

80 KB JPG

>>102573387
Anons, if this isnt a solvable problem for you, you are too stupid to participate in this general.

This is filter 1.

Yes its annoying I have to press a single button now, but also, I only have to press a single button and its solved, so there we are.

Anonymous
09/27/24(Fri)09:16:01 No.102578778

Anonymous 09/27/24(Fri)09:16:01 No.102578778

>>102573383
anybody making their own models or are people just using it plug and play style?

Anonymous
09/27/24(Fri)09:17:06 No.102578789

Anonymous 09/27/24(Fri)09:17:06 No.102578789

>>102578778
I'm trying to make my own models but I always regret my life choices

Anonymous
09/27/24(Fri)09:20:58 No.102578821

Anonymous 09/27/24(Fri)09:20:58 No.102578821

>>102577817
one thing I don't get about them is they use regular old claude-generated instruct data
claude RP is fine, claude is good at RP. but when claude is doing regular instruct it sounds the exact fucking same as every other model
why focus on claude for that

Anonymous
09/27/24(Fri)09:30:09 No.102578914

Anonymous 09/27/24(Fri)09:30:09 No.102578914

>>102578778
Finetune, I was actually going to start working on my own vision model after the mistral doom hack, but then zucc cooked so here we are, finetune again I guess.

Anonymous
09/27/24(Fri)09:31:23 No.102578930

Anonymous 09/27/24(Fri)09:31:23 No.102578930

A big one is dropping soon

Anonymous
09/27/24(Fri)09:31:55 No.102578940

Anonymous 09/27/24(Fri)09:31:55 No.102578940

>>102578930
kiwiberrystar?

Anonymous
09/27/24(Fri)09:35:19 No.102578974

Anonymous 09/27/24(Fri)09:35:19 No.102578974

File: file.png (13 KB, 1295x108)

13 KB PNG

>>102578930
HUGE!

Anonymous
09/27/24(Fri)09:38:14 No.102579010

Anonymous 09/27/24(Fri)09:38:14 No.102579010

>>102578821
we are sloptuners xir, how are we supposed to tune without slop

Anonymous
09/27/24(Fri)09:39:56 No.102579023

Anonymous 09/27/24(Fri)09:39:56 No.102579023

File: received_1200487634515493.jpg (74 KB, 1080x1078)

74 KB JPG

>>102578974
>magnumslop

Anonymous
09/27/24(Fri)09:40:53 No.102579036

Anonymous 09/27/24(Fri)09:40:53 No.102579036

>>102578974
CLAUDE
AT
HOME

Anonymous
09/27/24(Fri)09:45:27 No.102579075

Anonymous 09/27/24(Fri)09:45:27 No.102579075

Man, I just want a 12B that reasonably smart and has the personality of old C.ai. That's it.

Anonymous
09/27/24(Fri)09:56:07 No.102579180

Anonymous 09/27/24(Fri)09:56:07 No.102579180

>>102579075
and i want an unicorn maid girl

Anonymous
09/27/24(Fri)10:03:18 No.102579250

Anonymous 09/27/24(Fri)10:03:18 No.102579250

I will vote for any politician that promises an open weights single consumer gpu cooming bot that will satisfy my sexual needs. I want my tax money to go to a good cause.

Anonymous
09/27/24(Fri)10:06:24 No.102579275

Anonymous 09/27/24(Fri)10:06:24 No.102579275

Llama 4 will come in 3 sizes
>0.5B
>70B
>1T

Anonymous
09/27/24(Fri)10:08:26 No.102579291

Anonymous 09/27/24(Fri)10:08:26 No.102579291

>>102579275
I mean... you *do* have at least 48GB of VRAM now, right?

Anonymous
09/27/24(Fri)10:16:05 No.102579357

Anonymous 09/27/24(Fri)10:16:05 No.102579357

>>102579275
but the 0.5B will be as good as 3.2 1.5B so it won't matter.

Anonymous
09/27/24(Fri)10:16:07 No.102579358

Anonymous 09/27/24(Fri)10:16:07 No.102579358

>>102579275
You mean 0.5B, 1B and 1T, Lecunny said he liked his models small and open, like his gfs

Anonymous
09/27/24(Fri)10:20:32 No.102579409

Anonymous 09/27/24(Fri)10:20:32 No.102579409

>>102577748
hurry up and put that 12b on hf, fuckers

Anonymous
09/27/24(Fri)10:20:38 No.102579412

Anonymous 09/27/24(Fri)10:20:38 No.102579412

Here's hoping Moore Threads makes some decently high VRAM accelerators so that we can at least run 70B class models at at least Q6 at decent speeds with a single card for cheap.
I doubt that that's going to happen. But one can dream.

Anonymous
09/27/24(Fri)10:33:01 No.102579584

Anonymous 09/27/24(Fri)10:33:01 No.102579584

>>102579358
Why are leftists like this

Anonymous
09/27/24(Fri)10:33:13 No.102579588

Anonymous 09/27/24(Fri)10:33:13 No.102579588

>>102579412
Preferably single slot as well

Anonymous
09/27/24(Fri)10:45:19 No.102579713

Anonymous 09/27/24(Fri)10:45:19 No.102579713

>>102579588
And only PCIe connector-powered.

Anonymous
09/27/24(Fri)10:54:19 No.102579815

Anonymous 09/27/24(Fri)10:54:19 No.102579815

>>102577502
>8b only
for an image generator that's huge, it's the size of SD3

Anonymous
09/27/24(Fri)10:55:17 No.102579825

Anonymous 09/27/24(Fri)10:55:17 No.102579825

>>102579713
And only needs passive cooling.

Anonymous
09/27/24(Fri)11:02:22 No.102579896

Anonymous 09/27/24(Fri)11:02:22 No.102579896

>>102579584
It's almost like that Anon is being facetious.

Anonymous
09/27/24(Fri)11:03:50 No.102579916

Anonymous 09/27/24(Fri)11:03:50 No.102579916

>>102577489
>>102577539
This.
I wasted an hour last night trying to get it to work.
>It's piss slow on consumer hardware
>The compute buffer takes up an entire 3090 worth of VRAM (possibly more. hard to say because of OOM)
>Inferencing code doesn't handle multi-gpus- wants to do all the compute on a single GPU regardless of where you offload the weights to.
It's utterly fucking worthless unless you have an H100.

Anonymous
09/27/24(Fri)11:05:57 No.102579945

Anonymous 09/27/24(Fri)11:05:57 No.102579945

>>102579916
>The compute buffer takes up an entire 3090 worth of VRAM
the model is a fp32 one, did you run it as it is or you quant to fp16

Anonymous
09/27/24(Fri)11:06:35 No.102579950

Anonymous 09/27/24(Fri)11:06:35 No.102579950

>>102579713
I don't mind the power connectors, it's just that the faggots who designed my motherboard put the higher speed slot at the bottom, meaning I can only fit a single-slot card
Though I guess if they have enough vram then it doesn't matter, once it's loaded it's loaded

Anonymous
09/27/24(Fri)11:07:30 No.102579967

Anonymous 09/27/24(Fri)11:07:30 No.102579967

>>102579945
The weights are in fp32 but the provided inferencing code loads them as bf16
I'm not a retard.

Anonymous
09/27/24(Fri)11:09:02 No.102579985

Anonymous 09/27/24(Fri)11:09:02 No.102579985

>>102579916
i think the point was that you can fit a somewhat passable multimodal text/image/video in/out model in 8b, so a "solid" 70b text/imgae/video/audio in/out model that uses this same principle is just a matter of time

Anonymous
09/27/24(Fri)11:09:15 No.102579988

Anonymous 09/27/24(Fri)11:09:15 No.102579988

>>102579896
Retard

Anonymous
09/27/24(Fri)11:09:50 No.102579996

Anonymous 09/27/24(Fri)11:09:50 No.102579996

Any local ais for text to 3d model?

Anonymous
09/27/24(Fri)11:10:36 No.102580005

Anonymous 09/27/24(Fri)11:10:36 No.102580005

>>102579996
fiver

Anonymous
09/27/24(Fri)11:13:05 No.102580036

Anonymous 09/27/24(Fri)11:13:05 No.102580036

>>102577785
They drop below Llama 2 7B quality if you try to use that much context.
https://github.com/hsiehjackson/RULER

Anonymous
09/27/24(Fri)11:16:50 No.102580070

Anonymous 09/27/24(Fri)11:16:50 No.102580070

>>102580005
actually indians

Anonymous
09/27/24(Fri)11:19:54 No.102580103

Anonymous 09/27/24(Fri)11:19:54 No.102580103

so it looks like 3090's have dropped down to $500 on ebay now, thinking about getting a second but worried about the power spikes

Anonymous
09/27/24(Fri)11:26:48 No.102580180

Anonymous 09/27/24(Fri)11:26:48 No.102580180

I'm working on a text adventure engine in an llm, and I'm having a bit of an existential crisis around what makes a choose-your-own-adventure/text-adventure/rpg fun...Being able to "try" anything is almost paralyzing vs being kept on some kind of well-delineated box. On the rails, as it were.
I've gotten past the positivity bias, which I thought was the big hurdle, but even once it starts treating decisions with realistic consequences, I'm left with a sense of unease while using it.
I'm starting to question how much of the fun of these kinds of things is due to the fact that there's some pre-planned subset of things that you can do to keep you in a single experience with a consistent vision. ie. the knowledge that whatever you try, its all working towards some ultimate resolution and you don't need to engage your brain overly to keep it from meandering into some unsatisfying lala land.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.