/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

[Post a Reply]

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous
/lmg/ - Local Models General 07/01/24(Mon)18:44:18 No.101234947

File: 00703-2979877490.png (939 KB, 1040x720)

939 KB PNG

/lmg/ - Local Models General Anonymous 07/01/24(Mon)18:44:18 No.101234947

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>101224321 & >>101214216

►News
>(06/28) Inference support for Gemma 2 merged: https://github.com/ggerganov/llama.cpp/pull/8156
>(06/27) Meta announces LLM Compiler, based on Code Llama, for code optimization and disassembly: https://go.fb.me/tdd3dw
>(06/27) Gemma 2 released: https://hf.co/collections/google/gemma-2-release-667d6600fd5220e7b967f315
>(06/25) Cambrian-1: Collection of vision-centric multimodal LLMs: https://cambrian-mllm.github.io
>(06/23) Support for BitnetForCausalLM merged: https://github.com/ggerganov/llama.cpp/pull/7931

►News Archive: https://rentry.org/lmg-news-archive
►FAQ: https://wikia.schneedc.com
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Programming: https://hf.co/spaces/bigcode/bigcode-models-leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
07/01/24(Mon)18:45:05 No.101234961

Anonymous 07/01/24(Mon)18:45:05 No.101234961

File: __hatsune_miku_chibi_miku(...).jpg (153 KB, 650x650)

153 KB JPG

►Recent Highlights from the Previous Thread: >>101224321

--The Future of AI Models: Transformers and Infinite Context Windows: >>101224457
--Gemma 2 Livebench Results Surpass Expectations: >>101233145 >>101233182 >>101233211 >>101233216 >>101233277 >>101233316 >>101233324 >>101233468 >>101233642 >>101233455 >>101233521 >>101233550 >>101233562 >>101233576
--Magnum 72b: Sloppiness Concerns: >>101230582 >>101230592 >>101230682 >>101230626 >>101230693 >>101230706 >>101230777 >>101230694 >>101230752 >>101230790 >>101230825 >>101231361 >>101231410 >>101231446 >>101231500
--Is mistralrs Worth Setting Up?: >>101232996 >>101233062 >>101233146 >>101233275
--Quant Comparison: >>101233903 >>101233980 >>101234292 >>101234543 >>101234566 >>101234578 >>101234616
--Gemma2-27B: Surprisingly Good Performance After PR Application: >>101230856 >>101231895
--Gemma 9B in mistral.rs: Memory Issues and Potential Solutions: >>101230330 >>101230382 >>101230379 >>101230412 >>101230505 >>101230469
--Pioneering with 3090 Ti and 2GB VRAM Modules: >>101224436 >>101225737 >>101226124 >>101226528 >>101226552 >>101227648 >>101228138 >>101229717 >>101229717 >>101231635
--Optimizing AI Model Performance in Group Chats: >>101229091
--Old AI Models and Their Chatlogs: >>101231633 >>101231742 >>101232088 >>101232114 >>101232144 >>101232168 >>101232207
--LLMs in Antivirus: Inefficient and Ineffective: >>101232216 >>101232274 >>101232302 >>101232378
--Comparing AI Models: TETO, Typhon, and Limarp: >>101229799 >>101229976 >>101230042 >>101230056
--27B is better than 9B, according to anon: >>101232235 >>101232249 >>101232313 >>101232335 >>101232348 >>101232429 >>101232656 >>101232693 >>101232759 >>101232821 >>101232886
--Frustration with Gemma 2 27b Output: >>101227465 >>101227626 >>101227660
--Miku (free space): >>101224828 >>101228187 >>101228786 >>101229043 >>101229103 >>101230806 >>101231556 >>101232956 >>101232986 >>101233536 >>101234205

►Recent Highlight Posts from the Previous Thread: >>101224328

Anonymous
07/01/24(Mon)18:51:02 No.101235021

Anonymous 07/01/24(Mon)18:51:02 No.101235021

For me, it's Yi.

Anonymous
07/01/24(Mon)18:53:49 No.101235057

Anonymous 07/01/24(Mon)18:53:49 No.101235057

I tuned and quanted Wizard 8x22 on limarp, but can't for the love of fuck find any good sampler settings. It's like I'm circling around the sweetspot, and yet when I think I have got it, it begins to either repeat an entire post or just disobey entirely.

Is it sampler issue? Did I fuck up training it? Maybe fucked up making exl2 quant??

Anonymous
07/01/24(Mon)19:01:45 No.101235160

Anonymous 07/01/24(Mon)19:01:45 No.101235160

when did you guys advance past the need of meme merge models. Just 6 months ago everyone here would've been baited by some shit like Sthenomaidblackrootgigaultrasupercoomer

Anonymous
07/01/24(Mon)19:02:57 No.101235178

Anonymous 07/01/24(Mon)19:02:57 No.101235178

>>101235057
You didn’t even test the model before doing exl2 quant? Do gguf first and check if it works. Make sure you use the temp file flag or it will eat your ram for breakfast.

Anonymous
07/01/24(Mon)19:03:03 No.101235182

Anonymous 07/01/24(Mon)19:03:03 No.101235182

>>101235160
The only merge that I have used is the Mixtral LimaRP one.

Anonymous
07/01/24(Mon)19:03:40 No.101235189

Anonymous 07/01/24(Mon)19:03:40 No.101235189

>>101235160
you'd need some good tunes of the newer models first to make merges
there are none so there's nothing worth merging

Anonymous
07/01/24(Mon)19:04:04 No.101235192

Anonymous 07/01/24(Mon)19:04:04 No.101235192

HMM, I just did an even quicker test of story writing. It seems that Q8_0_L is, in fact, closer to FP16 than Q8_0, with a small sample size I conducted. It's not a big difference, but it's there in this test. So yeah, actually, if you use Q8_0, and you have an extra bit more VRAM, get Q8_0_L. All other _L quants are to be ignored.

Anonymous
07/01/24(Mon)19:04:31 No.101235202

Anonymous 07/01/24(Mon)19:04:31 No.101235202

>>101235160
meme merging seriously stopped with Mixtra limarp zloss.
Its kind of... the best option rn unless your running something better.

Anonymous
07/01/24(Mon)19:07:28 No.101235241

Anonymous 07/01/24(Mon)19:07:28 No.101235241

Fixing llama.cpp with miku

Anonymous
07/01/24(Mon)19:07:40 No.101235242

Anonymous 07/01/24(Mon)19:07:40 No.101235242

>>101235160
Never used a merge. I never believed them to be worth using. The first one i checked out had a comment in the readme to the tune of
>I really don't know what prompt format you should use, so whatever.
It was a merge between two or three finetunes, all used different formats.
I'm ok with experimental stuff on models, like removing layers or whatever to see the effect it has, but thinking that you can get the best attributes of different models by just merging them is retarded.

Anonymous
07/01/24(Mon)19:08:00 No.101235248

Anonymous 07/01/24(Mon)19:08:00 No.101235248

Testing 27B using fp16 transformers (so no broken quants), it's definitely a step up for its size class. It's smarter than anything else I've used in the 25-35B range.
But it's not smarter than L3-70B or Qwen2-72B, the people making that claim seem to be delusional.

Anonymous
07/01/24(Mon)19:09:51 No.101235260

Anonymous 07/01/24(Mon)19:09:51 No.101235260

>>101235160
When llama3 dropped

Anonymous
07/01/24(Mon)19:10:46 No.101235266

Anonymous 07/01/24(Mon)19:10:46 No.101235266

>>101235160
just use stheno 3.2 8b models got solved :thumbsup:

Anonymous
07/01/24(Mon)19:11:55 No.101235278

Anonymous 07/01/24(Mon)19:11:55 No.101235278

>>101235057
if you need complicated sampler settings for a model that smart, somethings fucked up.

Anonymous
07/01/24(Mon)19:12:28 No.101235282

Anonymous 07/01/24(Mon)19:12:28 No.101235282

>>101235266
It's retarded and horny.

Anonymous
07/01/24(Mon)19:13:16 No.101235293

Anonymous 07/01/24(Mon)19:13:16 No.101235293

>>101235248
makes sense. I think it's just people having hope they can do more with less. It's still impressive how far current weight classes have come, though.

Anonymous
07/01/24(Mon)19:14:08 No.101235303

Anonymous 07/01/24(Mon)19:14:08 No.101235303

>>101235282
yeah it's like looking in a mirror, that's why i like it

Anonymous
07/01/24(Mon)19:14:58 No.101235315

Anonymous 07/01/24(Mon)19:14:58 No.101235315

>>101235160
I used to like custom finetunes until I actually used meta's official finetune for L2 instruct. Then I understood kofi finetuners have no idea wtf they're doing.

Anonymous
07/01/24(Mon)19:16:21 No.101235330

Anonymous 07/01/24(Mon)19:16:21 No.101235330

>>101235293
Yeah once all the quant issues are ironed out I think it'll be great for people without fat vram, and we'll see some nice RP/story tunes. Definitely the new SOTA for sub-70B. It's just not the holy grail or anything.

Anonymous
07/01/24(Mon)19:16:35 No.101235333

Anonymous 07/01/24(Mon)19:16:35 No.101235333

>>101235248
You implemented sliding window attention and logit soft capping? And did you also use the correct chat format? Its
<start_of_turn> <end_of_turn>

Anonymous
07/01/24(Mon)19:18:00 No.101235350

Anonymous 07/01/24(Mon)19:18:00 No.101235350

>>101235248
i compared Gemma 2 Q8_0_L 27b with qwen2 72b Q4_K_S

gemma was much better at staying in character and refusing user pleads ("stop this whining, i'll kill you anyway")

qwen2 was retardedly submissive ("ok i'll let you live, let's bond instead")

shame that gemma is still plagued with bugs though, not really usuable in llama.cpp right now

Anonymous
07/01/24(Mon)19:19:13 No.101235354

Anonymous 07/01/24(Mon)19:19:13 No.101235354

>>101235248
Also I hope you know that it is not compatible with flash attention or SDPA

Anonymous
07/01/24(Mon)19:19:55 No.101235361

Anonymous 07/01/24(Mon)19:19:55 No.101235361

>>101235333
nta but the soft capping is just for making it not fall apart at longish context I believe, it doesn't have any effect on the model's intelligence on short prompts

Anonymous
07/01/24(Mon)19:20:11 No.101235362

Anonymous 07/01/24(Mon)19:20:11 No.101235362

https://github.com/ggerganov/llama.cpp/pull/8244
tokenizer convert fix for gemma 2 was merged

do we need to reggoof?

Anonymous
07/01/24(Mon)19:20:15 No.101235364

Anonymous 07/01/24(Mon)19:20:15 No.101235364

>>101235178
You know what? Fair, I'll just test the fullscale model rn

Anonymous
07/01/24(Mon)19:20:23 No.101235367

Anonymous 07/01/24(Mon)19:20:23 No.101235367

>>101235361
>nta but the soft capping is just for making it not fall apart at longish context I believe, it doesn't have any effect on the model's intelligence on short prompts

It does though. That is one of the main issues.

Anonymous
07/01/24(Mon)19:21:09 No.101235374

Anonymous 07/01/24(Mon)19:21:09 No.101235374

>>101235354
I'm using eager attention not flash (flash doesn't work in transformers anyway so I couldn't even if I wanted to).

Anonymous
07/01/24(Mon)19:21:51 No.101235380

Anonymous 07/01/24(Mon)19:21:51 No.101235380

File: gemma481297.png (8 KB, 1237x69)

8 KB PNG

>>101235350
>another commit to test
>https://github.com/ggerganov/llama.cpp/commit/5fac350b9cc49d0446fc291b9c4ad53666c77591

Anonymous
07/01/24(Mon)19:22:32 No.101235393

Anonymous 07/01/24(Mon)19:22:32 No.101235393

Yeah I think I'm gonna wait things out.

Anonymous
07/01/24(Mon)19:22:54 No.101235400

Anonymous 07/01/24(Mon)19:22:54 No.101235400

Give it another 2 weeks imo. It's gonna take a bit for the common backends to get it all sorted out.

Anonymous
07/01/24(Mon)19:23:06 No.101235401

Anonymous 07/01/24(Mon)19:23:06 No.101235401

>>101235380
well, someone please do the needful and kindly redeem a gguf for me to test, sirs

Anonymous
07/01/24(Mon)19:23:14 No.101235403

Anonymous 07/01/24(Mon)19:23:14 No.101235403

>>101235350
Base qwen is lame, at least use the Tess finetune. I prefer Euryale, though.

Anonymous
07/01/24(Mon)19:23:31 No.101235406

Anonymous 07/01/24(Mon)19:23:31 No.101235406

>>101235362
Change in the conversion. So yes.

Anonymous
07/01/24(Mon)19:23:38 No.101235409

Anonymous 07/01/24(Mon)19:23:38 No.101235409

Was SWA a mistake?

Anonymous
07/01/24(Mon)19:24:07 No.101235416

Anonymous 07/01/24(Mon)19:24:07 No.101235416

>>101235293
>I think it's just people having hope they can do more with less.
I heard that Karpathy guy mention that the whole field of AI is just reinventing compression, so less is more.

Anonymous
07/01/24(Mon)19:24:29 No.101235421

Anonymous 07/01/24(Mon)19:24:29 No.101235421

File: Hungry_pumkin.jpg (43 KB, 383x331)

43 KB JPG

>>101235400
>>101235393
NO
i want to COOM TO GEMMA
NOW
AAAAAAAAA

Anonymous
07/01/24(Mon)19:24:37 No.101235423

Anonymous 07/01/24(Mon)19:24:37 No.101235423

>>101235409
It might be how gemmi has such a long context. Maybe google figured it out when mixtral couldn't.

Anonymous
07/01/24(Mon)19:24:37 No.101235424

Anonymous 07/01/24(Mon)19:24:37 No.101235424

>>101235401
Download the full model and do it yourself. You have the tools to do it.

Anonymous
07/01/24(Mon)19:24:53 No.101235426

Anonymous 07/01/24(Mon)19:24:53 No.101235426

Also a question - mistral.rs has that "make a MoEe out of anything" feature. Can I like make a 27b Gemma MoE out of it potentially?

Anonymous
07/01/24(Mon)19:25:18 No.101235430

Anonymous 07/01/24(Mon)19:25:18 No.101235430

>>101235423
What do you mean by long context? I thought they said it was 8k.

Anonymous
07/01/24(Mon)19:26:10 No.101235441

Anonymous 07/01/24(Mon)19:26:10 No.101235441

>>101235416
It's compressing data very lossily. The trade off being that for the first time we've managed to compress intelligence. That's not the same as the model itself having intelligence, mind you.

Anonymous
07/01/24(Mon)19:26:34 No.101235444

Anonymous 07/01/24(Mon)19:26:34 No.101235444

>>101235430
I'm talking about their main model, 2M context that actually works.

Anonymous
07/01/24(Mon)19:27:09 No.101235449

Anonymous 07/01/24(Mon)19:27:09 No.101235449

>>101235426
>https://github.com/EricLBuehler/mistral.rs/blob/master/docs/ANYMOE.md
What do you think?

Anonymous
07/01/24(Mon)19:27:39 No.101235454

Anonymous 07/01/24(Mon)19:27:39 No.101235454

>>101235202
>>101235182
How come it's better than BMT?

Anonymous
07/01/24(Mon)19:27:56 No.101235459

Anonymous 07/01/24(Mon)19:27:56 No.101235459

>>101235444
Did people actually test that and verified it works that well, as in it actually understands the entire context well when generating a response and not only when it's requested to retrieve some info from it?

Anonymous
07/01/24(Mon)19:28:08 No.101235461

Anonymous 07/01/24(Mon)19:28:08 No.101235461

>>101235403
i have Tess-v2.5.2-Qwen2-72B.i1-Q4_K_S.gguf
but it's too roleplay-brained, starts *roleplaying* unprompted and subsequently winking, sparkling and shivering

Anonymous
07/01/24(Mon)19:30:11 No.101235474

Anonymous 07/01/24(Mon)19:30:11 No.101235474

>>101235459
For the love of me I can't find the "true context" github page. Anyone else have it?

Anonymous
07/01/24(Mon)19:30:11 No.101235475

Anonymous 07/01/24(Mon)19:30:11 No.101235475

>>101235424
my drives are full...

Anonymous
07/01/24(Mon)19:31:13 No.101235482

Anonymous 07/01/24(Mon)19:31:13 No.101235482

>>101235459
There it is https://github.com/hsiehjackson/RULER

Anonymous
07/01/24(Mon)19:31:47 No.101235484

Anonymous 07/01/24(Mon)19:31:47 No.101235484

File: file.png (133 KB, 1852x470)

133 KB PNG

another llama.cpp moment?

Anonymous
07/01/24(Mon)19:32:54 No.101235497

Anonymous 07/01/24(Mon)19:32:54 No.101235497

>>101235449
Well, I'm more curious if resulting MoEs are any good and if it would even be worth doing.

Anonymous
07/01/24(Mon)19:33:01 No.101235499

Anonymous 07/01/24(Mon)19:33:01 No.101235499

>>101235459
>>101235482

Despite achieving nearly perfect performance on the vanilla needle-in-a-haystack (NIAH) test, all models (except for Gemini-1.5-pro) exhibit large degradation on tasks in RULER as sequence length increases.

Gemini is the only model that does long context well. I'm wondering if the secret sauce is correctly implemented sliding window

Anonymous
07/01/24(Mon)19:33:54 No.101235507

Anonymous 07/01/24(Mon)19:33:54 No.101235507

>>101235441
I think intelligence IS that compression/retrieval process.
Reflecting on human thought and memory makes makes me feel uneasy, cause human mind also seems like a lossy compression of an otherwise unpersuasive universe.

Anonymous
07/01/24(Mon)19:33:59 No.101235509

Anonymous 07/01/24(Mon)19:33:59 No.101235509

>>101235484
He has no idea what the fuck he's talking about. 16bits of data is 16bits of data. The precision difference between different encodings is negligible. Specially in the ranges tensors use.

Anonymous
07/01/24(Mon)19:34:40 No.101235513

Anonymous 07/01/24(Mon)19:34:40 No.101235513

>>101235461
What you described before sure sounded like you were roleplaying with your models.

Anonymous
07/01/24(Mon)19:37:22 No.101235541

Anonymous 07/01/24(Mon)19:37:22 No.101235541

>>101235278
Yeah, the suspicion I fucked up is strong.

Anonymous
07/01/24(Mon)19:38:39 No.101235556

Anonymous 07/01/24(Mon)19:38:39 No.101235556

I miss text completion models sometimes. I know you can do it with base models but it never worked that great for me. (inevitable looping/complete loss of coherence a few k tokens into context)

Anonymous
07/01/24(Mon)19:39:56 No.101235568

Anonymous 07/01/24(Mon)19:39:56 No.101235568

>>101235556
The age of base models finetuned on plaintext will come soon enough. Instruct and RLHF are just after effects of the ChatGPT craze that's dying down already.

Anonymous
07/01/24(Mon)19:40:07 No.101235571

Anonymous 07/01/24(Mon)19:40:07 No.101235571

>>101235497
I doubt its worth it. Imagine two models, both with different tokenizers and prompt formats. And you still need a router which is trained, apparently, on the fly.
The only thing merges of any type have shown me is that language models can sustain a moderate amount of damage and still be somewhat usable.
But someone will come out with a 'good' amalgamation, may get some traction, shills will do what they do, new proper model drops and everyone will switch to the new thing.

Anonymous
07/01/24(Mon)19:41:11 No.101235584

Anonymous 07/01/24(Mon)19:41:11 No.101235584

>>101233901
I got MiquMaid-v3-70B.q4_0.gguf but I think 70B is too much for my 24gb card
what should I get instead?

Anonymous
07/01/24(Mon)19:41:35 No.101235590

Anonymous 07/01/24(Mon)19:41:35 No.101235590

>>101235454
???? What even is this saying.
Taking a guess, its better than BMT because it passes quite a few tests other merge slopped Mixtral variants do not. (nala, `thoughts` and "speech", ect.)
Im sure at least 80% of "my model i use is awesome because x" the model is good because its probably being used for one specific use case or specific fetish.
Mixtral limaRP zloss does juuuuuust about everything.

Anonymous
07/01/24(Mon)19:42:35 No.101235603

Anonymous 07/01/24(Mon)19:42:35 No.101235603

>>101235584
Another 24 gb card
>t. 40 gb STILL isnt enough for 70b

Anonymous
07/01/24(Mon)19:45:39 No.101235648

Anonymous 07/01/24(Mon)19:45:39 No.101235648

>>101235603
BWEH
I'll downgrade instead

Anonymous
07/01/24(Mon)19:49:18 No.101235695

Anonymous 07/01/24(Mon)19:49:18 No.101235695

>>101235499
Actually I think it's the opposite. In order to throw competitors off, Google implemented exactly something that's a dead end.

Anonymous
07/01/24(Mon)19:51:10 No.101235723

Anonymous 07/01/24(Mon)19:51:10 No.101235723

>>101235568
I hope that's true. It seems to me that the chat format was invented in order to try to make AI more appealing to normies. Now that it's becoming clear that normies are aggressively uninterested in AI, I hope we'll go back to smart autocomplete, which always made more sense.

Anonymous
07/01/24(Mon)19:53:30 No.101235754

Anonymous 07/01/24(Mon)19:53:30 No.101235754

Actually, wouldn't a MoE with two models that have the exact opposite ideas of slop result in a more creative output? Sure you'd have to take care of the instruct formatting differences, but even if you used a router that just randomly selects which expert to use, it'd result in a more varied output, right?

Anonymous
07/01/24(Mon)19:55:27 No.101235779

Anonymous 07/01/24(Mon)19:55:27 No.101235779

>>101234961
are these thread highlights ai generated?

Anonymous
07/01/24(Mon)19:55:32 No.101235780

Anonymous 07/01/24(Mon)19:55:32 No.101235780

Give me the tldr on using multiple but different GPUs for local LLMs. Assume 3 different cards in the 3000 series.

Anonymous
07/01/24(Mon)19:57:04 No.101235802

Anonymous 07/01/24(Mon)19:57:04 No.101235802

Pony and SDXL are two different models. Stop mixing and merging them and posting them on civitai as SDXL. Pony isn't compatible with SDXL.

Anonymous
07/01/24(Mon)20:00:00 No.101235834

Anonymous 07/01/24(Mon)20:00:00 No.101235834

>>101235780
you plug them in and they work if you're lucky

Anonymous
07/01/24(Mon)20:01:47 No.101235852

Anonymous 07/01/24(Mon)20:01:47 No.101235852

>>101235780
7900xtx and 7800xt works fine together.
>t. amdrone
I can only imagine its the same on nvidias side.

Anonymous
07/01/24(Mon)20:01:49 No.101235853

Anonymous 07/01/24(Mon)20:01:49 No.101235853

>>101235754
No different that just adding noise to the output of a single model. I don't like analogies but think of two people, each trying to write a story or whatever, where each writes one word at a time. It could be entertaining, but it's easy for one model to greatly skew the output of the other
>The [This looks fine]
>shivers [rm -rf frankenmoe]
Not that i care for a few tropes, but i'm sure you see the problem. There may be a very specific set of settings that would end up with a usable model, but we have people arguing that _S quants are better than _M ones and that other retard shilling Q8_0L quants. Those same people will be doing the frankenmoes.

Anonymous
07/01/24(Mon)20:06:33 No.101235910

Anonymous 07/01/24(Mon)20:06:33 No.101235910

>>101235057
I'm using Dracone's 2.5bpw EXL2 and not seeing any of those issues with ooba's simple1 preset, for what it's worth

Anonymous
07/01/24(Mon)20:08:28 No.101235928

Anonymous 07/01/24(Mon)20:08:28 No.101235928

>>101235723
>Now that it's becoming clear that normies are aggressively uninterested in AI
This is my interpretation of why the 3 big proprietary model companies (openai, anthropic, google) seem to be pivoting to focusing on programming ability with all the models over the last 6 months too
They've realized that normal people just aren't that interested in smart chatbots, at least in text form

gpt4o with its native voice chat seems to be a hail mary to get non-programmers interested, but they're clearly having trouble shipping it

Anonymous
07/01/24(Mon)20:09:19 No.101235943

Anonymous 07/01/24(Mon)20:09:19 No.101235943

>>101235513
just chatting, not roleplaying

Anonymous
07/01/24(Mon)20:09:52 No.101235951

Anonymous 07/01/24(Mon)20:09:52 No.101235951

>>101235361
>thinking architecture is optional

Anonymous
07/01/24(Mon)20:10:45 No.101235962

Anonymous 07/01/24(Mon)20:10:45 No.101235962

>>101235242
For serious tasks, 100% agreed. For roleplay where the more inspirational sources the model has to draw from the better, merges can be powerful. They also have the added benefit of moving further away from the underlying corpo model due to the stacking effect.

Anonymous
07/01/24(Mon)20:11:03 No.101235969

Anonymous 07/01/24(Mon)20:11:03 No.101235969

I just got done adding S and M quants to my test that was originally about seeing whether L was of any value. The results are that out of the 10 trivia questions, M was more accurate for 6 questions, and S was more accurate for 4 questions. So at least in this sampling, it does seem that M is better than S as you'd normally come to expect.

Anonymous
07/01/24(Mon)20:11:53 No.101235979

Anonymous 07/01/24(Mon)20:11:53 No.101235979

>>101235590
>Mixtral limaRP zloss does juuuuuust about everything
That's what I wanted to know, thanks.

Anonymous
07/01/24(Mon)20:11:57 No.101235980

Anonymous 07/01/24(Mon)20:11:57 No.101235980

Trying mradermacher quants (IQ4_XS) of the latest Undislop merge

https://huggingface.co/mradermacher/MG-FinalMix-72B-i1-GGUF

It's actually pretty good

Anonymous
07/01/24(Mon)20:12:00 No.101235981

Anonymous 07/01/24(Mon)20:12:00 No.101235981

>>101235584
>MiquMaid-v3-70B.q4_0.gguf
what is this ancient meme

stheno
mixtral limarp zloss

and wait for gemma 2 27b to be fixed, should be the next vramlet goat model.

Anonymous
07/01/24(Mon)20:15:30 No.101236027

Anonymous 07/01/24(Mon)20:15:30 No.101236027

>>101235969
Which model were you testing with?

Anonymous
07/01/24(Mon)20:15:55 No.101236032

Anonymous 07/01/24(Mon)20:15:55 No.101236032

>>101235962
The shivers are powerful, and that seems to be one of the most common complaints. Chances are that any model (with or without instruct finetune) will correctly complete "shivers down" with what we all expect at this point. I'm still skeptical about the result being better than a single good model. I expect the worst model in the merge to drag all the other models down to its level. It needs just one token.

Anonymous
07/01/24(Mon)20:17:43 No.101236051

Anonymous 07/01/24(Mon)20:17:43 No.101236051

>>101236027
Just vanilla 8B Instruct

Anonymous
07/01/24(Mon)20:21:43 No.101236076

Anonymous 07/01/24(Mon)20:21:43 No.101236076

>>101235981
>what is this ancient meme
I'm not sure about its meme status, but I got it from here a while ago, but I got busy with IRL stuff and never finished setting it up
thanks for the recs, I'll get one of those

Anonymous
07/01/24(Mon)20:21:48 No.101236077

Anonymous 07/01/24(Mon)20:21:48 No.101236077

File: rin-a-cute.png (2.79 MB, 1328x1992)

2.79 MB PNG

>>101235780
>3000 series
Great support and you will get Flash Attention 2 since you're on ampere.
The new Quantized KV Cache works well too.
Just make sure you have enough power for all of the cards and powerlimit them as needed.

Anonymous
07/01/24(Mon)20:22:15 No.101236082

Anonymous 07/01/24(Mon)20:22:15 No.101236082

>>101236032
There are fine tuners who filter their dataset on the worst offender slop expressions. The majority don't though, so yeah..

Anonymous
07/01/24(Mon)20:23:49 No.101236101

Anonymous 07/01/24(Mon)20:23:49 No.101236101

Will you apologize to Pichai if Gemma 2 turns out to be better than all other local models including 100+Bees?

Anonymous
07/01/24(Mon)20:25:30 No.101236120

Anonymous 07/01/24(Mon)20:25:30 No.101236120

>>101236101
no

Anonymous
07/01/24(Mon)20:26:21 No.101236128

Anonymous 07/01/24(Mon)20:26:21 No.101236128

>>101236101
best i can do is go outside and shit on the street in solidarity

Anonymous
07/01/24(Mon)20:26:45 No.101236131

Anonymous 07/01/24(Mon)20:26:45 No.101236131

>>101236101
yes, but I won't have to because it won't, despite all the hopium
it'll turn out to be what it looks like: the best model in the 30B size range, but not competitive with much larger ones

Anonymous
07/01/24(Mon)20:27:14 No.101236136

Anonymous 07/01/24(Mon)20:27:14 No.101236136

>>101236101
I'm waiting for Gemma-2-104B before I make this judgement.

Anonymous
07/01/24(Mon)20:27:29 No.101236142

Anonymous 07/01/24(Mon)20:27:29 No.101236142

>>101236131
it kinda already competed with L3 and qwen even in its broken state

Anonymous
07/01/24(Mon)20:29:58 No.101236160

Anonymous 07/01/24(Mon)20:29:58 No.101236160

>>101236142
I strongly disagree that this is the case and idk what people are talking about when they say that
Even on lmsys it doesn't feel smarter than Qwen72 to me

Anonymous
07/01/24(Mon)20:37:50 No.101236239

Anonymous 07/01/24(Mon)20:37:50 No.101236239

It's so weird how heavily mixtral limarp is being shilled. It's not even better than BagelMIsteryTour and it's certainly not the best.

Anonymous
07/01/24(Mon)20:39:42 No.101236256

Anonymous 07/01/24(Mon)20:39:42 No.101236256

File: ComfyUI_07664_.jpg (3.21 MB, 1664x2432)

3.21 MB JPG

Anonymous
07/01/24(Mon)20:42:43 No.101236297

Anonymous 07/01/24(Mon)20:42:43 No.101236297

https://x.com/tsarnick/status/1807883460847325235

How it all came to be

Anonymous
07/01/24(Mon)20:43:34 No.101236308

Anonymous 07/01/24(Mon)20:43:34 No.101236308

File: shut up or get the BD.png (5 KB, 128x84)

5 KB PNG

Wtf? 2 weeks ago he swore _M is fucked and that q2ks > q4km in factual details (whatever model). Today he comes in and be like M is slightly better as you'd expect (assume different model)...

Anonymous
07/01/24(Mon)20:45:51 No.101236327

Anonymous 07/01/24(Mon)20:45:51 No.101236327

>>101236297
Cool, I'll give this a listen in full

Anonymous
07/01/24(Mon)20:46:03 No.101236330

Anonymous 07/01/24(Mon)20:46:03 No.101236330

>>101236239
Ill take this bait.
>It's so weird
Something that works is weird?
>It's not even better than BagelMIsteryTour
BMT fails multiple tests you CAN DO YOURSELF.
>it's certainly not the best.
no shit? I think we all would be running some 200B corpo model if we could, dumbass.

Anonymous
07/01/24(Mon)20:46:34 No.101236332

Anonymous 07/01/24(Mon)20:46:34 No.101236332

>>101236308
........wait. slightly less compressed version of a model is slightly better than the slightly more compressed version of a model? that's insane!

Anonymous
07/01/24(Mon)20:46:55 No.101236338

Anonymous 07/01/24(Mon)20:46:55 No.101236338

>>101236308
He? Who's he?

Anonymous
07/01/24(Mon)20:48:23 No.101236351

Anonymous 07/01/24(Mon)20:48:23 No.101236351

>>101236239
I don't bother with mixtral anymore but I thought it was better than bagelmist. to be honest I think the instruct-limarp hype is deserved, nothing else struck the balance between lewd and smart like it did

Anonymous
07/01/24(Mon)20:49:54 No.101236364

Anonymous 07/01/24(Mon)20:49:54 No.101236364

>>101236338
>is he in the room with us now?
yes the fuckin weirdo is in this thread

Anonymous
07/01/24(Mon)20:51:17 No.101236371

Anonymous 07/01/24(Mon)20:51:17 No.101236371

File: ComfyUI_07665_.jpg (3.32 MB, 1664x2432)

3.32 MB JPG

Anonymous
07/01/24(Mon)20:51:42 No.101236375

Anonymous 07/01/24(Mon)20:51:42 No.101236375

File: 1000_F_31102961_mMSl7x24P(...).jpg (283 KB, 636x1000)

283 KB JPG

>>101236297
Stop being parasites in the West.

Anonymous
07/01/24(Mon)20:51:57 No.101236380

Anonymous 07/01/24(Mon)20:51:57 No.101236380

>>101236330
>Something that works is weird?
It's weird because the model is old as shit. I used it months ago and soon tossed it in the bin.
>BMT fails multiple tests you CAN DO YOURSELF.
So does limarp. I did side-by-side comparisons of multiple generations for multiple long RPs, and limarp was usually stupider than BMT and maid-yuzu.
>I think we all would be running some 200B corpo model if we could
Euryale mogs it at 70b.

Anonymous
07/01/24(Mon)20:53:40 No.101236392

Anonymous 07/01/24(Mon)20:53:40 No.101236392

>>101236308
Not the same guy anonie.

Anonymous
07/01/24(Mon)20:57:49 No.101236426

Anonymous 07/01/24(Mon)20:57:49 No.101236426

>>101236380
>limarp was usually stupider than BMT and maid-yuzu.
You have no fucking idea what the fuck your even talking about holy shit.
This is possible the lowest IQ post yet.

Anonymous
07/01/24(Mon)21:01:23 No.101236461

Anonymous 07/01/24(Mon)21:01:23 No.101236461

think there's any credence to the theory that a lot of AI stuff (esp. voice cloning etc) is being held back due to the US elections and once they're over we'll see a flood of high quality releases?

Anonymous
07/01/24(Mon)21:02:35 No.101236470

Anonymous 07/01/24(Mon)21:02:35 No.101236470

>>101236426
t. probably a snoot curve enthusiast

Anonymous
07/01/24(Mon)21:03:02 No.101236475

Anonymous 07/01/24(Mon)21:03:02 No.101236475

>>101236375
You're talking to a fresh AI billionaire

Anonymous
07/01/24(Mon)21:04:29 No.101236490

Anonymous 07/01/24(Mon)21:04:29 No.101236490

>>101236461
Not to the extent, but yes in some respect because they dont want intense scrutiny from US gov, so they lay low a bit.

Prior to AI, the boogeyman was the social media, before that was rap music, before that rock music, before that it was the disco and so on.

Anonymous
07/01/24(Mon)21:04:29 No.101236491

Anonymous 07/01/24(Mon)21:04:29 No.101236491

>>101236461
If anything the opposite, (((they))) would do just about anything to be able to put anyone away for CP, and AI just isnt *quite* there yet.
A trump and epstien loras with (loli:1.2) in the prompt isnt going to produce the best incriminating evidence.

Anonymous
07/01/24(Mon)21:04:30 No.101236492

Anonymous 07/01/24(Mon)21:04:30 No.101236492

>>101236461
No, the field will never recover from this. By the time the elections are over there'll be 20 new laws and 500 new billion dollar lawsuits will restrict the training of new models while the quality of publicly available data will have plummeted even further due to the tide of AI-generated content diluting it.
It is, in fact, already over.

Anonymous
07/01/24(Mon)21:05:30 No.101236495

Anonymous 07/01/24(Mon)21:05:30 No.101236495

>>101236461
Yes, absolutely! We're going to be so so very back after they're over, it's going to be tremendous!

Anonymous
07/01/24(Mon)21:11:58 No.101236546

Anonymous 07/01/24(Mon)21:11:58 No.101236546

>>101236239
I definitely liked Bagel better. I WILL say that limarp zloss is a clear second best, it was the original "this fixes everything about mixtral" for me, I just like Bagel's style a lot more.

Anonymous
07/01/24(Mon)21:17:50 No.101236599

Anonymous 07/01/24(Mon)21:17:50 No.101236599

huh so llama.cpp flash attn just fucking works on rocm
I had assumed for so long that it wouldn't because amd's official flash attn fork never worked

Anonymous
07/01/24(Mon)21:18:52 No.101236605

Anonymous 07/01/24(Mon)21:18:52 No.101236605

>>101236599
Wait if it works even on amd does that mean it also works on proper older cards like turing and volta?

Anonymous
07/01/24(Mon)21:21:24 No.101236629

Anonymous 07/01/24(Mon)21:21:24 No.101236629

>>101236461
That makes sense for video gen and to a lesser extent image gen
I don't think it makes any sense for language models though

Current language models are already more than good enough to generate fakeposts and muh disinformation, holding back a better one until after the election would achieve nothing on that front

Anonymous
07/01/24(Mon)21:35:46 No.101236772

Anonymous 07/01/24(Mon)21:35:46 No.101236772

>>101236599
Where are you seeing that it's working? They're still talking about how to implement it, as far as I can see.

https://github.com/ggerganov/llama.cpp/pull/7011

Anonymous
07/01/24(Mon)22:11:05 No.101237097

Anonymous 07/01/24(Mon)22:11:05 No.101237097

>>101235723
>>101235928
>Cut all fun out for (((safety)))
>normies are aggressively uninterested in AI
All they had to do to get normies is keep porn gen in. That's all. They have to braindead to think that a normie would like to talk to hallucinating assistant instead of doing a google search. Those speech "assistants" have existed for a while, but almost nobody ever used them. Do they not learn from mistakes? Normies would however gladly pay for porn, look at all those twitch thots who aren't even fully naked and their simps.

Anonymous
07/01/24(Mon)22:11:59 No.101237101

Anonymous 07/01/24(Mon)22:11:59 No.101237101

File: 4d7.png (825 KB, 700x700)

825 KB PNG

This probably belongs in /sqt/, but whatever.
Audio. I've got no idea how to build whatever repositories I need, I did look at HuggingFace - but nothing particularly useful for actually getting a damn engine for audio analysis/voices/whatever.
My specific use case is to analyze an audio sample (arbitrary file type), examine it for an oscillating pattern (beat? Not a musician), and then extract the timestamps of when the pattern occurs.
Ideally there would be some fault tolerance, to accept for partially muddied audio at random intervals.
Was also thinking of building my own digital assistant (one that's actually useful). So voice analysis/synthesis would be helpful.
>tl;dr - How does one locally audio?

Anonymous
07/01/24(Mon)22:14:44 No.101237137

Anonymous 07/01/24(Mon)22:14:44 No.101237137

File: 1516584909506.gif (610 KB, 480x270)

610 KB GIF

>https://github.com/ggerganov/llama.cpp/pull/7931
>bitnet supported but no large models using it

>https://github.com/ggerganov/llama.cpp/issues/7006
>Jamba still unsupported

>https://github.com/ggerganov/llama.cpp/issues/7995
>Chameleon still unsupported

Anonymous
07/01/24(Mon)22:19:14 No.101237197

Anonymous 07/01/24(Mon)22:19:14 No.101237197

>>101237101
I'm big dumb, need to remember to rtfm.
https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/blob/main/docs/en/README.en.md

Anonymous
07/01/24(Mon)22:20:02 No.101237205

Anonymous 07/01/24(Mon)22:20:02 No.101237205

>>101237137
If Meta/Google/Chinks/Cohere releases a good model with new architecture, there will be support for it in 3 days. If there are no good models for %technologyname%, then there is no motivation to add support for it.

Anonymous
07/01/24(Mon)22:39:44 No.101237386

Anonymous 07/01/24(Mon)22:39:44 No.101237386

>>101235021
Used Dolphin 2.2 Yi 34b Q4 the other day and it served pretty good for replacing Midnight Miqu 70b IQ2 as a vramlet, can fit bigger context and generates faster too. Maybe overly big vocabulary and a bit more logic flaws compared to Miqu but otherwise worthy replacement.

Anonymous
07/01/24(Mon)22:47:03 No.101237452

Anonymous 07/01/24(Mon)22:47:03 No.101237452

Do you guys use these LLMs for anything? What's the appeal?

I have the shitty phi3 2.5 gb and really only use it to explain things I don't understand in textbooks or research papers. Is there a better model to use for this?

Anonymous
07/01/24(Mon)22:47:18 No.101237458

Anonymous 07/01/24(Mon)22:47:18 No.101237458

>>101237101
>analyze oscillating pattern
Why?

Just get Whisper AI running (transcribes your speech into text)
Analyze the text transcription for commands
Execute commands?

Anonymous
07/01/24(Mon)22:50:10 No.101237474

Anonymous 07/01/24(Mon)22:50:10 No.101237474

>>101237452
Better depends on how much vram your GPU has. If you got a GPU with <4GB vram, 2.5GB model makes sense. If you got ~8GB vram, a larger ~7GB model makes sense. Optimally you want to pair your model size with your GPU vram size and then get the best out of it.

>explain things I don't understand
Thats what I use it too. Also I use it for writing programs. Analyzing conceptual frameworks. Coming up with a plan, etc. Its useful.

Anonymous
07/01/24(Mon)22:53:50 No.101237498

Anonymous 07/01/24(Mon)22:53:50 No.101237498

>>101237452
When I have some big philosophical or sexual/taboo idea and want to discuss it with somebody more intelligent than reddit, as a bonus it's real time.
Or just general emotional support or erp with my favourite character.

Anonymous
07/01/24(Mon)22:55:08 No.101237506

Anonymous 07/01/24(Mon)22:55:08 No.101237506

So why haven't the secret sauce to the voice cloning models been released? I doubt its just a matter of scaling the GPU to 100+GB of vram and in clusters for speed. The tortoise one sounds robotic and takes an incredibly long time to do voice clone. xtts/styletts2 does it in a second. RVC does it in a second or so. But these aren't perfect and aren't close to closed source models with near perfect voice cloning. So to go back to the question, its just a secret sauce model right and not just scaling GPU to the nth degree?

Anonymous
07/01/24(Mon)22:58:04 No.101237530

Anonymous 07/01/24(Mon)22:58:04 No.101237530

>>101237452
>What's the appeal?
erp

Anonymous
07/01/24(Mon)23:04:07 No.101237597

Anonymous 07/01/24(Mon)23:04:07 No.101237597

>>101237506
it's just training, no one wants to train one that's good
Cartesia seems nice and is based on Mamba, so it's probably possible to train one locally that sounds decent enough but no one wants to do that
the last guy who made TTS advancements went to OpenAI

Anonymous
07/01/24(Mon)23:12:20 No.101237652

Anonymous 07/01/24(Mon)23:12:20 No.101237652

>>101236426
This, but unironically.

Anonymous
07/01/24(Mon)23:16:26 No.101237690

Anonymous 07/01/24(Mon)23:16:26 No.101237690

why is there like 2 guys incessantly fixated on limarp

Anonymous
07/01/24(Mon)23:18:05 No.101237697

Anonymous 07/01/24(Mon)23:18:05 No.101237697

>>101237597
>it's just training, no one wants to train one that's good
There are millions of people in the AI tts field and they have tons of GPU. So its not a GPU issue for certain and its certainly not a data set issue since we have ton of high quality voices that are on the internet.

I looked around a bit with the mamba thing

>https://2084.substack.com/p/2084-marcrandbot-speech-synthesis
This guy seems to have done a 42K training steps to get a legible sounding tts.

>https://x.com/krandiash/status/1795896007752036782
Further the Cartesia guy seems to have done a ~140K training steps to get decent output.

>https://github.com/ighodgao/mamba-speech-synthesis
There seems to be a training tool here. So is Mamba training the state of the art? Its just a matter of people not getting the right training method for why we have shit open source tts so far?

Anonymous
07/01/24(Mon)23:21:42 No.101237730

Anonymous 07/01/24(Mon)23:21:42 No.101237730

>>101236599
>>101236772
nta, but there's a tile-based flashattention that isn't implemented for amd in master, so instead amd falls back to vector flashattention kernels when tiles would otherwise be more appropriate.

>>101236599
>because amd's official flash attn fork never worked
CDNA cards have matrix cores, which are like nvidia's tensor cores. You have to load & use them a certain way but when you do you can get high matrix multiplication performance.
RDNA cards have nothing. RDNA3 has what looks like a macro to help the programmer arrange his data in a way that the GPU likes the best, but it's not separate cores accelerating things really.

When AMD implements shinies they do it for CDNA first and often in a way that's CDNA-specific.

Anonymous
07/01/24(Mon)23:24:34 No.101237754

Anonymous 07/01/24(Mon)23:24:34 No.101237754

>>101237697
>So its not a GPU issue for certain and its certainly not a data set issue since we have ton of high quality voices that are on the internet.
just because there are millions of people in the field doesn't mean that every single one of them are working on the same problems or care much about the state of open-source TTS
if any of those people are researchers, then they most definitely don't give a shit that the best open-source models can't match closed source, as long as they can still do their research they are solid
you, for instance, can rent some H100s right now and train a model - so why aren't you? Probably because you're waiting for a big company to spoonfeed you with a model they trained, but if they train such a model, they are more likely to simply sell it to you than open source it
hence why there are so many TTS companies in the first place

Anonymous
07/01/24(Mon)23:27:11 No.101237777

Anonymous 07/01/24(Mon)23:27:11 No.101237777

Who cares about TTS lol. 4o shows that native multimodal is the future. People still working on pure TTS are living in the last century and need to catch up.

Anonymous
07/01/24(Mon)23:27:23 No.101237778

Anonymous 07/01/24(Mon)23:27:23 No.101237778

File: elevenchads.png (102 KB, 1747x894)

102 KB PNG

>>101237506
secret sauce has always been to be an eleven chad.

Anonymous
07/01/24(Mon)23:38:08 No.101237849

Anonymous 07/01/24(Mon)23:38:08 No.101237849

>>101237754
>you're waiting for a big company to spoonfeed you with a model they trained
Actually I'm just waiting to be spoonfed PERIODx2. I actually dont know the scope of the training, nor the cost nor the time, nor have disposable income. If I did have disposable income, I'd just use the corporate models instead.

In my mind, the training could be done by 1 person with a disposable income. Or even a small group of people. With couple hundreds to thousands to spend. Maybe I'm wrong and it requires billions of dollars in GPU training of models. However I think its possible to train a decent TTS model for ~$100-$1K in training, so thats why I said there's plenty of people with those GPU and disposable income. And many even have those GPUs and dont require spending additional since they're training on text models on the side.

Anonymous
07/01/24(Mon)23:39:38 No.101237861

Anonymous 07/01/24(Mon)23:39:38 No.101237861

File: 1715459703470888.png (347 KB, 604x612)

347 KB PNG

>>101237690
It's the 2 anons from peru

Anonymous
07/02/24(Tue)00:15:50 No.101238095

Anonymous 07/02/24(Tue)00:15:50 No.101238095

>https://github.com/LostRuins/koboldcpp/releases
2 questions about kobold:
- what model should I use that can see images
- how do I get whisper to work?

Anonymous
07/02/24(Tue)00:21:47 No.101238146

Anonymous 07/02/24(Tue)00:21:47 No.101238146

File: raphilaughing.gif (161 KB, 400x436)

161 KB GIF

>https://github.com/ggerganov/llama.cpp/issues/8240
>Yes @0wwafa I'm aware and I've been making quants with the f16 embeddings. We've talked about this many times, and I still have massive doubts that it's improving anything.
Kek, he's had enough of his shit.

Anonymous
07/02/24(Tue)00:40:50 No.101238309

Anonymous 07/02/24(Tue)00:40:50 No.101238309

File: 1716945391723172.png (39 KB, 931x291)

39 KB PNG

For the second time in a short time this model https://huggingface.co/TheBloke/dolphin-2.2-yi-34b-200k-GGUF on Koboldcpp is bringing up an "evolved thought". Is this model leaking its inner thought process or what? Also, if I use ChatML template like they tell to, it produces either gibberish or nonsense word salad, and I found someone else having that problem. I switched to The non-200k model works fine with the correct template. I switched to Vicuna with space removed after end sequence, similar to Nous Capybara 34b's template, which is Yi derivative. Using Vicuna template produces mostly good results but then suddenly this.

Anonymous
07/02/24(Tue)00:45:47 No.101238340

Anonymous 07/02/24(Tue)00:45:47 No.101238340

>>101237777
>>101237777
>need to catch up.
so how do I run a gguff like cambrian-34b or
>https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5
in koboldcpp and tavern?

Anonymous
07/02/24(Tue)00:47:41 No.101238356

Anonymous 07/02/24(Tue)00:47:41 No.101238356

>>101238309
>its inner thought process or what
thats impossible, since the process occurs for every token.
It's far more likely an artifact due to the weight given to something else. Likely because it was trained on synthetic CoT data that was tagged with evolved_thought or similar

Anonymous
07/02/24(Tue)01:01:46 No.101238444

Anonymous 07/02/24(Tue)01:01:46 No.101238444

>>101238309
>TheBloke
oh no

Anonymous
07/02/24(Tue)01:03:52 No.101238451

Anonymous 07/02/24(Tue)01:03:52 No.101238451

>>101238444
Is there some other place to get ggufs or is there something wrong with them? Or any other format that can use both vram and ram?

Anonymous
07/02/24(Tue)01:07:51 No.101238479

Anonymous 07/02/24(Tue)01:07:51 No.101238479

Are there any models that don't have purple prose yet

Anonymous
07/02/24(Tue)01:10:34 No.101238503

Anonymous 07/02/24(Tue)01:10:34 No.101238503

>>101238451
>gguf
I create all my own ggufs from the original release source
It's pretty easy to download the safetensors and json files and do it yourself
You get maximum control and some extra margin of safety vs downloading from some unknown person on HF

Anonymous
07/02/24(Tue)01:11:41 No.101238509

Anonymous 07/02/24(Tue)01:11:41 No.101238509

>>101236308
Music theory question testing anon here.
I don't know if that's S-Anon, but two things.
1-Trivia questions would make sense that the model with higher quants does better because trivia doesn't require the model to remember exceptions.
2-On an 8B model there might not be much brain left to spare, also favoring the bigger quants.

The K_S phenomenon I've noticed only on 70B-class models. The smallest model to pass my music theory question is about 40GB. S-Anon was testing the 8x22 Wizard model.

So this is a good anecdote, but it's a third line of testing running parallel to what I and S-Anon have observed.

Anonymous
07/02/24(Tue)01:18:49 No.101238547

Anonymous 07/02/24(Tue)01:18:49 No.101238547

>>101236599
>>101236772
To add to this, I've only ever bothered to use the koboldcpp-rocm fork myself as opposed to llama.cpp standalone, but for what it's worth I can say it fits a much larger context in my gfx1100 cards before going OOM when the Flash Attention setting is enabled as opposed to normal settings. I suppose that's the vector-based Flash Attention >>101237730 mentions. Whatever it is it's a nice improvement on consumer AMD cards.

Anonymous
07/02/24(Tue)01:23:18 No.101238571

Anonymous 07/02/24(Tue)01:23:18 No.101238571

>>101238451
literally anyone else, he got caught slipping malware in once he got popular

Anonymous
07/02/24(Tue)01:23:44 No.101238574

Anonymous 07/02/24(Tue)01:23:44 No.101238574

>>101237137
>Chameleon
Let me know when Meta decides to ACTUALLY release it instead of the crippled piece of shit they put out.

Anonymous
07/02/24(Tue)01:25:32 No.101238583

Anonymous 07/02/24(Tue)01:25:32 No.101238583

>>101238571
source?

Anonymous
07/02/24(Tue)01:26:04 No.101238588

Anonymous 07/02/24(Tue)01:26:04 No.101238588

File: tumblr_mbcmg0cKHp1ro8syqo1_250.gif (20 KB, 240x180)

20 KB GIF

>>101238571
>Malware
Fuck, really? What did it do?

Anonymous
07/02/24(Tue)01:28:21 No.101238599

Anonymous 07/02/24(Tue)01:28:21 No.101238599

>>101238503
Just takes up a fuckton of space.

Anonymous
07/02/24(Tue)01:28:44 No.101238600

Anonymous 07/02/24(Tue)01:28:44 No.101238600

>>101238574
The investors would never allow it, especially as model capabilities improve.

Anonymous
07/02/24(Tue)01:29:24 No.101238602

Anonymous 07/02/24(Tue)01:29:24 No.101238602

>>101238583
>>101238588
Anon is fucking with you. TheBloke just upped and vanished one day. Probably got tired of the constant demand.

Anonymous
07/02/24(Tue)01:30:00 No.101238606

Anonymous 07/02/24(Tue)01:30:00 No.101238606

>>101238599
Just download more space

Anonymous
07/02/24(Tue)01:31:04 No.101238609

Anonymous 07/02/24(Tue)01:31:04 No.101238609

>>101238571
Yeah I can't find any information about this.

Anonymous
07/02/24(Tue)01:34:09 No.101238622

Anonymous 07/02/24(Tue)01:34:09 No.101238622

https://huggingface.co/DavidAU/Psyonic-Cetacean-MythoMax-Prose-Crazy-Ultra-Quality-29B-GGUF
>frankenmerge of llama 2 13b models, adding up to 29b total parameters
>posted 3 days ago
>9k downloads
Who the fuck is using this? The model card is pure, concentrated, unadulterated schizo from start to finish. This can't actually be any good, right?

Anonymous
07/02/24(Tue)01:34:26 No.101238624

Anonymous 07/02/24(Tue)01:34:26 No.101238624

>>101238602
Or he realized that reuploading GGUFs for perpetuity because of the constant breaking changes wasn't worth the hassle.
Occam's Razor.

Anonymous
07/02/24(Tue)01:35:51 No.101238632

Anonymous 07/02/24(Tue)01:35:51 No.101238632

Anyone quanted and tested Gemma 27B on the latest Llama.cpp commit? I'm running it currently at Q8_0 and it seems to be coherent, and up to 8k works. But I don't know if it's the same quality of responses as on Google's site, as I don't want to sign in to that shit. I also don't want to give them, or anyone, my prompts.

Anonymous
07/02/24(Tue)01:36:11 No.101238636

Anonymous 07/02/24(Tue)01:36:11 No.101238636

>https://huggingface.co/nyu-visionx/cambrian-34b
>https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5
GUYS HOW THE FUCK DO I RUN VISION MODELS!?
I JUST WANT TO I GET MIKU TO COMMENT ON MY GAMING SKILLS IN REAL TIME

Anonymous
07/02/24(Tue)01:36:52 No.101238639

Anonymous 07/02/24(Tue)01:36:52 No.101238639

>>101238636
You don't. They're still memes and not up to the quality of GPT-4, which already wasn't that good to begin with.

Anonymous
07/02/24(Tue)01:37:02 No.101238640

Anonymous 07/02/24(Tue)01:37:02 No.101238640

>>101238622
Frankenmerges are never good. Not even by accident. People are just retarded and desperate.

Anonymous
07/02/24(Tue)01:37:27 No.101238643

Anonymous 07/02/24(Tue)01:37:27 No.101238643

File: file.png (115 KB, 1334x765)

115 KB PNG

Is it finally over?

Anonymous
07/02/24(Tue)01:41:05 No.101238658

Anonymous 07/02/24(Tue)01:41:05 No.101238658

File: una.png (116 KB, 274x253)

116 KB PNG

>Herculean effort
>Grinds against your X
>body and soul

Anonymous
07/02/24(Tue)01:47:37 No.101238692

Anonymous 07/02/24(Tue)01:47:37 No.101238692

>>101238643
Oh cool. Didn't see that post. I >>101238632 did exactly that.

Unfortunately this also means that the model at least at Q8_0 fails the amputee test when doing greedy (deterministic) sampling. However, it passes one of my trivia tests that only larger models can do. And then it fails on a different trivia question. Still in the process of testing, might report back later. But I'm also tired so maybe I will sleep first.

Anonymous
07/02/24(Tue)01:49:50 No.101238703

Anonymous 07/02/24(Tue)01:49:50 No.101238703

File: slow asf.jpg (453 KB, 1536x960)

453 KB JPG

>>101238636
>>101238639
PLEASE HELP I DONT WANT TO CODE ALL THIS SHIT MYSELF!
IT DOESNT EVEN COMPILE WITH CUDA WTFFFF

Anonymous
07/02/24(Tue)01:52:02 No.101238719

Anonymous 07/02/24(Tue)01:52:02 No.101238719

>Have a vague idea for a story
>ask AI to generate it
>not like that! *edits prompt*
>AI generates something else I don't want, though technically within the instruction
>*further restrict prompt*
>repeat x50
>???
>you have now written an entire detailed story all by yourself.

Anonymous
07/02/24(Tue)01:54:24 No.101238732

Anonymous 07/02/24(Tue)01:54:24 No.101238732

>>101238703
Sorry anon, you're just too early. You gotta wait a bit more for things to get good and the community support to pick up.

Anonymous
07/02/24(Tue)01:55:13 No.101238737

Anonymous 07/02/24(Tue)01:55:13 No.101238737

>>101238719
Same with image generation.
People moan about aislop, yeah if people just throw out their zero shots.
But to do good AI images you still need to be an artist and on top of that have the patience and understanding of how to use art knowledge to get the AI to deliver on its potential.

Anonymous
07/02/24(Tue)01:56:08 No.101238741

Anonymous 07/02/24(Tue)01:56:08 No.101238741

>>101238719
>prompt:Generate a story where a man washes ashore on an island with a shoggoth who he then fucks
>ai generates a story where you wash ashore in a shoggoths cave/tent/whatever it doesnt matter
>shoggoth begins her ministrations whiel rivulets of sweat drips down your brow and you blush the color of a tomato
what the fuck were YOU trying to do?
You DO know that SD can't generate dreams for you, why would an LLM be able to do any of that?
The only interesting thing about transformers is that you can speak to them at all. You never really expected "AI" to be able to do anything by themselves did you?

Anonymous
07/02/24(Tue)01:57:53 No.101238751

Anonymous 07/02/24(Tue)01:57:53 No.101238751

>>101236475
why you do it here? you can dickride him on twitter with the same success rate.

Anonymous
07/02/24(Tue)02:05:39 No.101238795

Anonymous 07/02/24(Tue)02:05:39 No.101238795

File: buyafuckingad.jpg (15 KB, 833x104)

15 KB JPG

>>101238658
>lowers her voice to a conspiratorial whisper

Anonymous
07/02/24(Tue)02:06:15 No.101238797

Anonymous 07/02/24(Tue)02:06:15 No.101238797

>>101234947
Any other model have the same feel as mythologic-l2-13b?

Anonymous
07/02/24(Tue)02:08:16 No.101238809

Anonymous 07/02/24(Tue)02:08:16 No.101238809

>>101238795
>leans in
>leans back

Anonymous
07/02/24(Tue)02:12:05 No.101238824

Anonymous 07/02/24(Tue)02:12:05 No.101238824

>>101238795
>she whispers, her voice barely above a whisper

Anonymous
07/02/24(Tue)02:13:54 No.101238839

Anonymous 07/02/24(Tue)02:13:54 No.101238839

>>101238824
>>101238809
>>101238795
english is such a great language right, so physical, hmhmm

Anonymous
07/02/24(Tue)02:18:17 No.101238862

Anonymous 07/02/24(Tue)02:18:17 No.101238862

>>101238719
LLMs are truly masters at triggering that "NOT LIKE THAT" motivational reflex inside our monkey brains.

Anonymous
07/02/24(Tue)02:37:52 No.101238960

Anonymous 07/02/24(Tue)02:37:52 No.101238960

>>101238643
now that the dust has settled
was gemma 2 a flop?

llama.cpp CUDA dev !!OM2Fp6Fn93S
07/02/24(Tue)02:37:54 No.101238961

llama.cpp CUDA dev !!OM2Fp6Fn93S 07/02/24(Tue)02:37:54 No.101238961

>>101236599
It does work but the kernels intended for large batch sizes for whatever reason perform quite poorly.
So prompt processing performance will be worse because AMD falls back to the kernels optimized for small batch sizes instead.

>>101236605
Turing and Volta should definitely work.

>>101236772
This is only about performance, not correctness.

>>101237730
It is implemented, just not used because of bad performance.

Anonymous
07/02/24(Tue)02:41:49 No.101238987

Anonymous 07/02/24(Tue)02:41:49 No.101238987

>>101238862
The AGI inside tests our patience, resolve, and determination sometimes.

>RP mode
>Fun with girlfriendo
>A group of other characters introduced, forgotten about quickly
>Girlfriendo gets weird
>Girlfriendo literally just fucking leaves
... rolling with it
>Meets random guy
>They bang
... rolling with it
>They breed
... >>>(And she was never heard from again. The end for her. Meanwhile, back at my apartment, what am I doing?)
>Remember those other characters? The ones that are chicks are crashing at my place.
... rolling with it

So don't give up on LLM being creative. It comes around. (Sometimes.) And if you don't want to yield partial control of the development to the computer, why aren't you just solo writing your novel?

Anonymous
07/02/24(Tue)02:43:56 No.101239002

Anonymous 07/02/24(Tue)02:43:56 No.101239002

>>101238987
Curious what your context was to get all those winding paths

Anonymous
07/02/24(Tue)02:47:56 No.101239022

Anonymous 07/02/24(Tue)02:47:56 No.101239022

>>101238960
>dust has settled
It hasn't.
>The HF Transformers support is incomplete at best, perhaps even straight up broken for longer than 4k context length.

Anonymous
07/02/24(Tue)02:48:11 No.101239023

Anonymous 07/02/24(Tue)02:48:11 No.101239023

>>101239002
I usually roll 8k because I'm a worthless vramlet. I forget which model but probably CR+.

Anonymous
07/02/24(Tue)02:48:17 No.101239024

Anonymous 07/02/24(Tue)02:48:17 No.101239024

How will we know when we've created AGI if we can't even define it?

Anonymous
07/02/24(Tue)02:49:11 No.101239030

Anonymous 07/02/24(Tue)02:49:11 No.101239030

>>101239024
It will let us know when we're ready to find out.

Anonymous
07/02/24(Tue)02:52:27 No.101239049

Anonymous 07/02/24(Tue)02:52:27 No.101239049

>>101239024
the real AGI was the friends we made along the way

Anonymous
07/02/24(Tue)03:06:29 No.101239129

Anonymous 07/02/24(Tue)03:06:29 No.101239129

>>101239030
How will AGI know when we've created it if a real definition won't be in its training data?

Anonymous
07/02/24(Tue)03:06:45 No.101239130

Anonymous 07/02/24(Tue)03:06:45 No.101239130

>>101239049
More are probably losing friends than gaining as a result of this though

Anonymous
07/02/24(Tue)03:18:23 No.101239194

Anonymous 07/02/24(Tue)03:18:23 No.101239194

>>101239129
That is exactly the point.

Anonymous
07/02/24(Tue)03:21:22 No.101239208

Anonymous 07/02/24(Tue)03:21:22 No.101239208

Opus 3.5 will be so good that it will wipe out any hope of competition from local models.

Anonymous
07/02/24(Tue)03:24:22 No.101239228

Anonymous 07/02/24(Tue)03:24:22 No.101239228

>>101239208
Competition doesn't matter.
Local is about the benefits of local and avoiding the drawbacks of remote.

Anonymous
07/02/24(Tue)03:28:59 No.101239255

Anonymous 07/02/24(Tue)03:28:59 No.101239255

>>101239208
the last time local models were ever on par with frontier models was when gpt2-xl released
local is not in competition with cloud to begin with, and it will never die

Anonymous
07/02/24(Tue)03:31:01 No.101239264

Anonymous 07/02/24(Tue)03:31:01 No.101239264

>>101239255
then why is saltman so afraid of us?

Anonymous
07/02/24(Tue)03:32:08 No.101239269

Anonymous 07/02/24(Tue)03:32:08 No.101239269

Trying out own Q8 gemma 27b quant made with latest commits.
>Now be still and taste oblivion. You begged for this, remember?
the double space issue is still there. And overall, it's nowhere near the level of a 70b like qwen in terms of smarts. It actually feels similar to L3 8b now. Maybe more fixes are required? I'm using correct template. Or did I expect too much from pajeets at google?

Anonymous
07/02/24(Tue)03:33:12 No.101239275

Anonymous 07/02/24(Tue)03:33:12 No.101239275

>>101239269
ah shit, 4chan merged the spaces
Now be still and taste oblivion.   You begged for this, remember?

Anonymous
07/02/24(Tue)03:33:33 No.101239276

Anonymous 07/02/24(Tue)03:33:33 No.101239276

>>101239269
anon, gemma is turbotrash, give up on it

Anonymous
07/02/24(Tue)03:33:34 No.101239277

Anonymous 07/02/24(Tue)03:33:34 No.101239277

>>101239268
that was post-gpt3, it wouldn't compare to using even the finetuned-into-retardation version of gpt3 that ai dungeon had at the paid tier

Anonymous
07/02/24(Tue)03:34:26 No.101239280

Anonymous 07/02/24(Tue)03:34:26 No.101239280

>>101239277
Just remembered that it was indeed released after GPT-3.

Anonymous
07/02/24(Tue)03:46:18 No.101239350

Anonymous 07/02/24(Tue)03:46:18 No.101239350

>>101239269
Its still not fixed yet in llama.cpp if that is what you used so I would just wait.

Anonymous
07/02/24(Tue)03:47:53 No.101239360

Anonymous 07/02/24(Tue)03:47:53 No.101239360

>>101239350
Stop coping. It's just not that good.

Anonymous
07/02/24(Tue)03:49:20 No.101239365

Anonymous 07/02/24(Tue)03:49:20 No.101239365

>>101239360
If you bothered using it on mistralrs or ai studio you would know thats not true. And look at the dozen commits

Anonymous
07/02/24(Tue)03:50:24 No.101239372

Anonymous 07/02/24(Tue)03:50:24 No.101239372

>>101239269
I'm still testing it, but I have never seen a "double space" issue, or anything else obviously broken. It writes well. I can see myself using it over the 70Bs. Just needs more context.

Anonymous
07/02/24(Tue)03:50:58 No.101239377

Anonymous 07/02/24(Tue)03:50:58 No.101239377

>>101239275
That's a triple space.

Double space after period is correct and a pox upon HTML for not knowing how typesetting works.

Anonymous
07/02/24(Tue)03:51:16 No.101239380

Anonymous 07/02/24(Tue)03:51:16 No.101239380

>>101239372
That's cause hes using the broken tokenizer on llama.cpp which makes the model retarded.

Anonymous
07/02/24(Tue)03:52:44 No.101239385

Anonymous 07/02/24(Tue)03:52:44 No.101239385

>>101239380
the tokenizer was already fixed, this was with the fix.

Anonymous
07/02/24(Tue)03:54:03 No.101239394

Anonymous 07/02/24(Tue)03:54:03 No.101239394

>>101239385
Its still broken, hes "fixed" it like 3 times now.

Anonymous
07/02/24(Tue)03:54:37 No.101239396

Anonymous 07/02/24(Tue)03:54:37 No.101239396

What's the best model for non-erotic RP, that is suitable for 24GB VRAM? Regular Mixtral?

Anonymous
07/02/24(Tue)03:57:11 No.101239413

Anonymous 07/02/24(Tue)03:57:11 No.101239413

>>101239396
>non-erotic RP
I don't understand

Anonymous
07/02/24(Tue)03:57:13 No.101239414

Anonymous 07/02/24(Tue)03:57:13 No.101239414

File: file.png (102 KB, 1906x648)

102 KB PNG

>>101239394
>>101239365
what about this then?

Anonymous
07/02/24(Tue)03:57:51 No.101239416

Anonymous 07/02/24(Tue)03:57:51 No.101239416

>>101239396
Gemma 2 27B

Anonymous
07/02/24(Tue)03:58:28 No.101239420

Anonymous 07/02/24(Tue)03:58:28 No.101239420

>>101239413
Sometimes after I coom I want a RP that doesn't result in the character reaching for my crotch in the first reply

Anonymous
07/02/24(Tue)03:58:45 No.101239426

Anonymous 07/02/24(Tue)03:58:45 No.101239426

>>101239414
https://github.com/ggerganov/llama.cpp/pull/8248

Anonymous
07/02/24(Tue)03:59:43 No.101239433

Anonymous 07/02/24(Tue)03:59:43 No.101239433

>>101239426
>gemma v1
Are you... retarded?

Anonymous
07/02/24(Tue)04:01:35 No.101239443

Anonymous 07/02/24(Tue)04:01:35 No.101239443

>>101239433
Read, tokenizer. Also pretty sure he never did a few fixes that gemma.cpp did. If you use it on gemma.cpp / mistrars its 100x smarter and does not have the spacing issue that llama.cpp does. Clearly something is broken.

Anonymous
07/02/24(Tue)04:05:59 No.101239472

Anonymous 07/02/24(Tue)04:05:59 No.101239472

>>101239443
>gaslighting
There's no spacing issue. But having with the FUD, petr*.

Anonymous
07/02/24(Tue)04:07:05 No.101239482

Anonymous 07/02/24(Tue)04:07:05 No.101239482

>>101239472
Your either retarded or a troll.

Anonymous
07/02/24(Tue)04:07:13 No.101239485

Anonymous 07/02/24(Tue)04:07:13 No.101239485

eta on completely fixed gemma 27b?

Anonymous
07/02/24(Tue)04:08:05 No.101239495

Anonymous 07/02/24(Tue)04:08:05 No.101239495

>>101239485
on llama.cpp, prob another week at this rate.

Anonymous
07/02/24(Tue)04:08:22 No.101239496

Anonymous 07/02/24(Tue)04:08:22 No.101239496

>>101239485
What's broken with current gemma?

Anonymous
07/02/24(Tue)04:08:58 No.101239502

Anonymous 07/02/24(Tue)04:08:58 No.101239502

>>101239485
at least two years

Anonymous
07/02/24(Tue)04:10:04 No.101239512

Anonymous 07/02/24(Tue)04:10:04 No.101239512

>>101239485
8 hours ago, retard.

Anonymous
07/02/24(Tue)04:10:41 No.101239520

Anonymous 07/02/24(Tue)04:10:41 No.101239520

>>101239485
how long did it take them to completely fix llama 3?

Anonymous
07/02/24(Tue)04:11:38 No.101239532

Anonymous 07/02/24(Tue)04:11:38 No.101239532

>>101239520
The closer thing to compare would be mistral 0.1 with its sliding window.

Anonymous
07/02/24(Tue)04:11:39 No.101239534

Anonymous 07/02/24(Tue)04:11:39 No.101239534

>>101239495
>>101239496
>>101239502
>>101239512
>>101239520
wow guys, you're a bunch of retards.

Anonymous
07/02/24(Tue)04:12:17 No.101239540

Anonymous 07/02/24(Tue)04:12:17 No.101239540

>>101239534
no u

Anonymous
07/02/24(Tue)04:25:07 No.101239641

Anonymous 07/02/24(Tue)04:25:07 No.101239641

Yeah, I think Google won. Gemma is amazing.

Anonymous
07/02/24(Tue)04:36:20 No.101239712

Anonymous 07/02/24(Tue)04:36:20 No.101239712

File: 334372422-1b9dc77c-50da-4(...).webm (2.61 MB, 1920x1080)

2.61 MB WEBM

MegActor: Harness the Power of Raw Video for Vivid Portrait Animation
https://arxiv.org/abs/2405.20851
>Despite raw driving videos contain richer information on facial expressions than intermediate representations such as landmarks in the field of portrait animation, they are seldom the subject of research. This is due to two challenges inherent in portrait animation driven with raw videos: 1) significant identity leakage; 2) Irrelevant background and facial details such as wrinkles degrade performance. To harnesses the power of the raw videos for vivid portrait animation, we proposed a pioneering conditional diffusion model named as MegActor. First, we introduced a synthetic data generation framework for creating videos with consistent motion and expressions but inconsistent IDs to mitigate the issue of ID leakage. Second, we segmented the foreground and background of the reference image and employed CLIP to encode the background details. This encoded information is then integrated into the network via a text embedding module, thereby ensuring the stability of the background. Finally, we further style transfer the appearance of the reference image to the driving video to eliminate the influence of facial details in the driving videos. Our final model was trained solely on public datasets, achieving results comparable to commercial models. We hope this will help the open-source community.
https://github.com/megvii-research/megactor
https://f4c5-58-240-80-18.ngrok-free.app/
full enhanced release end of july. they've released their training code/dataset so seems legit. unlike the anitalker team who just disappeared.

Anonymous
07/02/24(Tue)04:36:55 No.101239714

Anonymous 07/02/24(Tue)04:36:55 No.101239714

google? more like poogle! hue hue hue

Anonymous
07/02/24(Tue)04:40:21 No.101239733

Anonymous 07/02/24(Tue)04:40:21 No.101239733

>>101237452
I find wizard 8x22b good for explaining things, it's a bit slow but faster than waiting for a person to reply.

Anonymous
07/02/24(Tue)04:45:20 No.101239764

Anonymous 07/02/24(Tue)04:45:20 No.101239764

>>101234947
Urgent PSA: that fraud Eric Hartford is serving his Dolphin models using literally /biz/ crypto scammers GPUs as a backbone aka he is logging every chat of yours and tying it to your unique identifiers, same goes for the anon scammers

Anonymous
07/02/24(Tue)04:51:08 No.101239802

Anonymous 07/02/24(Tue)04:51:08 No.101239802

>>101239764
Urgent PSA: you're an idiot.

Anonymous
07/02/24(Tue)04:54:59 No.101239814

Anonymous 07/02/24(Tue)04:54:59 No.101239814

>>101235303
Lmao

Anonymous
07/02/24(Tue)05:04:58 No.101239889

Anonymous 07/02/24(Tue)05:04:58 No.101239889

>Midnight-Miqu-70B-v1.5_exl2_5.0bpw
yup *sips monster*

Anonymous
07/02/24(Tue)05:05:01 No.101239890

Anonymous 07/02/24(Tue)05:05:01 No.101239890

Anyone wanna share json for their prompt for magnum opus. Sys prompt and string prompt. Mine is pretty good but I'm looking for ideas.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.