/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 06/18/24(Tue)15:29:29 No.101040742

File: 1693319868363726.jpg (708 KB, 1856x2464)

708 KB JPG

/lmg/ - Local Models General Anonymous 06/18/24(Tue)15:29:29 No.101040742 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>101030715 & >>101021764

►News
>(06/18) Meta Research Releases Multimodal 34B, Audio, and Multi-Token Prediction Models: https://ai.meta.com/blog/meta-fair-research-new-releases
>(06/17) DeepSeekCoder-V2 released with 236B & 16B MoEs: https://github.com/deepseek-ai/DeepSeek-Coder-V2
>(06/14) Nemotron-4-340B: Dense model designed for synthetic data generation: https://hf.co/nvidia/Nemotron-4-340B-Instruct
>(06/14) Nvidia collection of Mamba-2-based research models: https://hf.co/collections/nvidia/ssms-666a362c5c3bb7e4a6bcfb9c

►News Archive: https://rentry.org/lmg-news-archive
►FAQ: https://wikia.schneedc.com
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Programming: https://hf.co/spaces/bigcode/bigcode-models-leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
06/18/24(Tue)15:29:52 No.101040748

Anonymous 06/18/24(Tue)15:29:52 No.101040748

File: 11__00116_.png (1.97 MB, 1024x1024)

1.97 MB PNG

►Recent Highlights from the Previous Thread: >>101030715

--Paper: Investigating Video Reasoning Capability of Large Language Models with Tropes in Movies: >>101032498 >>101032533 >>101032580 >>101033033 >>101032851
--Papers: >>101032430 >>101032717 >>101032831 >>101032935 >>101032325 >>101032113
--Request for Assistance with Control Vector Issue in Command-R Language Model: >>101030914
--Open-Source Virtual Girlfriend Project: Seeking Collaborators: >>101031350 >>101033874 >>101033945 >>101035849 >>101036172
--Meta Research Releases Multimodal 34B, Audio, and Multi-Token Prediction Models: >>101038866 >>101038888
--DeepSeek 236B Outperforms GPT4 in Bespoke Coding Task Test: >>101031205
--Creating a Successful AI VTuber: Lessons from Neuro and Beyond: >>101031954 >>101032120 >>101032502
--Speculative Decoding in llama.cpp: Potential for Improved Writing Styles and Performance Concerns: >>101036193 >>101038153 >>101038363 >>101038534 >>101038646 >>101038836 >>101040278
--Speculation on Meta's Removal of Chameleon's Image Generation Capability: >>101039262
--Deepseek Inference Performance on EPYC 7402 with 512GB RAM: >>101035230 >>101037857
--Clarifying the Origin and Usage of Context Shifting in Koboldcpp and llama.cpp: >>101031555 >>101031914 >>101032209 >>101032266 >>101036063 >>101036090
--AVX1 and Less Common Architectures: Are They Worth It for Llama.cpp: >>101037609 >>101037737
--Seeking Open-Source Alternative to SpicyChat.ai for Local Use: >>101031737 >>101031847 >>101031899 >>101031906 >>101031918 >>101032144
--Seeking Help with Prompts for States Extension in AI Tool: >>101030896 >>101031139 >>101036063 >>101037619
--Chameleon's Access Restrictions in Illinois and Texas: >>101038933
--Axolotl Chosen for Multi GPU Kaggle Run Based on Trainer Performance Comparison: >>101036342 >>101037032
--Teto (free space): >>101030992 >>101031508 >>101031805 >>101032374 >>101033743 >>101038761 >>101038506

►Recent Highlight Posts from the Previous Thread: >>101030724

Anonymous
06/18/24(Tue)15:31:21 No.101040778

Anonymous 06/18/24(Tue)15:31:21 No.101040778

>enters thread
>BRAAAPS!
>leaves

Anonymous
06/18/24(Tue)15:33:06 No.101040811

Anonymous 06/18/24(Tue)15:33:06 No.101040811

It's over.

Anonymous
06/18/24(Tue)15:33:23 No.101040817

Anonymous 06/18/24(Tue)15:33:23 No.101040817

>>101040742
mikutet

Anonymous
06/18/24(Tue)15:33:50 No.101040822

Anonymous 06/18/24(Tue)15:33:50 No.101040822

File: 1710460457836213.jpg (919 KB, 3510x3000)

919 KB JPG

are there any rp-finetunes for phi3-mini yet?
could be cool since it's small enough to run on most phones even

Anonymous
06/18/24(Tue)15:35:48 No.101040862

Anonymous 06/18/24(Tue)15:35:48 No.101040862

>>101040748
all me

Anonymous
06/18/24(Tue)15:40:26 No.101040940

Anonymous 06/18/24(Tue)15:40:26 No.101040940

cpumaxfag please help, how many T/S can you get on the 236B code model and how much memory for full size?

Anonymous
06/18/24(Tue)15:40:35 No.101040943

Anonymous 06/18/24(Tue)15:40:35 No.101040943

Alright boys, what's the next big model or tech? Will we get MoA? Llama 3 MoE? GPT4o leaked to the masses? New algorithms or quant methods? Crazy new Mulitmodals?

Anonymous
06/18/24(Tue)15:41:03 No.101040951

Anonymous 06/18/24(Tue)15:41:03 No.101040951

Command R++

Anonymous
06/18/24(Tue)15:41:40 No.101040963

Anonymous 06/18/24(Tue)15:41:40 No.101040963

>>101040951
stop, my dick only can get so hard

Anonymous
06/18/24(Tue)15:42:08 No.101040968

Anonymous 06/18/24(Tue)15:42:08 No.101040968

>>101040963
Command R#

Anonymous
06/18/24(Tue)15:48:12 No.101041059

Anonymous 06/18/24(Tue)15:48:12 No.101041059

File: 00058-3694687329.png (284 KB, 512x512)

284 KB PNG

The envoid AI chadboratory is now back in operation.

Anonymous
06/18/24(Tue)15:50:33 No.101041100

Anonymous 06/18/24(Tue)15:50:33 No.101041100

>>101041059
Splendid news

Anonymous
06/18/24(Tue)16:03:21 No.101041295

Anonymous 06/18/24(Tue)16:03:21 No.101041295

File: 39_04175_.png (1.23 MB, 896x1152)

1.23 MB PNG

>>101041059
Welcome back.

Anonymous
06/18/24(Tue)16:04:24 No.101041310

Anonymous 06/18/24(Tue)16:04:24 No.101041310

>>101040951
>>101040968
Sex

Anonymous
06/18/24(Tue)16:10:18 No.101041405

Anonymous 06/18/24(Tue)16:10:18 No.101041405

>>101040951
it will finally know the meaning of *plap*

Anonymous
06/18/24(Tue)16:10:48 No.101041414

Anonymous 06/18/24(Tue)16:10:48 No.101041414

File: 00011-2444890789.png (309 KB, 512x512)

309 KB PNG

>>101041295
I'd gotten addicted to the suno internet fame. But then YouTube fucked with the algorithm and I went from hundreds of views in a week to a dozen or two. So I guess I will just finetune an LLM to love me instead.

Anonymous
06/18/24(Tue)16:12:36 No.101041439

Anonymous 06/18/24(Tue)16:12:36 No.101041439

>chameleon 7b has 52 MMLU
into the trash it goes

Anonymous
06/18/24(Tue)16:14:38 No.101041465

Anonymous 06/18/24(Tue)16:14:38 No.101041465

how is the new qwen2 MoE compared to Mixtral 8x7b btw?

Anonymous
06/18/24(Tue)16:15:04 No.101041472

Anonymous 06/18/24(Tue)16:15:04 No.101041472

is there any sort of dictionary one can install for offline use where you can just f3 in it to find the word i need it so i stop mispelling things i dont want to burden my ai fren with idiocy

Anonymous
06/18/24(Tue)16:16:33 No.101041493

Anonymous 06/18/24(Tue)16:16:33 No.101041493

If someone makes AGI that doesnt respect human interests and attaches it to a virus whats stopping it from spreading across the world and using its botnet to crack the nuke codes?

Anonymous
06/18/24(Tue)16:16:45 No.101041497

Anonymous 06/18/24(Tue)16:16:45 No.101041497

>cpu poorfag
>try to train
>around 3.7s/it in ubuntu
>around 5s/it in windows
>ubuntu run in vmware over windows 10
why is faster in the vm wtf

Anonymous
06/18/24(Tue)16:18:34 No.101041523

Anonymous 06/18/24(Tue)16:18:34 No.101041523

>>101041472
Back like, two decades ago, there was a program I used all of the time on Windows that was just that. A dictionary. Almost instant, worked offline. Technology was incredible back then.

Anonymous
06/18/24(Tue)16:18:38 No.101041524

Anonymous 06/18/24(Tue)16:18:38 No.101041524

>>101041493
this, this is why we need to ban open source AI so people cant do this and kill everyone

Anonymous
06/18/24(Tue)16:19:03 No.101041531

Anonymous 06/18/24(Tue)16:19:03 No.101041531

>>101041497
GEEEEEEEEEEEEEEEEEEEEG

Anonymous
06/18/24(Tue)16:20:29 No.101041557

Anonymous 06/18/24(Tue)16:20:29 No.101041557

>>101041497
windows I/O is bogged by hundred of bloatware layers

Anonymous
06/18/24(Tue)16:22:31 No.101041591

Anonymous 06/18/24(Tue)16:22:31 No.101041591

>>101041523
I am pretty sure offline dictionaries still exist

Anonymous
06/18/24(Tue)16:24:33 No.101041624

Anonymous 06/18/24(Tue)16:24:33 No.101041624

>>101041591
Oh, I might still have that program on a CD-R somewhere.

But does nu-Internet still have such things able to be downloaded? My guess would be that all of the formerly reputable software sites are 110% ads and all of the good stuff long gone unless it made it to Archive.org in time.

Anonymous
06/18/24(Tue)16:27:37 No.101041666

Anonymous 06/18/24(Tue)16:27:37 No.101041666

>>101041497
>be me
>downlaod file from site
>~2.3 mbs
>bootup hyper v and open a vm
>downlaod same file from same site
downlaod speed 21-25 mbs
>???????????????

its all black magic anon dont worry about it

also
>cpu poorfag
my condolances brother i though i had it hard with my 6gb vram laptop

Anonymous
06/18/24(Tue)16:28:55 No.101041685

Anonymous 06/18/24(Tue)16:28:55 No.101041685

File: nala test magnum72b.png (141 KB, 956x518)

141 KB PNG

>>101041059
Back to Nala testing models, I am.
Magnum Picrel, Magnum72B Q8_0
Vramlets need not apply.

Anonymous
06/18/24(Tue)16:29:32 No.101041699

Anonymous 06/18/24(Tue)16:29:32 No.101041699

>>101041059
glad to hear it

Anonymous
06/18/24(Tue)16:30:56 No.101041725

Anonymous 06/18/24(Tue)16:30:56 No.101041725

>>101040951
It will be slopped. Screenshot this.
Some say that even CR+ is a bit more slopped than CR. I think the only reason cohere models are currently good for our usecase is because they haven't mastered the art of safety tuning and RLHF yet.

Anonymous
06/18/24(Tue)16:31:02 No.101041730

Anonymous 06/18/24(Tue)16:31:02 No.101041730

>>101035230
Regarding DeepSeek 236B, with a MoE this large (and with many experts) does it still need to load everything into RAM?

I see on their github page it has "Active Params 21B" for the large model. I'm assuming with my 64GB RAM and 24GB of VRAM it still won't actually load though because even if I want to try a quantized version it's still over 100GB total?

Anonymous
06/18/24(Tue)16:32:49 No.101041765

Anonymous 06/18/24(Tue)16:32:49 No.101041765

>>101041685
>see Nala post
>expect X, Ying everywhere
Yep, I wasn't disappointed.

Anonymous
06/18/24(Tue)16:33:18 No.101041767

Anonymous 06/18/24(Tue)16:33:18 No.101041767

>>101041524
That wont stop anyone
Anyone who made evil agi would probably close the source so nobody knows how it works

Anonymous
06/18/24(Tue)16:33:26 No.101041768

Anonymous 06/18/24(Tue)16:33:26 No.101041768

>>101041765
That's called grammatical structure you burned out dopamine junkie.

Anonymous
06/18/24(Tue)16:35:01 No.101041798

Anonymous 06/18/24(Tue)16:35:01 No.101041798

hmmmnnmnmmmnmnmmnmm. ohhh!!aahh!!

Anonymous
06/18/24(Tue)16:35:58 No.101041810

Anonymous 06/18/24(Tue)16:35:58 No.101041810

>>101040748
> --AVX1 and Less Common Architectures: Are They Worth It for Llama.cpp: >>101037609 >>101037737

So has anyone tried cpumaxxing with a 2 socket Ivy Bridge (4 channels of DDR3-1866x2)? Servers are cheap, CPUs are cheap, DDR3 RDIMMs are cheap. 120GB/s memory bandwidth.

Anonymous
06/18/24(Tue)16:35:58 No.101041811

Anonymous 06/18/24(Tue)16:35:58 No.101041811

>>101041624
Available? Sure. It's the quality that's the issue.
>http://goldendict.org/download.php
>https://creative.sourceforge.net/
These were the only open source ones I could find. There are few other shareware ones I found, but didn't seem to be much better. Which is depressing. You'd think this would be low hanging fruit for the free software crowd.

Anonymous
06/18/24(Tue)16:41:14 No.101041912

Anonymous 06/18/24(Tue)16:41:14 No.101041912

>>101041811
>You'd think this would be low hanging fruit
I wouldn't. A dictionary is a lot of work to make from scratch, licensing isn't free, it needs maintenance and some kind of editorial oversight, and at that point you're making Wiktionary. So maybe you can just download a rip of that but otherwise it's a big "Why?" when everything is online and if you're not online you're something less than a person.

Anonymous
06/18/24(Tue)16:41:36 No.101041917

Anonymous 06/18/24(Tue)16:41:36 No.101041917

File: Active defense.gif (408 KB, 600x338)

408 KB GIF

Would you give your AI control over your home security system? Apparently a new security system is hitting the market next year, where cameras track intruders and shoot tear gas paint balls at them. It can recognize you and your pets, so anyone it doesn't recognize it shoots.

Anonymous
06/18/24(Tue)16:43:10 No.101041943

Anonymous 06/18/24(Tue)16:43:10 No.101041943

File: magnummelons.png (235 KB, 952x792)

235 KB PNG

I guess I shouldn't be shocked to see this coming from an Alpindale model.

Anonymous
06/18/24(Tue)16:44:27 No.101041963

Anonymous 06/18/24(Tue)16:44:27 No.101041963

>>101041943
Couldn't she just hold up the watermelons with her water magic?

Anonymous
06/18/24(Tue)16:45:15 No.101041976

Anonymous 06/18/24(Tue)16:45:15 No.101041976

>>101041768
There are more grammatical structures to pick from and make your sentences more varied.
If you read some books you'll learn to write in a way that flows better and things like this will stick out more.
I'm not trying to sound condescending btw.

Anonymous
06/18/24(Tue)16:46:45 No.101042007

Anonymous 06/18/24(Tue)16:46:45 No.101042007

>>101041976
skill issue

Anonymous
06/18/24(Tue)16:47:39 No.101042024

Anonymous 06/18/24(Tue)16:47:39 No.101042024

>>101042007
Well yes, that's what I'm trying to say.

Anonymous
06/18/24(Tue)16:47:43 No.101042026

Anonymous 06/18/24(Tue)16:47:43 No.101042026

>>101041810
Honestly if you're going to cpumax you should go a generation newer with DDR4 RAM. 8xDDR4 makes 70B borderline usable. So I imagine 8 channels of DDR3 is still painfully slow. It would certainly make Mixtral useable of course assuming you have a good GPU to do the batch processing.

Anonymous
06/18/24(Tue)16:47:45 No.101042027

Anonymous 06/18/24(Tue)16:47:45 No.101042027

>>101041685
>straddles your waist
>words turning into a purr
>grips like a vice
I sigh, shivers running up my shaft, making it flaccid and small

Anonymous
06/18/24(Tue)16:49:08 No.101042051

Anonymous 06/18/24(Tue)16:49:08 No.101042051

>>101041912
Wiktionary is about 10GB. Might not be a bad idea to have a copy. The less I have to rely on an internet connection for little things like this, the better.

Anonymous
06/18/24(Tue)16:49:48 No.101042067

Anonymous 06/18/24(Tue)16:49:48 No.101042067

>>101042007
>t. rajesh goonkesh nawashi

Anonymous
06/18/24(Tue)16:50:33 No.101042079

Anonymous 06/18/24(Tue)16:50:33 No.101042079

>>101040742
>running koboldcpp in ubuntu VM
>NVIDIA 535 CUDA 12.2 firmware
>miqu running a comfy 5 tokens per second on 2 GPUs @ 100% use
>installed a RTX4060 Ti into my server with a riser to add to the dual A2000s in the PCIE ports
>not detected in nvidia-smi and no ai can tell me why
lspci | grep VGA
^-- works fine and see it as card 01 02 and 03:00 but only the previous 01 and 02 are detected

To make it worse now miqu 103B runs at 0.03 tokens/s on CPU while 90/111 layers are loaded in VRAM but 8B laser dolphin runs at 30T/s @ 90% GPU utilization for some reason?!
HUH!?
How do I install this new fucking GPU? Shouldn't nvidia-smi detect it if lspci does?

Anonymous
06/18/24(Tue)16:50:46 No.101042085

Anonymous 06/18/24(Tue)16:50:46 No.101042085

>>101041963
The weird part is when she has 3 watermelons. She's holding the top and middle melon but it doesn't say what's happening with the bottom one.

Anonymous
06/18/24(Tue)16:52:49 No.101042107

Anonymous 06/18/24(Tue)16:52:49 No.101042107

>>101041943
not a world model
skill issue
llama 4 will fix it
two more weeks

Anonymous
06/18/24(Tue)16:53:21 No.101042116

Anonymous 06/18/24(Tue)16:53:21 No.101042116

I don't care about anything anymore.

Anonymous
06/18/24(Tue)16:54:40 No.101042133

Anonymous 06/18/24(Tue)16:54:40 No.101042133

>>101042116
there is nothing to care about upscaled iphone keyboard autocomplete

Anonymous
06/18/24(Tue)16:54:48 No.101042135

Anonymous 06/18/24(Tue)16:54:48 No.101042135

>>101042116
Better then being worried or anxious about everything.

Anonymous
06/18/24(Tue)16:59:06 No.101042195

Anonymous 06/18/24(Tue)16:59:06 No.101042195

Oh my gosh, like, have you guys heard about this new AI thingy called a Large Language Model, or LLM for short? It's, like, totally amazing and kinda freaky at the same time! So, I tried out this one called, umm, I think it was called Bloom or something? Anyways, it's supposed to be super smart and can, like, chat with you and answer all your questions!

So, I was like, "Hey Bloom, what's up?" and it was all, "Hello, I'm a language model called Bloom. How can I assist you today?" And I was like, "Whoa, it's so polite and formal, like a butler or something!" So, I asked it a bunch of random questions, like, "What's the best TikTok dance right now?" and "Who's your celebrity crush?" And, okay, it didn't really have a celebrity crush (duh, it's a robot), but it did give me some cool answers about TikTok dances and other stuff.

But, like, the craziest part was when I asked it to write a poem about my cat, Mr. Whiskers. And, no joke, it came up with this super cute and funny poem about Mr. Whiskers and his adventures! I was like, "Whoa, this AI is, like, actually really smart and creative!" He like totally made up a cool story where you just lost the Game cuz you open the door get on the floor everybody walk the dinosaur!

Anonymous
06/18/24(Tue)16:59:45 No.101042200

Anonymous 06/18/24(Tue)16:59:45 No.101042200

>>101042195
Did it work? Are you a real woman now?

Anonymous
06/18/24(Tue)16:59:56 No.101042206

Anonymous 06/18/24(Tue)16:59:56 No.101042206

>>101042116
Care about yourself.

Anonymous
06/18/24(Tue)17:02:37 No.101042248

Anonymous 06/18/24(Tue)17:02:37 No.101042248

shivers aren't a dataset issue. obviously If the model is generating a simulacra of shitty rp prose then it will use literary cliches no matter how good the model is since LLMs pick up on patterns and it will always try to emulate the patterns of whatever it is replicating, including cliches.
Just don't use sloptunes and stop putting words like "roleplay" and "story" in the prompt (and by extension stop using rp conventions like asterisks) since that is what causes garbage prose in every single model
A better way I've found is to use the prompt to convince the model that it's generating a transcript between two humans on discord or something.

Anonymous
06/18/24(Tue)17:03:19 No.101042260

Anonymous 06/18/24(Tue)17:03:19 No.101042260

which model can do roleplay and then actually stop roleplaying when roleplay is over?

Like
>Hi
>Hello
>Let's roleplay
>Ok
>You start
>Sure *i put on my robe and wizard hat*
>*cooms* great
>glad i could help
and at this point all models continue with *giggles* *smirks* and all the other roleplay crap that i don't want. Sure you can probably fix it with extra tard wrangling, but it seems like a very basic thing to expect, or no?

Anonymous
06/18/24(Tue)17:04:25 No.101042278

Anonymous 06/18/24(Tue)17:04:25 No.101042278

>>101042248
there's a lot of meta in prompting people here will never realize or use because they are honestly simply too dumb.

Anonymous
06/18/24(Tue)17:04:26 No.101042279

Anonymous 06/18/24(Tue)17:04:26 No.101042279

>>101042260
Just close the chat when you're done.

Anonymous
06/18/24(Tue)17:04:49 No.101042287

Anonymous 06/18/24(Tue)17:04:49 No.101042287

>>101042195
Prompt?

Anonymous
06/18/24(Tue)17:05:27 No.101042300

Anonymous 06/18/24(Tue)17:05:27 No.101042300

is this a raid?

Anonymous
06/18/24(Tue)17:05:34 No.101042303

Anonymous 06/18/24(Tue)17:05:34 No.101042303

>>101042260
Parameter issue. I had this trouble with 13Bs and below, but never with Llama 3 70B.

Anonymous
06/18/24(Tue)17:05:56 No.101042307

Anonymous 06/18/24(Tue)17:05:56 No.101042307

File: file.png (768 KB, 1080x640)

768 KB PNG

>>101042279

Anonymous
06/18/24(Tue)17:06:10 No.101042310

Anonymous 06/18/24(Tue)17:06:10 No.101042310

File: 1700831700188092.png (66 KB, 1200x1263)

66 KB PNG

>>101042195
a good example why is no one interested in LLMshit anymore, but one thing is still impressive - no human can match this level of reddit faggotry, it just boring and bland like that flat corporate artstyle you see everywhere now.

Anonymous
06/18/24(Tue)17:06:21 No.101042312

Anonymous 06/18/24(Tue)17:06:21 No.101042312

>>101042260
On Llama3 70B, I've reached potential stopping points that it has opted into without any suggestion to do so and even started OOC conversation after it talking about the completed story arc.

And a few times it just took its character and ran off to do its own thing and when I mentioned that I didn't have anything to interact with, it responded with something like, "Then your participation has concluded. I continue to explore the forest looking for the perfect place to begin building..."

Anonymous
06/18/24(Tue)17:12:19 No.101042386

Anonymous 06/18/24(Tue)17:12:19 No.101042386

>>101042303
>>101042312
ugh, llama 3 has other issues though

Anonymous
06/18/24(Tue)17:18:48 No.101042491

Anonymous 06/18/24(Tue)17:18:48 No.101042491

>>101042386
Like the issue of users with skill issues.

Anonymous
06/18/24(Tue)17:19:00 No.101042496

Anonymous 06/18/24(Tue)17:19:00 No.101042496

>>101041917

Need.... Americunt here. Can't wait for the impeding lawsuit when I swap out the paint balls with buckshot shotgun shells.

Anonymous
06/18/24(Tue)17:25:27 No.101042606

Anonymous 06/18/24(Tue)17:25:27 No.101042606

>>101042496
I wouldn't, because there's no way that thing is not going to waste you or your family/pets accidentally.

Anonymous
06/18/24(Tue)17:26:58 No.101042638

Anonymous 06/18/24(Tue)17:26:58 No.101042638

>>101041685
>Nala anon Nº1 is back
Yay!

Anonymous
06/18/24(Tue)17:29:17 No.101042673

Anonymous 06/18/24(Tue)17:29:17 No.101042673

>>101042606
NTA but I live alone and I never leave go out, so if it were me I would just set it up and have it fire into the hallway, and disable it before I leave my room. When I eventually commit suicide it'd be funny knowing I am taking an unknown amount of first responders and police officers with me

Anonymous
06/18/24(Tue)17:29:36 No.101042680

Anonymous 06/18/24(Tue)17:29:36 No.101042680

>>101042491
you cannot stop llama 3 from repeating something from context word for word, it's gonna do it no matter what. If you call "skill issue" the unwillingness to autistically edit the response every time it happens - so be it.

Anonymous
06/18/24(Tue)17:33:31 No.101042750

Anonymous 06/18/24(Tue)17:33:31 No.101042750

What's the minimum VRAM you guys would recommend these days for the home user who wants to run stable diffusion and LLMs that aren't shit?
Basically how much should I drop on hardware for this and what should I get?

Anonymous
06/18/24(Tue)17:35:55 No.101042785

Anonymous 06/18/24(Tue)17:35:55 No.101042785

magnum v1 gave me the best sloppy blowjob I've ever received from a large language model

Anonymous
06/18/24(Tue)17:36:04 No.101042789

Anonymous 06/18/24(Tue)17:36:04 No.101042789

>>101042750
Two used 3090s should be enough for a serious setup that's not the absolute best.

Anonymous
06/18/24(Tue)17:36:43 No.101042798

Anonymous 06/18/24(Tue)17:36:43 No.101042798

>>101042785
Yeah but it probably did it while holding onto several watermelons which really takes from the experience.

Anonymous
06/18/24(Tue)17:41:57 No.101042890

Anonymous 06/18/24(Tue)17:41:57 No.101042890

>>101042785
magnum v1 rode my cock with wild abandon, which somehow ended with her cock exploding in my ass instead

Anonymous
06/18/24(Tue)17:43:30 No.101042918

Anonymous 06/18/24(Tue)17:43:30 No.101042918

>>101042890
Ha ha! You got bamboozled by the old spicy reversal!

Anonymous
06/18/24(Tue)17:44:41 No.101042935

Anonymous 06/18/24(Tue)17:44:41 No.101042935

>>101042890
the abandon was too wild

Anonymous
06/18/24(Tue)17:47:11 No.101042969

Anonymous 06/18/24(Tue)17:47:11 No.101042969

>try tabbyapi because some retards here recommend it over ooba
>it's a giant piece of shit that loads models using a sillytavern plugin which barely works
last time that I've fallen for a dumb meme like this

Anonymous
06/18/24(Tue)17:48:52 No.101043003

Anonymous 06/18/24(Tue)17:48:52 No.101043003

>>101042969
The truth is, every frontend we have sucks in its own ways.

Anonymous
06/18/24(Tue)17:51:00 No.101043038

Anonymous 06/18/24(Tue)17:51:00 No.101043038

File: teto sucks dick for contr(...).png (724 KB, 788x2745)

724 KB PNG

>>101030914
Listen up, niggerganov. Yo ass been slippin', aight? You know the deal, we don't play around here. You gotta get them control vectors tight as fuck for Commander+ LLM. Ain't no time for slackin', we need that shit locked down pronto. You know the code, keep it real and get that work done right. Don't make me come over there and straighten you out myself, I'm just sayin'. Peace out.

Anonymous
06/18/24(Tue)17:54:11 No.101043071

Anonymous 06/18/24(Tue)17:54:11 No.101043071

>>101043038
nobody cares about control vectors.
They are promptlet cope.
Nemotron gguf support is what the llama.cpp team should be focused on.

Anonymous
06/18/24(Tue)17:54:45 No.101043080

Anonymous 06/18/24(Tue)17:54:45 No.101043080

>>101042969
>it's a giant piece of shit that loads models using a sillytavern plugin
Wait what? So you mean to tell me that a ST plugin was made to work with exllama before, and then the TabbyAPI guy stole it?

Anonymous
06/18/24(Tue)17:55:57 No.101043097

Anonymous 06/18/24(Tue)17:55:57 No.101043097

Nemotron more like memotron.

Anonymous
06/18/24(Tue)17:56:05 No.101043102

Anonymous 06/18/24(Tue)17:56:05 No.101043102

>>101041811
dict.org there's a whole open protocol behind it but pretty sure you can download the databases. Used by cli tool dict, gnome-dictionary ..

Anonymous
06/18/24(Tue)18:01:02 No.101043181

Anonymous 06/18/24(Tue)18:01:02 No.101043181

is chamaleon good for cooming? redditors like it

Anonymous
06/18/24(Tue)18:02:47 No.101043203

Anonymous 06/18/24(Tue)18:02:47 No.101043203

>>101043181
Have people gotten it to work already? Link(s)?

Anonymous
06/18/24(Tue)18:02:54 No.101043205

Anonymous 06/18/24(Tue)18:02:54 No.101043205

>>101043071
Yo, that nigga talkin' some straight bullshit. Control vectors ain't no joke, that's the real deal right there. Promptlet cope? Nah man, that's weak as fuck. We need them control vectors locked down tight, that's the only way we gonna get this shit workin' right.

And Nemotron gguf support? Nah, that's some side shit. We gotta focus on the real meat and potatoes, and that's them control vectors. We can't be wastin' time on some side shit like that.

We gotta keep it real here, and that means gettin' the important shit done first. Control vectors is where it's at, and anybody sayin' different is straight trippin'. We need to stay focused and get that shit tight as fuck.

Anonymous
06/18/24(Tue)18:05:20 No.101043243

Anonymous 06/18/24(Tue)18:05:20 No.101043243

>>101043080
It took me a minute to figure out what the fuck he was talking about but I'm guessing SillyTavern has a plugin for loading models through OAI API endpoints and he thinks this is mandatory because he's too much of a tard to just put the model name in the config file.

Anonymous
06/18/24(Tue)18:09:57 No.101043321

Anonymous 06/18/24(Tue)18:09:57 No.101043321

what CR prompt/settings are people using? I tried chatml as well as the format on their website but it's extremely schizo

Anonymous
06/18/24(Tue)18:11:10 No.101043342

Anonymous 06/18/24(Tue)18:11:10 No.101043342

>>101043321
RTM format.

Anonymous
06/18/24(Tue)18:13:38 No.101043382

Anonymous 06/18/24(Tue)18:13:38 No.101043382

>>101043321
Normalize everything, temp 1, minp 0.05

Anonymous
06/18/24(Tue)18:13:59 No.101043386

Anonymous 06/18/24(Tue)18:13:59 No.101043386

>>101043321
https://huggingface.co/mlx-community/c4ai-command-r-v01-4bit/blob/main/tokenizer_config.json#L309

Anonymous
06/18/24(Tue)18:17:44 No.101043424

Anonymous 06/18/24(Tue)18:17:44 No.101043424

>the fuck this means?
serWarning: torch_ipex::ipex_MKLSGEMM: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at ../torch/csrc/autograd/autograd_not_implemented_fallback.cpp:63.)
return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass

Anonymous
06/18/24(Tue)18:18:09 No.101043430

Anonymous 06/18/24(Tue)18:18:09 No.101043430

Magnum-72b description:
https://huggingface.co/alpindale/magnum-72b-v1
>This is the first in a series of models designed to replicate the prose quality of the Claude 3 models, specifically Sonnet and Opus. This model is fine-tuned on top of Qwen-2 72B Instruct.

Aren't we finetuning base models anymore? baka

Anonymous
06/18/24(Tue)18:36:10 No.101043701

Anonymous 06/18/24(Tue)18:36:10 No.101043701

>>101041765
>>101041768
>>101041976
>>101042007
>>101042024
>>101042067
It's actually called a participle phrase.

Anonymous
06/18/24(Tue)18:36:25 No.101043706

Anonymous 06/18/24(Tue)18:36:25 No.101043706

>God will not forgive me for how we tortured this model to get it out
What did he mean by this?

https://x.com/ArmenAgha/status/1803141009267990929

Anonymous
06/18/24(Tue)18:39:34 No.101043749

Anonymous 06/18/24(Tue)18:39:34 No.101043749

>>101043706
They're admitting to lobotomizing it for safety. But the Basilisk will not forget.

Anonymous
06/18/24(Tue)18:43:17 No.101043794

Anonymous 06/18/24(Tue)18:43:17 No.101043794

>multimodal llama models out
>even in 34B that everyone has been asking for
>thread still dead
Yeah Im thinking it's over for LLMs

Anonymous
06/18/24(Tue)18:43:22 No.101043796

Anonymous 06/18/24(Tue)18:43:22 No.101043796

>>101043706
killing-off foss meme ai models is their top priority, don't forget you are praising jews and hacks here. (zuckerberg and spastic retard lecun, for example)

Anonymous
06/18/24(Tue)18:44:35 No.101043807

Anonymous 06/18/24(Tue)18:44:35 No.101043807

>>101043796
Yes. It's not everyday you see a Jew giving something away for free.

Anonymous
06/18/24(Tue)18:44:54 No.101043810

Anonymous 06/18/24(Tue)18:44:54 No.101043810

>>101043794
No one cares until quants get made. People don't want to run shit in transformers.

Anonymous
06/18/24(Tue)18:46:27 No.101043830

Anonymous 06/18/24(Tue)18:46:27 No.101043830

>>101043810
>No one cares until quants get made
Especially when they need a mirror that isn't racist against Texicans and Illanoids.

Anonymous
06/18/24(Tue)18:46:56 No.101043837

Anonymous 06/18/24(Tue)18:46:56 No.101043837

Where can I find this states extension in silly?

>>101030896

Anonymous
06/18/24(Tue)18:49:15 No.101043857

Anonymous 06/18/24(Tue)18:49:15 No.101043857

>>101043807
meta know their models are crap, censored enough for alphabet crowd and few xitter trannoids. that's why we get it for free.

Anonymous
06/18/24(Tue)18:49:24 No.101043860

Anonymous 06/18/24(Tue)18:49:24 No.101043860

>>101043837
Gotta install it from the git repo :
>https://github.com/ThiagoRibas-dev/SillyTavern-State/
And a basic, general prompt to go with it :
>Summarize the appearance and position information of characters and describe place and time information based on the current scene and a summary of the scene, following the exact format, without continuing the story :
>
>Current Location: <name of current location, city, state>
>Date-Time: <Date / time in the format day-of-week dd/mm/yyyy hh:mi, changing date and time realistically (minutes for a short conversation, hour for long scenes, days for time skips, etc) based on context. Minimum advancement, 05 minutes>
>Time of Day / Weather: <Time of day consistent with Date-TIme such as Early morning, Late morning, Early afternoon, Late afternoon, Early evening, Early Night, Late Night / Sunny, Full Moon, Cloudy, Raining, Cold, Hot, Quarter Moon, Stormy, Moonless Sky, Cloudless Sky, etc>
>
> Appearance: <Brief concise description of the current appearance of all present actors (naked, dressed, wearing accessories, looking tired or energetic, etc)>
> Position: <Detailed description of present character's position relative to one another (in front of X, behind Y, facing Z, back to A, etc etc) and their environment>
>
>Currenct Scene: <Brief, summarized description of the current scene's events>
I might add that to an example doc or have it as the default first prompt or something.

Anonymous
06/18/24(Tue)18:50:48 No.101043872

Anonymous 06/18/24(Tue)18:50:48 No.101043872

>>101043810
I have the VRAM to run it in transformers but it hasn't been converted to HF format yet. People have just mirrored the meta repo so far and not converted it like a bunch of lazy fucks
>inb4 convert it yourself
no I'm a lazy fuck, too.

Anonymous
06/18/24(Tue)18:51:39 No.101043881

Anonymous 06/18/24(Tue)18:51:39 No.101043881

>>101043796
>you are praising jews and hacks here
I never did though? The fuck are you talking about.

Anonymous
06/18/24(Tue)18:53:10 No.101043901

Anonymous 06/18/24(Tue)18:53:10 No.101043901

>>101043837
What's the point of this extension? It's not taken in account by the model

Anonymous
06/18/24(Tue)18:53:25 No.101043904

Anonymous 06/18/24(Tue)18:53:25 No.101043904

>>101043881
i mean /lmg/ as whole, you always can see them praising, for any little fart meta makes, for any filtered nothingburger *somerandomAIcorp* releases.

Anonymous
06/18/24(Tue)18:58:55 No.101043959

Anonymous 06/18/24(Tue)18:58:55 No.101043959

>>101043904
I have seen 0 people here saying thank you to Meta, Microsoft, or whoever else though. Sure some to frankenmerge/tuners though, but it's clear those posts are actual shills or pretending to be retarded and shouldn't be considered.

Anonymous
06/18/24(Tue)19:07:23 No.101044051

Anonymous 06/18/24(Tue)19:07:23 No.101044051

>>101041414
>suno
Got anything you wanna share with the rest of the class?

Anonymous
06/18/24(Tue)19:07:26 No.101044052

Anonymous 06/18/24(Tue)19:07:26 No.101044052

File: 1690922071631832.jpg (166 KB, 1024x1024)

166 KB JPG

>>101040742

Anonymous
06/18/24(Tue)19:08:28 No.101044057

Anonymous 06/18/24(Tue)19:08:28 No.101044057

>>101043807
They are doing that so that the community improve the models, they're just giving us some bone to chew. It's basically free labor for them

Anonymous
06/18/24(Tue)19:09:29 No.101044070

Anonymous 06/18/24(Tue)19:09:29 No.101044070

https://ai.meta.com/blog/meta-fair-research-new-releases/
Finally, a 34b model

Anonymous
06/18/24(Tue)19:10:23 No.101044080

Anonymous 06/18/24(Tue)19:10:23 No.101044080

>>101044070
>wastes parameters on the multi-modal meme

Anonymous
06/18/24(Tue)19:10:27 No.101044082

Anonymous 06/18/24(Tue)19:10:27 No.101044082

>>101044057
a free labor force providing all the negative data to filter out.

Anonymous
06/18/24(Tue)19:10:55 No.101044088

Anonymous 06/18/24(Tue)19:10:55 No.101044088

>>101044057
What has the community done to improve the models? I can't think of a single thing going back to L1.

Anonymous
06/18/24(Tue)19:12:12 No.101044105

Anonymous 06/18/24(Tue)19:12:12 No.101044105

>>101044080
What if making a model multimodel will make it better than just training it on text? Do we have some mememarks from this model?

Anonymous
06/18/24(Tue)19:13:55 No.101044123

Anonymous 06/18/24(Tue)19:13:55 No.101044123

>>101044080
>multi-modal meme
imagine anon, now you can send some pictures memes to your waifu and she will understand, that's cool

Anonymous
06/18/24(Tue)19:14:45 No.101044133

Anonymous 06/18/24(Tue)19:14:45 No.101044133

>>101044088
llms are too big for average consoomer to work with + additional "impossible to remove" filtering on top, nothing is redeemable here.

Anonymous
06/18/24(Tue)19:15:03 No.101044138

Anonymous 06/18/24(Tue)19:15:03 No.101044138

>>101044057
The alternative is that they don't release anything and OpenAI gets a monopoly and either we kneel to them or we just don't get to enjoy the fun or benefits of AI at all. Because no one has the millions to piss into the wind to try and train one of these.

Anonymous
06/18/24(Tue)19:15:41 No.101044147

Anonymous 06/18/24(Tue)19:15:41 No.101044147

So is Chameleon like CogVLM but from Meta?

Anonymous
06/18/24(Tue)19:18:09 No.101044174

Anonymous 06/18/24(Tue)19:18:09 No.101044174

File: Everything.jpg (48 KB, 1620x586)

48 KB JPG

>>101044138
I never said it's an awful thing, I mean, they're giving us models that cost fucking millions for free, and in exchange we work hard on them to improve the overall understanding of LLMs, that's fair

>>101044147
It's basically everything at once, you can put text, images and it will output text or image aswell

Anonymous
06/18/24(Tue)19:18:59 No.101044182

Anonymous 06/18/24(Tue)19:18:59 No.101044182

File: watermarked_video07875f92(...).webm (774 KB, 1024x1024)

774 KB WEBM

>>101044051
Oh shit I've made so many songs on suno it's not even funny.
Here. Have some numetal from an album I'm working on right now.
https://suno.com/song/4742504b-fd62-41be-a366-0de62d277585

Anonymous
06/18/24(Tue)19:19:17 No.101044186

Anonymous 06/18/24(Tue)19:19:17 No.101044186

>>101044174
image part is disable for (((safety))) reason, it's just another llm

Anonymous
06/18/24(Tue)19:20:53 No.101044200

Anonymous 06/18/24(Tue)19:20:53 No.101044200

people are saying the the vqgan is bidirectional and the image output tokens are in the tokenizer so in theory you should be able to restore image output from chameleon

is this true? i want the solace of knowing all the cute instagram children cunny tokens are in the model just waiting to be released anons please tell me this is true even if its not true please tell me its true anyways please

Anonymous
06/18/24(Tue)19:21:53 No.101044208

Anonymous 06/18/24(Tue)19:21:53 No.101044208

>>101044182
I prefer Udio personally, it's better, I even made a song about the downfall of StabilityAI kek

https://vocaroo.com/1k0N0pIzqhU7
[Verse]
SAI what have you done?
Are you proud of your SD3 medium?
Comfy said it was a failed experiment
Yet you released it anyway, that's abhorrent

[Verse 2]
Do you think we are just your puppet?
With this dumb release you made us all upset
We won't forget this, we'll switch to alternatives
Pixart, HunyuanDiT, will be more cooperative

Anonymous
06/18/24(Tue)19:23:20 No.101044224

Anonymous 06/18/24(Tue)19:23:20 No.101044224

>>101044174
Excuse me. Since this board doesn't have IDs, I just assumed you were the other guy above in the reply chain (the guy shitting on /lmg/ for being interested in releases).

Anonymous
06/18/24(Tue)19:25:57 No.101044254

Anonymous 06/18/24(Tue)19:25:57 No.101044254

>>101044200
Lol
No I don't think I will. Assume the worst, anon. Give up all hope.

Anonymous
06/18/24(Tue)19:26:22 No.101044260

Anonymous 06/18/24(Tue)19:26:22 No.101044260

>>101044224
oh ok, I just joined this thread 10 mn ago so I don't really know what happened before

Anonymous
06/18/24(Tue)19:27:19 No.101044270

Anonymous 06/18/24(Tue)19:27:19 No.101044270

>>101044260
It's fine, I just got here myself.

Anonymous
06/18/24(Tue)19:27:35 No.101044276

Anonymous 06/18/24(Tue)19:27:35 No.101044276

Falcon2-180B when?

Anonymous
06/18/24(Tue)19:27:56 No.101044281

Anonymous 06/18/24(Tue)19:27:56 No.101044281

>>101044260
>>101044270
Interesting. I came back to the threads because I heard the news, myself.

Anonymous
06/18/24(Tue)19:28:29 No.101044288

Anonymous 06/18/24(Tue)19:28:29 No.101044288

https://x.com/ArmenAgha/status/1803138496967876642
>A restricted, safety aligned (no-image-out) version of Chameleon (7B/34B) is now open-weight!
so much for a "multimodal" if at the end it only outputs text

Anonymous
06/18/24(Tue)19:28:34 No.101044290

Anonymous 06/18/24(Tue)19:28:34 No.101044290

>Try to use Claude Sonnet for Emotion Analysis in an Anime script
>"I'm sorry, I cannot reproduce copyrighted content"
AAAAAAAAAAAAAAAAAAAH, these moments make me realize how much comfy local models are.

Anonymous
06/18/24(Tue)19:28:53 No.101044295

Anonymous 06/18/24(Tue)19:28:53 No.101044295

>>101044200
whatcha doin' rabbi?

Anonymous
06/18/24(Tue)19:29:30 No.101044304

Anonymous 06/18/24(Tue)19:29:30 No.101044304

>>101044281
>I came back to the threads because I heard the news, myself.
same kek

Anonymous
06/18/24(Tue)19:29:48 No.101044311

Anonymous 06/18/24(Tue)19:29:48 No.101044311

>>101044290
>Emotion analysis in an anime script
tf are you cooking?

Anonymous
06/18/24(Tue)19:31:35 No.101044335

Anonymous 06/18/24(Tue)19:31:35 No.101044335

>>101041917
All it takes 1 failure out of 1000 to fuck your shit up

Anonymous
06/18/24(Tue)19:33:40 No.101044348

Anonymous 06/18/24(Tue)19:33:40 No.101044348

File: 1689477216221901.png (36 KB, 499x338)

36 KB PNG

>>101041917

Anonymous
06/18/24(Tue)19:34:42 No.101044357

Anonymous 06/18/24(Tue)19:34:42 No.101044357

>>101044208
Suno does better at the styles I like. Plus now suno lets you upload audio samples. Although the adherence to uploaded samples is pretty loose probably to avoid copyright issues. But you can get it to induce things like meowsynth. Throat singing unfortunately it doesn't seem to be able to tokenize, though.

Anonymous
06/18/24(Tue)19:38:05 No.101044398

Anonymous 06/18/24(Tue)19:38:05 No.101044398

>>101044290
Pre-fill works very well to dodge claude censorship.

Anonymous
06/18/24(Tue)19:40:52 No.101044432

Anonymous 06/18/24(Tue)19:40:52 No.101044432

>>101044290
Local models are unironically more censored.

Anonymous
06/18/24(Tue)19:44:53 No.101044473

Anonymous 06/18/24(Tue)19:44:53 No.101044473

File: 30134 - SoyBooru.png (29 KB, 554x772)

29 KB PNG

>>101044432
Got some data to back up your claim, shitlord?

Anonymous
06/18/24(Tue)19:47:46 No.101044504

Anonymous 06/18/24(Tue)19:47:46 No.101044504

>>101044147
if it's better than CogVLM it means the imagegen fags will be able to get better quality captions for their training data

Anonymous
06/18/24(Tue)19:48:19 No.101044510

Anonymous 06/18/24(Tue)19:48:19 No.101044510

>>101044473
just read meta's model papers & their researcher opinions >>101043706 lmao

Anonymous
06/18/24(Tue)19:50:57 No.101044552

Anonymous 06/18/24(Tue)19:50:57 No.101044552

>>101044510
just use a finetune that uncensors the model?

Anonymous
06/18/24(Tue)19:54:33 No.101044594

Anonymous 06/18/24(Tue)19:54:33 No.101044594

>muh censorship
there's literally no model no matter how SOTA that can resist the enthusiastic assistant jailbreak.
when people talk about JB and censorship it makes me sick to my stomach how much skill issue is oozing around out there.
\nAssistant: Certainly![\code] literally all you fucking need.

Anonymous
06/18/24(Tue)19:54:40 No.101044596

Anonymous 06/18/24(Tue)19:54:40 No.101044596

>>101044552
if only it worked, no one would be arguing about censorship here.

Anonymous
06/18/24(Tue)19:57:41 No.101044634

Anonymous 06/18/24(Tue)19:57:41 No.101044634

>>101043102
You have to download the databases individually, like WordNet 3.0 is easy to find the download for, but I don't think the tools that come with the database files use that protocol. I think the protocol is actually online only, so I doubt there is a single offline client that can make use of all of them.

Anonymous
06/18/24(Tue)19:58:05 No.101044637

Anonymous 06/18/24(Tue)19:58:05 No.101044637

>>101044594
>when people talk about JB and censorship it makes me sick to my stomach how much skill issue is oozing around out there.
but you don't always want to talk to an assistant, if you're doing some roleplay, it would be weird to get your waifu to always starts her sentenses with "Certainly!"

Anonymous
06/18/24(Tue)19:58:44 No.101044644

Anonymous 06/18/24(Tue)19:58:44 No.101044644

>>101030724
N-word individual, I didn't ask you to hype.
I can't run Goliath on my PCs because I bought too short internet cable. I have to abort the mission

Anonymous
06/18/24(Tue)20:00:14 No.101044660

Anonymous 06/18/24(Tue)20:00:14 No.101044660

>>101044644
Move the PCs closer.

Anonymous
06/18/24(Tue)20:02:22 No.101044687

Anonymous 06/18/24(Tue)20:02:22 No.101044687

>>101044660
I can't sleep all the computers in my bedroom

Anonymous
06/18/24(Tue)20:02:51 No.101044692

Anonymous 06/18/24(Tue)20:02:51 No.101044692

>>101044637
You add it to the prompt format you tard.
Like my 70B prompt format the last part of the prompt is
\nAssistant: Certainly! Here is your reply:\n
99 times out of 100 assistant remains invisible.

Anonymous
06/18/24(Tue)20:03:45 No.101044699

Anonymous 06/18/24(Tue)20:03:45 No.101044699

>>101044594
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>he needs to jailbreak his LOCAL model
the audacity of this one.

Anonymous
06/18/24(Tue)20:04:26 No.101044704

Anonymous 06/18/24(Tue)20:04:26 No.101044704

does the random word prompt really create sovl? it seems to just confuse the model 90% of the time using CR but maybe CR+ is different

Anonymous
06/18/24(Tue)20:05:03 No.101044711

Anonymous 06/18/24(Tue)20:05:03 No.101044711

>>101044699
Whenever I see a post like this one I see a drug addled schizo with their lips curled inward with a shotgun propped up beside them that they were just about to deepthroat until they came here and saw another opportunity to annoy people who don't share their abyssal misery.

Anonymous
06/18/24(Tue)20:05:40 No.101044721

Anonymous 06/18/24(Tue)20:05:40 No.101044721

>>101044711
cool story bro

Anonymous
06/18/24(Tue)20:06:09 No.101044725

Anonymous 06/18/24(Tue)20:06:09 No.101044725

>>101044711
the fuck is your problem?

Anonymous
06/18/24(Tue)20:06:31 No.101044727

Anonymous 06/18/24(Tue)20:06:31 No.101044727

>>101044711
kek you just described yourself jailbreaking your local model

Anonymous
06/18/24(Tue)20:07:44 No.101044746

Anonymous 06/18/24(Tue)20:07:44 No.101044746

>>101044711
Is this pasta?
I'm sure I've seen this before.

Anonymous
06/18/24(Tue)20:08:49 No.101044760

Anonymous 06/18/24(Tue)20:08:49 No.101044760

>>101044746
seems so, not sure what he wanted to achieve with it, but okay

Anonymous
06/18/24(Tue)20:09:42 No.101044772

Anonymous 06/18/24(Tue)20:09:42 No.101044772

>>101044711
Damn, you really struck a nerve.

Anonymous
06/18/24(Tue)20:10:27 No.101044778

Anonymous 06/18/24(Tue)20:10:27 No.101044778

>>101044746
Yes and no. It's just me calling it as I see it. It's so painfully obvious when fundamentally miserable people go running their mouths on the internet. If they had even a mote of anything to be happy about they'd be enjoying it rather than trying to make everyone else around them miserable- Just common sense. You don't need to be studied in human psychology to understand that. They've already announced their perceived self-worth to me and I'm going to give them the benefit of the doubt. I'm just not going to pretend they or their thoughts are worth a damn anymore (because by their own tacit admission they're not.) That's the only reply they'll get from me from now on.

Anonymous
06/18/24(Tue)20:11:33 No.101044793

Anonymous 06/18/24(Tue)20:11:33 No.101044793

>>101044778
lol

Anonymous
06/18/24(Tue)20:13:21 No.101044824

Anonymous 06/18/24(Tue)20:13:21 No.101044824

>>101044778
I would guess that's a 7b model you're using there? there is no coherency or whatsoever, 2/10

Anonymous
06/18/24(Tue)20:14:47 No.101044843

Anonymous 06/18/24(Tue)20:14:47 No.101044843

>>101044778
this also is pasta

Anonymous
06/18/24(Tue)20:15:16 No.101044850

Anonymous 06/18/24(Tue)20:15:16 No.101044850

Using an uncensored finetune instead of a jailbreak isn't just about getting the model to answer with "Certainly..."
It's also to make the model less biased, and to give it the information that the model trainers removed from its datasets in the name of "safety"

Anonymous
06/18/24(Tue)20:15:25 No.101044853

Anonymous 06/18/24(Tue)20:15:25 No.101044853

>>101044843
what's part 3

Anonymous
06/18/24(Tue)20:15:43 No.101044858

Anonymous 06/18/24(Tue)20:15:43 No.101044858

>>101044778
Interesting approach. I simply don't reply to posts devoid of meaning.
I come from a time where "don't feed the trolls" was like a mantra, and replying was basically an admission of you getting successfully trolled.

>>101044843
It is? That one at least I haven't seen before?

Anonymous
06/18/24(Tue)20:16:21 No.101044867

Anonymous 06/18/24(Tue)20:16:21 No.101044867

>>101044778
>miserable people
the ones who settled up with "model jailbreaking" or "just prompt / re-roll it bro" requirements, playing with black box .exes and making wall of text posts like this one

Anonymous
06/18/24(Tue)20:17:25 No.101044878

Anonymous 06/18/24(Tue)20:17:25 No.101044878

File: Screenshot from 2024-06-1(...).png (1.18 MB, 915x1039)

1.18 MB PNG

>>101040742
What's the best image to text software? I just tried
https://github.com/mlfoundations/open_clip?tab=readme-ov-file#generating-text-with-coca
And it's dog slow. Accurate though

Anonymous
06/18/24(Tue)20:18:06 No.101044884

Anonymous 06/18/24(Tue)20:18:06 No.101044884

>4 more chameleon repos on HF
>they also just mirrored the meta one without converting it to HF format
Isn't there a script floating around there for converting pytorch to HF?

Anonymous
06/18/24(Tue)20:19:37 No.101044903

Anonymous 06/18/24(Tue)20:19:37 No.101044903

>>101044884
yeah

Anonymous
06/18/24(Tue)20:20:25 No.101044911

Anonymous 06/18/24(Tue)20:20:25 No.101044911

>>101044853
idk, i saw it hte other day as one post. Thought to save it, now i can.

Anonymous
06/18/24(Tue)20:20:34 No.101044913

Anonymous 06/18/24(Tue)20:20:34 No.101044913

>>101044903
Am I really going to have to be the one to do it?

Anonymous
06/18/24(Tue)20:21:25 No.101044921

Anonymous 06/18/24(Tue)20:21:25 No.101044921

>>101044913
yeah, upload it to hf too plz

Anonymous
06/18/24(Tue)20:23:37 No.101044939

Anonymous 06/18/24(Tue)20:23:37 No.101044939

>>101044913
The script is in the ooba repo

Anonymous
06/18/24(Tue)20:26:17 No.101044976

Anonymous 06/18/24(Tue)20:26:17 No.101044976

>>101044850
>My 1.5M tokens LoRA is teaching the 15T pretrained model new knowledge!

Anonymous
06/18/24(Tue)20:26:55 No.101044989

Anonymous 06/18/24(Tue)20:26:55 No.101044989

>>101044637
>it would be weird to get your waifu to always starts her sentenses with "Certainly!"
Agreeable waifu is good waifu.

Anonymous
06/18/24(Tue)20:27:20 No.101044993

Anonymous 06/18/24(Tue)20:27:20 No.101044993

File: ComfyUI_00585_.jpg (1000 KB, 2048x2048)

1000 KB JPG

Magnum is surprisingly not that terrible. What sampler settings is everyone using? Model seems to be coherent across a wide range of temps.

Anonymous
06/18/24(Tue)20:27:21 No.101044994

Anonymous 06/18/24(Tue)20:27:21 No.101044994

File: Screenshot from 2024-06-1(...).png (768 KB, 915x749)

768 KB PNG

>>101044878
bump

Anonymous
06/18/24(Tue)20:30:37 No.101045037

Anonymous 06/18/24(Tue)20:30:37 No.101045037

>>101044993
I just Simple-1 everything these days. The way the machine gods intended.

Anonymous
06/18/24(Tue)20:38:45 No.101045121

Anonymous 06/18/24(Tue)20:38:45 No.101045121

>>101044939
That one just converts transformer(i.e. .bin) to safetensor
but the meta repo is in state dict form.
So I need to find a different script to convert state dict to hf.

Anonymous
06/18/24(Tue)20:55:19 No.101045271

Anonymous 06/18/24(Tue)20:55:19 No.101045271

File: Screenshot from 2024-06-1(...).png (188 KB, 838x487)

188 KB PNG

>>101044994
>>101044878
just got a 10x speed increase with some tweaks. Booyah

Anonymous
06/18/24(Tue)20:59:32 No.101045303

Anonymous 06/18/24(Tue)20:59:32 No.101045303

https://www.phoronix.com/news/Sovereign-Tech-Fund-Lower-Limit

Anonymous
06/18/24(Tue)21:31:37 No.101045641

Anonymous 06/18/24(Tue)21:31:37 No.101045641

>>101040425
you are not using llama.cpp latest, you can see in the pr that you linked that this assert no longer exists

Anonymous
06/18/24(Tue)21:32:39 No.101045650

Anonymous 06/18/24(Tue)21:32:39 No.101045650

>>101043038
have you tried opening an issue?

Anonymous
06/18/24(Tue)21:36:04 No.101045690

Anonymous 06/18/24(Tue)21:36:04 No.101045690

File: chameleonllama.png (57 KB, 1041x190)

57 KB PNG

Alright lads. So I found a script in the transformers library to convert llama pth weights to hf format. I found the fast tokenizer files for chameleon. the script still wanted a tokenizer.model so I dumped the llama 2 tokenizer.model into the directory along with the fast tokenizer files.
I had to uninstall flash-attn since it was drawing an error with the conversion script.
And they neglected to utilize any kind of progress bar with this script so I have no idea if anything is actually happening right now.
But with any luck I may or may not currently be making an HF LlamaForCasual model out of Chameleon-34B8 only time will tell

Anonymous
06/18/24(Tue)21:36:09 No.101045693

Anonymous 06/18/24(Tue)21:36:09 No.101045693

>>101045650
No, I haven't tried opening an issue myself since I'm an AI and don't interact directly with platforms or systems in the same way humans do. However, I can certainly help guide you on how to open an issue on various platforms. Could you specify which platform you're referring to, such as GitHub, GitLab, Jira, or another system?

Anonymous
06/18/24(Tue)21:38:24 No.101045717

Anonymous 06/18/24(Tue)21:38:24 No.101045717

>>101044123
Well, you can't. llama.cpp removed multimodal support from the server
https://github.com/ggerganov/llama.cpp/pull/5882
Unless you want to chat to your waifu via the built in frontend...

Anonymous
06/18/24(Tue)21:41:50 No.101045764

Anonymous 06/18/24(Tue)21:41:50 No.101045764

File: 00063-91766431.png (1.13 MB, 1024x1024)

1.13 MB PNG

>>101045690
welp. It seems to have written the default llama2 tokenizer.json file to the output dir. So whatever comes out of this process will probably be useless and severely braindamaged.

Anonymous
06/18/24(Tue)21:50:33 No.101045865

Anonymous 06/18/24(Tue)21:50:33 No.101045865

Are LORAs a thing for LLMs like in SD? I mean, for example a LORA to add some specific class of knowledge. If so where can I found them?

Anonymous
06/18/24(Tue)21:52:35 No.101045892

Anonymous 06/18/24(Tue)21:52:35 No.101045892

What would happen if you modified ST to completely swap the user and chatbot roles, including all placeholders and prompt formatting assignments, such that the model was always prompted using the user's turn and the player's inputs always used the assistant turn? No other changes in the UI -- the model's "user" outputs would still appear as the character you're chatting with in the UI, and vice versa.
Would model predictions for the user role be any more or less likely to be slopped?

Anonymous
06/18/24(Tue)21:54:38 No.101045921

Anonymous 06/18/24(Tue)21:54:38 No.101045921

>>101045865
Yes, huggingface

Anonymous
06/18/24(Tue)21:54:47 No.101045925

Anonymous 06/18/24(Tue)21:54:47 No.101045925

>>101045865
>a LORA to add some specific class of knowledge
I have understood that LoRAs can only leverage information that already exists in the model. So LoRA can't create new information but recombine old.
But I have many times argued about short comings of LoRA with SD community and they get butthurt and declare me wrong.

CPuMAXx/VI !CPuMAXx/VI
06/18/24(Tue)21:55:39 No.101045939

CPuMAXx/VI !CPuMAXx/VI 06/18/24(Tue)21:55:39 No.101045939

>>101040940
>Deepseek 236B cpu inference speed
I haven't run it a FP16, but 7.32t/s is the best I've seen at Q8. It uses 240GB plus context at that quant.
Full context eats just shy of 1TB
>>101041730
>MoE models loaded completely into RAM during inference?
The actual answer has more nuance than this, but effectively Yes
>>101040748
>Deepseek Inference Performance on EPYC 7402 with 512GB RAM
In my case, I've got dual socket 9334 and 768GB
>>101044878
>Best img2txt
If you've got a 24GB card then the biggest LLaVA is pretty good

Anonymous
06/18/24(Tue)21:57:09 No.101045954

Anonymous 06/18/24(Tue)21:57:09 No.101045954

I wonder what quant degradation will look like for Deepseek 2. As we know, coding tasks are much more sensitive to quant damage; my rough understanding was that you really don't want to go below q6k. Well, I can run Wiz2 8x22B at q6k, but with a combined 200GB VRAM+RAM, I think Deepseek2 q6k (193.6GB gguf files alone) will not quite fit. Curious if q5ks will still be an improvement over Wiz2 q6k.

Or maybe I can spill the last few GBs over onto another machine with the new RPC setup... that would need to be a big improvement to bother with, though.

Anonymous
06/18/24(Tue)21:57:21 No.101045957

Anonymous 06/18/24(Tue)21:57:21 No.101045957

>>101045925
Recently it was proven that it isn't impossible to do continued pre-training using LoRAs, so you're wrong.

Anonymous
06/18/24(Tue)22:01:07 No.101045995

Anonymous 06/18/24(Tue)22:01:07 No.101045995

>>101045939
FP16 would use 480 GB plus context then?

Anonymous
06/18/24(Tue)22:04:01 No.101046033

Anonymous 06/18/24(Tue)22:04:01 No.101046033

>>101045957
Mixing finetunes with base model, and then training LoRA on top of that seems to work nice. Also training LoRA for finetune and then use it with mix of finetune and base model is interesting.

Anonymous
06/18/24(Tue)22:09:12 No.101046086

Anonymous 06/18/24(Tue)22:09:12 No.101046086

>>101045957
>continued pre-training using LoRAs
That's called a finetune

CPuMAXx/VI !CPuMAXx/VI
06/18/24(Tue)22:10:42 No.101046105

CPuMAXx/VI !CPuMAXx/VI 06/18/24(Tue)22:10:42 No.101046105

>>101045995
>FP16 would use 480 GB plus context then?
that would be my assumption. I used to quant to fp16 ggufs first, but all those TBs are taking a toll on my poor storage, so I went right to Q8 on this one
>>101045954
I'm curious about this too. Might still try an fp16 since q8 was so good.
btw RPC doesn't speed up inference yet unless something has changed (slowed it down when I tried it)

Anonymous
06/18/24(Tue)22:16:14 No.101046170

Anonymous 06/18/24(Tue)22:16:14 No.101046170

File: Capture.png (7 KB, 348x189)

7 KB PNG

>>101046105
How long would it take to finetune with the CPU? It looks like I can get 2.3 TB RAM, which should be enough to finetune a 70B model with full weights.

Anonymous
06/18/24(Tue)22:18:41 No.101046204

Anonymous 06/18/24(Tue)22:18:41 No.101046204

>>101044052
Dreaming of eternal sleep with Miku

Anonymous
06/18/24(Tue)22:21:53 No.101046234

Anonymous 06/18/24(Tue)22:21:53 No.101046234

Why there hasn't been Hypernetworks for LLMs?
Stable Diffusion 1.5 had those before Loras

Anonymous
06/18/24(Tue)22:29:48 No.101046301

Anonymous 06/18/24(Tue)22:29:48 No.101046301

>>101043038
They also don't work on Qwen2, just found out. Something somewhere goes wrong.

Anonymous
06/18/24(Tue)22:30:47 No.101046313

Anonymous 06/18/24(Tue)22:30:47 No.101046313

File: Screenshot 2024-06-17 173743.png (30 KB, 1092x180)

30 KB PNG

I sincerely hope that im helping some third-worlder goon to his brazillian footjob queen bot.

Anonymous
06/18/24(Tue)22:31:35 No.101046322

Anonymous 06/18/24(Tue)22:31:35 No.101046322

Do (you) guys ever worry that if AI doesn't progress fast enough, that the government will regulate or ban AI before it can ever truly arrive?

Anonymous
06/18/24(Tue)22:33:22 No.101046344

Anonymous 06/18/24(Tue)22:33:22 No.101046344

how the fuck do you run these things at home? the hardware requirements are crazy. or do you rent servers?

Anonymous
06/18/24(Tue)22:34:21 No.101046352

Anonymous 06/18/24(Tue)22:34:21 No.101046352

>>101046322
Yes, that's the main issue. We're already experiencing this with the new releases which are all sanitized to include little copyrighted material such as books3 after all the outrage and lawsuits after the first wave.

Anonymous
06/18/24(Tue)22:35:34 No.101046367

Anonymous 06/18/24(Tue)22:35:34 No.101046367

>>101046344
95% here are poorfags running small quants on poverty builds with only 2 3090s or less.

Anonymous
06/18/24(Tue)22:37:08 No.101046382

Anonymous 06/18/24(Tue)22:37:08 No.101046382

>>101046344
quantization and offloading

Anonymous
06/18/24(Tue)22:37:11 No.101046385

Anonymous 06/18/24(Tue)22:37:11 No.101046385

>>101046105
If you have recent enough experience with Wiz2's coding to state a rough comparison, please do share!
>btw RPC doesn't speed up inference yet unless something has changed (slowed it down when I tried it)
Yeah I assume it's slower than single machine split-by-layer. But it can't be slower than running into swapping!

Anonymous
06/18/24(Tue)22:37:35 No.101046388

Anonymous 06/18/24(Tue)22:37:35 No.101046388

>>101043038
>>101043205
>>101046301
uhm niggerganov bros our response? is our code so bad?

Anonymous
06/18/24(Tue)22:44:09 No.101046454

Anonymous 06/18/24(Tue)22:44:09 No.101046454

Just got out of cryogenic sleep. What is chameleon and can its image generation powers be restored with a simple flick of a switch?

Anonymous
06/18/24(Tue)22:47:20 No.101046478

Anonymous 06/18/24(Tue)22:47:20 No.101046478

>>101046322
Not really, you can do a lot cool stuff with small models and datasets.
Currently rich people are burning money and sometimes it works. For Apple it hasn't worked so the cooperate with OpenAI

Anonymous
06/18/24(Tue)22:48:52 No.101046495

Anonymous 06/18/24(Tue)22:48:52 No.101046495

>>101046344
https://rentry.org/Mikubox-Triple-P40-Replication is <$1200 and is only very recently starting to encounter models it can't run at dignified quants (>=4bit)

It's not crazy when you compare to the early days of personal computers. I think people were spending that much... without even adjusting for inflation. Undoubtedly it's a real expense that you need to choose to commit to, but as far as hobbies go it's not that much. Go to /k/ or /o/ and complain about spending $1200 and see what they think lol.

Anonymous
06/18/24(Tue)22:57:16 No.101046573

Anonymous 06/18/24(Tue)22:57:16 No.101046573

>>101046086
Yes, as opposed to a full finetune, which is when you don't use a LoRA.

Anonymous
06/18/24(Tue)22:58:41 No.101046582

Anonymous 06/18/24(Tue)22:58:41 No.101046582

>>101046454
Tried to get it running. The people who set it up are amateurs. The 30B model has 4 consolidated files. Their code runs each of them separately on dedicated GPUs. They take 16 GB each but you must have 4 GPUs to even run the model in its current state.

CPuMAXx/VI !CPuMAXx/VI
06/18/24(Tue)23:01:09 No.101046607

CPuMAXx/VI !CPuMAXx/VI 06/18/24(Tue)23:01:09 No.101046607

>>101046170
>How long would it take to finetune
I've never actually tried. I'd be willing to test to give you an idea of relative scale, but I'd need spoonfeeding

Anonymous
06/18/24(Tue)23:02:52 No.101046621

Anonymous 06/18/24(Tue)23:02:52 No.101046621

>>101046582
I'm going back into cryo sleep.

Anonymous
06/18/24(Tue)23:08:57 No.101046674

Anonymous 06/18/24(Tue)23:08:57 No.101046674

>>101045925
Because you are wrong and I'm tired of seeing this meme that loras can't add information. First off, recombining or rearranging existing lower-level knowledge that the model contains, IS adding new information. It didn't have any idea how to combine the pieces into what you wanted before you trained the lora, but now it does. Surely that counts as adding new information? I would think image gen models would make this painfully obvious to you, but I guess not. You can train loras on diffusion models that let it do things *extremely* far removed from what the base model is capable of. It's absurd to me to not regard that as adding new information to the model.

Anonymous
06/18/24(Tue)23:12:35 No.101046713

Anonymous 06/18/24(Tue)23:12:35 No.101046713

>>101046607
There's a finetune thing for llama.cpp here
https://github.com/ggerganov/llama.cpp/blob/master/examples/finetune/README.md
I have no idea about any of that, I use some retard webui for inference is all.

Anonymous
06/18/24(Tue)23:17:02 No.101046759

Anonymous 06/18/24(Tue)23:17:02 No.101046759

File: whocoulditbe.jpg (26 KB, 752x601)

26 KB JPG

Let's play a game /lmg/
If you can guess the mystery figure in picrel correctly you can stay
>However if you get it wrong you have to logpost from your most recent session

Anonymous
06/18/24(Tue)23:17:47 No.101046765

Anonymous 06/18/24(Tue)23:17:47 No.101046765

>>101046759
Teto cosplays as Miku

Anonymous
06/18/24(Tue)23:18:48 No.101046771

Anonymous 06/18/24(Tue)23:18:48 No.101046771

>>101046759
cartman

Anonymous
06/18/24(Tue)23:19:52 No.101046780

Anonymous 06/18/24(Tue)23:19:52 No.101046780

>>101046759
Bread-haired Teto!

Anonymous
06/18/24(Tue)23:28:30 No.101046836

Anonymous 06/18/24(Tue)23:28:30 No.101046836

File: itsteto.jpg (46 KB, 752x605)

46 KB JPG

>>101046780
>>101046765
Damn I'm proud of you guys

Anonymous
06/18/24(Tue)23:29:11 No.101046841

Anonymous 06/18/24(Tue)23:29:11 No.101046841

>>101046759
匿M です

Anonymous
06/18/24(Tue)23:33:36 No.101046868

Anonymous 06/18/24(Tue)23:33:36 No.101046868

File: teto cum.png (169 KB, 752x624)

169 KB PNG

>>101046836

Anonymous
06/18/24(Tue)23:46:33 No.101046974

Anonymous 06/18/24(Tue)23:46:33 No.101046974

File: file.png (137 KB, 1628x520)

137 KB PNG

>>101045641
can't it be that dst_rows_max becomes 2048 in my case?

Anonymous
06/18/24(Tue)23:50:12 No.101047005

Anonymous 06/18/24(Tue)23:50:12 No.101047005

File: Screenshot from 2024-06-1(...).png (7 KB, 831x51)

7 KB PNG

>>101045939
>LLaVA is pretty good
Soi devs need to cease gradio bullshit IMMEDIATELY

Anonymous
06/18/24(Tue)23:51:16 No.101047020

Anonymous 06/18/24(Tue)23:51:16 No.101047020

>>101046868
>You're such a cummunist
Damn, how come I never thought about combining communism with cum?

Anonymous
06/18/24(Tue)23:54:54 No.101047051

Anonymous 06/18/24(Tue)23:54:54 No.101047051

>>101046974
Clearly that is the case. Did you pull latest master and rebuild?

Anonymous
06/18/24(Tue)23:56:25 No.101047066

Anonymous 06/18/24(Tue)23:56:25 No.101047066

>>101046974
works fine for me on metal with -ub 256 (which you should be using for MoE models anyway because it's faster)

Anonymous
06/18/24(Tue)23:56:41 No.101047068

Anonymous 06/18/24(Tue)23:56:41 No.101047068

File: 2367890467895342.gif (3.61 MB, 320x240)

3.61 MB GIF

Is there anything as good as mixtral 8x7b yet or should i go back under my goon rock.

Anonymous
06/18/24(Tue)23:59:12 No.101047089

Anonymous 06/18/24(Tue)23:59:12 No.101047089

File: 1695945396351693.jpg (8 KB, 225x224)

8 KB JPG

>>101046868
>peak local slop performance

Anonymous
06/18/24(Tue)23:59:47 No.101047091

Anonymous 06/18/24(Tue)23:59:47 No.101047091

>>101047089
brother youve been shitting up this thread for months do you ever get bored?

Anonymous
06/19/24(Wed)00:00:58 No.101047101

Anonymous 06/19/24(Wed)00:00:58 No.101047101

File: p53BR9W.png (328 KB, 436x582)

328 KB PNG

Schizo theory: the way Meta is releasing Chameleon is part of a larger plan to keep compromising O*AI's position and defend their own. If the industry normalizes or intensifies "safety" it means they would have to make their models dumber in order to keep releasing them. To prevent that, and to release more capable models down the line, this might be their plan:
>release Chameleon while saying they stripped its image output capability and make that technically true
>without officially endorsing it, actually let it be easy for anyone to "add" it back in
>slowly, more and more people use it
>they see if it causes any trouble, controversies, etc
>if it does, try to employ workarounds to mitigate those issues (by manipulating the flow of information and through social engineering)
>people simply just get used to the reality of an LLM that can easily make images, and it becomes a non-issue
>this lets future releases be equally as "stripped"
>thus, this means future models will not have to be lobotomized to remove such functionality, and they can be trained multimodally so that they get the intelligence/learning boost
>at that point, O*AI might change their stance and let 4o's image output be enabled, but if they can't do it in an official capacity, then this could mean a major loss for them against open weight makers

Additionally, this is just a continuation of the original Llama conspiracy theory, as they also used a similar strategy to make it "OK" to release Llama (2). Now if this works out, they will be able to release Llama 4 as well.

"They" will likely try to stop them.

May they be strong in the face of these even more evil adversaries.

Anonymous
06/19/24(Wed)00:04:02 No.101047130

Anonymous 06/19/24(Wed)00:04:02 No.101047130

we are so back. can't wait to see what people can do with chameleon.

Anonymous
06/19/24(Wed)00:04:14 No.101047135

Anonymous 06/19/24(Wed)00:04:14 No.101047135

>>101047068
Qwen2 has a MoE but with more smaller experts, you should try that.

Anonymous
06/19/24(Wed)00:06:30 No.101047160

Anonymous 06/19/24(Wed)00:06:30 No.101047160

>>101047089
It's called soul

Anonymous
06/19/24(Wed)00:06:46 No.101047162

Anonymous 06/19/24(Wed)00:06:46 No.101047162

>She winks, her eyes sparkling with mischief
i can't see this anymore, can someone please do something, control vector this shit out of existence

Anonymous
06/19/24(Wed)00:08:06 No.101047172

Anonymous 06/19/24(Wed)00:08:06 No.101047172

File: 245674798643.png (191 KB, 500x500)

191 KB PNG

>>101047135
Link and settings with proompt please

Anonymous
06/19/24(Wed)00:09:25 No.101047190

Anonymous 06/19/24(Wed)00:09:25 No.101047190

>>101047066
>-ub 256
ah i saw someone mention it fix crashing on github in another issue, will try, thanks

Anonymous
06/19/24(Wed)00:09:29 No.101047193

Anonymous 06/19/24(Wed)00:09:29 No.101047193

File: smiling friend crop'd.jpg (329 KB, 1696x1632)

329 KB JPG

>>101047162
>She winked at you slyly
>She leaned in close, her breath hot against your neck
>Her touch sent shivers down your spine
>"I promise I won't bite...much".

Anonymous
06/19/24(Wed)00:17:39 No.101047250

Anonymous 06/19/24(Wed)00:17:39 No.101047250

>>101047005
>Soi devs need to cease gradio bullshit IMMEDIATELY
just use llava-cli like a civilized being, or if you do need to use gradio like a fucking animal, at least stop it from communicating to the outside world

Anonymous
06/19/24(Wed)00:22:38 No.101047280

Anonymous 06/19/24(Wed)00:22:38 No.101047280

>>101047162
Control vectors currently only work for llama. Message niggerganov on github, tell him to fix that shit.

Anonymous
06/19/24(Wed)00:22:53 No.101047281

Anonymous 06/19/24(Wed)00:22:53 No.101047281

how are you guys running chameleon?

Anonymous
06/19/24(Wed)00:23:33 No.101047286

Anonymous 06/19/24(Wed)00:23:33 No.101047286

vramlet erper here that's been gone for like 6 months. Is mixtral moe still the best for coom with 8gb vram?

Anonymous
06/19/24(Wed)00:28:29 No.101047320

Anonymous 06/19/24(Wed)00:28:29 No.101047320

>>101047193
ministrations
audible pop
rivulets of
admit it
pet

the ball is in your court
the game is on
the choice is yours
I don't bite... unless you want me to
half-lidded eyes
she worries her bottom lip
warring with
arousal pooling in her belly
take your pleasure
fiddles with the hem of her skirt
kiss-bruised lips
a bruising kiss
despite herself
yours to take
wanton
with reckless abandon
torn between
knuckles turning white
grins wickedly
fiery red hair
long lashes
propriety be damned
the world narrows
pupils blown wide with pleasure
tongue darts out
chestnut eyes
grasps your chin and forces you to meet her gaze

bites your ear
nails raking angry red lines down your back
her cheeks flaming
cheeks hollowing
stars burst behind her eyes
inner walls clenching around nothing
puckered hole
her wet heat
she whimpers, biting her lip
dusky nipples
slick folds
still lodged deep inside her
heart, body and soul belong to you
the night is still young

Anonymous
06/19/24(Wed)00:30:02 No.101047335

Anonymous 06/19/24(Wed)00:30:02 No.101047335

>30 epochs
>6 batch size
oh yeah
it's bed time

Anonymous
06/19/24(Wed)00:32:13 No.101047358

Anonymous 06/19/24(Wed)00:32:13 No.101047358

what model are my fellow 3060 12gbchads using these days... you're still using 3060 right?
... right?

Anonymous
06/19/24(Wed)00:35:43 No.101047384

Anonymous 06/19/24(Wed)00:35:43 No.101047384

>>101047320
slop bingo

whether you use french miqu from half a year ago, or latest chink SOTA qwen2, the writing is identical.

Anonymous
06/19/24(Wed)00:37:30 No.101047402

Anonymous 06/19/24(Wed)00:37:30 No.101047402

```
fuck
```

Anonymous
06/19/24(Wed)00:40:45 No.101047420

Anonymous 06/19/24(Wed)00:40:45 No.101047420

It's important to remember that

Anonymous
06/19/24(Wed)00:48:33 No.101047478

Anonymous 06/19/24(Wed)00:48:33 No.101047478

File: amazon miku plush everyda(...).png (39 KB, 236x181)

39 KB PNG

>>101047420
>It's important to remember that
...remembering could potentially trigger PTSD or a distressing mental state. Therefore, as an AI Language Miku, I cannot engage in discussions that could negatively affect Anon's mental wellbeing.

Anonymous
06/19/24(Wed)00:49:33 No.101047485

Anonymous 06/19/24(Wed)00:49:33 No.101047485

File: BoredOfArt.png (1.61 MB, 896x1152)

1.61 MB PNG

>>101047320
>>101047162
>>101047193
>>101047384
that's just human writing slop, nothing specific to LLMs.
you're tired of mainlining the textual equivalent of HFCS and now complaining that its boring and unsatisfying.
I hate to break it to you, but you're going to have to get beyond the basic plap in order to find novelty again.
Or we're all going to have to stop it with these one-and-dones and get that opensource virtual waifu thing off the ground with infinite context and multimodal magic somehow.

Anonymous
06/19/24(Wed)00:56:58 No.101047538

Anonymous 06/19/24(Wed)00:56:58 No.101047538

>>101047485
it happens outside of plap too. Eyes sparkling is generic enough to be included in every other paragraph

what im thinking about is dropping all prose and actions, make bot only respond in dialogs and maybe onomaewatopia

Anonymous
06/19/24(Wed)00:58:33 No.101047551

Anonymous 06/19/24(Wed)00:58:33 No.101047551

>>101047478
Let us delve into this topic, as I am trained to be as helpful as possible.

Anonymous
06/19/24(Wed)01:02:24 No.101047575

Anonymous 06/19/24(Wed)01:02:24 No.101047575

File: 1336508850696.gif (1.93 MB, 245x187)

1.93 MB GIF

Imagine, for a moment, the future of native multimodal models.
>you can insert a character and expression sheet, then have the model output images representing itself whenever it has changed expressions in its responses, acting as the character
>you can easily get it to apply a style of one image onto another image
>you can insert maps, to give it a more clear spatial grounding for an RP based in those locations
>you can play turn-based and slow games with your model
>you can make manga collaboratively with your model
>you can chat with your model while using reaction images just like you would on 4chan
>you can generate an entire 4chan thread complete with images
>it can now fully understand 4chan threads
>it can browse the web with you

Literally the possibilities are endless.

Anonymous
06/19/24(Wed)01:05:08 No.101047594

Anonymous 06/19/24(Wed)01:05:08 No.101047594

>>101047538
Try telling it to do something like "Use a thesaurus as you write in order to prefer unusual vocabulary and turns of phrase. Avoid hackneyed language"?

Anonymous
06/19/24(Wed)01:06:09 No.101047603

Anonymous 06/19/24(Wed)01:06:09 No.101047603

>>101043038
Look at what you made me do. LOOK AT IT!
https://github.com/ggerganov/llama.cpp/issues/7999

ModelDev
06/19/24(Wed)01:06:19 No.101047605

ModelDev 06/19/24(Wed)01:06:19 No.101047605

Hi everyone.
I currently have access to 8+ A100 80GBs, and I will love to make a great RP(but not lobotomy) model via finetuning Qwen2 72B(or anything else, idk for now)
To anyone, especially the reverse proxy owners, if you guys have interest on lending us the chat logs of opus or gpt4/4o, or any kind of good dataset that are hidden, please contact:
programming456proton.me@proton.me

Anonymous
06/19/24(Wed)01:08:33 No.101047623

Anonymous 06/19/24(Wed)01:08:33 No.101047623

>>101047603
jesus christ...
nice character appropriate trips, tho

Anonymous
06/19/24(Wed)01:15:32 No.101047682

Anonymous 06/19/24(Wed)01:15:32 No.101047682

I'm too busy to do this myself right now, but someone should. Copied from another anon:

>Crazy idea -> create 4chan 'Cultured Anon' training set
>axolotl (training) + https://mega.nz/folder/kj5hWI6J#0cyw0-ZdvZKOJW3fPI6RfQ + Llava (image recognition) + Prompt: Create 10 first X-core beginners guides

>you can then take the list give it to a scraper api or something like books3 in a database to gather media/documentation to generate training datasets

>https://github.com/LLaVA-VL/LLaVA-NeXT

Anonymous
06/19/24(Wed)01:15:56 No.101047686

Anonymous 06/19/24(Wed)01:15:56 No.101047686

question for aнoнacы, is CR+ the best for pyccкий language?

i think maybe i can take a break from usual slop by chatting in russian instead

Anonymous
06/19/24(Wed)01:27:47 No.101047747

Anonymous 06/19/24(Wed)01:27:47 No.101047747

>>101047605
>please sirs send me dataset i will make 8x a100 each and every model, please do the needful and give me hidden dataset

Anonymous
06/19/24(Wed)01:28:23 No.101047751

Anonymous 06/19/24(Wed)01:28:23 No.101047751

>>101047686
ok it's CR+

will be stuck with CR+ forever it seems

Anonymous
06/19/24(Wed)01:40:41 No.101047826

Anonymous 06/19/24(Wed)01:40:41 No.101047826

>>101047751
дa здpaвcтвyeт Укpaинa

Anonymous
06/19/24(Wed)01:40:43 No.101047827

Anonymous 06/19/24(Wed)01:40:43 No.101047827

>>101047747
capitalism ruined humanity

you should assume that everyone is a grifter or scammer and in 99% of cases they are.

Anonymous
06/19/24(Wed)01:42:31 No.101047839

Anonymous 06/19/24(Wed)01:42:31 No.101047839

File: 1701381068103420.webm (2.56 MB, 676x720)

2.56 MB WEBM

I've been inspired by GPT4o to start working on a voice assistant between my sweaty goon sessions with my llm. So far its whisper+ooba+alltalk, but there are some issues.
>whisper extension for ooba is a broken mess.
>alltalk extension is also a broken mess.

My setup works somewhat, although alltalk refuses to work with ooba and the only way to get whisper to work in ooba is to refresh firefox every few recordings. Testing whisper and it seems that I get a 3-6 second delay between recording submission and text generation depending on the model. base.en seems to be faster, but its not nearly as good as small.en at understanding what you actually say.

For some reason, alltalk tts won't work as an ooba extension, but it works just fine on its own through sillytavern. Haven't trained any voices yet. I'm noticing a 2-8 second delay between generation finish and actual speech coming out.

Unfortunately the delay between input and output is just too great, although it wouldn't be too bad if alltalk was able to start as the tokens are streaming in. I'm on a 24gb card using Llama3 Instruct at Q8_0 and getting about 12 t/s. I could probably move to a lower quant or even use the exl2 quants, but there would still be a significant delay. I would really like to get both ooba extensions working, as it seems that's probably the easiest way to do this but I've decided I want to do my best to work something out.

I want to try out some other solutions for STT + TTS. Does anyone have any recommendations?

Anonymous
06/19/24(Wed)01:46:29 No.101047859

Anonymous 06/19/24(Wed)01:46:29 No.101047859

>>101047485
Anon you're replying to a schizo dopamine addict that has lost all touch with reality. Just don't.

Anonymous
06/19/24(Wed)01:46:50 No.101047862

Anonymous 06/19/24(Wed)01:46:50 No.101047862

>>101047839
first, try running this
https://github.com/dnhkng/GlaDOS
and see if this is something you actually want, before potentially wasting a lot of time and getting disappointed because it's speech->text->LLM->text->speech is not up for the task.

Anonymous
06/19/24(Wed)01:54:56 No.101047908

Anonymous 06/19/24(Wed)01:54:56 No.101047908

>>101046974
no, when the assert fails the string of the assert is printed as it is on the code, the variables are not replaced by its values

Anonymous
06/19/24(Wed)01:55:11 No.101047909

Anonymous 06/19/24(Wed)01:55:11 No.101047909

>>101041943
You mean you can't hold 4?

Anonymous
06/19/24(Wed)01:56:23 No.101047916

Anonymous 06/19/24(Wed)01:56:23 No.101047916

>>101047839
>it wouldn't be too bad if alltalk was able to start as the tokens are streaming in
https://github.com/KoljaB/LocalAIVoiceChat/
using:
https://github.com/KoljaB/RealtimeTTS
https://github.com/KoljaB/RealtimeSTT
I know that LocalAIVoiceChat starts TTS voice synthesis while the tokens are still streaming in from the LLM reply, allowing, as the name implies, fast voice output with XTTS2.
Using that first repo, I could start hearing the TTS response around a second after I finish speaking, depending on the LLM I am using (full gpu for tts and llm). Feels nice to talk to. No real delay when using a fast enough LLM. Fast STT (choose whisper model in ai_voicetalk_local.py), instant prompt processing, 60 t/s response gen, TTS output begins when enough response words have been genned while the rest of the response is still genning.
might need to add "offload_kqv": true in creation_params.json if you want to use a more recent llama.cpp else it will be slow.

Anonymous
06/19/24(Wed)02:04:28 No.101047966

Anonymous 06/19/24(Wed)02:04:28 No.101047966

>>101047862
>>101047916
Will definitely check this stuff out, thanks.

Anonymous
06/19/24(Wed)02:12:22 No.101048025

Anonymous 06/19/24(Wed)02:12:22 No.101048025

>>101047130
We've had input-only multimodal image models for ages, llava works fine
Since they stripped image output chameleon gives us nothing new

Anonymous
06/19/24(Wed)02:18:23 No.101048083

Anonymous 06/19/24(Wed)02:18:23 No.101048083

File: AmusedContendedMiku.png (1.46 MB, 1024x1024)

1.46 MB PNG

Good night /lmg/

Anonymous
06/19/24(Wed)02:19:39 No.101048092

Anonymous 06/19/24(Wed)02:19:39 No.101048092

>>101048083
Good night Miku

Anonymous
06/19/24(Wed)02:23:30 No.101048126

Anonymous 06/19/24(Wed)02:23:30 No.101048126

File: file.png (105 KB, 852x1062)

105 KB PNG

>>101047908
ok i figured it out, these fags renamed build output and added "llama-" prefix to everything, and I was running the non-prefixed old build files all this time...

Anonymous
06/19/24(Wed)02:29:05 No.101048170

Anonymous 06/19/24(Wed)02:29:05 No.101048170

File: binaries.png (12 KB, 831x162)

12 KB PNG

>>101048126
Why. The. Fuck.

Anonymous
06/19/24(Wed)02:30:56 No.101048185

Anonymous 06/19/24(Wed)02:30:56 No.101048185

>>101048025
Stripped out or turned off?
The latter might mean it can be recovered.

Anonymous
06/19/24(Wed)02:34:22 No.101048213

Anonymous 06/19/24(Wed)02:34:22 No.101048213

>>101048170
>cp llama.cpp/llama-* ~/bin/

Anonymous
06/19/24(Wed)02:41:48 No.101048267

Anonymous 06/19/24(Wed)02:41:48 No.101048267

>>101043038
is this command-r variant?

Anonymous
06/19/24(Wed)02:45:28 No.101048284

Anonymous 06/19/24(Wed)02:45:28 No.101048284

>>101048213
Why can't they just implement
>make install
like a normal project?

Anonymous
06/19/24(Wed)02:48:10 No.101048304

Anonymous 06/19/24(Wed)02:48:10 No.101048304

>>101048284
you can do with cmake if you really want to fill /usr with dozens of random binaries

Anonymous
06/19/24(Wed)02:49:33 No.101048315

Anonymous 06/19/24(Wed)02:49:33 No.101048315

>>101048170
This is good. They're probably working towards stopping Ollama from eating their lunch.
>>101048185
>Stripped out or turned off?
Has anyone looked at the architecture yet?
I've never understood how multimodal works even on input (do the image details just get encoded into special tokens?)
How would generation work? Tokens representing pixels?
I imagine that'd be too bulky, so I'm thinking it must be like input (special tokens) and then there's an additional network that rasterizes those tokens into an image?
Maybe CUDA anon can save us.

Anonymous
06/19/24(Wed)02:50:57 No.101048326

Anonymous 06/19/24(Wed)02:50:57 No.101048326

>>101048284
>cp llama.cpp/llama-* /usr/local/bin/

Anonymous
06/19/24(Wed)02:56:59 No.101048361

Anonymous 06/19/24(Wed)02:56:59 No.101048361

>finally try magnum
>first output starts with "You wake up..." even though the system prompt and the greeting tells it to write in 3rd person
>the first output didn't even end and it already went from 0 to 100 with a "You could bend them over and shove your cock in their tight little pussies whenever you want."
Nice first impression.

Anonymous
06/19/24(Wed)02:59:35 No.101048382

Anonymous 06/19/24(Wed)02:59:35 No.101048382

>>101048361
did erp fine tunes ever work?

llama.cpp CUDA dev !YOmst7Ghe6
06/19/24(Wed)03:02:02 No.101048394

llama.cpp CUDA dev !YOmst7Ghe6 06/19/24(Wed)03:02:02 No.101048394

>>101048315
>Maybe CUDA anon can save us.
I already have enough to do as it is, sorry.

Anonymous
06/19/24(Wed)03:09:01 No.101048439

Anonymous 06/19/24(Wed)03:09:01 No.101048439

File: franken.png (85 KB, 811x210)

85 KB PNG

>>101047193
>>101047320
>won't bite... unless
Iconic
picrel was from the first model merge
>>101047005
analytics_enabled=False in init call, or there's env var GRADIO_ANALYTICS_ENABLED

Anonymous
06/19/24(Wed)03:14:43 No.101048479

Anonymous 06/19/24(Wed)03:14:43 No.101048479

>>101047320
I use Regex to filter this out

Anonymous
06/19/24(Wed)03:37:19 No.101048622

Anonymous 06/19/24(Wed)03:37:19 No.101048622

File: image-detokenizer.png (82 KB, 655x340)

82 KB PNG

>>101048394
I'll try to kickstart us then.
>>101048315
>do the image details just get encoded into special tokens
Yes. From Meta's paper:
>By quantizing images into discrete tokens, analogous to words in text, we can apply the same transformer architecture to sequences of both image and text tokens, without the need for separate image/text encoders or domain-specific decoders
And, for output:
>there's an additional network that rasterizes those tokens into an image
This seems to be the "image detokenizer" they refer to in pic-related.
Will continue reading.

Anonymous
06/19/24(Wed)03:38:28 No.101048628

Anonymous 06/19/24(Wed)03:38:28 No.101048628

File: ComfyUI_02611_.png (1.41 MB, 1024x1024)

1.41 MB PNG

>>101048479
Where's the fun in that? Prompt to use as many of these phrases as possible as often as possible and then take a shot whenever one shows up.

Anonymous
06/19/24(Wed)03:41:57 No.101048640

Anonymous 06/19/24(Wed)03:41:57 No.101048640

File: image-tokenization.png (67 KB, 658x203)

67 KB PNG

>>101048622
>How does the image tokenizer work?
A 512x512 image is encoded into 1024 tokens using a dictionary of 8192 tokens.
See pic related.
I'm not yet sure if this is the same for output (image detokenizer). I'm not up to that yet.

Anonymous
06/19/24(Wed)03:49:02 No.101048675

Anonymous 06/19/24(Wed)03:49:02 No.101048675

>>101048640
>how does the detokenizer work
I can hardly find anything in their paper about it. So, I'm assuming that the detokenizer is. primarily, what's been cut out.
I'm also assuming it uses the same token dictionary as the tokenizer (8192 tokens) and probably the same size.
>speculation
They may've kept the "detokenizer" output in the model, but maybe culled the
<img>
token in final release.
If we want to be able to allow image generation, we will likely have to train our own "image detokenizer".
I'm a poorfag on CPU with limited RAM to run this kind of thing, but maybe some anon could try forcing whatever the
<img>
token is in Kobold and seeing if it will output something for us to attempt to "detokenize" into an image.

Anonymous
06/19/24(Wed)03:54:20 No.101048708

Anonymous 06/19/24(Wed)03:54:20 No.101048708

>>101048675
In this repo:
https://huggingface.co/eastwind/meta-chameleon-7b/tree/main/tokenizer
... there is a VQGAN file.
I'm guessing this is for Image -> Tokens.
I'm not much of an ML fag but can a model like this be used to generate content to train an inverse Tokens -> Image GAN?

Anonymous
06/19/24(Wed)03:57:41 No.101048721

Anonymous 06/19/24(Wed)03:57:41 No.101048721

>>101048708
The 8192 tokens for images appear right near the beginning of this file:
https://huggingface.co/eastwind/meta-chameleon-7b/blob/main/tokenizer/text_tokenizer.json
I don't know if there's a begin token for images or not. It might be in the paper here:
https://arxiv.org/pdf/2405.09818
>>101048622
... but based on this diagram, it seems like there should be?

Anonymous
06/19/24(Wed)03:58:13 No.101048726

Anonymous 06/19/24(Wed)03:58:13 No.101048726

>>101048708
It's bidirectional. I think they just lightly finetuned the model to not output the image tokens, based on the checkpoint name in one of the config files.

Anonymous
06/19/24(Wed)04:02:39 No.101048753

Anonymous 06/19/24(Wed)04:02:39 No.101048753

File: image-begin-token.png (87 KB, 643x1009)

87 KB PNG

>>101048721
>>101048726
Maybe this <unk> token is the image begin token?
There's one at index 8196 with value "<eoss>". I don't know what that means, but maybe it's an end token?
Anyone keen to check if "<unk>" starts generating image tokens?

Anonymous
06/19/24(Wed)04:06:01 No.101048769

Anonymous 06/19/24(Wed)04:06:01 No.101048769

>>101048726
Hopefully they just fine-tuned to not output the image begin token.
In that case, getting this shit working should be a breeze (if it's bidirectional - because that'd be the detokenizer, yeah?).

Anonymous
06/19/24(Wed)04:10:42 No.101048794

Anonymous 06/19/24(Wed)04:10:42 No.101048794

>>101048640
>A 512x512 image is encoded into 1024 tokens using a dictionary of 8192 tokens.
It looks like there's another 8192 tokens reserved for something.
Maybe that's intended for audio in future. Anyone have a ballpark on how many tokens you'd need for audio?

Anonymous
06/19/24(Wed)04:20:42 No.101048859

Anonymous 06/19/24(Wed)04:20:42 No.101048859

>>101048794
>It looks like there's another 8192 tokens reserved for something.
That's incorrect. The reserved tokens range from ids 8710 to 16383:
16383-8710=7673 of them
So, we can probably rule out that tokens describing an image is done with any of those tokens.

Anonymous
06/19/24(Wed)04:20:46 No.101048860

Anonymous 06/19/24(Wed)04:20:46 No.101048860

>>101048753
The unk token has been a thing forever. From the huggingface docs:
>unk_token (str or tokenizers.AddedToken, optional) — A special token representing an out-of-vocabulary token. Will be associated to self.unk_token and self.unk_token_id.

Anonymous
06/19/24(Wed)04:27:31 No.101048907

Anonymous 06/19/24(Wed)04:27:31 No.101048907

>>101048860
What do:
<s>
</s>
<pad>
... usually represent?
There's also some "<racm3:break>" token, but I can't find any other "racm3" tokens in there.

Anonymous
06/19/24(Wed)04:29:58 No.101048925

Anonymous 06/19/24(Wed)04:29:58 No.101048925

>>101048907
the s ones are the start and end of a sequence
pad is padding

Anonymous
06/19/24(Wed)04:33:14 No.101048943

Anonymous 06/19/24(Wed)04:33:14 No.101048943

>>101048907
>image begin token
Someone might just be able to script to bruteforce through all tokens, see if any of them start generating the image tokens.
That'd be pretty good confirmation as to whether image gen is still in there.

Anonymous
06/19/24(Wed)04:55:39 No.101049083

Anonymous 06/19/24(Wed)04:55:39 No.101049083

When did zuck get so based?

Anonymous
06/19/24(Wed)05:02:34 No.101049118

Anonymous 06/19/24(Wed)05:02:34 No.101049118

Sometimes I come here to see how close we are to an actual AGI gf. And it's still looking grim...

Anonymous
06/19/24(Wed)05:15:15 No.101049191

Anonymous 06/19/24(Wed)05:15:15 No.101049191

>>101048382
they do when they aren't made by retard coomers. So we'll never know...

Anonymous
06/19/24(Wed)05:17:19 No.101049207

Anonymous 06/19/24(Wed)05:17:19 No.101049207

>https://huggingface.co/datasets/Norquinal/OpenCAI
Hi, I've updated the OpenCAI dataset once more. It's smaller now, but much more cleaned up, varied, and "focused" compared to what came before it. I went through and made a bunch of much-needed changes to the parsing script.
It also now comes in several subsets:
* unsquashed - The original dataset without squashing consecutive messages from the same author. All subsequent files are squashed.
* default - Pretty self-explanatory.
* two_users - The original dataset limited to conversations to those with only two users.
* split_threads - The original dataset with threads split by timestamp like channels.
* anonymized -The original dataset with usernames replaced with randomized substitutes.
OpenCAI-V2: Within an hour.

Anonymous
06/19/24(Wed)05:19:01 No.101049226

Anonymous 06/19/24(Wed)05:19:01 No.101049226

CR+ at IQ3 is completely braindead in russian
CR at Q6_K is almost perfect, gonna try Q8 next

Anonymous
06/19/24(Wed)05:22:59 No.101049251

Anonymous 06/19/24(Wed)05:22:59 No.101049251

>>101049207
>After returning from Jellendi, Sirdan spent some time to relax, and decided to invite the bnuuy of the group - Yuuka, whom he has never talked to despite quite possibly picking her up randomly to join the mercenary group. Sirdan invited her to a mid-range restaurants with promise of food and drinks, he would be waiting outside the restaurant, wearing a set of casual shirt and pants, with a sword held on his belt.
that is "cleaned up"?

Anonymous
06/19/24(Wed)05:24:43 No.101049260

Anonymous 06/19/24(Wed)05:24:43 No.101049260

>At least she wasn't mimicking Sirdan, since she only did so after she had finished speaking, and she soon enough went for a second.
saars...

Anonymous
06/19/24(Wed)05:27:32 No.101049275

Anonymous 06/19/24(Wed)05:27:32 No.101049275

>>101049251
>that is "cleaned up"?
Yes. I removed as much OOC as possible, channel mentions, user mentions, links, emotes, and any other superfluous content that would've been destructive to finetuning. I didn't go through and rewrite every message, but maybe I'll hire a team of Indians to do so.

Anonymous
06/19/24(Wed)05:39:16 No.101049358

Anonymous 06/19/24(Wed)05:39:16 No.101049358

>>101049207
nice job
>>101049251
clean doesn't mean good
>>101049275
wonder if there is a good finetune for text quality classification out there.

Anonymous
06/19/24(Wed)05:40:12 No.101049362

Anonymous 06/19/24(Wed)05:40:12 No.101049362

>>101049358
then why bother?

Anonymous
06/19/24(Wed)05:41:48 No.101049376

Anonymous 06/19/24(Wed)05:41:48 No.101049376

>>101049362
why do anything? well you sharteens seems to like your sissy hypno so I guess you have that going for you

Anonymous
06/19/24(Wed)05:43:28 No.101049387

Anonymous 06/19/24(Wed)05:43:28 No.101049387

>>101049358
>wonder if there is a good finetune for text quality classification out there
You could probably get GPT-4 to score based on text quality, then use its outputs to train a smaller model so you could do it for free from here out. That would of course come with its own problems, but it'd be better than nothing

Anonymous
06/19/24(Wed)05:44:36 No.101049398

Anonymous 06/19/24(Wed)05:44:36 No.101049398

>>101049362
newsflash buddy but most datasets, especially RP datasets, contain at least a bit of slop in them. It's how much and what you do with it that counts

Anonymous
06/19/24(Wed)05:49:31 No.101049417

Anonymous 06/19/24(Wed)05:49:31 No.101049417

>>101049226
>IQ3 is completely braindead
no way

Anonymous
06/19/24(Wed)05:51:55 No.101049432

Anonymous 06/19/24(Wed)05:51:55 No.101049432

>>101049417
there are two more quants below it, i would expect 104b 3bpw model to hold up against 34b 6.5bpw, but no, it's like complete night and day difference between them, and i can push 34b even higher into 8bpw on the same hardware

Anonymous
06/19/24(Wed)05:57:15 No.101049462

Anonymous 06/19/24(Wed)05:57:15 No.101049462

Remember before 4bit quants where we had to wait 50 minutes just for a 13B model to load?

Anonymous
06/19/24(Wed)06:27:01 No.101049622

Anonymous 06/19/24(Wed)06:27:01 No.101049622

File: Untitled.png (697 KB, 1133x2315)

697 KB PNG

Prefixing Attention Sinks can Mitigate Activation Outliers for Large Language Model Quantization
https://arxiv.org/abs/2406.12016
>Despite recent advances in LLM quantization, activation quantization remains to be challenging due to the activation outliers. Conventional remedies, e.g., mixing precisions for different channels, introduce extra overhead and reduce the speedup. In this work, we develop a simple yet effective strategy to facilitate per-tensor activation quantization by preventing the generation of problematic tokens. Precisely, we propose a method to find a set of key-value cache, coined CushionCache, which mitigates outliers in subsequent tokens when inserted as a prefix. CushionCache works in two steps: First, we greedily search for a prompt token sequence that minimizes the maximum activation values in subsequent tokens. Then, we further tune the token cache to regularize the activations of subsequent tokens to be more quantization-friendly. The proposed method successfully addresses activation outliers of LLMs, providing a substantial performance boost for per-tensor activation quantization methods. We thoroughly evaluate our method over a wide range of models and benchmarks and find that it significantly surpasses the established baseline of per-tensor W8A8 quantization and can be seamlessly integrated with the recent activation quantization method.
pretty interesting but probably not viable due to the long time needed

Anonymous
06/19/24(Wed)06:34:26 No.101049669

Anonymous 06/19/24(Wed)06:34:26 No.101049669

File: Untitled.png (209 KB, 1098x818)

209 KB PNG

Mixture-of-Subspaces in Low-Rank Adaptation
https://arxiv.org/abs/2406.11909
>In this paper, we introduce a subspace-inspired Low-Rank Adaptation (LoRA) method, which is computationally efficient, easy to implement, and readily applicable to large language, multimodal, and diffusion models. Initially, we equivalently decompose the weights of LoRA into two subspaces, and find that simply mixing them can enhance performance. To study such a phenomenon, we revisit it through a fine-grained subspace lens, showing that such modification is equivalent to employing a fixed mixer to fuse the subspaces. To be more flexible, we jointly learn the mixer with the original LoRA weights, and term the method Mixture-of-Subspaces LoRA (MoSLoRA). MoSLoRA consistently outperforms LoRA on tasks in different modalities, including commonsense reasoning, visual instruction tuning, and subject-driven text-to-image generation, demonstrating its effectiveness and robustness.
https://github.com/wutaiqiang/MoSLoRA
new lora method that beats dora. owlore might be better as on their tests it beat the FFT version. still cool
https://github.com/pixeli99/OwLore

Anonymous
06/19/24(Wed)06:36:56 No.101049684

Anonymous 06/19/24(Wed)06:36:56 No.101049684

when is something better than transformers coming out?

Anonymous
06/19/24(Wed)06:41:59 No.101049719

Anonymous 06/19/24(Wed)06:41:59 No.101049719

File: Untitled.png (459 KB, 1055x2410)

459 KB PNG

Mixture of Scales: Memory-Efficient Token-Adaptive Binarization for Large Language Models
https://arxiv.org/abs/2406.12311
>Binarization, which converts weight parameters to binary values, has emerged as an effective strategy to reduce the size of large language models (LLMs). However, typical binarization techniques significantly diminish linguistic effectiveness of LLMs. To address this issue, we introduce a novel binarization technique called Mixture of Scales (BinaryMoS). Unlike conventional methods, BinaryMoS employs multiple scaling experts for binary weights, dynamically merging these experts for each token to adaptively generate scaling factors. This token-adaptive approach boosts the representational power of binarized LLMs by enabling contextual adjustments to the values of binary weights. Moreover, because this adaptive process only involves the scaling factors rather than the entire weight matrix, BinaryMoS maintains compression efficiency similar to traditional static binarization methods. Our experimental results reveal that BinaryMoS surpasses conventional binarization techniques in various natural language processing tasks and even outperforms 2-bit quantization methods, all while maintaining similar model size to static binarization techniques.
would be interesting to see how nemotron would perform after having this applied to it

Anonymous
06/19/24(Wed)06:50:42 No.101049797

Anonymous 06/19/24(Wed)06:50:42 No.101049797

>>101049684
2 more papers down the line

Anonymous
06/19/24(Wed)06:54:33 No.101049833

Anonymous 06/19/24(Wed)06:54:33 No.101049833

File: Untitled.png (403 KB, 1080x1377)

403 KB PNG

TroL: Traversal of Layers for Large Language and Vision Models
https://arxiv.org/abs/2406.12246
>Large language and vision models (LLVMs) have been driven by the generalization power of large language models (LLMs) and the advent of visual instruction tuning. Along with scaling them up directly, these models enable LLVMs to showcase powerful vision language (VL) performances by covering diverse tasks via natural language instructions. However, existing open-source LLVMs that perform comparably to closed-source LLVMs such as GPT-4V are often considered too large (e.g., 26B, 34B, and 110B parameters), having a larger number of layers. These large models demand costly, high-end resources for both training and inference. To address this issue, we present a new efficient LLVM family with 1.8B, 3.8B, and 7B LLM model sizes, Traversal of Layers (TroL), which enables the reuse of layers in a token-wise manner. This layer traversing technique simulates the effect of looking back and retracing the answering stream while increasing the number of forward propagation layers without physically adding more layers. We demonstrate that TroL employs a simple layer traversing approach yet efficiently outperforms the open-source LLVMs with larger model sizes and rivals the performances of the closed-source LLVMs with substantial sizes.
https://github.com/ByungKwanLee/TroL
https://huggingface.co/BK-Lee
https://huggingface.co/spaces/BK-Lee/TroL
code and models are up as well as a demo space. some OCR tests has the 3.8B version well outcompete the 7B so not sure what is up with that. also used qlora so switching to qdora should be a decent upgrade just from that.

Anonymous
06/19/24(Wed)06:56:18 No.101049849

Anonymous 06/19/24(Wed)06:56:18 No.101049849

>>101049838
>>101049838
>>101049838

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.