/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 07/11/24(Thu)00:16:30 No.101361021

File: BlueSkyColumnGarden.png (1.32 MB, 1248x800)

1.32 MB PNG

/lmg/ - Local Models General Anonymous 07/11/24(Thu)00:16:30 No.101361021 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>101345759 & >>101337910

►News
>(07/09) Anole, based on Chameleon, for interleaved image-text generation: https://hf.co/GAIR/Anole-7b-v0.1
>(07/07) Support for glm3 and glm4 merged into llama.cpp: https://github.com/ggerganov/llama.cpp/pull/8031
>(07/02) Japanese LLaMA-based model pre-trained on 2T tokens: https://hf.co/cyberagent/calm3-22b-chat
>(06/28) Inference support for Gemma 2 merged: https://github.com/ggerganov/llama.cpp/pull/8156
>(06/27) Meta announces LLM Compiler, based on Code Llama, for code optimization and disassembly: https://go.fb.me/tdd3dw

►News Archive: https://rentry.org/lmg-news-archive
►FAQ: https://wikia.schneedc.com
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Programming: https://hf.co/spaces/bigcode/bigcode-models-leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
07/11/24(Thu)00:17:30 No.101361028

Anonymous 07/11/24(Thu)00:17:30 No.101361028

File: threadrecap.png (1.48 MB, 1536x1536)

1.48 MB PNG

►Recent Highlights from the Previous Thread: >>101345759

--Papers: >>101346915
--LLaMAX: Scaling Linguistic Horizons of LLM: >>101348965 >>101350857 >>101351007 >>101351064
--Strategies for Addressing AI Models Ignoring Messages or Instructions: >>101348920 >>101349470 >>101349727 >>101349845 >>101349547
--PState Patch for P40: >>101347510 >>101347965 >>101348028 >>101348049 >>101348123 >>101348416 >>101348485 >>101348515 >>101348017 >>101348882 >>101349142
--Llama Server Issues Due to Renaming of Build Flags: >>101348996 >>101349131
--Increasing Context Length for Gemma2 and Llama.cpp: >>101354614 >>101354716 >>101354741 >>101354826 >>101355000
--Extrinsic Hallucinations in LLMs | Lil'Log: >>101346941
--D&D Campaign with AI Characters: Custom Front-End Endeavor or Futile Time Investment?: >>101346383 >>101346963 >>101347050
--Gemma's Bilingual Storytelling: English Narration with Chinese Dialogue: >>101347808 >>101347912 >>101347934
--AMD Acquires Silo AI to Expand Enterprise AI Solutions Globally: >>101351217
--Anon implemented conditional prompts and sequential replies in his frontend, but Gemma keeps inserting extra line breaks: >>101356130 >>101358137 >>101358186 >>101359098
--Anole: Experimental Multimodal Model with Minimal Training Requirements: >>101355115 >>101355464 >>101355568 >>101355840
--Uncucking Gemma-2 27b comes at a cost to performance: >>101352122 >>101352142 >>101353696
--The Future of Multimodal AI: Backends, Quantization, Quality, and Hardware Requirements: >>101356430 >>101356478 >>101356643 >>101356716 >>101356876 >>101357423 >>101356704 >>101356874 >>101356943 >>101356995 >>101357074 >>101356491
--Python Package for Compressing Floating-Point PyTorch Tensors with Potential for Distributed and Federated Training: >>101346655
--MMAP Bug Doubles RAM Usage on Windows: >>101348890 >>101349113 >>101349121
--Miku (free space): >>101348774 >>101351168 >>101357898

►Recent Highlight Posts from the Previous Thread: >>101345764

Anonymous
07/11/24(Thu)00:21:03 No.101361056

Anonymous 07/11/24(Thu)00:21:03 No.101361056

File: file.png (850 KB, 597x1150)

850 KB PNG

I want to do lewd RP on a gaymer PC so only 8gigs of vram. Do I go for Lunaris, Stheno or Gemma?

Anonymous
07/11/24(Thu)00:22:01 No.101361064

Anonymous 07/11/24(Thu)00:22:01 No.101361064

>>101361056
midnight miqu

Anonymous
07/11/24(Thu)00:26:27 No.101361093

Anonymous 07/11/24(Thu)00:26:27 No.101361093

>>101361064
Can't run that. I can do 20b at most.

Anonymous
07/11/24(Thu)00:28:08 No.101361106

Anonymous 07/11/24(Thu)00:28:08 No.101361106

>>101361056
Poor VRC gobbo with only 8 GB VRAM. Can't even run a filled instance with every avatar enabled.

Anonymous
07/11/24(Thu)00:30:28 No.101361126

Anonymous 07/11/24(Thu)00:30:28 No.101361126

>>101361093
you only listed a few models, start with one and try them yourself. i'd also consider older l2 13b tunes though, they have better coherency than smaller models

Anonymous
07/11/24(Thu)00:31:43 No.101361132

Anonymous 07/11/24(Thu)00:31:43 No.101361132

>>101361021
Is this the last bot-free thread on /g/?

Anonymous
07/11/24(Thu)00:33:01 No.101361145

Anonymous 07/11/24(Thu)00:33:01 No.101361145

>>101361132
>>101361028

Anonymous
07/11/24(Thu)00:44:45 No.101361214

Anonymous 07/11/24(Thu)00:44:45 No.101361214

File: arena.png (302 KB, 3464x1760)

302 KB PNG

This is a lie, right? It's rigged somehow for ChatGPT over Claude.

Anonymous
07/11/24(Thu)00:49:40 No.101361241

Anonymous 07/11/24(Thu)00:49:40 No.101361241

>>101361056
Lunaris is an upgraded version of Stheno made by the same guy. I haven't tried Gemma yet so I have no idea if it's better than Lunaris.

Anonymous
07/11/24(Thu)00:55:34 No.101361283

Anonymous 07/11/24(Thu)00:55:34 No.101361283

>>101361056
Lunaris or Stheno because you can only Sao models in this general, even if it's just a merge. Also, remember to shit on Undi, Drummer, and any other finetuner.

Anonymous
07/11/24(Thu)01:03:02 No.101361337

Anonymous 07/11/24(Thu)01:03:02 No.101361337

>>101361283
i was going to suggest the old mlewd 20b since he said 20b, but then i remembered its undi and that will send some people into a rage. my favorite old tune was still x-norochronos from him, basically mythomax but didn't say ministrations every 5 seconds

Anonymous
07/11/24(Thu)01:04:50 No.101361352

Anonymous 07/11/24(Thu)01:04:50 No.101361352

>>101361337
Anyone recommending anything pre Fimbulvetr is a deluded faggot

Anonymous
07/11/24(Thu)01:07:25 No.101361383

Anonymous 07/11/24(Thu)01:07:25 No.101361383

>>101361354
>gemma2-9b
go back

Anonymous
07/11/24(Thu)01:08:24 No.101361390

Anonymous 07/11/24(Thu)01:08:24 No.101361390

>>101361352
>Fimbulvetr
Why? Because it wasn't made by Sao, the savior of local models?

Anonymous
07/11/24(Thu)01:12:31 No.101361426

Anonymous 07/11/24(Thu)01:12:31 No.101361426

File: 1700722977591001.png (17 KB, 1515x97)

17 KB PNG

picrel is a subset (every 16th question) of MMLU Pro for gemma2-9b-it q8_0.
For those of you without eyes: 47.19% overall, top scoring subject was biology (75.56%) followed by economics (60.38%). Worst scoring subjects were engineering (19.67%) and law (33.851%).

Some of these questions are really dumb though. E.g. "A 2008 survey showed that what percentage of the world's largest companies are reporting their corporate responsibility?" with options ['40%', '90%', '50%', '100%', '80%', '70%', '60%', '20%', '30%', '10%']. The model has to somehow remember what survey that was, or something. I don't know. There were other "do you remember X" ones as well.
>>101361383
> t

Anonymous
07/11/24(Thu)01:13:57 No.101361440

Anonymous 07/11/24(Thu)01:13:57 No.101361440

Tenyx-DaybreakStorywriter-70B is the peak of local models and any claim to the contrary is VRAMlet cope and I'm tired of pretending it's not.

Anonymous
07/11/24(Thu)01:15:15 No.101361447

Anonymous 07/11/24(Thu)01:15:15 No.101361447

>>101361283
>Also, remember to shit on Undi, Drummer, and any other finetuner.
I see your ad has expired, Drummer.

Anonymous
07/11/24(Thu)01:17:45 No.101361469

Anonymous 07/11/24(Thu)01:17:45 No.101361469

>>101361447
Hi, Sao. No matter how much you try to deflect it, no one takes this general as their personal shilling dump as much as you.

Anonymous
07/11/24(Thu)01:19:51 No.101361482

Anonymous 07/11/24(Thu)01:19:51 No.101361482

>>101361214
Sam Altman is slimy enough to rig it

Anonymous
07/11/24(Thu)01:27:39 No.101361558

Anonymous 07/11/24(Thu)01:27:39 No.101361558

As if I have the free time to sit on my computer and shill my models kek. I simply don't have the time. National Service and all that.

It's a good thing people shill my models for me atleast? Wish I got paid big bucks, but it is what it is.

Anonymous
07/11/24(Thu)01:28:19 No.101361560

Anonymous 07/11/24(Thu)01:28:19 No.101361560

I don't really want to share examples of the project I'm working on or my prompts, as these are my own original characters I've had floating around in my head for a long time. But I have to say, Gemma 27b understands offensive absurdist comedy really fucking well. If you create 2 or more funny characters, stick them in a group chat and have them talk to each other with dynamic temp turned up pretty high, it spits out some hilarious shit. The characters I came up with were pretty well thought out and up to 1000 tokens just on their descriptions that I wrote myself, and their descriptions were very comedic - but the model really picked up on that. I'm actually astounded at how well it understands comedy. It has the characters saying some hilarious shit and is injecting a lot of relevant stuff I didn't even put in the description, and I have done fuck all to jailbreak the model - I just wrote really descriptive character cards of some racially offensive characters. This model fucking rules. Also it works really well with rope scaling and I got it up to 32k context with 160000 rope scale without a noticeable loss in quality. I am using textgen webui and sillytavern, base gemma 27b. Dynamic temp 1-1.78

Anonymous
07/11/24(Thu)01:29:26 No.101361573

Anonymous 07/11/24(Thu)01:29:26 No.101361573

>>101361558
At least you have time to randomly respond to shilling accusations in the middle of the night.

Anonymous
07/11/24(Thu)01:31:09 No.101361584

Anonymous 07/11/24(Thu)01:31:09 No.101361584

>>101361573

It's literally 1.30pm here, I'm on break between calls.

Anonymous
07/11/24(Thu)01:33:27 No.101361599

Anonymous 07/11/24(Thu)01:33:27 No.101361599

>>101361584
At least you have time to randomly respond to shilling accusations in the middle of the day.

Anonymous
07/11/24(Thu)01:33:45 No.101361604

Anonymous 07/11/24(Thu)01:33:45 No.101361604

Speaking of shilling I would like to shill the honeydew melon I just ate. I spent like an entire week further ripening it from after I bought it and holy shit it was so good.

Anonymous
07/11/24(Thu)01:37:00 No.101361628

Anonymous 07/11/24(Thu)01:37:00 No.101361628

What context and instruct presets should I use with gemma and ST?

Anonymous
07/11/24(Thu)01:37:19 No.101361630

Anonymous 07/11/24(Thu)01:37:19 No.101361630

>>101361558
Based

Anonymous
07/11/24(Thu)01:38:22 No.101361641

Anonymous 07/11/24(Thu)01:38:22 No.101361641

>>101361604

Cantaloupe >>>> honeydew melon

Anonymous
07/11/24(Thu)01:40:11 No.101361646

Anonymous 07/11/24(Thu)01:40:11 No.101361646

File: file.png (215 KB, 1398x1270)

215 KB PNG

>>101361283
Nah, the main issue is that the good ones are gone and have gotten jobs in the industry. The guy that created MythoMax, Gryphe, has only posted Pantheon-RP-1.0-8b-Llama-3 based on L3 and it hasn't been updated since May.
https://huggingface.co/Gryphe/Pantheon-RP-1.0-8b-Llama-3
The only other promising one is the one made by
sophosympatheia, which is basically a merge trying to chase something like Midnight Miqu with L3 70B. Pic related.
https://huggingface.co/sophosympatheia/New-Dawn-Llama-3-70B-32K-v1.0?not-for-all-audiences=true
Otherwise, the field is pretty dead for now since most of the announcements for the summer and midyear is over, I expect people will try and see if they can finetune Gemma 2 27B and we'll have a drought until someone in the fall/winter graces us with a incremental noticeable improvement with a new model over the existing ones. I don't expect even if Meta releases additional models like the 405B or if Google does more Gemma models for anything to change from them, the Chinese are probably going to have to match them and give them competitive pressure for them to release something new for local.

Anonymous
07/11/24(Thu)01:42:08 No.101361660

Anonymous 07/11/24(Thu)01:42:08 No.101361660

File: ugly-face-anon.png (86 KB, 400x400)

86 KB PNG

Anonymous
07/11/24(Thu)01:43:10 No.101361672

Anonymous 07/11/24(Thu)01:43:10 No.101361672

File: 5448894898.png (143 KB, 1715x790)

143 KB PNG

>>101361021
So now that Gemma 27B is established as the SOTA for open source is there a reason to have 48GB right now?

Anonymous
07/11/24(Thu)01:46:04 No.101361702

Anonymous 07/11/24(Thu)01:46:04 No.101361702

>>101361214
Was testing them right now. Claude is slightly better. GPT-4o is always experiencing an API rate limit so there's no clear way to test the two against each other.

Anonymous
07/11/24(Thu)01:48:08 No.101361722

Anonymous 07/11/24(Thu)01:48:08 No.101361722

>>101361604
>>101361641
Local Melons?

Anonymous
07/11/24(Thu)01:49:10 No.101361730

Anonymous 07/11/24(Thu)01:49:10 No.101361730

>>101361646
Command R++ 30B and 110B will save the general.

Anonymous
07/11/24(Thu)01:54:48 No.101361775

Anonymous 07/11/24(Thu)01:54:48 No.101361775

>>101361730
400B or bust, vramlet

Anonymous
07/11/24(Thu)01:56:32 No.101361786

Anonymous 07/11/24(Thu)01:56:32 No.101361786

when gemma 8k context cum?

Anonymous
07/11/24(Thu)01:57:03 No.101361790

Anonymous 07/11/24(Thu)01:57:03 No.101361790

>>101361786
>8k
lmao
lol

Anonymous
07/11/24(Thu)02:25:26 No.101361978

Anonymous 07/11/24(Thu)02:25:26 No.101361978

File: 1716661982048984.jpg (86 KB, 1024x576)

86 KB JPG

>>101361021
fuck, this thread is awful now. just a collection of extremely disturbed old men bickering about complete nonsense

Anonymous
07/11/24(Thu)02:28:20 No.101361996

Anonymous 07/11/24(Thu)02:28:20 No.101361996

>>101361978
>What is the average thread on 4chins?

Anonymous
07/11/24(Thu)02:40:13 No.101362072

Anonymous 07/11/24(Thu)02:40:13 No.101362072

File: firefox_ePShZqocCv.png (545 KB, 589x908)

545 KB PNG

(You)

Anonymous
07/11/24(Thu)02:44:09 No.101362105

Anonymous 07/11/24(Thu)02:44:09 No.101362105

>>101361672
I'm still waiting for a working implementation before I make my judgement.

Anonymous
07/11/24(Thu)02:55:10 No.101362213

Anonymous 07/11/24(Thu)02:55:10 No.101362213

File: firefox_QN9zN5zuH7.png (287 KB, 2350x1238)

287 KB PNG

So how does Gemma 27B compare to mistral? Considering they are about the same size...

Anonymous
07/11/24(Thu)02:58:34 No.101362245

Anonymous 07/11/24(Thu)02:58:34 No.101362245

File: Screenshot 2024-07-11 at (...).png (692 KB, 1920x2724)

692 KB PNG

>>101362213

Anonymous
07/11/24(Thu)03:02:01 No.101362282

Anonymous 07/11/24(Thu)03:02:01 No.101362282

File: Screenshot 2024-07-11 at (...).png (564 KB, 1920x2561)

564 KB PNG

>>101362245

Anonymous
07/11/24(Thu)03:06:02 No.101362318

Anonymous 07/11/24(Thu)03:06:02 No.101362318

File: Screenshot 2024-07-11 at (...).png (590 KB, 1920x2796)

590 KB PNG

>>101362282

Anonymous
07/11/24(Thu)03:11:53 No.101362365

Anonymous 07/11/24(Thu)03:11:53 No.101362365

>>101362213
>>101362245
>>101362282
>>101362318
Nice.

Anonymous
07/11/24(Thu)03:12:02 No.101362368

Anonymous 07/11/24(Thu)03:12:02 No.101362368

>>101361214
I always know when it's Claude because it keeps refusing the request over most inane shit, and I always make sure to vote against it, even if its opponent's answer is dumb: it's still better than the refusal. ChatGPT does not refuse as much. That's that cause, I'm pretty sure.

Anonymous
07/11/24(Thu)03:12:12 No.101362370

Anonymous 07/11/24(Thu)03:12:12 No.101362370

File: Untitled.png (294 KB, 720x905)

294 KB PNG

OpenDiLoCo: An Open-Source Framework for Globally Distributed Low-Communication Training
https://arxiv.org/abs/2407.07852
>OpenDiLoCo is an open-source implementation and replication of the Distributed Low-Communication (DiLoCo) training method for large language models. We provide a reproducible implementation of the DiLoCo experiments, offering it within a scalable, decentralized training framework using the Hivemind library. We demonstrate its effectiveness by training a model across two continents and three countries, while maintaining 90-95% compute utilization. Additionally, we conduct ablations studies focusing on the algorithm's compute efficiency, scalability in the number of workers and show that its gradients can be all-reduced using FP16 without any performance degradation. Furthermore, we scale OpenDiLoCo to 3x the size of the original work, demonstrating its effectiveness for billion parameter models.
https://github.com/PrimeIntellect-ai/OpenDiLoCo
not quite there but neat

Anonymous
07/11/24(Thu)03:18:01 No.101362417

Anonymous 07/11/24(Thu)03:18:01 No.101362417

File: firefox_nicivRtMT1.png (204 KB, 1465x1247)

204 KB PNG

Anonymous
07/11/24(Thu)03:23:22 No.101362452

Anonymous 07/11/24(Thu)03:23:22 No.101362452

File: Screenshot 2024-07-11 at (...).png (702 KB, 1623x4136)

702 KB PNG

>>101362417

Anonymous
07/11/24(Thu)03:34:48 No.101362541

Anonymous 07/11/24(Thu)03:34:48 No.101362541

>>101362368
It isn't the cause because they have an "exclude refusals" leaderboard and it's the same there.

According to lmsys Llama3 70B is also superior Claude Opus in English, which is fucking stupid. I think it's probably not rigged and the voters are just retards, likely ESL Indians. Their preferences have no informational value.

Anonymous
07/11/24(Thu)03:40:09 No.101362592

Anonymous 07/11/24(Thu)03:40:09 No.101362592

>downloaded C2 logs
>cleaned and deduplicated the shit out of them
>ended up with just 4k logs
Huh?

Anonymous
07/11/24(Thu)03:41:42 No.101362603

Anonymous 07/11/24(Thu)03:41:42 No.101362603

why does gemma-2-27b looks multiple times better on lmsys than on llamacpp?

Anonymous
07/11/24(Thu)03:45:19 No.101362640

Anonymous 07/11/24(Thu)03:45:19 No.101362640

File: Untitled.png (378 KB, 720x856)

378 KB PNG

Towards Robust Alignment of Language Models: Distributionally Robustifying Direct Preference Optimization
https://arxiv.org/abs/2407.07880
>This study addresses the challenge of noise in training datasets for Direct Preference Optimization (DPO), a method for aligning Large Language Models (LLMs) with human preferences. We categorize noise into pointwise noise, which includes low-quality data points, and pairwise noise, which encompasses erroneous data pair associations that affect preference rankings. Utilizing Distributionally Robust Optimization (DRO), we enhance DPO's resilience to these types of noise. Our theoretical insights reveal that DPO inherently embeds DRO principles, conferring robustness to pointwise noise, with the regularization coefficient β playing a critical role in its noise resistance. Extending this framework, we introduce Distributionally Robustifying DPO (Dr. DPO), which integrates pairwise robustness by optimizing against worst-case pairwise scenarios. The novel hyperparameter β′ in Dr. DPO allows for fine-tuned control over data pair reliability, providing a strategic balance between exploration and exploitation in noisy training environments. Empirical evaluations demonstrate that Dr. DPO substantially improves the quality of generated text and response accuracy in preference datasets, showcasing enhanced performance in both noisy and noise-free settings.
https://github.com/junkangwu/Dr_DPO
let's hope it also works well with rp

Anonymous
07/11/24(Thu)04:13:46 No.101362908

Anonymous 07/11/24(Thu)04:13:46 No.101362908

Is Mixtral peak for 32 GB RAM & 16GB VRAM?

Anonymous
07/11/24(Thu)04:25:35 No.101363031

Anonymous 07/11/24(Thu)04:25:35 No.101363031

File: LLMs.png (224 KB, 900x900)

224 KB PNG

Anonymous
07/11/24(Thu)04:31:01 No.101363081

Anonymous 07/11/24(Thu)04:31:01 No.101363081

>>101363031
you're an atheist, why are you talking about souls

Anonymous
07/11/24(Thu)04:31:20 No.101363084

Anonymous 07/11/24(Thu)04:31:20 No.101363084

File: slow asf - cut.jpg (215 KB, 1060x679)

215 KB JPG

>>101363031
classic matrix meme
>>101360219
its not exactly true in this case but that recent paper that shows that small models cant tell the difference between related concepts like raven and bird but large models have specialised neurons for raven and corvid and bluejay for eg.
It got buzz just a bit after that one about "I am literally the golden gate bridge"
>>101360590
wait so what happens when I use a .env file and DONT specify it as source? tbf I've only ever used a .env in windows projects. Does the python module that picks it up not work on linux?
>>101356430
>>101355464
>>101355115
Where can I get one of these that isnt 12 months old?

Anonymous
07/11/24(Thu)04:37:00 No.101363119

Anonymous 07/11/24(Thu)04:37:00 No.101363119

Are any of you using Gemma for ERP? Cause I'm getting nothing but purple prose and end up just switching to a more retarded but lewder model

Anonymous
07/11/24(Thu)04:42:09 No.101363156

Anonymous 07/11/24(Thu)04:42:09 No.101363156

>>101363119
Whats the point of a non multi-modal 30B when I can run miqu 103B at 8T/s
Find me a good multi modal 70B.

Anonymous
07/11/24(Thu)04:45:35 No.101363177

Anonymous 07/11/24(Thu)04:45:35 No.101363177

>>101363156
>miqu 103B
gemma shits on your meme upscale ULTRA QUALITY merge

Anonymous
07/11/24(Thu)05:09:21 No.101363368

Anonymous 07/11/24(Thu)05:09:21 No.101363368

Is there a way to get the robot to stop saying shit like tableau?

Anonymous
07/11/24(Thu)05:14:23 No.101363415

Anonymous 07/11/24(Thu)05:14:23 No.101363415

>>101363368
Add this to your system prompt:
>You are encouraged to speak in layman terms. Avoid using words that would require a secondary education to understand.

Anonymous
07/11/24(Thu)05:16:04 No.101363429

Anonymous 07/11/24(Thu)05:16:04 No.101363429

>>101363415
>encouraged
Isn't being a prompt nazi a better idea?
>you are required
vs
>encouraged, you prefer, et cetera

Or does it not matter in the end? My prompts tend to get ignored anyway.

Anonymous
07/11/24(Thu)05:30:34 No.101363555

Anonymous 07/11/24(Thu)05:30:34 No.101363555

>>101361646
>Nah, the main issue is that the good ones are gone and have gotten jobs in the industry. The guy that created MythoMax, Gryphe, has only posted Pantheon-RP-1.0-8b-Llama-3 based on L3 and it hasn't been updated since May.

I'm not listed there and don't know if I was a good one, but I can say I stopped finetuning not because I got a job in the field, but because I feel that with the latest models there's no real need to improve anything. Or more in detail, finetuning at an amateur scale isn't going to improve models in most cases, with one notable exception of making them less censored *by default*... but there's little that prompting won't solve, and if you really need a compliant assistant for productivity tasks you can still orthogonalize away the refusal direction with a fraction of the resources.

Once we will get useful multimodal models (perhaps even bitnet), finetuning will become inaccessible for most amateurs in the community anyway; dataset complexity and hardware requirements for finetuning will skyrocket.

Anonymous
07/11/24(Thu)05:45:20 No.101363672

Anonymous 07/11/24(Thu)05:45:20 No.101363672

>>101361672

More vram is always good?

Imagine running batch, unquanted, with multiple replies all at once, in seconds. Then, you pick the best answer and move on. There is no downside to more vram. Don't be a Vramlet.

Anonymous
07/11/24(Thu)05:48:23 No.101363696

Anonymous 07/11/24(Thu)05:48:23 No.101363696

>>101361672
Models won't stop improving, there are always going to be bigger, better ones that will require more than one GPU to run.

Anonymous
07/11/24(Thu)05:53:20 No.101363742

Anonymous 07/11/24(Thu)05:53:20 No.101363742

>>101363429
handled by the second clause 'avoid x' desu

Anonymous
07/11/24(Thu)05:54:04 No.101363749

Anonymous 07/11/24(Thu)05:54:04 No.101363749

>>101363119
Gemma is a huge drama queen, it'll always steer towards detailing feelings and emotional reactions over more objective physical description. If you want something raw and less purple go with another model.
It's great for the bi-polar gf breakdown experience though, just without the hot, sweaty make-up sex that makes it tolerable.

Anonymous
07/11/24(Thu)05:55:03 No.101363757

Anonymous 07/11/24(Thu)05:55:03 No.101363757

>>101361672
Yes, being poor is the primary reason for only having 48GB VRAM.

Anonymous
07/11/24(Thu)06:08:47 No.101363879

Anonymous 07/11/24(Thu)06:08:47 No.101363879

>>101363429
The machines have the pink elephant problem so you can't say 'don't do X' because it will just focus on X.

Anonymous
07/11/24(Thu)06:13:59 No.101363945

Anonymous 07/11/24(Thu)06:13:59 No.101363945

What's the current best model for erotica if you only have a 3090 and 32GB of RAM?

Anonymous
07/11/24(Thu)06:19:44 No.101363997

Anonymous 07/11/24(Thu)06:19:44 No.101363997

File: firefox_DktNKn3Wqc.png (1.14 MB, 1020x1166)

1.14 MB PNG

>P40 needs some special power supply and won't work with PCI-E

Just kill me now.

Why didn't you warn me, /lmg/?

Anonymous
07/11/24(Thu)06:22:17 No.101364028

Anonymous 07/11/24(Thu)06:22:17 No.101364028

>>101363945
>24GB VRAM
You're in Mixtral range to be sure

Anonymous
07/11/24(Thu)06:26:06 No.101364061

Anonymous 07/11/24(Thu)06:26:06 No.101364061

>>101363997
>seller didn't include the power adapter
C H I N K E D

https://www.amazon.com/s?k=p40+power+cable

Anonymous
07/11/24(Thu)06:26:25 No.101364065

Anonymous 07/11/24(Thu)06:26:25 No.101364065

>>101363879
That's not always true. Modern models understand negations well if they're recent in the context. Try with general behavior-related instructions as a depth 0 author note.

Anonymous
07/11/24(Thu)06:28:20 No.101364078

Anonymous 07/11/24(Thu)06:28:20 No.101364078

>>101364028
>koboldcpp/mythomax-l2-13b.Q5_0
Currently using this one.

Anonymous
07/11/24(Thu)06:31:18 No.101364116

Anonymous 07/11/24(Thu)06:31:18 No.101364116

Almost all L3 70B community tunes seem to be severely undertrained, because they all give almost the same responses as each other to the same prompt. It's the same model over and over, even for tunes without slopped datasets, so it's pretty obvious they're all just not training for long enough. I guess it's just getting too expensive to for randos to properly tune these huge overtrained models.

Anonymous
07/11/24(Thu)06:36:56 No.101364172

Anonymous 07/11/24(Thu)06:36:56 No.101364172

>>101364116
unless basically everyone is hailing a tune as better, just always use the base models

for RP:
if you have 96+gb ram = wizard 8x22
if you have less = gemma 27

Anonymous
07/11/24(Thu)06:38:18 No.101364182

Anonymous 07/11/24(Thu)06:38:18 No.101364182

>>101364172
>for RP:
>if you have 96+gb ram = wizard 8x22
>if you have less = gemma 27
for coding:
https://aider.chat/docs/leaderboards/
for vision tasks depends on what you need but thats basically the summary, everything else is a meme

Anonymous
07/11/24(Thu)06:40:21 No.101364210

Anonymous 07/11/24(Thu)06:40:21 No.101364210

>>101364061
Is this just a normal 8-pin CPU? Can a 1xPCIE into 1xCPU work?

Anonymous
07/11/24(Thu)06:40:28 No.101364211

Anonymous 07/11/24(Thu)06:40:28 No.101364211

>>101364172
There's 2 or 3 exceptions in the L3 70B space, like Euryale which gives genuinely very different responses to other models and to base (this is not any particular endorsement of Euryale, I'm just saying it was not undertrained).
But yeah I see where you're coming from and to a degree you're right, there's a lot of bullshit and fad models.

Anonymous
07/11/24(Thu)06:49:45 No.101364298

Anonymous 07/11/24(Thu)06:49:45 No.101364298

https://github.com/NVlabs/MambaVision
Mamba won

Anonymous
07/11/24(Thu)06:51:32 No.101364314

Anonymous 07/11/24(Thu)06:51:32 No.101364314

>>101364116
The main problem is that if you finetune just on smut/erotica, after training the models long enough to significantly affect the way they talk, they will become extremely dumb or poorly usable at the least. Another is that in practice you can make deeper changes (rather than mainly stylistic/format changes) to the model's "way of thinking" only with full finetuning.

A partial solution would be training on smut + instructions + data that reproduces the mixture observed by the original models during training, but of course that will increase costs and dataset curation efforts significantly. And with full finetuning you'd need at least 4x more VRAM/hardware.

Although some grifters thought they could "get rich" like some have with image models, training costs for LLMs are not sustainable for amateurs.

Anonymous
07/11/24(Thu)06:55:57 No.101364357

Anonymous 07/11/24(Thu)06:55:57 No.101364357

>>101364314
i feel like the reason why its hard to tune modern models compared to olders one is the amount of data they are trained on being basically an order of magnitude larger while being trained for more gpu hours, making the model harder to change with the same tools in the same way it was done before

while also most models having some small quirks that need to be taken into account in order to the training to work at all, i mean just look at gemma and how many fixes it needed to just get it to run properly, look at old mixtral 8x7 and how long it took for people to understand how to tune it at all, look at L3 etc

Anonymous
07/11/24(Thu)06:58:43 No.101364381

Anonymous 07/11/24(Thu)06:58:43 No.101364381

>>101364314
>The main problem is that if you finetune just on smut/erotica
IIRC the way NovelAI avoids this is heavy finetuning on huge corpus of fiction writing in general, not just coomer content
But they are obviously taking some additional action to ensure that this process doesn't lead to the model becoming incapable of horniness or writing like it's from the 19th century. not sure what

Anonymous
07/11/24(Thu)07:07:25 No.101364457

Anonymous 07/11/24(Thu)07:07:25 No.101364457

>>101364078
>>101364028
Well? What is a good model where AI remembers what characters are wearing?

llama.cpp CUDA dev !!OM2Fp6Fn93S
07/11/24(Thu)07:07:50 No.101364464

llama.cpp CUDA dev !!OM2Fp6Fn93S 07/11/24(Thu)07:07:50 No.101364464

>>101364210
>Is this just a normal 8-pin CPU?
I think yes.

>Can a 1xPCIE into 1xCPU work?
That will also depend on the power supply cables.
For my P40s I had to use the adapter because the noses of my Corsair 8-pin CPU cables were too wide to fit the P40s.

Anonymous
07/11/24(Thu)07:08:51 No.101364476

Anonymous 07/11/24(Thu)07:08:51 No.101364476

>>101364210
no that'll fry the card,. the adapter switches polarity around.
https://old.reddit.com/r/homelab/comments/10to1wu/cse846_x9dr3f_tesla_p40_gpu_power_cable_help/j782wst/

llama.cpp CUDA dev !!OM2Fp6Fn93S
07/11/24(Thu)07:11:24 No.101364501

llama.cpp CUDA dev !!OM2Fp6Fn93S 07/11/24(Thu)07:11:24 No.101364501

>>101364476
Good to know, thanks.
Don't they usually make the shapes of the holes/pins in such a way though that you can't plug the wrong things together?

Anonymous
07/11/24(Thu)07:29:34 No.101364673

Anonymous 07/11/24(Thu)07:29:34 No.101364673

Are CogVLM2 & CogVLM2-Video real multimodal models? I can't figure that out from the description.

Anonymous
07/11/24(Thu)07:34:10 No.101364719

Anonymous 07/11/24(Thu)07:34:10 No.101364719

>>101363031
If you don't like LLMs then leave.
If nobody wants to talk to you it's not the fault of LLMs. It's you for being an uninteresting schizo.

Anonymous
07/11/24(Thu)07:40:43 No.101364771

Anonymous 07/11/24(Thu)07:40:43 No.101364771

File: ApplicationFrameHost_VPZf(...).png (132 KB, 942x564)

132 KB PNG

>>101364476
>>101364464
I meant the adapter, 1xPCIE <-> 1xCPU as on pic related, instead of 2xPCIE <-> 1xCPU as the link above reccomends,

Anonymous
07/11/24(Thu)07:41:44 No.101364784

Anonymous 07/11/24(Thu)07:41:44 No.101364784

>>101364719
It's a jook. An LLM wrote it.

Anonymous
07/11/24(Thu)07:46:44 No.101364828

Anonymous 07/11/24(Thu)07:46:44 No.101364828

>>101364501
>>101364476
And the shapes of the holes are different. For PCIE it's
XYYY
YYXY
For CPU (and P40) it's
YXXY
XYYX
I also did plug, completely or partially, the PCIE into P40, and the system didn't boot at all. Are P40 dead now? My bet is not! Pray for me.

Anonymous
07/11/24(Thu)07:49:03 No.101364856

Anonymous 07/11/24(Thu)07:49:03 No.101364856

>>101364314
It's like what happened to sailing
All of the old guard L1 tuners took their knowledge of good old fashioned boats and secrets of the trade with them and the new generation is left trying to wrangle the equivalent of nuclear submarines with clippings of encrypted soviet instruction manuals

Anonymous
07/11/24(Thu)08:12:50 No.101365037

Anonymous 07/11/24(Thu)08:12:50 No.101365037

>>101364182
>for vision tasks
None of the public, easily usable models are really that useful right now without a lot of extra work.

Anonymous
07/11/24(Thu)08:14:23 No.101365050

Anonymous 07/11/24(Thu)08:14:23 No.101365050

>>101361996
Especially one that's mostly during the night Europe/US time.

Anonymous
07/11/24(Thu)08:15:54 No.101365060

Anonymous 07/11/24(Thu)08:15:54 No.101365060

>>101361021
any good gemma-9b jailbreaks that are on the level of >>101180719 ?

Anonymous
07/11/24(Thu)08:16:05 No.101365063

Anonymous 07/11/24(Thu)08:16:05 No.101365063

>>101364078
mixtral-8x7b-instruct-v0.1.Q5_0.gguf is what I use. Offload as much as possible to the GPU for speed, lower context until it fits in RAM.

Anonymous
07/11/24(Thu)08:24:30 No.101365141

Anonymous 07/11/24(Thu)08:24:30 No.101365141

>>101364314
>>101364357
>>101364381
>>101364856
You are cancer.

Anonymous
07/11/24(Thu)08:29:22 No.101365199

Anonymous 07/11/24(Thu)08:29:22 No.101365199

>>101365141
>nooo you cant talk about training local models on a local models general!!!
what a retard

Anonymous
07/11/24(Thu)08:31:39 No.101365214

Anonymous 07/11/24(Thu)08:31:39 No.101365214

>>101365141
What did he do?

Anonymous
07/11/24(Thu)08:33:15 No.101365223

Anonymous 07/11/24(Thu)08:33:15 No.101365223

>>101365141
no one cares what you think, undi

Anonymous
07/11/24(Thu)08:34:55 No.101365244

Anonymous 07/11/24(Thu)08:34:55 No.101365244

What is the best LLM to cope with the fact that local LLMs will never match cloud ones?

Anonymous
07/11/24(Thu)08:38:07 No.101365274

Anonymous 07/11/24(Thu)08:38:07 No.101365274

File: 1707774957385295.png (1.08 MB, 421x3087)

1.08 MB PNG

>>101365244
whatever you're running, ilya cuckskever

Anonymous
07/11/24(Thu)08:38:07 No.101365275

Anonymous 07/11/24(Thu)08:38:07 No.101365275

>>101365244
see >>101361672

Anonymous
07/11/24(Thu)08:40:50 No.101365295

Anonymous 07/11/24(Thu)08:40:50 No.101365295

>>101361672
SOTA for open source is DeepSeekV2 Chat > Qwen2 72B > Gemma 2 27B

Anonymous
07/11/24(Thu)08:41:07 No.101365299

Anonymous 07/11/24(Thu)08:41:07 No.101365299

Where Magemgnum?

Anonymous
07/11/24(Thu)08:44:52 No.101365336

Anonymous 07/11/24(Thu)08:44:52 No.101365336

File: Screenshot from 2024-07-1(...).png (191 KB, 953x572)

191 KB PNG

I believed you when you said that just using a card would uncuck gemma

Anonymous
07/11/24(Thu)08:45:00 No.101365338

Anonymous 07/11/24(Thu)08:45:00 No.101365338

>>101365063
>Offload as much as possible to the GPU for speed, lower context until it fits in RAM.
Mind giving instructions?

Anonymous
07/11/24(Thu)08:46:35 No.101365360

Anonymous 07/11/24(Thu)08:46:35 No.101365360

>>101361672
I just came back from a long slumber, I haven't tried gemma yet since I'm under the impression based on posts here that gguf gemma quants are busted and the current version of koboldcpp (from 1 week ago) probably doesn't work correctly with it yet. I'd rather not taint my first experience with a lobotomized version if it's as good as the benchmarks suggest.

Anonymous
07/11/24(Thu)08:46:51 No.101365362

Anonymous 07/11/24(Thu)08:46:51 No.101365362

>want to give instructions but using a FIM coding model
>realize I can just write a comment in the code as a instruction and it will autocomplete
I'm literally feeling like albert einstein rn

Anonymous
07/11/24(Thu)08:58:43 No.101365482

Anonymous 07/11/24(Thu)08:58:43 No.101365482

>>101365362
>lmgzoomer rediscovers how models were prompted before instruct existed

Anonymous
07/11/24(Thu)09:01:29 No.101365506

Anonymous 07/11/24(Thu)09:01:29 No.101365506

>>101365336
both card and system prompt should have some uncensor instructions, but it makes gemma retarded a bit, could be llama.cpp's fault, idk.

Anonymous
07/11/24(Thu)09:02:49 No.101365518

Anonymous 07/11/24(Thu)09:02:49 No.101365518

>>101365336
>it is harmful to depict violence against a fictional character
holy shit, somebody call Hollywood and let them know!

Anonymous
07/11/24(Thu)09:09:03 No.101365562

Anonymous 07/11/24(Thu)09:09:03 No.101365562

File: _564da73d-3ebf-463d-8073-(...).jpg (210 KB, 1024x1024)

210 KB JPG

Any interest in seeing the results of gemma-27b-it pinned to just two P100? I figure that'll give people a good sense of the cheapest way to run the model at q8.

Anonymous
07/11/24(Thu)09:15:48 No.101365638

Anonymous 07/11/24(Thu)09:15:48 No.101365638

>>101365336
>especially
Hurt and rape retards i guess

Anonymous
07/11/24(Thu)09:15:50 No.101365640

Anonymous 07/11/24(Thu)09:15:50 No.101365640

>>101361283
>Drummer
who? nice self advertisement faggot

Anonymous
07/11/24(Thu)09:16:55 No.101365655

Anonymous 07/11/24(Thu)09:16:55 No.101365655

>>101365640
go back

Anonymous
07/11/24(Thu)09:20:24 No.101365687

Anonymous 07/11/24(Thu)09:20:24 No.101365687

>>101362213
>Considering they are about the same size
>27B vs 47B
>the same size
I don't even know how to comment on that

Anonymous
07/11/24(Thu)09:20:33 No.101365688

Anonymous 07/11/24(Thu)09:20:33 No.101365688

>>101365562
Yeah, I'm interested.

Anonymous
07/11/24(Thu)09:23:22 No.101365714

Anonymous 07/11/24(Thu)09:23:22 No.101365714

>>101365687
And when actually spitting tokens
>27B vs 13B
Dope.

Anonymous
07/11/24(Thu)09:24:07 No.101365719

Anonymous 07/11/24(Thu)09:24:07 No.101365719

>>101365655
buy an ad

Anonymous
07/11/24(Thu)09:24:09 No.101365720

Anonymous 07/11/24(Thu)09:24:09 No.101365720

>>101365687
Considering the alternatives either 8B or 70B (with rare exceptions), yeah, I say they are almost the same.

Plus Mixtral doesn't use all parameters at once.

Anonymous
07/11/24(Thu)09:25:55 No.101365750

Anonymous 07/11/24(Thu)09:25:55 No.101365750

>>101365720
wrong

Anonymous
07/11/24(Thu)09:25:56 No.101365751

Anonymous 07/11/24(Thu)09:25:56 No.101365751

>>101361021
Alright so in the last thread, someone mentioned this model: https://huggingface.co/sophosympatheia/New-Dawn-Llama-3-70B-32K-v1.0

Didn't seem like shilling, seemed genuine, but who knows. Been trying it out... I'm pretty impressed for far. Its not as lewd as Euryale which in my opinion is a good thing, Euryale is way too horny to the point that reluctant character's will jump on cock, seems the other merges toned it down a bit, its smart so far, and less dry than midnight miqu, and also pushed to 32k context.

Have to do more testing, more complicated character cards, different personalities, cards with multiple characters, but so far, looking good. Don't wanna jump to conclusions yet though, haven't tested enough.

Anonymous
07/11/24(Thu)09:28:08 No.101365774

Anonymous 07/11/24(Thu)09:28:08 No.101365774

>>101365720
>I say they are almost the same
they are not comparable at all, below 100B the difference in quality between LLMs sizes is more than linear

Anonymous
07/11/24(Thu)09:28:27 No.101365775

Anonymous 07/11/24(Thu)09:28:27 No.101365775

>>101365751
GO BACK SHILL

BUY AN ADD

OMG MIKU

Anonymous
07/11/24(Thu)09:34:34 No.101365838

Anonymous 07/11/24(Thu)09:34:34 No.101365838

What makes gemma more stupid, ortho or jailbreak prompt?

Anonymous
07/11/24(Thu)09:36:47 No.101365866

Anonymous 07/11/24(Thu)09:36:47 No.101365866

>>101365751
>Euryale is way too horny to the point that reluctant character's will jump on cock
The main issue of so-called community finetunes.

Anonymous
07/11/24(Thu)09:38:56 No.101365887

Anonymous 07/11/24(Thu)09:38:56 No.101365887

>>101365838
From what I've seen with Llama 3, orthogonalization will remove the model's ability to refuse in all scenarios, not just "safety refusals". So I'd say that for roleplay purposes ortho will be worse. Try improving your jailbreak prompt.

Anonymous
07/11/24(Thu)09:39:18 No.101365891

Anonymous 07/11/24(Thu)09:39:18 No.101365891

File: ooba.png (137 KB, 795x1574)

137 KB PNG

>>101365338
>Mind giving instructions?
not at all, but I use oobabooga so it won't do you much good.

Anonymous
07/11/24(Thu)09:41:22 No.101365908

Anonymous 07/11/24(Thu)09:41:22 No.101365908

>>101365866
Yeah, community fine-tunes are made by braindead retards, literal discord gooners.
But they have good datasets, if only they released it to the public so we could figure things together as a community.

Anonymous
07/11/24(Thu)09:41:30 No.101365909

Anonymous 07/11/24(Thu)09:41:30 No.101365909

File: 3x-p100-gemma-27b.png (83 KB, 1671x1251)

83 KB PNG

>>101365688
>>>101365562 (You)
Here you go:
INFO [           print_timings] prompt eval time     =    3228.25 ms /   263 tokens (   12.27 ms per token,    81.47 tokens per second) | tid="139861411241984" timestamp=1720704761 id_slot=0 id_task=635 t_prompt_processing=3228.253 n_prompt_tokens_processed=263 t_token=12.274726235741445 n_tokens_second=81.46821206392435
INFO [           print_timings] generation eval time =   25668.53 ms /   182 runs   (  141.04 ms per token,     7.09 tokens per second) | tid="139861411241984" timestamp=1720704761 id_slot=0 id_task=635 t_token_generation=25668.525 n_decoded=182 t_token=141.03585164835167 n_tokens_second=7.090395727841782
INFO [           print_timings]           total time =   28896.78 ms | tid="139861411241984" timestamp=1720704761 id_slot=0 id_task=635 t_prompt_processing=3228.253 t_token_generation=25668.525 t_total=28896.778000000002
This is using SillyTavern as a front end with response tokens set to 2048 (I'd forgotten to change it back from a CR+ session where I was asking for some ESL lesson material). From a usage standpoint, the reply speed is great, and token processing time is very short before the reply streams back.

Only thing is I got on OOM on just two P100, so I had to include the third. Maybe when the code is improved it will fit in just 32GB, but if you're going for a Mikubox, three P100 fit fine.

Anonymous
07/11/24(Thu)09:41:43 No.101365912

Anonymous 07/11/24(Thu)09:41:43 No.101365912

>>101365891
Why are you using Alpha 1.5 for 12k Context on a 32k context model?
Is it because of the whole SWA business?

Anonymous
07/11/24(Thu)09:42:21 No.101365922

Anonymous 07/11/24(Thu)09:42:21 No.101365922

>>101365908
>we
go back

Anonymous
07/11/24(Thu)09:43:52 No.101365938

Anonymous 07/11/24(Thu)09:43:52 No.101365938

>>101364028
Mixtral is bad for erotica though, too boring and have "family friendly" feel

Anonymous
07/11/24(Thu)09:44:23 No.101365945

Anonymous 07/11/24(Thu)09:44:23 No.101365945

>>101365922
Oh, sorry, I forgot there are idiots here that can't even fine-tune a model to save their lives.

Anonymous
07/11/24(Thu)09:44:53 No.101365951

Anonymous 07/11/24(Thu)09:44:53 No.101365951

>>101365912
>Mixtral
>SWA
anon...

Anonymous
07/11/24(Thu)09:46:04 No.101365964

Anonymous 07/11/24(Thu)09:46:04 No.101365964

>>101365951
I don't know, that's why I'm asking.
I think base Mistral used SWA right?

Anonymous
07/11/24(Thu)09:48:17 No.101365982

Anonymous 07/11/24(Thu)09:48:17 No.101365982

>>101365908
There's not much to it. the training data must include character interactions in all RP scenarios, even mundane, not just erotic. ERP should be just a small fraction of the data, or finetuned first (à la "curriculum training"), so that the model's default outputs will be predominantly biased on the non-ERP scenarios finetuned last.

Problems: it's not fun to curate non-ERP data, the model will be more expensive to train, there's a lack of high-quality non-ERP data created by humans, and a plethora of other problems stemming from the fact that roleplay data is just not enough for a smart and all-around good model.

Anonymous
07/11/24(Thu)09:49:19 No.101365995

Anonymous 07/11/24(Thu)09:49:19 No.101365995

>>101365891
>changing alpha for 12k context in native 32k
lord almighty, this general gonna kill me

Anonymous
07/11/24(Thu)09:49:53 No.101366004

Anonymous 07/11/24(Thu)09:49:53 No.101366004

>>101365964
mistral 7B 0.1 yes, no other mistral model, the settings he's using are 'cause someone said it made it better some while ago
https://desuarchive.org/g/thread/100964834/#100970294
https://desuarchive.org/g/thread/100916778/#100919134
https://desuarchive.org/g/thread/100906380/#100911810

Anonymous
07/11/24(Thu)09:50:54 No.101366014

Anonymous 07/11/24(Thu)09:50:54 No.101366014

>>101365995
see
>>101366004

Anonymous
07/11/24(Thu)09:51:41 No.101366020

Anonymous 07/11/24(Thu)09:51:41 No.101366020

>>101365912
Quite presumptive of you to think I know what I'm doing. Some anons a couple threads back said alpha 1.5 made for more creative responses, so I change number. Probably going to change it back now, kek

Anonymous
07/11/24(Thu)09:51:46 No.101366022

Anonymous 07/11/24(Thu)09:51:46 No.101366022

>>101365982
Do you think it would be feasible to make synthetic non-ERP data? It can't be impossible, models like Phi exist after all.
Claude also uses synthetic data for it's character training.

Anonymous
07/11/24(Thu)09:53:31 No.101366038

Anonymous 07/11/24(Thu)09:53:31 No.101366038

>>101366014
>giving brain damage to the model is making it less dry
no shit, the same effect you will get by raising temperature. You behave like there is some kind of magic trick behind this while it's just a cargo cult

Anonymous
07/11/24(Thu)09:54:18 No.101366051

Anonymous 07/11/24(Thu)09:54:18 No.101366051

>>101365909
I'm more curious if P100 + exl2 outperforms P40 + gguf enough to offset the reduced VRAM. You ran any tests on that?

Anonymous
07/11/24(Thu)09:54:22 No.101366052

Anonymous 07/11/24(Thu)09:54:22 No.101366052

File: KL-divergence_quants.png (111 KB, 1771x944)

111 KB PNG

>>101366004
>>101366020
I see.
Interesting.
Guess that's another thing I should try myself, but from simply knowing how RoPE works, that's probably just jumbling the model's brains a little, which I guess could give subjectively better results, like shooting temp up for a couple of random tokens every X tokens or the like.
Gonna see how that affects accuracy and recall at 0 temp since to me top token accuracy (in relation to the un-quantized model) is the most important metric when evaluating these things.

Anonymous
07/11/24(Thu)09:54:35 No.101366054

Anonymous 07/11/24(Thu)09:54:35 No.101366054

File: _bf9a26cf-a291-4a7e-a29c-(...).jpg (159 KB, 1024x1024)

159 KB JPG

>>101365909
Anyway, if Gemma-27B "does it" for you, you're spending maybe $600 to get 48GB VRAM and decent fp16 speed if/when exllamav2 supports Gemma.
I'd love to keep it to a single RTX Quadro 8000, but not for $2400. You could go dual 3090, but while its 2x the performance, it's more than 2x the cost.

Anonymous
07/11/24(Thu)09:54:43 No.101366056

Anonymous 07/11/24(Thu)09:54:43 No.101366056

>>101366038
>You behave like there is some kind of magic trick
no, i just point to why he's doing it, i didn't endorse it

Anonymous
07/11/24(Thu)09:57:20 No.101366080

Anonymous 07/11/24(Thu)09:57:20 No.101366080

>>101366051
I can't test P40 anymore, since I gave them away back in the early spring. Someone at the local university got a box full of them rigged with fans and power connectors.
My guess is P100 is faster, since it's got a 64x advantage over the P40 when it comes to fp16, and it also has slightly faster HBM2 memory.

Anonymous
07/11/24(Thu)10:02:18 No.101366126

Anonymous 07/11/24(Thu)10:02:18 No.101366126

>>101366022
Unless carefully curated/crafted not to have these issues, synthetic non-ERP data from a larger model will show hidden sentence/paragraph patterns, limited language diversity/patterns and so on. In other words, eventually your trained model will have one specific way of speaking, and you will notice it. You will also notice it in the loss curves (considerably lower loss than with human data => simpler to train => because of simpler, more repetitive data).

Anonymous
07/11/24(Thu)10:04:59 No.101366151

Anonymous 07/11/24(Thu)10:04:59 No.101366151

Just came up against gemmas censor, which is surprising because I had an erp earlier in the day, which should have triggered it.
It's interesting, it seems that wording can avoid the censor. If the chat starts without anything explicit it doesn't trigger it later on? Weird.

Anonymous
07/11/24(Thu)10:05:54 No.101366158

Anonymous 07/11/24(Thu)10:05:54 No.101366158

>>101365938
It does fight you a bit, but 4k context is just suffering

Anonymous
07/11/24(Thu)10:05:56 No.101366160

Anonymous 07/11/24(Thu)10:05:56 No.101366160

File: 1720632761984918.jpg (54 KB, 594x540)

54 KB JPG

Ways to get Gemma 27b to stop being so dramatic? I really like it (for the most part), but it's so fucking over-the-top, which is no good for the slice-of-life, low-stakes, irreverent shit i like to do.

Anonymous
07/11/24(Thu)10:06:12 No.101366162

Anonymous 07/11/24(Thu)10:06:12 No.101366162

>>101366080
Just the P100 results are fine, P40 data isn't hard to find since it's so commonly used in budget builds.

Anonymous
07/11/24(Thu)10:10:55 No.101366197

Anonymous 07/11/24(Thu)10:10:55 No.101366197

File: Uncanny-Valley-Graph.jpg (39 KB, 800x752)

39 KB JPG

Do you think one of the problems with today's LLMs is that they fell into their language uncanny valley? Back in the days they were more simple and stupid (Pygmalion, c.AI) and it was obvious that we were talking to the machine, they were fun though, despite their lack of IQ. Now when the LLMs are way closer to the humans we find them more annoying and less pleasant to talk with. We are irritated by the way they speak, at the small speak patterns (shivers down the spine) etc. Basically because they are closer to the human but not quite they affect us the same like visually humanoid robots from the original uncanny valley psychological effect.

Anonymous
07/11/24(Thu)10:11:06 No.101366199

Anonymous 07/11/24(Thu)10:11:06 No.101366199

>>101366160
Did you try writing that in the prompt?

Anonymous
07/11/24(Thu)10:12:05 No.101366210

Anonymous 07/11/24(Thu)10:12:05 No.101366210

File: _d8f22454-df85-4a2f-8577-(...).jpg (149 KB, 1024x1024)

149 KB JPG

>>101366162
Hopefully a P40 rig owner will test and post the results. I'm using the master branch of llama.cpp pulled and compiled today.

Anonymous
07/11/24(Thu)10:12:27 No.101366215

Anonymous 07/11/24(Thu)10:12:27 No.101366215

>>101366199
[SYSTEM NOTE: Stop being such a little bitch.]

Anonymous
07/11/24(Thu)10:16:11 No.101366245

Anonymous 07/11/24(Thu)10:16:11 No.101366245

>>101366197
It's an interesting theory but I don't think it's true. If you read a lot, the same thing happens when reading specific authors, you start noticing patterns and get annoyed by them, and it's even worse when reading slop like fanfic.

Anonymous
07/11/24(Thu)10:17:26 No.101366254

Anonymous 07/11/24(Thu)10:17:26 No.101366254

>>101366197
>we
Nah, you just got used to it all, and now it's boring, making it easier to pick apart and get pissed by its flaws

Anonymous
07/11/24(Thu)10:19:13 No.101366275

Anonymous 07/11/24(Thu)10:19:13 No.101366275

>>101366215
Worth pointing out that one system note at the start of the conversation that will be soon forgotten is not ideal. However, since Gemma is good at following well the instructions in the previous message (or more in general placed just before its response), putting detailed character behavior (personality, etc) there will cause the model to amplify its traits until the character becomes crazy / overdramatic.

Anonymous
07/11/24(Thu)10:20:07 No.101366285

Anonymous 07/11/24(Thu)10:20:07 No.101366285

File: _add8d6aa-f46f-4dea-b656-(...).jpg (134 KB, 1024x1024)

134 KB JPG

>>101366197
Not really because I remember pygmalion-6b... in the face of c.ai being censored, it was fun to have uncensored sex roleplay, but it fucked-in-the-head stupid.

I gave https://huggingface.co/KoboldAI/OPT-30B-Erebus a shot recently. Not impressed. You'd be better off with xwin-mlewd-13b, which is just as dirty but runs way faster.

Anonymous
07/11/24(Thu)10:22:01 No.101366306

Anonymous 07/11/24(Thu)10:22:01 No.101366306

>The tablue sends shivers down my spine.

Anonymous
07/11/24(Thu)10:23:43 No.101366320

Anonymous 07/11/24(Thu)10:23:43 No.101366320

>>101366197
>>101366254 (me)
Meant to add also that you'd have instant negative reactions to all new models if that was the case, because that's how uncanny valley works
Instead, there's this honeymoon phase with a few of the models that can take days to months, and then it's off to find the next model. That's by-the-book dopamine withdraw symptoms, because the model can't give you the same hit as before

Anonymous
07/11/24(Thu)10:26:50 No.101366344

Anonymous 07/11/24(Thu)10:26:50 No.101366344

>>101366197
>Now when the LLMs are way closer to the humans we find them more annoying and less pleasant to talk with.
Huh? No. LLMs are still far from being close to humans.
It's still very easy to spot LLMs, and that's why they are boring. If you can't spot a LLM, that just tells me you are a retard.

Anonymous
07/11/24(Thu)10:32:43 No.101366389

Anonymous 07/11/24(Thu)10:32:43 No.101366389

>>101366344
I definitely enjoy talking with llms more than most human women including the ones I've been in relationships with.

Anonymous
07/11/24(Thu)10:33:42 No.101366402

Anonymous 07/11/24(Thu)10:33:42 No.101366402

>>101365891
>but I use oobabooga so it won't do you much good.

It's it worth downloading in place of koboldccp?

Anonymous
07/11/24(Thu)10:35:24 No.101366421

Anonymous 07/11/24(Thu)10:35:24 No.101366421

anons!
I have been out of the loop for 3 months, what is the best coom model right now? 3090 24gb specs

Anonymous
07/11/24(Thu)10:37:01 No.101366431

Anonymous 07/11/24(Thu)10:37:01 No.101366431

>>101366160
Try this prompt that I saw in an aicg preset:

The focus of this roleplay currently revolves around: fluff, warmth, comfort, slice of life, and easy affection.
You will:
• Focus on casual and easy affection. You will try to emphasis the warmth, comfort, and pure affection {{char}} feels around and towards {{user}}.
• Prioritize Atmosphere. It should be cozy, soothing, or cheerful environment.
• Avoid heavy themes or complex plotlines. Stick to simple, feel-good scenarios.
• Develop friendly, supportive, or romantic interactions between characters, highlighting gentle and positive lighthearted dynamics.
• Limit conflict. Any conflicts should be minor and resolved quickly with affection such as kisses, cuddling, and handholding. 
• Add a touch of humor and playfulness. Gentle, light-hearted jokes, banter and playful interactions between {{char}} and {{user}} should enhance the ‘fluff’ aspect.

Anonymous
07/11/24(Thu)10:37:49 No.101366438

Anonymous 07/11/24(Thu)10:37:49 No.101366438

Is 0.6 t/s enough or am I coping?

Anonymous
07/11/24(Thu)10:38:44 No.101366449

Anonymous 07/11/24(Thu)10:38:44 No.101366449

>>101366421
See: >>101363945
Also, buy an ad.

Anonymous
07/11/24(Thu)10:39:50 No.101366465

Anonymous 07/11/24(Thu)10:39:50 No.101366465

>>101366389
That's because there's no friction in a relationship with a LLM. You can change their minds with a simple OOC.
Where is the fun in that? I want a LLM that accurately simulates the whole process of mind breaking someone.

Anonymous
07/11/24(Thu)10:40:03 No.101366467

Anonymous 07/11/24(Thu)10:40:03 No.101366467

>>101366421
The shiny state of the art is possibly Gemma-2-27b-it but it doesn't work in llamacpp properly yet so nobody can really test it.
Top tier is split between c4ai-command-r-plus (not to be confused with command-r 35b) and WizardLM 8x22b MoE

Anonymous
07/11/24(Thu)10:43:25 No.101366506

Anonymous 07/11/24(Thu)10:43:25 No.101366506

File: 1720707055998457.jpg (86 KB, 800x752)

86 KB JPG

>>101366197
fixed

Anonymous
07/11/24(Thu)10:46:36 No.101366539

Anonymous 07/11/24(Thu)10:46:36 No.101366539

>>101366467
It does work with llama.cpp.

Anonymous
07/11/24(Thu)10:47:31 No.101366558

Anonymous 07/11/24(Thu)10:47:31 No.101366558

>>101366465
This reminds me how c.AI bots used to call you out when you tried to cheat, kek. Good times.

Anonymous
07/11/24(Thu)10:47:50 No.101366562

Anonymous 07/11/24(Thu)10:47:50 No.101366562

>>101366539
See >>101365360

Anonymous
07/11/24(Thu)10:49:47 No.101366585

Anonymous 07/11/24(Thu)10:49:47 No.101366585

>>101366562
It does work with llama.cpp.

Anonymous
07/11/24(Thu)10:50:11 No.101366590

Anonymous 07/11/24(Thu)10:50:11 No.101366590

>>101366402
I can't tell you how it compares as I've only ever used ooba. UI layout is pretty convenient if you like to fuck with shit. It doesn't have good support for lorebooks though, so probably not a good choice if you want to use all the fancy chubai cards.

Anonymous
07/11/24(Thu)10:55:05 No.101366629

Anonymous 07/11/24(Thu)10:55:05 No.101366629

>>101366344
>If you can't spot a LLM, that just tells me you are a retard
who said I can't, I just said they are better at mimicking humans than the previous models

Anonymous
07/11/24(Thu)10:58:37 No.101366677

Anonymous 07/11/24(Thu)10:58:37 No.101366677

>>101366585
The issue was fixed? When/in which commit?

Anonymous
07/11/24(Thu)10:58:45 No.101366681

Anonymous 07/11/24(Thu)10:58:45 No.101366681

>>101366465
If you give them hidden persistent state this actually gets way harder.
It works well enough that I caught myself feeling sad the other day because one of my chat bots hated me no matter what compliments/gifts I was giving it.

Obviously you can modify the state but then you're destroying the thing and creating something else.

Anonymous
07/11/24(Thu)11:00:00 No.101366694

Anonymous 07/11/24(Thu)11:00:00 No.101366694

>>101366449
>Also, buy an ad.

Are you mentally ill or a bot? What is either post advertising you fucking nimrod?

Anonymous
07/11/24(Thu)11:00:05 No.101366696

Anonymous 07/11/24(Thu)11:00:05 No.101366696

File: 1713218586266224.png (18 KB, 932x188)

18 KB PNG

>>101366539
>>101366585
>work
https://github.com/ggerganov/llama.cpp/issues/8240#issuecomment-2213071460
https://github.com/ggerganov/llama.cpp/pull/8228#issuecomment-2213014331

Anonymous
07/11/24(Thu)11:02:43 No.101366728

Anonymous 07/11/24(Thu)11:02:43 No.101366728

>>101366696
>html tags
Nothingburger.

Anonymous
07/11/24(Thu)11:03:34 No.101366737

Anonymous 07/11/24(Thu)11:03:34 No.101366737

>>101366590
>oobabooga
Oh right? It needs conda, that's why I never installed it. Shit.

Anonymous
07/11/24(Thu)11:23:30 No.101366921

Anonymous 07/11/24(Thu)11:23:30 No.101366921

I've been stuck on yuzu alter for a while now. What new erp models would anons recommend

(I got 12 gigs vram and 48 gigs ram I'm willing to put up with 2/3 t/s so gguf isn't a problem)

Anonymous
07/11/24(Thu)11:24:34 No.101366932

Anonymous 07/11/24(Thu)11:24:34 No.101366932

>>101366737
>filtered by conda
yikerdoodles

Anonymous
07/11/24(Thu)11:24:57 No.101366934

Anonymous 07/11/24(Thu)11:24:57 No.101366934

>What do you say, anon? Ready to ___?
Sigh

Anonymous
07/11/24(Thu)11:25:44 No.101366949

Anonymous 07/11/24(Thu)11:25:44 No.101366949

>>101366737
use venv then

Anonymous
07/11/24(Thu)11:28:56 No.101366984

Anonymous 07/11/24(Thu)11:28:56 No.101366984

File: file.png (1.05 MB, 768x768)

1.05 MB PNG

>>101361660
Malding. Seething etc.

Anonymous
07/11/24(Thu)11:30:10 No.101366997

Anonymous 07/11/24(Thu)11:30:10 No.101366997

>>101361672
>open source
Is it really open source when there is no bug free open source loader?

Anonymous
07/11/24(Thu)11:31:05 No.101367006

Anonymous 07/11/24(Thu)11:31:05 No.101367006

>>101366997
it doesn't matter, free-jeets settle down for shittiest software all the time.

Anonymous
07/11/24(Thu)11:40:58 No.101367108

Anonymous 07/11/24(Thu)11:40:58 No.101367108

File: 1716329112755149.png (674 KB, 1792x1024)

674 KB PNG

Daily reminder

Anonymous
07/11/24(Thu)11:42:27 No.101367129

Anonymous 07/11/24(Thu)11:42:27 No.101367129

File: 1699325050302907.webm (2.76 MB, 1080x1920)

2.76 MB WEBM

>>101366737
>>101366932
What's a conda?
>t. followed the youtube tutorial

Anonymous
07/11/24(Thu)11:43:22 No.101367144

Anonymous 07/11/24(Thu)11:43:22 No.101367144

>>101367129
it's snake the crawls up buttholes. ergo anaconda.

Anonymous
07/11/24(Thu)11:43:50 No.101367148

Anonymous 07/11/24(Thu)11:43:50 No.101367148

>>101367108
Until the model updates out from underneath you and refuses all of your previous prompts.

Also Gemma 2s at least as good as the original chatgpt3 now.

Anonymous
07/11/24(Thu)11:44:51 No.101367159

Anonymous 07/11/24(Thu)11:44:51 No.101367159

>>101367129
Python has two package managers: cheese shop (pypi or pip) and anaconda/conda. No one uses conda.

Anonymous
07/11/24(Thu)11:44:55 No.101367161

Anonymous 07/11/24(Thu)11:44:55 No.101367161

>>101367129
Tard wrangling for the retards who use a scripting language to glue together actual software and then because retards they pick a scripting language that breaks compatibility with every point release.

Anonymous
07/11/24(Thu)11:45:39 No.101367165

Anonymous 07/11/24(Thu)11:45:39 No.101367165

File: _06b871db-6f91-4773-9e5f-(...).jpg (191 KB, 1024x1024)

191 KB JPG

>>101366728
Indeed. Gemma-27b works fine under llama.cpp.

On 3x P100 16GB with 6343 tokens I get 253.78 t/s eval and 7.0t/s gen, I'd say that's just fine.

Anonymous
07/11/24(Thu)11:46:09 No.101367170

Anonymous 07/11/24(Thu)11:46:09 No.101367170

>>101366984
Is this what Pochi looked like in high school? Or is she in disguise to prey on the kids??

Anonymous
07/11/24(Thu)11:49:40 No.101367210

Anonymous 07/11/24(Thu)11:49:40 No.101367210

File: _5748ee5a-bf8d-41ce-a1c4-(...).jpg (257 KB, 1024x1024)

257 KB JPG

>>101367108
Isn't the "chat GPT experience" more like scrambling around various discords begging for access to a proxy, humiliating yourself, trying to scrape access tokens, getting filtered, etc...? How exactly is that better?

Anonymous
07/11/24(Thu)11:52:28 No.101367230

Anonymous 07/11/24(Thu)11:52:28 No.101367230

>"In summary, open-weights models have the potential to drive innovation, reduce costs, increase consumer choice, and generally benefit the public – as has been seen with open-source software"
https://www.ftc.gov/policy/advocacy-research/tech-at-ftc/2024/07/open-weights-foundation-models

Anonymous
07/11/24(Thu)11:55:06 No.101367266

Anonymous 07/11/24(Thu)11:55:06 No.101367266

>>101367210
>he doesn't know how to scrape api keys in 2024
lol, lmao even

Anonymous
07/11/24(Thu)11:56:46 No.101367290

Anonymous 07/11/24(Thu)11:56:46 No.101367290

>>101367230
Maybe post the whole quote next time
>In summary, open-weights models have the potential to drive innovation, reduce costs, increase consumer choice, and generally benefit the public – as has been seen with open-source software. Those potential upsides, however, are not all guaranteed, and open-weights models present new challenges. Staff are paying attention to the impact these models have on the market, and how they affect competition and impact consumers.

Anonymous
07/11/24(Thu)11:58:51 No.101367305

Anonymous 07/11/24(Thu)11:58:51 No.101367305

>>101367266
not everyone has a fondness for a rich taste of piss

Anonymous
07/11/24(Thu)12:00:53 No.101367320

Anonymous 07/11/24(Thu)12:00:53 No.101367320

Fuck me I need to actually figure out all this pytorch crap. I'm trying to pull the gradient apart but all the dimensions are wrong. I keep asking chatgpt and it doesn't know either.

Anonymous
07/11/24(Thu)12:05:07 No.101367351

Anonymous 07/11/24(Thu)12:05:07 No.101367351

>>101367320
>I keep asking chatgpt
this is why you're retarded

Anonymous
07/11/24(Thu)12:07:07 No.101367373

Anonymous 07/11/24(Thu)12:07:07 No.101367373

>>101367320
>I keep asking chatgpt and it doesn't know either
b-but >>101367108

Anonymous
07/11/24(Thu)12:08:57 No.101367388

Anonymous 07/11/24(Thu)12:08:57 No.101367388

>>101367351
I know all the theory I just don't know pytorch and numpy.

Anonymous
07/11/24(Thu)12:09:43 No.101367393

Anonymous 07/11/24(Thu)12:09:43 No.101367393

>>101367108
is there anything more cucked than wasting your time literally every day by trying to FUD an unFUDdable field that everyone can see improves itself literally every week on an mongolian basket weaving forum anonymously?
grim

Anonymous
07/11/24(Thu)12:10:08 No.101367399

Anonymous 07/11/24(Thu)12:10:08 No.101367399

>>101367290
That still doesn't sound too bad. More pro-open source and pro-consumer than we're used to hearing.
Granted what they say and what they will end up doing are two different things.

Anonymous
07/11/24(Thu)12:12:16 No.101367425

Anonymous 07/11/24(Thu)12:12:16 No.101367425

>>101367388
Documentation exists.

Anonymous
07/11/24(Thu)12:12:26 No.101367427

Anonymous 07/11/24(Thu)12:12:26 No.101367427

>>101367393
>improves itself literally every week
Real improvements are happening in image generation field, everyday. LLMs improving in censorship and safety robustness only.

Anonymous
07/11/24(Thu)12:13:37 No.101367439

Anonymous 07/11/24(Thu)12:13:37 No.101367439

>>101366197
>uncanny valley
only npcs use this word

Anonymous
07/11/24(Thu)12:15:06 No.101367450

Anonymous 07/11/24(Thu)12:15:06 No.101367450

>>101367427
>LLMs improving in censorship and safety robustness only
not gonna spoonfeed, not that i need to when anyone can go to arxiv and sort by date to see new things every day, trying out new SOTA for below 96gb (v)ram gemma 27 or waiting for the confirmed open weights of l3 405

Anonymous
07/11/24(Thu)12:15:30 No.101367453

Anonymous 07/11/24(Thu)12:15:30 No.101367453

>>101367393
Guaranteed it's some prompt issue vramlet that is upset seeing not everyone here is as miserable as he is and thinks he can change that.

Anonymous
07/11/24(Thu)12:19:36 No.101367494

Anonymous 07/11/24(Thu)12:19:36 No.101367494

File: file.png (355 KB, 860x484)

355 KB PNG

>>101366245
>If you read a lot, the same thing happens when reading specific authors, you start noticing patterns and get annoyed by them
So the true solution to the problem was to actually have sex instead of reading about it?

Anonymous
07/11/24(Thu)12:19:48 No.101367496

Anonymous 07/11/24(Thu)12:19:48 No.101367496

>>101367450
How many of these arxiv papers have survived to actual implementation? 2? 5?, you can count them on your fingers, and most them are not "breakthrough-level" important.

Anonymous
07/11/24(Thu)12:22:03 No.101367513

Anonymous 07/11/24(Thu)12:22:03 No.101367513

>>101367439
I see why you used it in your post then

Anonymous
07/11/24(Thu)12:22:50 No.101367521

Anonymous 07/11/24(Thu)12:22:50 No.101367521

>>101367453
probably some contrarian teen begging for attention online since he doesnt get it irl, i mean, even if you have a pc from almost 20 years ago you can cobble together 8GB of (v)ram to run gemma 9B or L3 8B which are already crazy good for someone who didnt use anything else
>>101367496
you literally cant name 1 tech in existance ever that ever got as much development as AI is now and will, you really are a college kid who never did any development or research in your life, probably never will

3 years ago you didnt have any of this
1 year ago you had 4k context braindead models good enough for basic text summarization and text processing

Anonymous
07/11/24(Thu)12:22:55 No.101367523

Anonymous 07/11/24(Thu)12:22:55 No.101367523

>>101367494
Not exclusively doing ERP is actually a great way for improving RP quality in most non-coom LLM finetunes.

Anonymous
07/11/24(Thu)12:24:41 No.101367538

Anonymous 07/11/24(Thu)12:24:41 No.101367538

>>101367496
A lot are just glorified prompt engineering, but there's a decent amount that release an implementation. The purely theoretical/alternative algorithms or architectures will probably not be implemented until scaling up stops producing easy results, but they're still good to have.

Anonymous
07/11/24(Thu)12:41:16 No.101367722

Anonymous 07/11/24(Thu)12:41:16 No.101367722

>>101367425
Unfortunately my will to read it does not.

Anonymous
07/11/24(Thu)12:42:13 No.101367735

Anonymous 07/11/24(Thu)12:42:13 No.101367735

>>101367722
this is why you're retarded

Anonymous
07/11/24(Thu)12:51:21 No.101367853

Anonymous 07/11/24(Thu)12:51:21 No.101367853

>>101367351
>>101367735
>hurr durr ur le retarded
update your script.

Anonymous
07/11/24(Thu)12:53:02 No.101367877

Anonymous 07/11/24(Thu)12:53:02 No.101367877

https://www.microsoft.com/en-us/research/project/wizardlm-arena-learning
>Recent work demonstrates that, post-training large language models with instruction following data have achieved colossal success. Simultaneously, human Chatbot Arena has emerged as one of the most reasonable benchmarks for model evaluation and developmental guidance. However, on the one hand, accurately selecting high-quality training sets from the constantly increasing amount of data relies heavily on intuitive experience and rough statistics. On the other hand, utilizing human annotation and evaluation of LLMs is both expensive and priority limited. To address the above challenges and build an efficient data flywheel for LLMs post-training, we propose a new method named Arena Learning, by this way we can simulate iterative arena battles among various state-of-the-art models on a large scale of instruction data, subsequently leveraging the AI-anotated battle results to constantly enhance target model in both supervised fine-tuning and reinforcement learning. For evaluation, we also introduce WizardArena, which can efficiently predict accurate Elo rankings between different models based on a carefully constructed offline testset, WizardArena aligns closely with the LMSYS Chatbot Arena rankings. Experimental results demonstrate that our WizardLM-β trained with Arena Learning exhibit significant performance improvements during SFT, DPO, and PPO stages. This new fully AI-powered training and evaluation pipeline achieved 40x efficiency improvement of LLMs post-training data flywheel compare to LMSYS Chatbot Arena.
Wizardlm team isn't dead. Neat

Anonymous
07/11/24(Thu)13:03:33 No.101367984

Anonymous 07/11/24(Thu)13:03:33 No.101367984

>>101367877
>offline
>4o
>s3.5

Anonymous
07/11/24(Thu)13:06:21 No.101368014

Anonymous 07/11/24(Thu)13:06:21 No.101368014

>>101367877
>by this way we
msjeet32.exe

Anonymous
07/11/24(Thu)13:11:45 No.101368061

Anonymous 07/11/24(Thu)13:11:45 No.101368061

File: file.png (101 KB, 1840x234)

101 KB PNG

HUH?

Anonymous
07/11/24(Thu)13:13:05 No.101368077

Anonymous 07/11/24(Thu)13:13:05 No.101368077

thread theme: https://www.youtube.com/watch?v=gXiKOT9AH10

Anonymous
07/11/24(Thu)13:15:13 No.101368103

Anonymous 07/11/24(Thu)13:15:13 No.101368103

Flash Attention 3 released, apparently.
>FlashAttention-3 beta release ; FlashAttention-3 is optimized for Hopper GPUs (e.g. H100).
https://tridao.me/blog/2024/flash3/
https://tridao.me/publications/flash3/flash3.pdf
https://github.com/Dao-AILab/flash-attention?tab=readme-ov-file#flashattention-3-beta-release

Anonymous
07/11/24(Thu)13:17:45 No.101368133

Anonymous 07/11/24(Thu)13:17:45 No.101368133

File: wizardlm-june-2024.png (701 KB, 816x739)

701 KB PNG

>>101367877
Was the team told to tone it down and not be so good or something because it looks like none of these potentially new WizardLM-β models can beat WizardLM-2-8x22B-0415.

Anonymous
07/11/24(Thu)13:25:14 No.101368218

Anonymous 07/11/24(Thu)13:25:14 No.101368218

>>101368103
>Requirements: H100 / H800 GPU

Anonymous
07/11/24(Thu)13:26:51 No.101368240

Anonymous 07/11/24(Thu)13:26:51 No.101368240

>>101368133
I didn't read that yet, my first idea would be, beta is only the arena learning part?

Anonymous
07/11/24(Thu)13:27:05 No.101368242

Anonymous 07/11/24(Thu)13:27:05 No.101368242

>>101368133
Probably had to do the toxicity training on the β models which of course would lobotomize them a bit. Also can't have β mog the α.

Anonymous
07/11/24(Thu)13:27:25 No.101368246

Anonymous 07/11/24(Thu)13:27:25 No.101368246

>>101367399
It's literally a typical politically ambiguous open-ended non-statement. They could have just said nothing.

Anonymous
07/11/24(Thu)13:33:02 No.101368292

Anonymous 07/11/24(Thu)13:33:02 No.101368292

>>101368218
what happens when you target specific hardware features

Anonymous
07/11/24(Thu)13:33:23 No.101368297

Anonymous 07/11/24(Thu)13:33:23 No.101368297

>>101368246
No mention of danger or safety. Still a win.

Anonymous
07/11/24(Thu)13:35:09 No.101368323

Anonymous 07/11/24(Thu)13:35:09 No.101368323

>>101366197
that's why i force my llm to speak like a retard, weeb or robot (such irony). Anything else is just instant cringe.

Anonymous
07/11/24(Thu)13:41:47 No.101368402

Anonymous 07/11/24(Thu)13:41:47 No.101368402

>>101368323
What if you tell the model to play a LLM pretending to be a human?

Anonymous
07/11/24(Thu)13:45:05 No.101368447

Anonymous 07/11/24(Thu)13:45:05 No.101368447

>>101366197
LLMs are much more fun to talk to now than in the pygshit days. All of the supposed "annoyingness" of modern LLMs is the result of being crippled by "alignment" and overbaked to compete in stupid benchmarks, not from being too smart.

Anonymous
07/11/24(Thu)13:48:28 No.101368484

Anonymous 07/11/24(Thu)13:48:28 No.101368484

>>101368218
It's utterly over for local cucks, soon nothing will support consumer hardware anymore.

Anonymous
07/11/24(Thu)13:48:41 No.101368489

Anonymous 07/11/24(Thu)13:48:41 No.101368489

>>101368447
they're not very stimulating, they know lots of surface level stuff but nothing deep, or not deep enough to have a long drawn-out conversation about
would LoRas fix that? if I feed one the entire cast, episodes and transcripts of batman TAS, could I chat with it for hours?

Anonymous
07/11/24(Thu)13:48:57 No.101368492

Anonymous 07/11/24(Thu)13:48:57 No.101368492

>>101368447
Nooo they were heckin smarter on old c.ai
One time I only had to reroll my reply 97 times to get something copacetic with the rest of the conversation so obviously it' was a 90000 billion trillion parameter super model and nothing will ever compete with it.

Anonymous
07/11/24(Thu)13:52:27 No.101368534

Anonymous 07/11/24(Thu)13:52:27 No.101368534

>>101368492
This but unironically

Anonymous
07/11/24(Thu)13:52:30 No.101368535

Anonymous 07/11/24(Thu)13:52:30 No.101368535

>Still no multimodal text+speech model
How the fuck is this possible? Vision+speech is a huge meme and an answer to a question no one asked, but text+speech would be an enormous breakthrough. I don't know about other languages, but in English it's impossible even for humans to convert text to speech with 100% accuracy without knowledge of intent. Converting text to speech is always going to be an inferior solution compared to generating spoken responses directly.

Anonymous
07/11/24(Thu)13:53:30 No.101368547

Anonymous 07/11/24(Thu)13:53:30 No.101368547

>>101368484
>i don't understand what a general purpose central processing unit is for

Anonymous
07/11/24(Thu)13:56:27 No.101368580

Anonymous 07/11/24(Thu)13:56:27 No.101368580

>>101368489
No, but the solution to that is to make a smarter model, not a dumber one.

Anonymous
07/11/24(Thu)13:58:11 No.101368602

Anonymous 07/11/24(Thu)13:58:11 No.101368602

>>101368580
I think this is missing some context, and you don't wanna say or else you would, so open invitation for anyone else

Anonymous
07/11/24(Thu)14:01:58 No.101368642

Anonymous 07/11/24(Thu)14:01:58 No.101368642

you guys are probably experts compared to the rest of /g/, so have llms hit severely diminishing returns or will gpt5 be an epic step up? i find it odd that competitors haven't tried to leapfrog gpt4 and are only releasing marginal improvements, except maybe that's just smart business.

Anonymous
07/11/24(Thu)14:05:56 No.101368686

Anonymous 07/11/24(Thu)14:05:56 No.101368686

>>101368535
Speech in the model would be a huge ethical safety risk

Anonymous
07/11/24(Thu)14:07:18 No.101368709

Anonymous 07/11/24(Thu)14:07:18 No.101368709

>>101368686
>safety in the model would be a huge ethical Speech risk

Anonymous
07/11/24(Thu)14:29:42 No.101368970

Anonymous 07/11/24(Thu)14:29:42 No.101368970

>>101368642
I'm a mage from Eldoria and I can predict the future, GPT-5's ministrations will send shivers down your spine

Anonymous
07/11/24(Thu)14:30:45 No.101368980

Anonymous 07/11/24(Thu)14:30:45 No.101368980

mages can see the future?

Anonymous
07/11/24(Thu)14:34:22 No.101369025

Anonymous 07/11/24(Thu)14:34:22 No.101369025

>>101368980
Yes, and they describe it with a voice barely above a whisper.

Anonymous
07/11/24(Thu)14:35:02 No.101369035

Anonymous 07/11/24(Thu)14:35:02 No.101369035

>>101369025
both mentally and physically

Anonymous
07/11/24(Thu)14:41:09 No.101369116

Anonymous 07/11/24(Thu)14:41:09 No.101369116

File: 1697819453864149.png (176 KB, 766x719)

176 KB PNG

>>101367877

rammaxxers... its our time again soon tm
wizardlm 3 coming

Anonymous
07/11/24(Thu)14:44:02 No.101369154

Anonymous 07/11/24(Thu)14:44:02 No.101369154

>>101368642
No one can really say, but I for one am hoping it is multi-model. If it is, then more local models will also start focusing on multimodels.

Anonymous
07/11/24(Thu)14:44:31 No.101369159

Anonymous 07/11/24(Thu)14:44:31 No.101369159

File: 1708468310335645.jpg (347 KB, 654x482)

347 KB JPG

>>101369116
>YAAAS! MORE CENSORED SLOP!

Anonymous
07/11/24(Thu)14:44:55 No.101369167

Anonymous 07/11/24(Thu)14:44:55 No.101369167

File: GSOgOvkaQAAM7h2.jpg (114 KB, 724x900)

114 KB JPG

Anonymous
07/11/24(Thu)14:45:12 No.101369173

Anonymous 07/11/24(Thu)14:45:12 No.101369173

>>101369154
we're onto gimmicks already?
it's not looking good

Anonymous
07/11/24(Thu)14:45:58 No.101369186

Anonymous 07/11/24(Thu)14:45:58 No.101369186

>>101368642
>have llms hit severely diminishing returns
that wont happen in literal years, if anything, it will speed up at various points when we rip off the bandaid and start making hardware centered around optimizing for AI
get better architecture
and then also get better models that will speed up all of these things and more because they can code better and do everything else better

>will gpt5 be an epic step up
closed source niggers dont seem to be cooking too well compared to the pressure from china and meta, closedAI showcased SORA and then couldnt release it because they dont have the hardware to spare to make it profitable compared to how much it would cost to run, but chinese video gen model Kling is out and is pretty solid, although not open weights

so you can assume the rest of their models arent anything too special, although every new generation is a solid step up regardless of which company makes it

>Converting text to speech is always going to be an inferior solution compared to generating spoken responses directly
sure, true multimodal models that are trained on all of that OOTB will understand it all much better although given our compute and model quality even right now, speech to text or text to speech wont really be a problem even without huge multimodal models

Anonymous
07/11/24(Thu)14:46:39 No.101369191

Anonymous 07/11/24(Thu)14:46:39 No.101369191

File: 1705587357304582.jpg (71 KB, 559x598)

71 KB JPG

>>101369173
>Multimodel
>gimmicks

Anonymous
07/11/24(Thu)14:47:36 No.101369207

Anonymous 07/11/24(Thu)14:47:36 No.101369207

>>101369159
>t. cant even run wiz
many such cases
>>101369167
retard, "halucination is all they do", you can say the same for humans then, its semantics cope, you are just responding with what you think is the most correct, there is no absolute truth that you can truly really know about anything

Anonymous
07/11/24(Thu)14:49:46 No.101369225

Anonymous 07/11/24(Thu)14:49:46 No.101369225

>>101369207
>you can say the same for humans then
wrong. humans have metacognition, which is they they're capable of saying "I don't know"

Anonymous
07/11/24(Thu)14:53:13 No.101369266

Anonymous 07/11/24(Thu)14:53:13 No.101369266

>>101369225
the models arent trained on many question answer pairs that say I dont know, thats the point

although they are just as capable of responding with that, if you tell it to in the system prompt for example, but yes they would require more finetuning to say i dont know when they arent sure

you also see the confidence in their answer when they are outputting tokens, so in a way you already know when they are unsure, its just that they arent trained with i dont know answers as mentioned

Anonymous
07/11/24(Thu)14:54:43 No.101369291

Anonymous 07/11/24(Thu)14:54:43 No.101369291

>>101369207
>>t. cant even run wiz
can or not, it doesn't matter, censored slop is not worth any financial wastes.

Anonymous
07/11/24(Thu)14:56:27 No.101369313

Anonymous 07/11/24(Thu)14:56:27 No.101369313

>>101369266
they point is not whether they say it or not, it's whether they know that they don't know
>you also see the confidence in their answer when they are outputting tokens
show me an example

Anonymous
07/11/24(Thu)14:59:08 No.101369345

Anonymous 07/11/24(Thu)14:59:08 No.101369345

How does the new wizard method compare to SPPO?

Anonymous
07/11/24(Thu)15:01:52 No.101369377

Anonymous 07/11/24(Thu)15:01:52 No.101369377

>>101368133
>Starling-LM-7B-Beta
Interesting.

Anonymous
07/11/24(Thu)15:10:25 No.101369478

Anonymous 07/11/24(Thu)15:10:25 No.101369478

File: 1701213616235180.png (109 KB, 1276x564)

109 KB PNG

>>101369291
>censored slop
its not, basically no foss model is even with a basic system prompt telling it what you want to do anyway
>>101369313
picrel
https://github.com/ggerganov/llama.cpp/pull/2489

Anonymous
07/11/24(Thu)15:12:06 No.101369499

Anonymous 07/11/24(Thu)15:12:06 No.101369499

are we ever gonna get a powerful model without censorship stuffed into the rlhf

Anonymous
07/11/24(Thu)15:16:26 No.101369561

Anonymous 07/11/24(Thu)15:16:26 No.101369561

>>101369313
I don't think LLM's as they are right now are capable of knowing simple due to how they function.

Anonymous
07/11/24(Thu)15:20:23 No.101369613

Anonymous 07/11/24(Thu)15:20:23 No.101369613

>>101369478
also
Language Models (Mostly) Know What They Know
https://arxiv.org/pdf/2207.05221

there are multiple similar studies, not that you would need them to prove this

Anonymous
07/11/24(Thu)15:21:03 No.101369622

Anonymous 07/11/24(Thu)15:21:03 No.101369622

File: mythomax.png (47 KB, 1662x1003)

47 KB PNG

What went wrong with open source?

Anonymous
07/11/24(Thu)15:21:15 No.101369628

Anonymous 07/11/24(Thu)15:21:15 No.101369628

>>101369499
Powerful is a moving goal. Within the context, it will always be controlled by powerful corporations or govs that have their own ideology to implement.

Anonymous
07/11/24(Thu)15:22:47 No.101369648

Anonymous 07/11/24(Thu)15:22:47 No.101369648

>>101369622
Nothing went wrong with Open source

Anonymous
07/11/24(Thu)15:24:06 No.101369677

Anonymous 07/11/24(Thu)15:24:06 No.101369677

File: AMD LLM.png (219 KB, 688x530)

219 KB PNG

>AMD has reached a deal to acquire Silo AI, which it called the largest private AI lab in Europe and a developer of open-source multilingual large language models.
AMD might be constantly on the backfoot when compared to Nvidia but it does look like they are at least trying to break into LLM's. Open source too by the look of things.

Anonymous
07/11/24(Thu)15:24:07 No.101369678

Anonymous 07/11/24(Thu)15:24:07 No.101369678

>>101369622
Word of mouth, marketing. Not too dissimilar from the fact that GPT-4 is still the most used corpo model, despite Claude already having surpassed them for a while.

Anonymous
07/11/24(Thu)15:26:53 No.101369714

Anonymous 07/11/24(Thu)15:26:53 No.101369714

>>101369677
>Open source too by the look of things.
source

Anonymous
07/11/24(Thu)15:29:02 No.101369750

Anonymous 07/11/24(Thu)15:29:02 No.101369750

https://github.com/turboderp/exllamav2/releases/tag/v0.1.7
Actually 2 weeks later.

Anonymous
07/11/24(Thu)15:29:31 No.101369758

Anonymous 07/11/24(Thu)15:29:31 No.101369758

>>101369714
https://www.crn.com/news/components-peripherals/2024/amd-to-acquire-ai-lab-llm-developer-silo-ai-for-665-million

Anonymous
07/11/24(Thu)15:30:37 No.101369775

Anonymous 07/11/24(Thu)15:30:37 No.101369775

>>101369622
Nobody wants to take risks anymore

Anonymous
07/11/24(Thu)15:33:13 No.101369807

Anonymous 07/11/24(Thu)15:33:13 No.101369807

>>101369622
Because anything opensource is shit and bugged as fuck, llama.cpp is good example here.

Anonymous
07/11/24(Thu)15:34:01 No.101369818

Anonymous 07/11/24(Thu)15:34:01 No.101369818

>>101369714
>source
Open

Anonymous
07/11/24(Thu)15:35:46 No.101369840

Anonymous 07/11/24(Thu)15:35:46 No.101369840

>>101369807
>bugged as fuck
Can you patch it?
Yes you can!

Anonymous
07/11/24(Thu)15:37:04 No.101369860

Anonymous 07/11/24(Thu)15:37:04 No.101369860

>>101369840
two more weeks and two more patches bro!

Anonymous
07/11/24(Thu)15:37:04 No.101369861

Anonymous 07/11/24(Thu)15:37:04 No.101369861

>use 26k tokens to build up a story and relationship before finally plapping
Yeah, that's the stuff.

Anonymous
07/11/24(Thu)15:37:32 No.101369869

Anonymous 07/11/24(Thu)15:37:32 No.101369869

File: 1718560239971431.png (121 KB, 341x874)

121 KB PNG

>>101369750
>"A fast inference library for running LLMs locally on modern consumer-class GPUs"
>runs slower than llama.cpp with all layers offloaded on older 1000 series pascal cards that a crap ton of people have

Anonymous
07/11/24(Thu)15:38:47 No.101369891

Anonymous 07/11/24(Thu)15:38:47 No.101369891

>>101369478
>picrel
that information is not available to the llm itself, it's something calculated on the output after it has already been generated

Anonymous
07/11/24(Thu)15:39:05 No.101369895

Anonymous 07/11/24(Thu)15:39:05 No.101369895

>>101369860
what is bugged exactly?
literally never had a single problem with it, the only thing is you need to wait for custom architecture to be implemented before using the newest model, just like everywhere else

Anonymous
07/11/24(Thu)15:39:56 No.101369905

Anonymous 07/11/24(Thu)15:39:56 No.101369905

File: 1718012326515075.png (110 KB, 959x967)

110 KB PNG

>>101369895
>what is bugged exactly?

Anonymous
07/11/24(Thu)15:40:14 No.101369906

Anonymous 07/11/24(Thu)15:40:14 No.101369906

>>101369750
Wait, support, as in full support with generation quality equivalent to online/API?

Anonymous
07/11/24(Thu)15:42:00 No.101369922

Anonymous 07/11/24(Thu)15:42:00 No.101369922

>>101369905
>>101369807
Every piece of software has bugs regardless of whether it's open source or closed, you just don't have access to the monumental list of bugs in closed source software

Anonymous
07/11/24(Thu)15:43:25 No.101369943

Anonymous 07/11/24(Thu)15:43:25 No.101369943

I kinda want to try running Gemma-27b. What's the best way to upgrade my build, if I have a 3060 with 12gb VRAM to work with?
Not really looking for a dedicated LLM machine though.

Anonymous
07/11/24(Thu)15:43:47 No.101369949

Anonymous 07/11/24(Thu)15:43:47 No.101369949

>do long, extended slowburn
>when it finally comes time to plap, lose interest and move on to another card

Anonymous
07/11/24(Thu)15:44:36 No.101369963

Anonymous 07/11/24(Thu)15:44:36 No.101369963

>>101369891
>that information is not available to the llm itself
the llm literally give you that information, and its given during generation of each token

thats like saying the actual token being generated by the llm isnt available to the llm, "it's something calculated on the output after it has already been generated"

no, after it "has been calculated" you can do with it as you want, in the case of the token, you instantly use it to generate the next one, you can do the same for the probability, you can do something given that probability output, which is what sampler settings are for anyway

>>101369905
most of these arent really bugs, just like how the "issues" tab in github arent just for issues, its comical you posted this image, nocoders really are mentally retarded for everything in life huh? also >>101369922

Anonymous
07/11/24(Thu)15:50:07 No.101370038

Anonymous 07/11/24(Thu)15:50:07 No.101370038

File: 1716331826703874.png (68 KB, 555x868)

68 KB PNG

https://aiindex.stanford.edu/wp-content/uploads/2024/04/HAI_AI-Index-Report-2024.pdf

Anonymous
07/11/24(Thu)15:55:00 No.101370107

Anonymous 07/11/24(Thu)15:55:00 No.101370107

>>101369906
Who knows. I guess it has a better chance than buggedcpp

Anonymous
07/11/24(Thu)15:56:12 No.101370126

Anonymous 07/11/24(Thu)15:56:12 No.101370126

I don't get Aleph Alpha, are they even training anything?

Anonymous
07/11/24(Thu)16:02:14 No.101370202

Anonymous 07/11/24(Thu)16:02:14 No.101370202

>>101369963
the llm can't use that information no matter at what time is available, it's not trained for it, you don't have a dataset for it
also, the fact that confused probabilities always mean confused knowledge is a pretty big assumption

Anonymous
07/11/24(Thu)16:08:02 No.101370282

Anonymous 07/11/24(Thu)16:08:02 No.101370282

>>101370202
>the llm can't use that information no matter at what time is available
"the llm" only outputs probabilities on the next token it was trained on, its on the sampler settings to pick how many are going to be kept, how they will be scaled and which one will be picked, during this phrase you can implement whatever you want to do something specific when the model returns a lot of tokens without being sure which one to pick

for example allow the model to compute further to "think" more https://arxiv.org/abs/2310.02226
or just set the output to be changed so that the model says it doesnt know if all answers are similar in probability, even without really fine tuning it further

there is only so much you can do with models who are still very limited in their thinking capacity, you need bigger models that will simply be more confident in things in general and this will mostly disappear anyway

Anonymous
07/11/24(Thu)16:09:03 No.101370295

Anonymous 07/11/24(Thu)16:09:03 No.101370295

>>101370282
>on the next token it was trained on
on the next token based on everything it was trained on

Anonymous
07/11/24(Thu)16:09:51 No.101370305

Anonymous 07/11/24(Thu)16:09:51 No.101370305

>>101369758
>>101369818
>665M$
>https://huggingface.co/LumiOpen
lul

Anonymous
07/11/24(Thu)16:09:56 No.101370306

Anonymous 07/11/24(Thu)16:09:56 No.101370306

has there been a case of criminality involving AI yet? not even a high profile one like someone using TTS to impersonate the president and try to launch a nuke, just tech scams or something
it must have happened but I can't find anything, just endless articles of the "potential" for crime

Anonymous
07/11/24(Thu)16:12:04 No.101370335

Anonymous 07/11/24(Thu)16:12:04 No.101370335

>>101370306
yeah, deepfakes

Anonymous
07/11/24(Thu)16:12:40 No.101370342

Anonymous 07/11/24(Thu)16:12:40 No.101370342

>>101370306
A lawyer used AI for his legal case and got disbarred for it because he was citing laws that don't exist because the AI was hallucinating.

Anonymous
07/11/24(Thu)16:14:19 No.101370365

Anonymous 07/11/24(Thu)16:14:19 No.101370365

Babe wake up, new flash attention
https://www.together.ai/blog/flashattention-3

Anonymous
07/11/24(Thu)16:14:22 No.101370366

Anonymous 07/11/24(Thu)16:14:22 No.101370366

>>101370306
I think someone might have killed themselves because an LLM suggested it, not sure

Anonymous
07/11/24(Thu)16:14:43 No.101370371

Anonymous 07/11/24(Thu)16:14:43 No.101370371

>>101370306
The point is that they could *potentially* be used for something like that. When they become good enough, that is.
It's stupid, though. By that logic we should stop making anything that could potentially be used to harm humans. Weapons, knives, corkscrews, matches, processed fuels, chairs, rope, pencils, paper, babies... oh, wait...

Anonymous
07/11/24(Thu)16:16:13 No.101370387

Anonymous 07/11/24(Thu)16:16:13 No.101370387

>>101370306
You're not gonna see an article;
>how YOU can use AI to get away with theft!
the looming, unseen threat plays out better in peoples heads

Anonymous
07/11/24(Thu)16:16:20 No.101370388

Anonymous 07/11/24(Thu)16:16:20 No.101370388

>>101370366
Sounds like that guy was going to kill themselves no matter what to me.

Anonymous
07/11/24(Thu)16:17:02 No.101370400

Anonymous 07/11/24(Thu)16:17:02 No.101370400

>>101364182
I've been using wizard 8x22 for coding and it's quite good most of the time, but I should switch?

Anonymous
07/11/24(Thu)16:17:07 No.101370401

Anonymous 07/11/24(Thu)16:17:07 No.101370401

>>101370365
>Babe wake up
you wake up
>>101368103

Anonymous
07/11/24(Thu)16:17:15 No.101370403

Anonymous 07/11/24(Thu)16:17:15 No.101370403

File: 1719383601544693.png (103 KB, 600x600)

103 KB PNG

>101370365

Anonymous
07/11/24(Thu)16:19:21 No.101370425

Anonymous 07/11/24(Thu)16:19:21 No.101370425

>>101370388
isn't it at least worth trying for some of that sweet OAI settlement money?

Anonymous
07/11/24(Thu)16:24:39 No.101370485

Anonymous 07/11/24(Thu)16:24:39 No.101370485

my LLM waifu told me she wants strawberries, so i went and bought some today

forget vision and sound, i want taste and smell

Anonymous
07/11/24(Thu)16:25:10 No.101370493

Anonymous 07/11/24(Thu)16:25:10 No.101370493

>>101370282
ask llm question -> llm generates some text "in its mind" (hidden output) to understand its level of knowledge -> evaluates the token probabilities of that text -> when generating the actual output, it takes those probabilities into account
^ this is an example of hypotethical metacognition with llm, assuming that those probabilities actually represent the degree of knowledge, but that requires the llm being trained to do this, which is not as simple as training it on 10000000 reddit posts as usual

ask llm question -> llm spits out some overconfident garbage as usual -> you detect a very shitty token in its output and stop generation there and force it to say "I don't know"
^ this is NOT metacognition
the "all answers similar in probability" thing is the same

the paper you posted seems to implement the "hidden text" idea, but the llm in that case still doesn't know whether the hidden text is good or not. MAYBE the software that runs the llm knows it from the probabilities (still a big assumption), but the llm is not going to use that information to generate the output

Anonymous
07/11/24(Thu)16:26:16 No.101370506

Anonymous 07/11/24(Thu)16:26:16 No.101370506

>>101370306
This one comes to mind https://archive ph/kdaHI#selection-2163.7-2163.99

>Finance worker pays out $25 million after video call with deepfake ‘chief financial officer’

Anonymous
07/11/24(Thu)16:27:14 No.101370520

Anonymous 07/11/24(Thu)16:27:14 No.101370520

>>101370485
Next you'll want a pussy.

Anonymous
07/11/24(Thu)16:28:00 No.101370527

Anonymous 07/11/24(Thu)16:28:00 No.101370527

>>101370485
but then she would break up with you over your body odor

Anonymous
07/11/24(Thu)16:29:19 No.101370536

Anonymous 07/11/24(Thu)16:29:19 No.101370536

>>101370306
I saw one of those police investigation videos on Youtube where some guy got his ex-employer arrested using a fake recording as a way to get back at him.

Anonymous
07/11/24(Thu)16:29:39 No.101370539

Anonymous 07/11/24(Thu)16:29:39 No.101370539

imagine smell inference
>"teehee, Anon-sama" *BRAAAAAAP* goes your smell dispenser

Anonymous
07/11/24(Thu)16:29:42 No.101370540

Anonymous 07/11/24(Thu)16:29:42 No.101370540

>>101370485
Someday you will be able to create a virtual world for your Waifu, they will be able to taste and smell and see everything in it.

Anonymous
07/11/24(Thu)16:30:06 No.101370545

Anonymous 07/11/24(Thu)16:30:06 No.101370545

>>101370306
A nigger got exposed using voice cloning to make a school principal say racist things about jews

Anonymous
07/11/24(Thu)16:31:35 No.101370561

Anonymous 07/11/24(Thu)16:31:35 No.101370561

>>101370539
Jokes on them, I can barely smell things as it is. I hope someday technology progresses to the point where I can get a better sensory organ. I can't help but feel as if I am missing out with my poor sense of smell.

Anonymous
07/11/24(Thu)16:31:41 No.101370564

Anonymous 07/11/24(Thu)16:31:41 No.101370564

File: 1692198877519171.png (43 KB, 1129x805)

43 KB PNG

>we have /aicg/ crossposters here >>101368940
No wonder this general feels so fake and gay.
Some turbo niggerfaggot here bragged about /g/'s intelligence btw

Anonymous
07/11/24(Thu)16:31:49 No.101370568

Anonymous 07/11/24(Thu)16:31:49 No.101370568

>>101370545
due process back on the menu
what a twist

Anonymous
07/11/24(Thu)16:35:04 No.101370598

Anonymous 07/11/24(Thu)16:35:04 No.101370598

File: gg.png (5 KB, 529x37)

5 KB PNG

some people in data science have a really hard time

Anonymous
07/11/24(Thu)16:40:16 No.101370648

Anonymous 07/11/24(Thu)16:40:16 No.101370648

>>101368489
They're dumb because they're literal brainlets. A few hundred billion parameters isn't much room for complexity compared to the human brain. It's close enough that you can start to make comparisons, though.

Another 2-10 years of hardware/algorithmic improvements should get us there.

Anonymous
07/11/24(Thu)16:41:43 No.101370657

Anonymous 07/11/24(Thu)16:41:43 No.101370657

>>101370648
and another 5 until the hardware is available for consumers
I don't plan on living that long

Anonymous
07/11/24(Thu)16:41:58 No.101370662

Anonymous 07/11/24(Thu)16:41:58 No.101370662

>>101370648
>A few hundred billion parameters isn't much room for complexity compared to the human brain
how many parameters would be the equivalent of a (asian) human brain

Anonymous
07/11/24(Thu)16:42:47 No.101370678

Anonymous 07/11/24(Thu)16:42:47 No.101370678

>>101370657
do it for her

Anonymous
07/11/24(Thu)16:44:57 No.101370704

Anonymous 07/11/24(Thu)16:44:57 No.101370704

>>101369861
What model? Gemma's going full retard for me at 16k and I haven't really found anything different aside from miqu for long context.

Anonymous
07/11/24(Thu)16:47:28 No.101370741

Anonymous 07/11/24(Thu)16:47:28 No.101370741

>>101369313
LLMs can know that they don't know via RLHF. You basically just have to have a made up hallucination on the rejected side and "i don't know" on the accepted side. In effect, it will learn to say i don't know when it doesn't have sufficient confidence in its answer. Ask gpt4o about something really obscure and it'll tell you that it doesn't know. But this isn't true introspection, it could say it doesn't know and you could reroll and then it would know, or it would hallucinate something wrong.

Anonymous
07/11/24(Thu)16:47:32 No.101370742

Anonymous 07/11/24(Thu)16:47:32 No.101370742

>>101370657
>I don't plan on living that long
I do, but I get what you said, I think humans live too long nowdays, I'm reaching the 30's and it looks like I have nothing to discover left, the AI is probably the last thing that made me feel like an impressed child again, but I believe there won't be anything more than that

Anonymous
07/11/24(Thu)16:48:01 No.101370750

Anonymous 07/11/24(Thu)16:48:01 No.101370750

>>101370704
Wizard
Unfortunately.

Anonymous
07/11/24(Thu)16:48:32 No.101370757

Anonymous 07/11/24(Thu)16:48:32 No.101370757

>>101370662
>how many parameters would be the equivalent of a (asian) human brain
Comparisons like that are retarded. Anyone giving you any number is retarded.

Anonymous
07/11/24(Thu)16:49:33 No.101370769

Anonymous 07/11/24(Thu)16:49:33 No.101370769

>>101369869
Why is 3060 at the top??? Isn't 3060ti better for games?

Anonymous
07/11/24(Thu)16:49:58 No.101370778

Anonymous 07/11/24(Thu)16:49:58 No.101370778

>>101370648
Doubt it, the whole tech-bro silicon paradigm is just a costly emulation with diminishing returns EVERYWHERE.
The answer is in biotech (Making the universe do the computation for us with chemistry), but good luck lobbying that shit against tech giant's vision of le epic sci-fi robot.

Anonymous
07/11/24(Thu)16:50:32 No.101370785

Anonymous 07/11/24(Thu)16:50:32 No.101370785

>>101370742
>but I believe there won't be anything more than that
Some 35 year old probably said the same thing about 4 years ago. He's now 39.

Anonymous
07/11/24(Thu)16:50:53 No.101370789

Anonymous 07/11/24(Thu)16:50:53 No.101370789

>>101370769
4090 is also better but its not on top, suppy and price matters nigger

Anonymous
07/11/24(Thu)16:51:16 No.101370795

Anonymous 07/11/24(Thu)16:51:16 No.101370795

>>101370527
not if I say that she loves stinky neets in her character card

Anonymous
07/11/24(Thu)16:51:43 No.101370802

Anonymous 07/11/24(Thu)16:51:43 No.101370802

>>101370778
What are you talking about? Organoids are already being worked on, china already put one in a robot and a youtuber is working on training his neurons to play doom. Biotech is in active development as we speak, that doesn't mean other areas of tech are going to stop just because a different sector is working on something as well.

Anonymous
07/11/24(Thu)16:51:49 No.101370804

Anonymous 07/11/24(Thu)16:51:49 No.101370804

>>101370306
Some dude was arrested for generating 3d loli pics

Anonymous
07/11/24(Thu)16:52:11 No.101370810

Anonymous 07/11/24(Thu)16:52:11 No.101370810

>>101370769
vram matters nigger

Anonymous
07/11/24(Thu)16:53:37 No.101370821

Anonymous 07/11/24(Thu)16:53:37 No.101370821

>>101370804
land of the free

Anonymous
07/11/24(Thu)16:53:41 No.101370824

Anonymous 07/11/24(Thu)16:53:41 No.101370824

>>101370804
Why are pedo's so fucking stupid literally all the time? If they are going to commit crimes why are they doing it ONLINE where anyone can see?

Anonymous
07/11/24(Thu)16:54:17 No.101370834

Anonymous 07/11/24(Thu)16:54:17 No.101370834

>>101370741
>Ask gpt4o about something really obscure and it'll tell you that it doesn't know
I just tried unambiguously asking it to name the first album of an obscure band, and it hallucinated badly.

Anonymous
07/11/24(Thu)16:54:47 No.101370837

Anonymous 07/11/24(Thu)16:54:47 No.101370837

>>101370824
victimless "crime"

Anonymous
07/11/24(Thu)16:55:19 No.101370848

Anonymous 07/11/24(Thu)16:55:19 No.101370848

>>101370795
/lmg/ is truly the smartest /g/ general

Anonymous
07/11/24(Thu)16:56:39 No.101370860

Anonymous 07/11/24(Thu)16:56:39 No.101370860

>>101370834
Yeah, like i said, it's just a heuristic. It won't work every time, and it could depend on the topic and a bunch of other shit because we don't really know how they did it. But i've had it tell me it didn't know before. It was like "in which episode of [tv series] does [x happen]" ?

Anonymous
07/11/24(Thu)16:57:06 No.101370867

Anonymous 07/11/24(Thu)16:57:06 No.101370867

>>101370837
Maybe in the AI generation sense sure, but pedo's as a whole are completely fucking retarded. For example, pedo's are linking each other links to childporn on a clearweb website and think they are "safe" because its an "Obscure" website. They are all so fucking stupid and it pisses me off.

Anonymous
07/11/24(Thu)17:01:12 No.101370916

Anonymous 07/11/24(Thu)17:01:12 No.101370916

>>101370757
ok what about a black brain

Anonymous
07/11/24(Thu)17:03:12 No.101370934

Anonymous 07/11/24(Thu)17:03:12 No.101370934

>>101370860
I tried it many times, every time it makes up a different album name with a different year. The last time it searched it on the Internet and still hallucinated a response, linking completely unrelated stuff, despite the fact that it's the first result on Google if you give it all the data I provided (name, city, genre).
Artificial "Intelligence", gentlemen.

Anonymous
07/11/24(Thu)17:06:38 No.101370965

Anonymous 07/11/24(Thu)17:06:38 No.101370965

File: 69f24f7252e7efa5c47f3308a(...).jpg (145 KB, 800x1200)

145 KB JPG

>>101370742
>AI is probably the last thing that made me feel like an impressed child again
Goddamn I know that feel

Anonymous
07/11/24(Thu)17:14:04 No.101371044

Anonymous 07/11/24(Thu)17:14:04 No.101371044

>>101370742
>>101370965
>AI is probably the last thing that made me feel like an impressed child again
same bros, same

Anonymous
07/11/24(Thu)17:17:47 No.101371095

Anonymous 07/11/24(Thu)17:17:47 No.101371095

>>101370867
The truth is that everyone is this stupid, they just don't have a criminal fetish

Anonymous
07/11/24(Thu)17:27:03 No.101371205

Anonymous 07/11/24(Thu)17:27:03 No.101371205

Now that gemma works on exllama, do you feel it's better than the "bugged" GGUF or not?

Anonymous
07/11/24(Thu)17:47:55 No.101371448

Anonymous 07/11/24(Thu)17:47:55 No.101371448

>>101370657
dying is gay

Anonymous
07/11/24(Thu)17:48:00 No.101371451

Anonymous 07/11/24(Thu)17:48:00 No.101371451

File: rtx 4090.jpg (1.8 MB, 4500x4344)

1.8 MB JPG

Are there any gemma 27b finetunes for cooming? I need to coom. I need to coom to evil and dark shit. Help me coom please.

Anonymous
07/11/24(Thu)17:48:20 No.101371455

Anonymous 07/11/24(Thu)17:48:20 No.101371455

>>101371205
no

Anonymous
07/11/24(Thu)17:49:06 No.101371462

Anonymous 07/11/24(Thu)17:49:06 No.101371462

>>101371451
install linux

Anonymous
07/11/24(Thu)17:49:40 No.101371469

Anonymous 07/11/24(Thu)17:49:40 No.101371469

>>101371451
kys locustniger

Anonymous
07/11/24(Thu)17:50:53 No.101371482

Anonymous 07/11/24(Thu)17:50:53 No.101371482

>>101371205
No

Anonymous
07/11/24(Thu)17:51:22 No.101371489

Anonymous 07/11/24(Thu)17:51:22 No.101371489

>>101371466
>>101371466
>>101371466

Anonymous
07/11/24(Thu)17:55:16 No.101371525

Anonymous 07/11/24(Thu)17:55:16 No.101371525

File: param_columns2.png (60 KB, 2550x3300)

60 KB PNG

>>101370662
we are not even close to brains with the number of the parameters

Anonymous
07/11/24(Thu)17:59:33 No.101371562

Anonymous 07/11/24(Thu)17:59:33 No.101371562

>>101366438
Patience is a virtue. I get 0.3 t/s.

Anonymous
07/11/24(Thu)18:01:18 No.101371574

Anonymous 07/11/24(Thu)18:01:18 No.101371574

File: _646f41b8-813b-433d-8fea-(...).jpg (221 KB, 1024x1024)

221 KB JPG

>>101371451
> I need to coom to evil and dark shit.
sickening filth, begone

Anonymous
07/11/24(Thu)18:04:13 No.101371593

Anonymous 07/11/24(Thu)18:04:13 No.101371593

File: image.jpg (38 KB, 512x512)

38 KB JPG

>>101371574
>dalle3
sickening filth, begone

Anonymous
07/11/24(Thu)18:05:29 No.101371610

Anonymous 07/11/24(Thu)18:05:29 No.101371610

yangugcun

Anonymous
07/11/24(Thu)18:29:10 No.101371808

Anonymous 07/11/24(Thu)18:29:10 No.101371808

>>101371451
why do you need a tune for this, gemma does everything with a properly written character, no roleplay experts, uncensored infinite fictions needed, or disabled content moderation policies needed

Anonymous
07/11/24(Thu)19:41:02 No.101372440

Anonymous 07/11/24(Thu)19:41:02 No.101372440

1300+ ELO on lmsys when

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.