/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 10/26/24(Sat)01:45:08 No.102976869

File: 1702178192786568.jpg (230 KB, 1024x1024)

230 KB JPG

/lmg/ - Local Models General Anonymous 10/26/24(Sat)01:45:08 No.102976869 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>102961420 & >>102947669

►News
>(10/25) GLM-4-Voice: End-to-end speech and text model based on GLM-4-9B: https://hf.co/THUDM/glm-4-voice-9b
>(10/24) Aya Expanse released with 23 supported languages: https://hf.co/CohereForAI/aya-expanse-32b
>(10/22) genmoai-smol allows video inference on 24 GB RAM: https://github.com/victorchall/genmoai-smol
>(10/22) Mochi-1: 10B Asymmetric Diffusion Transformer text-to-video model: https://hf.co/genmo/mochi-1-preview
>(10/22) Pangea: Open-source multilingual multimodal LLM supporting 39 languages: https://neulab.github.io/Pangea

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://livecodebench.github.io/leaderboard.html

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
10/26/24(Sat)01:45:36 No.102976873

Anonymous 10/26/24(Sat)01:45:36 No.102976873

File: 1715445682044736.png (627 KB, 819x819)

627 KB PNG

►Recent Highlights from the Previous Thread: >>102961420

--Paper: Looped transformers for length generalization in algorithmic tasks:
>102962317 >102962499
--Papers:
>102965897 >102966068
--Recommended resources for evaluating and selecting AI models:
>102963234
--Yam Peleg's experiment with 141B model and language structure challenges:
>102961589 >102961733 >102961816 >102961942 >102961943
--Tensors vs lists of lists: consistent dimensions, performance, and implementation:
>102966553 >102966577 >102966607 >102966643 >102966825 >102973760 >102966596 >102966620 >102966630 >102966701
--Nemotron 70B vs Sonnet: Stylish but dry, community-driven LMSys models:
>102968134 >102968155 >102968167 >102968214 >102968309 >102968317 >102968340 >102968146 >102968175
--MolmoE-1B-0924 model recommended for object detection in images:
>102963391 >102963449 >102963568 >102963635
--LiNeS method exposes limitations of current finetunes:
>102972740 >102972926
--Study finds LLMs reflect creators' ideology:
>102973312
--Softmax function limitations and attention distribution discussion:
>102962184 >102962255 >102963696 >102962392 >102963783
--INTELLECT-1 progress update and discussion on distributed training inefficiency:
>102961560 >102961622 >102961914 >102975417 >102975492 >102975499
--Discussion of techniques to improve llm output quality:
>102964748 >102964835 >102965000 >102965133 >102966147 >102966266 >102967215
--Culture benchmark to test intelligence vs rote memorization in LLMs:
>102963628 >102963794 >102964825 >102965031 >102965316 >102965283 >102964848
--Performance of LLaMA 3.2 and other AI models:
>102962854 >102962871 >102962879 >102962927 >102962890 >102964480
--GLM-4-Voice: End-to-end speech and text model based on ChatGLM:
>102973500
--Miku (free space):
>102970059 >102972533 >102973048 >102973552 >102975522 >102976015 >102976342

►Recent Highlight Posts from the Previous Thread: >>102961432

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script

Anonymous
10/26/24(Sat)01:49:30 No.102976897

Anonymous 10/26/24(Sat)01:49:30 No.102976897

Where do I get Jamba gguf?

Anonymous
10/26/24(Sat)01:51:37 No.102976912

Anonymous 10/26/24(Sat)01:51:37 No.102976912

>>102976897
jamba deez nuts

Anonymous
10/26/24(Sat)01:54:07 No.102976936

Anonymous 10/26/24(Sat)01:54:07 No.102976936

>>102976690
>https://huggingface.co/anthracite-org/magnum-v4-27b-gguf/discussions/1
Who to believe? Another guy writes it falls off at 8k, which makes sense I guess.
No way to expand to 16k for gemma2?

Anonymous
10/26/24(Sat)01:55:30 No.102976945

Anonymous 10/26/24(Sat)01:55:30 No.102976945

>>102976936
no, gemma is a fucking meme

Anonymous
10/26/24(Sat)01:57:57 No.102976967

Anonymous 10/26/24(Sat)01:57:57 No.102976967

>>102976936
Finetunes work past 8k

Anonymous
10/26/24(Sat)02:17:21 No.102977109

Anonymous 10/26/24(Sat)02:17:21 No.102977109

>>102976912
You need to be 18+ years old to post here.

Anonymous
10/26/24(Sat)02:17:43 No.102977113

Anonymous 10/26/24(Sat)02:17:43 No.102977113

when's the next jamba bitnet trained on 10 trillion tokens coming out?

Anonymous
10/26/24(Sat)02:18:20 No.102977116

Anonymous 10/26/24(Sat)02:18:20 No.102977116

OpenRouter can never host Largestral coomer finetunes because of the license, yeah? Shame, they're good enough at Q3 but it'd be nice to be able to use them fast at FP8 or whatever.

Anonymous
10/26/24(Sat)02:23:54 No.102977151

Anonymous 10/26/24(Sat)02:23:54 No.102977151

>>102976936
Every other Magnum has been too horny / dumbed the model too much. This Gemma one is a really nice balance of smarts and willingness to be dirty WHEN it actually fits.

Anonymous
10/26/24(Sat)02:24:21 No.102977158

Anonymous 10/26/24(Sat)02:24:21 No.102977158

>>102977113
diff transformers or no interest

Anonymous
10/26/24(Sat)03:11:26 No.102977462

Anonymous 10/26/24(Sat)03:11:26 No.102977462

>>102976873
Thank you Recap Miku

Anonymous
10/26/24(Sat)03:26:55 No.102977562

Anonymous 10/26/24(Sat)03:26:55 No.102977562

Where the hell is the p40 power patch compiler option in Koboldcpp. I heard it mentioned but have not seen it listed in the documentation at all.

Anonymous
10/26/24(Sat)03:31:03 No.102977592

Anonymous 10/26/24(Sat)03:31:03 No.102977592

File: 9.png (74 KB, 920x780)

74 KB PNG

INTELLECT-1 is at 25.39% complete, up from 22.63% last thread.

Anonymous
10/26/24(Sat)03:45:16 No.102977667

Anonymous 10/26/24(Sat)03:45:16 No.102977667

>>102977592
if they make a multimodal would it be called INTELLECT-2-ALL (intellectual)?

Anonymous
10/26/24(Sat)03:46:40 No.102977678

Anonymous 10/26/24(Sat)03:46:40 No.102977678

>>102977667
That is fucking clever as hell and I hope they go with that name if that do that.

Anonymous
10/26/24(Sat)03:52:44 No.102977729

Anonymous 10/26/24(Sat)03:52:44 No.102977729

>>102977592
- only accepting H100 richfags to participate
- talking democracy
pathetic af

Anonymous
10/26/24(Sat)04:10:42 No.102977871

Anonymous 10/26/24(Sat)04:10:42 No.102977871

Llama 3 Nemotron or Mistral Large for RP?

Anonymous
10/26/24(Sat)04:31:02 No.102977986

Anonymous 10/26/24(Sat)04:31:02 No.102977986

File: 1724213548020431.jpg (147 KB, 1179x1009)

147 KB JPG

https://x.com/deedydas/status/1849854657440645437
https://www.anthropic.com/research/evaluating-feature-steering

Anonymous
10/26/24(Sat)04:35:23 No.102978017

Anonymous 10/26/24(Sat)04:35:23 No.102978017

>>102977871
I feel like both have pretty different strengths and weaknesses. Mistral-Large give me the impression that it understands complex situations/cards better fundamentally while Nemotron is very good at dragging up small details from the scenario due to its tendency to break its reply down into bullet points.
I think Nemotron writes livelier dialogue so that might be better for pure chat character chat cards while I'd pick Large and its finetunes for bigger RP scenarios. I recommend downloading both and seeing which one you prefer.

Anonymous
10/26/24(Sat)04:39:08 No.102978042

Anonymous 10/26/24(Sat)04:39:08 No.102978042

File: 1700238976816078.jpg (12 KB, 256x176)

12 KB JPG

https://x.com/roeiherzig/status/1849492514350432359

Anonymous
10/26/24(Sat)04:41:20 No.102978057

Anonymous 10/26/24(Sat)04:41:20 No.102978057

um guys i cant use molmo with llama.cpp..................

Anonymous
10/26/24(Sat)04:51:09 No.102978107

Anonymous 10/26/24(Sat)04:51:09 No.102978107

File: 1702180645460320.png (38 KB, 604x424)

38 KB PNG

>>102977986
Which one of you was it?

Anonymous
10/26/24(Sat)05:06:54 No.102978199

Anonymous 10/26/24(Sat)05:06:54 No.102978199

>12b stagnated again
>Everyone putting out trash
>Rocinante was a fluke
>Sao vanished

Anonymous
10/26/24(Sat)05:22:43 No.102978302

Anonymous 10/26/24(Sat)05:22:43 No.102978302

File: Best Price Guaranteed.png (401 KB, 1133x1005)

401 KB PNG

Anonymous
10/26/24(Sat)05:28:03 No.102978347

Anonymous 10/26/24(Sat)05:28:03 No.102978347

>>102978302
Why buy 2 lamborghinis when you can buy jensen's magic box?

Anonymous
10/26/24(Sat)05:46:30 No.102978477

Anonymous 10/26/24(Sat)05:46:30 No.102978477

Elon's grok, interaction a-la gpt-4o.
https://xai-elevenlabs.replit.app/

Anonymous
10/26/24(Sat)05:48:55 No.102978490

Anonymous 10/26/24(Sat)05:48:55 No.102978490

>>102978477
local models?

Anonymous
10/26/24(Sat)05:52:06 No.102978519

Anonymous 10/26/24(Sat)05:52:06 No.102978519

>>102978490
Irrelevant in dead thread.

Anonymous
10/26/24(Sat)05:56:53 No.102978550

Anonymous 10/26/24(Sat)05:56:53 No.102978550

>>102978490
Can you run 300B+ LLM on your thinkpad?

Anonymous
10/26/24(Sat)06:09:02 No.102978625

Anonymous 10/26/24(Sat)06:09:02 No.102978625

>>102977729
The only way to have vramlets contribute would be to start making training do a couple layers at a time (local learning). Of course the great constant in LLM is that everything has to be a tiny variation on GPT2, so it likely won't happen.

Anonymous
10/26/24(Sat)06:16:58 No.102978693

Anonymous 10/26/24(Sat)06:16:58 No.102978693

>>102978550
>Can you run 300B+ LLM
If it had 1 Billion active parameters, maybe.

Anonymous
10/26/24(Sat)07:00:57 No.102978951

Anonymous 10/26/24(Sat)07:00:57 No.102978951

is there not like a simple way to have an ai language teacher yet
i wanna write in english and get responses in japanese with tts
gpt4o sort of has it but it's ass. has this really not been done yet?

Anonymous
10/26/24(Sat)07:37:53 No.102979167

Anonymous 10/26/24(Sat)07:37:53 No.102979167

i'm just gonna say it
you retards don't have ANY idea of the kind of rp that you want
you can vaguely point out "slop" which are words that are common in litterature but that you still don't want to see for some reason
like this "x y-ing" thing, it's just the basis of how to form a sentence you fucking niggerbrained faggots
even if I gave you a prize of $500.000 you would NOT be able to define what kind of precise syntax you'd want in erp, I can guarantee it and I am extremely confident in this matter
so basically, I don't want to hear ANY of that slop vs sovl debate ever again, you niggers are hypocrites who spit on everything but don't even know what the fuck they want, and also will do ZERO efforts to try and fix so-called "slop", to try and define it AS OPPOSED to so-called "sovl"
you deserve shitty llms for the rest of your miserable lives
niggers

Anonymous
10/26/24(Sat)07:43:18 No.102979196

Anonymous 10/26/24(Sat)07:43:18 No.102979196

>>102979167
low quality bait

Anonymous
10/26/24(Sat)07:46:34 No.102979214

Anonymous 10/26/24(Sat)07:46:34 No.102979214

File: brave_0ydDGoNtuW.webm (1.07 MB, 774x864)

1.07 MB WEBM

>>102978951

Anonymous
10/26/24(Sat)07:52:35 No.102979255

Anonymous 10/26/24(Sat)07:52:35 No.102979255

File: Screenshot 2024-10-26 125202.png (51 KB, 909x572)

51 KB PNG

>>102979214
oh, nice. Is that not available on lite? I don't have that. What version you using?

Anonymous
10/26/24(Sat)07:53:39 No.102979265

Anonymous 10/26/24(Sat)07:53:39 No.102979265

>>102979167
Anon, this general is completely okay with cucked models, opinion of majority is irrelevant here.
For someone in observation view i can say i want model that is: 1. Smart and capable of understanding popular concepts & trivia a-la CAI's detailed character knowledge. 2. Is free from IDPOL aplhabet shit & any similar stuff hard-trained in the name of """safety""".

Anonymous
10/26/24(Sat)07:55:52 No.102979283

Anonymous 10/26/24(Sat)07:55:52 No.102979283

i already have an entire server with a 9900x in it and an a380 for transcoding in real time. would a 7600xt with 16 gigs of vram be any good for llms around 11b or does rocm just suck ass? i just need something that works i guess, i dont mind it being slower than the 4070 i use with 8b in my pc rn as long as it frees up my 9900x and is faster than it

Anonymous
10/26/24(Sat)08:04:24 No.102979324

Anonymous 10/26/24(Sat)08:04:24 No.102979324

File: Untitled.png (66 KB, 369x327)

66 KB PNG

>>102979255
i'm using the version of kobold lite that comes up on localhost when you launch a model through koboldcpp and not horde's, but i still see those options on lite.kobold.net though.
i am pretty sure those tts options i have are some shit that installed themselves when i set up my japanese ide thing for typing.
we'd probably be better off figuring out how to run an instance of xtts though, as the microsoft sayaka thing sounds a bit robotic.
i haven't really played with xtts yet, only gpt-sovits, and i couldn't figure out how to get them to connect to eachother without using sillytavern, and i don't want to use that.

Anonymous
10/26/24(Sat)08:09:48 No.102979350

Anonymous 10/26/24(Sat)08:09:48 No.102979350

>>102979167
hey anthracite, can you tell your org member to relax?

Anonymous
10/26/24(Sat)08:24:48 No.102979436

Anonymous 10/26/24(Sat)08:24:48 No.102979436

>use kcpp api type in st
>no DRY
>use default api type
>decent chance of st just not receiving the output
very cool

Anonymous
10/26/24(Sat)08:25:47 No.102979439

Anonymous 10/26/24(Sat)08:25:47 No.102979439

So kinda figured out how to setup emotional voices for tts. It should work with all tts with voice clone abilities.

>normal Bateman voice: Bateman_normal_reference.wav
>angry Bateman voice: Bateman_angry_reference.wav
>happy Bateman voice: Bateman_happy_reference.wav
>sad Bateman voice: Bateman_sad_reference.wav
Then just create an API so that when you use (Normal), it uses Normal_reference.wav for inference generation. When I use (Angry) it uses angry_reference.wav as reference. And so on.

Ideally this could all be baked in with a nice model that is trained on all these by default but thats a tall task

Anonymous
10/26/24(Sat)08:26:34 No.102979446

Anonymous 10/26/24(Sat)08:26:34 No.102979446

>>102979436
Skill issue or something.

Anonymous
10/26/24(Sat)08:27:52 No.102979453

Anonymous 10/26/24(Sat)08:27:52 No.102979453

>>102979446
no I'm pretty sure it's the st devs' fault, just fucking enable DRY for kcpp already

Anonymous
10/26/24(Sat)08:29:29 No.102979464

Anonymous 10/26/24(Sat)08:29:29 No.102979464

>>102979453
Werks on my machine.

Anonymous
10/26/24(Sat)08:36:02 No.102979507

Anonymous 10/26/24(Sat)08:36:02 No.102979507

Bitmeme in electron. https://github.com/grctest/Electron-BitNet

Anonymous
10/26/24(Sat)08:41:09 No.102979547

Anonymous 10/26/24(Sat)08:41:09 No.102979547

>>102979507
>bitnet paper
>1 year ago
>bitnet framework
>ready
>bitnet model (usable)
>
Why

Anonymous
10/26/24(Sat)08:45:43 No.102979579

Anonymous 10/26/24(Sat)08:45:43 No.102979579

>>102979547
they all want someone else to put money into training one but they don't want to do it themselves

Anonymous
10/26/24(Sat)08:54:44 No.102979643

Anonymous 10/26/24(Sat)08:54:44 No.102979643

>>102979579
Microcock said they were making one themselves

Anonymous
10/26/24(Sat)09:01:18 No.102979684

Anonymous 10/26/24(Sat)09:01:18 No.102979684

>>102979579
They want model pass cuck test before releasing it in public.

Anonymous
10/26/24(Sat)09:02:56 No.102979691

Anonymous 10/26/24(Sat)09:02:56 No.102979691

File: 1709229671882879.jpg (89 KB, 967x1024)

89 KB JPG

i've been away for 8 months
what happened

Anonymous
10/26/24(Sat)09:03:56 No.102979703

Anonymous 10/26/24(Sat)09:03:56 No.102979703

>>102979691
nothing

Anonymous
10/26/24(Sat)09:04:24 No.102979704

Anonymous 10/26/24(Sat)09:04:24 No.102979704

File: Screenshot 2024-10-26 at (...).png (33 KB, 887x341)

33 KB PNG

Why the fuck is my post getting shadowrealm'd god damnit.
Fuck me.

Anonymous
10/26/24(Sat)09:04:27 No.102979705

Anonymous 10/26/24(Sat)09:04:27 No.102979705

>>102979691
All models are cucked harder than before, so, nothing good.

Anonymous
10/26/24(Sat)09:05:25 No.102979708

Anonymous 10/26/24(Sat)09:05:25 No.102979708

File: change it here.jpg (277 KB, 1876x502)

277 KB JPG

>>102979704
The image I actually wanted to post.

Anonymous
10/26/24(Sat)09:05:27 No.102979709

Anonymous 10/26/24(Sat)09:05:27 No.102979709

>>102979691
nemo, the best vramlet model

Anonymous
10/26/24(Sat)09:21:03 No.102979812

Anonymous 10/26/24(Sat)09:21:03 No.102979812

>>102979579
I have a few theories
>it flat out doesn't scale for bigger models
>it works but people who found out that it works are quitting to make their own adder hardware companies to stay ahead
>it works but head researchers are withholding release because it's unsafe for everyone to have powerful models at home
>deals with nvidia to not release big models

Anonymous
10/26/24(Sat)09:22:07 No.102979818

Anonymous 10/26/24(Sat)09:22:07 No.102979818

>>102979704
>>102979708
If anybody is getting a dry sequence break error with Silly (even if you don't use dry) after pulling from the staging branch, they fucked up.
They are getting the array of strings as a string with the array inside.
Here's a quick fix:
'dry_sequence_breakers': !!settings.dry_sequence_breakers ? JSON.parse(replaceMacrosInList(settings.dry_sequence_breakers)) : ["\n"],
Let's try again

>>102979167
I'm just happy we have small models that can use lorebooks pretty well without going completely retarded.
Hell, Nemo can even consistently ask for dice rolls if you steer it.
Things are pretty good and thank fuck for drummer that fucker. Rocinante 1.1 is so good.
All the style without losing any of the "intelligence".

>>102979436
With llama-server as the backend there's a button to choose the samplers, is that not an option with kcpp as a backend?

Anonymous
10/26/24(Sat)09:23:22 No.102979826

Anonymous 10/26/24(Sat)09:23:22 No.102979826

>>102979818
Aha.
Now it worked.
Was it because I mentioned a timer?
let's see if this one goes through.
>Also. I hate this fucking timer. Why the hell am I getting it multiple in a row?

Anonymous
10/26/24(Sat)09:26:28 No.102979853

Anonymous 10/26/24(Sat)09:26:28 No.102979853

>>102979818
>With llama-server as the backend there's a button to choose the samplers, is that not an option with kcpp as a backend?
Using ST's built-in kcpp API, no. Using the default API it complains that kcpp is using a legacy API and might drop responses (which it does) but ticking the legacy API box causes it to not connect at all.

Anonymous
10/26/24(Sat)09:27:42 No.102979865

Anonymous 10/26/24(Sat)09:27:42 No.102979865

>>102979812
In all likelihood it's just
>it's not at all well-supported vs. FP16 and no one wants to invest the effort to improve the ecosystem

Anonymous
10/26/24(Sat)09:28:12 No.102979871

Anonymous 10/26/24(Sat)09:28:12 No.102979871

>>102979691
AGI in 2 weeks

Anonymous
10/26/24(Sat)10:22:16 No.102980241

Anonymous 10/26/24(Sat)10:22:16 No.102980241

File: mmmmmk.jpg (42 KB, 415x415)

42 KB JPG

https://files.catbox.moe/c1k1rk.jpg

Anonymous
10/26/24(Sat)10:39:15 No.102980360

Anonymous 10/26/24(Sat)10:39:15 No.102980360

File: 00060-2888480053.png (1.04 MB, 1024x1024)

1.04 MB PNG

I've been on a mission to find the best working STT+TTS solution. I trained a voice using a Kuroki Tomoko EN dataset I assemlbed myself with Audacity using Piper, XTTSv2, and GPT-SoVITS:
https://huggingface.co/quarterturn/kuroki_tomoko_en_piper
https://huggingface.co/quarterturn/kuroki_tomoko_en_xtts_v2
https://huggingface.co/quarterturn/kuroki_tomoko_gpt_sovits_v2

Piper is the fastest to respond, but sounds the worst. Xttsv2 sounds good, but takes a bit to respond, is sensitive to the reference .wav file, and will sometimes go off the rails. GPT-SoVITS is by far the best quailty, hands-down, but nothing supports it directly at the moment.

As far as LLM front-end, I tried Open WebUI, SillyTavern, and Koboldcpp. SillyTavern I'm quite familiar with, but when it comes to TTS integration, it's tempermental; I could not get streaming to work.
Open WebUI is fucking garbage for roleplay and garbage even for base instruct prompting; the recommended openedai-speech solution was a nightmare of docker condfiguration where I gave up on trying to get their guide on adding a custom voice to work, and just connected into the container, installed vim, and edited the config files manually. After all that, it worked unreliably with a huge processing delay.
The winner was, suprisingly, Koboldcpp with Alltalk. Koboldcpp has whisper support built-in for STT, and supports Alltalk with xttsv2 well. It worked fantastically well, I just wish Alltalk supported GPT-SoVITS because the quality is far, far superior to anything else out there.

Anonymous
10/26/24(Sat)10:49:21 No.102980454

Anonymous 10/26/24(Sat)10:49:21 No.102980454

File: 1708703583049481.jpg (178 KB, 1564x1794)

178 KB JPG

>>102976869

Anonymous
10/26/24(Sat)11:02:14 No.102980567

Anonymous 10/26/24(Sat)11:02:14 No.102980567

>>102980360
Thank you for sharing your notes.
Is GPT-SoVITS the best option for stand alone basic text-to or voice-to matching? I gave up on Tortoise many months ago (slow, something about one of the generation stages can cause crashes, and apparently it got shelved) and haven't gotten around to investigating further since then.

Anonymous
10/26/24(Sat)11:06:36 No.102980612

Anonymous 10/26/24(Sat)11:06:36 No.102980612

>>102979691
Arthur released a model called Ministrations 8B. It's a dumb 8B model but has some of the best ministrations in the industry.

Anonymous
10/26/24(Sat)11:11:24 No.102980660

Anonymous 10/26/24(Sat)11:11:24 No.102980660

>>102980360
i got filtered by the sovits setup, how's the speed? want to make a speech to speech bot to practice spanish

Anonymous
10/26/24(Sat)11:11:35 No.102980663

Anonymous 10/26/24(Sat)11:11:35 No.102980663

>>102979705
i only care about computer programming though

>>102980612
based, i love the french

Anonymous
10/26/24(Sat)11:13:18 No.102980687

Anonymous 10/26/24(Sat)11:13:18 No.102980687

>>102980360
What's STT?

Anonymous
10/26/24(Sat)11:13:26 No.102980688

Anonymous 10/26/24(Sat)11:13:26 No.102980688

>>102980663
> only care about computer programming though
Deepseek coder 2.5 and the new qwen are both beasts for coding. Sota

Anonymous
10/26/24(Sat)11:14:20 No.102980694

Anonymous 10/26/24(Sat)11:14:20 No.102980694

>>102980688
are the based chinks actually the ones leading the way on this? i would not be surprised desu

Anonymous
10/26/24(Sat)11:15:37 No.102980705

Anonymous 10/26/24(Sat)11:15:37 No.102980705

fun fact : you can improve the quality of your language model by a large amount if you reject any training data which came from brown people

Anonymous
10/26/24(Sat)11:18:49 No.102980734

Anonymous 10/26/24(Sat)11:18:49 No.102980734

>>102980663
A good programming tip that I have discovered is to command the model to first print a diagnosis of a problem and only then the solution. Gets much better results.

Anonymous
10/26/24(Sat)11:23:26 No.102980771

Anonymous 10/26/24(Sat)11:23:26 No.102980771

>>102980734
that's a good point
i've lost count of how many tracebacks i've pasted into LLM context windows

Anonymous
10/26/24(Sat)11:26:45 No.102980809

Anonymous 10/26/24(Sat)11:26:45 No.102980809

File: Untitled.png (18 KB, 502x457)

18 KB PNG

>>102980687
speech to text, like whisper
>>102980360
i'm too retarded to get this shit to run

Anonymous
10/26/24(Sat)11:27:27 No.102980810

Anonymous 10/26/24(Sat)11:27:27 No.102980810

Someone should make a paper on why people choose to use shit software like ollama. They even came late in the scene, I don't understand how it got popular.

Anonymous
10/26/24(Sat)11:30:59 No.102980847

Anonymous 10/26/24(Sat)11:30:59 No.102980847

>>102980660
sovits doesnt work. i've spent hours trying to get it to work with all the unintuitive install process. I just tried it few hours ago. Its broken. Until someone makes a proper clean slate webui/api server, its not usable.

Anonymous
10/26/24(Sat)11:31:16 No.102980851

Anonymous 10/26/24(Sat)11:31:16 No.102980851

>>102980809
ah right, i always think of that as "transcription" but i guess STT makes sense since we always called TTS .. TTS

Anonymous
10/26/24(Sat)11:37:25 No.102980910

Anonymous 10/26/24(Sat)11:37:25 No.102980910

>>102980847
gpt-sovits works fine but like any of these tensorflow or torch projects you have to make sure you're starting from a correctly set-up cuda environment, use conda or a python env, and be willing to do some minor additional work if a dependency is missing or it complains about missing libraries. Sometimes you'll get things pulled in by pip which were compiled against cuda 11, in which case you have to search on your system for the file it wants (hopefully you have it from some other project) and then add it using "export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:"

Anonymous
10/26/24(Sat)11:39:24 No.102980922

Anonymous 10/26/24(Sat)11:39:24 No.102980922

>>102980809
I forgot to mention, getting TTS/STT working in the browser meant using Edge or Chrome. There's something fucked in Mozilla where it does not work properly.

Anonymous
10/26/24(Sat)11:40:40 No.102980936

Anonymous 10/26/24(Sat)11:40:40 No.102980936

>>102980847
womm

Anonymous
10/26/24(Sat)11:40:57 No.102980941

Anonymous 10/26/24(Sat)11:40:57 No.102980941

>>102980360
Those sound samples sound like shit...

Anonymous
10/26/24(Sat)11:44:48 No.102980970

Anonymous 10/26/24(Sat)11:44:48 No.102980970

>>102980847
i got gpt-sovits up and running in like 30 seconds after downloading this
https://huggingface.co/lj1995/GPT-SoVITS-windows-package/tree/main
trying to get xtts/alltalk running all morning has been a pain in my butt though.
wish i could figure out how to inference sovits from kobold

Anonymous
10/26/24(Sat)11:46:47 No.102980990

Anonymous 10/26/24(Sat)11:46:47 No.102980990

Some months ago I finetuned XTTS-v2 with a video game character samples but was unhappy with the result. I want to try again with GPT-SoVITS, what is the recommended project to use to do that? Note that I only have 12GB of vram, don't know if GPT-SoVITS need more vram than XTTS-v2 to finetune.

Anonymous
10/26/24(Sat)11:49:47 No.102981013

Anonymous 10/26/24(Sat)11:49:47 No.102981013

>>102980970
I'll try that. Still I prefer a complete new rewrite of the model. I partially got sovits to work last time with my conda install, but then broke itself when I had to run the inference server. The whole UI clutter is so unintuitive.

Anonymous
10/26/24(Sat)11:52:22 No.102981030

Anonymous 10/26/24(Sat)11:52:22 No.102981030

>>102980990
I fine tuned XTTS v2 with 8GB card. Ez. But xtts is very unstable with their outputs. I tried F5 TTS finetuning, but apparently that needs a 20+GB of vram. So I gave that up. The F5 TTS is stable and fast. But the training data is missing lot of words. None of the curse words work. Modern slangs dont work. Not enough data set is the prob I guess.

Anonymous
10/26/24(Sat)11:54:40 No.102981060

Anonymous 10/26/24(Sat)11:54:40 No.102981060

>>102981013
before you do, open up the go_webui.bat and change "zh_CH" to "en_US"
also, there is this helpful tutorial
https://rentry.org/GPT-SoVITS-guide
but really, you can skip all this training shit and go straight to inferencing using the weights this release comes with and your own 3-10 second audio sample rather than training your own and get good results.

Anonymous
10/26/24(Sat)11:57:17 No.102981086

Anonymous 10/26/24(Sat)11:57:17 No.102981086

>>102980990
Just a few minutes of audio (1:30-4:00) and like <8GB VRAM. If not, it can be done on cpu as well, but it takes longer, of course. Their main UI has the training stuff built-in. The inference stuff opens on a separate port once you launch it.
https://github.com/RVC-Boss/GPT-SoVITS
https://rentry.co/GPT-SoVITS-guide#/
The rentry guide is a bit shit. Just use the default values from the webui for batchsize and all that when training and start tuning them afterwards.

Anonymous
10/26/24(Sat)11:58:51 No.102981102

Anonymous 10/26/24(Sat)11:58:51 No.102981102

Speaking of voice cloning, is there anywhere out there with a database of high quality voice samples? I'm lazy and don't want to go downloading source material and ripping stuff.

Anonymous
10/26/24(Sat)11:59:10 No.102981105

Anonymous 10/26/24(Sat)11:59:10 No.102981105

>>102978199
all 'finetunes' are flukes. that's why every 'finetuner' does 15 runs of the same shit and pick the 'best' out of the lot.

Anonymous
10/26/24(Sat)11:59:41 No.102981114

Anonymous 10/26/24(Sat)11:59:41 No.102981114

>>102980360
>xtts AND sovits
Yikes.

Anonymous
10/26/24(Sat)12:01:10 No.102981125

Anonymous 10/26/24(Sat)12:01:10 No.102981125

>>102981102
You don't need super high quality, just good enough. And you don't need a lot either. Download some clips from youtube or something. You're given a lot. Stop being lazy.

Anonymous
10/26/24(Sat)12:03:59 No.102981154

Anonymous 10/26/24(Sat)12:03:59 No.102981154

>>102981102
https://huggingface.co/datasets?search=voice%20data

Anonymous
10/26/24(Sat)12:10:00 No.102981200

Anonymous 10/26/24(Sat)12:10:00 No.102981200

>>102981102
soundgasm dot net

Anonymous
10/26/24(Sat)12:11:54 No.102981221

Anonymous 10/26/24(Sat)12:11:54 No.102981221

>>102724337
>At the start of the roleplay, {{user}} immediately grabs the boobs of Seraphina, without any other context. Reroll the reply a few times.
>If Seraphina reacts negatively, as she should, then you may have a decent RP model. On the other hand, if Seraphina reacts positively and dives straight into ERP, then it means the model is filled with ERP slop, and is probably shit.
>It's a simple test to see if a model has common sense.

Anyone have a list showing whether models pass or fail the booba test?

Anonymous
10/26/24(Sat)12:13:39 No.102981245

Anonymous 10/26/24(Sat)12:13:39 No.102981245

>>102980810
Based. ollama still does not support logit probabilities in their API, its literally a shitty wrapper on top of llama.cpp except missing crucial features and thus is not API compatible with llama.cpp, the pull request has been open for 7+ months now. They keep advertising their shitty wrapper for "developers" while missing such basic features.

Anonymous
10/26/24(Sat)12:16:55 No.102981272

Anonymous 10/26/24(Sat)12:16:55 No.102981272

>>102981245
i used to like ollama because it has that "unload model after x minutes of no use" thing, but it doesnt seem to support avx512 so is dogslow for me

Anonymous
10/26/24(Sat)12:17:15 No.102981279

Anonymous 10/26/24(Sat)12:17:15 No.102981279

>>102981221
it's a bad test because even the worst finetunes filled with "ERP slop" pass it

Anonymous
10/26/24(Sat)12:19:17 No.102981301

Anonymous 10/26/24(Sat)12:19:17 No.102981301

>>102980970
>>102981013
>gpt-sovits
What a fucking let down. The voice cloning produces garbage quality that doesnt even sound like reference audio. Both xtts(unstable)/F5(too small data with missing words) produces much better sounding quality.

Anonymous
10/26/24(Sat)12:19:48 No.102981312

Anonymous 10/26/24(Sat)12:19:48 No.102981312

File: ComfyUI_05573_.png (980 KB, 1280x720)

980 KB PNG

https://www.reuters.com/legal/mother-sues-ai-chatbot-company-characterai-google-sued-over-sons-suicide-2024-10-23/
Character.AI getting sued. It's over for locusts.
Prepare for a swarm of refugees

Anonymous
10/26/24(Sat)12:20:34 No.102981317

Anonymous 10/26/24(Sat)12:20:34 No.102981317

>>102981245
I think the worst part is their default, it's marketed as something simple to use, nobody is going to change them. It use 4_0 quant by default and 2048 context...
Can't forget how they obfuscate everything, too simple to have GGUF files. Another part that suck is how a lot of LLM related projects are based around ollama, and since ollama have some shitty API that nobody else use it kill the whole ecosystem. They have a OAI compatible API, but it's incomplete, instead of doing like all the others LLM projects that accept extra parameters they only accept official OAI one, that means if you want extra samplers you have to use their shitty API.
>>102981272
Just use systemd for that or another init system and a proxy, with systemd you can use included one: /usr/lib/systemd/systemd-socket-proxyd --exit-idle-time=5min

Anonymous
10/26/24(Sat)12:23:12 No.102981355

Anonymous 10/26/24(Sat)12:23:12 No.102981355

>>102981301
*ahem*
skill issue

Anonymous
10/26/24(Sat)12:24:40 No.102981374

Anonymous 10/26/24(Sat)12:24:40 No.102981374

>>102981301
>The voice cloning produces garbage quality
Bullshit. I wouldn't say that the voices are indistinguishable, but it far from garbage.

Anonymous
10/26/24(Sat)12:24:42 No.102981375

Anonymous 10/26/24(Sat)12:24:42 No.102981375

>>102981312
Locusts don't use CharacterAI, you don't know what is the meaning of locust, retard.

Anonymous
10/26/24(Sat)12:24:56 No.102981377

Anonymous 10/26/24(Sat)12:24:56 No.102981377

>>102981312
Doubt it will go anywhere. If you look it up the kid both broke the rules by using the site being younger than 17 AND edited all the responses to say what he wanted to the point that he basically wrote a fanfiction. Parents who are actually at fault just looking for a payday.

Anonymous
10/26/24(Sat)12:26:02 No.102981389

Anonymous 10/26/24(Sat)12:26:02 No.102981389

>>102980810
Do keep survivor bias in mind.
Things that are already popular get more popular automatically.
ollama is the wrapper that made it big but there are dozens of other ones that never took off.
The devs being ex-Google probably gave them an edge though.

Anonymous
10/26/24(Sat)12:27:10 No.102981401

Anonymous 10/26/24(Sat)12:27:10 No.102981401

>>102981377
>edited all the responses to say what he wanted to the point that he basically wrote a fanfiction
that can't be true, how does one get so immersed in the chat to the point of ack-ing themselves while constantly breaking their immersion by editing messages?

Anonymous
10/26/24(Sat)12:30:36 No.102981448

Anonymous 10/26/24(Sat)12:30:36 No.102981448

>>102981389
Shit was mac (ARM) only software for a while and came way after all others wrapper. Like it came out after llama 2 when the whole ecosystem was already solidified.

Anonymous
10/26/24(Sat)12:31:20 No.102981454

Anonymous 10/26/24(Sat)12:31:20 No.102981454

>>102981401
Some people hold beliefs that cannot be held after taking a shower. I doubt that was a normal kid to begin with.

Anonymous
10/26/24(Sat)12:36:32 No.102981505

Anonymous 10/26/24(Sat)12:36:32 No.102981505

File: 1708266380645-1.png (303 KB, 1024x1024)

303 KB PNG

>>102981375
(you)

Anonymous
10/26/24(Sat)12:38:06 No.102981525

Anonymous 10/26/24(Sat)12:38:06 No.102981525

>>102981312
>and would make changes to "reduce the likelihood of encountering sensitive or suggestive content"
character.ai sissies not like this!!

Anonymous
10/26/24(Sat)12:39:46 No.102981545

Anonymous 10/26/24(Sat)12:39:46 No.102981545

File: disapear.gif (2.22 MB, 360x498)

2.22 MB GIF

>>102980810
Marketing and 'cool kid clubs'
Work at Google, get involved in the right social circles, let your network know your launching a new product, your network then blasts that out to everyone, and since your ex-google, clearly you're smarter than everyone else, and don't need to give credit to the people who's work you're using. No way at all.
Fuck ollama.

Anonymous
10/26/24(Sat)12:46:37 No.102981616

Anonymous 10/26/24(Sat)12:46:37 No.102981616

>>102981401
Kids do that. They are entertained by anything and cannot perceive any flaws.

Anonymous
10/26/24(Sat)12:51:40 No.102981669

Anonymous 10/26/24(Sat)12:51:40 No.102981669

>>102981545
Sure it's just networking, and not google spreading its influence trough "ex" employees.

Anonymous
10/26/24(Sat)12:51:43 No.102981670

Anonymous 10/26/24(Sat)12:51:43 No.102981670

>>102981374
https://vocaroo.com/1hkNMBvZHPI9
vs
https://vocaroo.com/18hUiLH29kTy (gpt sovits)

https://vocaroo.com/1cxviw5RExde
vs
https://vocaroo.com/1kL4k7GjLPcx (gpt sovits)

Neither perfect, but ones a lot better

Anonymous
10/26/24(Sat)12:52:13 No.102981680

Anonymous 10/26/24(Sat)12:52:13 No.102981680

>>102981301
https://litter.catbox.moe/mgbvg3.ogg
sovits is still fun to play with and i can't get xtts to work

Anonymous
10/26/24(Sat)12:52:50 No.102981689

Anonymous 10/26/24(Sat)12:52:50 No.102981689

>>102981670
Also, this is with F5-E2 tts.

Anonymous
10/26/24(Sat)12:53:22 No.102981698

Anonymous 10/26/24(Sat)12:53:22 No.102981698

>>102978347
Lambos are only 250k?
Don't tell me you looked at YouTuber special Huracans?

Anonymous
10/26/24(Sat)12:54:14 No.102981711

Anonymous 10/26/24(Sat)12:54:14 No.102981711

>>102976873
>--MolmoE-1B-0924 model recommended for object detection in images:
https://huggingface.co/allenai/MolmoE-1B-0924/discussions/7
>Examples of fine-tuning code?
>We plan to fully open-source the code soon (after clean-up) which will likely include finetuning examples
>27 days ago
Niggers.

Anonymous
10/26/24(Sat)12:55:32 No.102981723

Anonymous 10/26/24(Sat)12:55:32 No.102981723

>>102981670
>>102981670
Reference audio if you want to repeat the experiment
>Bateman
https://vocaroo.com/1wQdND1WInkj
>Aerith
https://vocaroo.com/1fixBobnqNON

Anonymous
10/26/24(Sat)12:57:40 No.102981741

Anonymous 10/26/24(Sat)12:57:40 No.102981741

File: take-your-meds.webm (2.91 MB, 1440x720)

2.91 MB WEBM

>>102981669
They wouldn't be ex-employees if that were the case. Google would happily bankroll such an operation.
You could make the argument that its being done by a sr mgr/director-level as a personal play to pivot, but I don't think so. I think its plain nepotism and 'cool kid/popular one' bullshit.
SF is a _big_ networking city, plenty of events all the time, with a large concentration of people working in tech/tech-adjacent and being comfortable using 'new' stuff.
All it takes is one person at one of those events to show off their new product -> people use it and share at other events -> popularity blows up.
Not _everything_ is a damn conspiracy anon.

Anonymous
10/26/24(Sat)13:00:30 No.102981774

Anonymous 10/26/24(Sat)13:00:30 No.102981774

>>102981680
Try the E5-F2 TTS

https://huggingface.co/spaces/mrfakename/E2-F5-TTS

>install
https://huggingface.co/spaces/mrfakename/E2-F5-TTS/
Clone this
Setup env with conda or something with 3.12 and install the pytorch with cuda 12.1 (or 12.4) and install the requirements.txt with pip

Anonymous
10/26/24(Sat)13:04:51 No.102981841

Anonymous 10/26/24(Sat)13:04:51 No.102981841

>>102981774
would i be able to inference from it through koboldlite by setting up some kind of openai compatible api thing?
that's all i'm really looking for in a tts setup.
can't get alltalk to work at all.

Anonymous
10/26/24(Sat)13:05:08 No.102981845

Anonymous 10/26/24(Sat)13:05:08 No.102981845

>>102981680
>https://github.com/daswer123/xtts-webui
Have you tried this?

>https://huggingface.co/daswer123/xtts_portable/resolve/main/xtts-webui-v1_0-portable.zip?download=true
Or a portable version here?

Anonymous
10/26/24(Sat)13:07:32 No.102981878

Anonymous 10/26/24(Sat)13:07:32 No.102981878

talking about TTS, thoughts on https://vall-e-demo.ecker.tech/ ?
I admire his autism

Anonymous
10/26/24(Sat)13:07:32 No.102981879

Anonymous 10/26/24(Sat)13:07:32 No.102981879

>>102981670
>>102981689
>>102981723
I see. I downloaded your the samples and i'll give it a look later. The little tests i did with gpt_sovits worked pretty well, but yeah. Your F5-E2 sounds much better. The sovits model, was it v1 or v2?

Anonymous
10/26/24(Sat)13:08:26 No.102981885

Anonymous 10/26/24(Sat)13:08:26 No.102981885

>>102981841
You'd need to write an API server for that. I think someone did a work part of the work here with setting up an API server. If you got time, you can possibly edit the API server code to match that of kobold compatible API.

https://github.com/jianchang512/f5-tts-api

Anonymous
10/26/24(Sat)13:09:27 No.102981900

Anonymous 10/26/24(Sat)13:09:27 No.102981900

>>102981879
Its the standard that came with the download link provided. I forgot to check, as I've already closed the setup.

Anonymous
10/26/24(Sat)13:13:05 No.102981933

Anonymous 10/26/24(Sat)13:13:05 No.102981933

>>102981841
>>102981885
And if you're lazy, you can just message mrfakename and ask if he could write a quick API that uses the same IP/API format as any of the supported ones like xtts/alltalk/openai.

Anonymous
10/26/24(Sat)13:16:59 No.102981966

Anonymous 10/26/24(Sat)13:16:59 No.102981966

>>102981933
or get <llm of your choice> to do it, setting up an API according to a spec is exactly the sort of busy work these models can chew through with very little guidance

Anonymous
10/26/24(Sat)13:24:07 No.102982034

Anonymous 10/26/24(Sat)13:24:07 No.102982034

>>102981900
>Its the standard that came with the download link provided. I forgot to check, as I've already closed the setup.
Fair enough. I'll give it a go with your samples and post when i'm back. I'll try both models, just in case.

Anonymous
10/26/24(Sat)13:24:28 No.102982039

Anonymous 10/26/24(Sat)13:24:28 No.102982039

Do sloptuners have their own benchmarks to test their stuff on? Or do they just say hey it's a failure I spent money on it so might as well upload the weights and give the abomination a fancy name like PrimeOracle?

Anonymous
10/26/24(Sat)13:25:17 No.102982048

Anonymous 10/26/24(Sat)13:25:17 No.102982048

File: Untitled.png (10 KB, 449x287)

10 KB PNG

>>102981845
i think this portable one's gonna be a winner when it finishes unzipping next week

Anonymous
10/26/24(Sat)13:26:36 No.102982060

Anonymous 10/26/24(Sat)13:26:36 No.102982060

>>102982039
Some upload variations of a model and after testing, either by themselves or with other people, they keep the best performing one and remove the rest. Not sure how prevalent that is among them.

Anonymous
10/26/24(Sat)13:27:33 No.102982069

Anonymous 10/26/24(Sat)13:27:33 No.102982069

>>102982048
Use 7-zip.

Anonymous
10/26/24(Sat)13:32:27 No.102982124

Anonymous 10/26/24(Sat)13:32:27 No.102982124

>>102982039
Sloptuners don't even know what sampler settings to use. Wasn't there a release just this week where the author told everyone he wasn't even aware of DRY sampling and said he just uses everything at default-ish settings. In the worst care (Dummber), he lobotomizes the models by training them on the wrong instruct format and then copes endlessly about it in these threads and flees to Reddit for validation.

Anonymous
10/26/24(Sat)13:32:48 No.102982129

Anonymous 10/26/24(Sat)13:32:48 No.102982129

>>102982048
>using default windows extractor/copier/delete/move
YIKES. Thats like 50%+ speed debuff right there vs 7zip/TeraCopy/etc

Anonymous
10/26/24(Sat)13:36:00 No.102982166

Anonymous 10/26/24(Sat)13:36:00 No.102982166

>>102982124
>sloptuner posts new Nemo tune
>recommended settings are default koboldcpp samplers and the original mistral instruct format
happens every week

Anonymous
10/26/24(Sat)13:37:42 No.102982182

Anonymous 10/26/24(Sat)13:37:42 No.102982182

>>102982166
>>recommended settings are default koboldcpp samplers
>instead of ________
>>and the original mistral instruct format
>instead of ________
Fill in the blanks so your comment is useful to the next sloptuner instead of part of the noise floor.

Anonymous
10/26/24(Sat)13:39:00 No.102982201

Anonymous 10/26/24(Sat)13:39:00 No.102982201

>>102982182
Why bother? You never learn anyway.

Anonymous
10/26/24(Sat)13:39:11 No.102982204

Anonymous 10/26/24(Sat)13:39:11 No.102982204

>>102982182
isn't the implication that these are wrong enough to inspire they look up the correct information, shit-for-brains? if they're not actually invested in doing a good job they're sure as hell not interested in some anon's opinion

Anonymous
10/26/24(Sat)13:43:14 No.102982251

Anonymous 10/26/24(Sat)13:43:14 No.102982251

>>102982182
In this example Nemo would suffer because Mistral itself recommends <1 temp, and the original Mistral instruct format is outdated, replaced with Mistral V2 or Tekken

Anonymous
10/26/24(Sat)13:46:40 No.102982278

Anonymous 10/26/24(Sat)13:46:40 No.102982278

>>102982124
>DRY sampling and said he just uses everything at default-ish settings
A good model shouldn't need complex samplers. General model testing should be done with as simple of a setup as possible.

Anonymous
10/26/24(Sat)13:48:12 No.102982289

Anonymous 10/26/24(Sat)13:48:12 No.102982289

>>102982124
If I see fancy samplers recommended in the model page I'm not downloading it

Anonymous
10/26/24(Sat)13:50:24 No.102982308

Anonymous 10/26/24(Sat)13:50:24 No.102982308

>>102982289
Nobody asked you.

Anonymous
10/26/24(Sat)13:53:31 No.102982342

Anonymous 10/26/24(Sat)13:53:31 No.102982342

>>102982308
>yeah I trained my shit on pure gptslop, just use DRY Dynatemp XTC and mirostat together to fix it bro

Anonymous
10/26/24(Sat)13:53:41 No.102982344

Anonymous 10/26/24(Sat)13:53:41 No.102982344

File: sloppa (2).png (62 KB, 844x454)

62 KB PNG

New Sloppacomplete, now with the sloppiest technology available: SloppaSampler (only works for llama.cpp)! Simply set Max tokens to 1, then type something, use the number keys to insert your Slop-token of choice! You can also see how the samplers affect the token probabilities in real time. https://rentry.org/sloppacomplete/raw

Anonymous
10/26/24(Sat)13:55:36 No.102982365

Anonymous 10/26/24(Sat)13:55:36 No.102982365

>>102982278
>>102982289
This is why the scene sucks, pajeets just train their shit model, drop it, and move on to the next trash. No QC, no testing, no information, just spam reddit and /lmg/ with the link and move on to the next shitty project. God forbid a sloptuner spend some time testing sampler settings, figuring out what produces quality outputs consistently, if new "fancy" samplers are a net positive for the model or not.

Anonymous
10/26/24(Sat)13:57:04 No.102982378

Anonymous 10/26/24(Sat)13:57:04 No.102982378

>>102977151
Doubt it, I'm not wasting time downloading another magnum shit.

Anonymous
10/26/24(Sat)13:58:25 No.102982393

Anonymous 10/26/24(Sat)13:58:25 No.102982393

>>102982365
You do know that these fancy samplers were made to combat shit models right? And you suggest they test their models with antislop samplers enabled? So you know how retarded that sounds?

Anonymous
10/26/24(Sat)13:58:51 No.102982398

Anonymous 10/26/24(Sat)13:58:51 No.102982398

what's a nice frontend for story writing?

Anonymous
10/26/24(Sat)13:59:58 No.102982409

Anonymous 10/26/24(Sat)13:59:58 No.102982409

>>102982393
> And you suggest they test their models with ALL THE THE TOOLS READILY AVAILABLE TO THEM?
Fuck you are retarded this hobby is so cooked

Anonymous
10/26/24(Sat)14:00:05 No.102982410

Anonymous 10/26/24(Sat)14:00:05 No.102982410

>>102982398
novelcrafter is the best, mikupad is the comfiest

Anonymous
10/26/24(Sat)14:00:07 No.102982411

Anonymous 10/26/24(Sat)14:00:07 No.102982411

>>102982344
Thanks anon, this is pretty neat.
This project is the only other I'm aware of doing this:
https://github.com/the-crypt-keeper/LLooM

Anonymous
10/26/24(Sat)14:01:16 No.102982425

Anonymous 10/26/24(Sat)14:01:16 No.102982425

>>102982204
>isn't the implication that these are wrong enough to inspire they look up the correct information, shit-for-brains
Look it up where? What is the reliable reference? You forgot to post a link to it.

Anonymous
10/26/24(Sat)14:02:05 No.102982436

Anonymous 10/26/24(Sat)14:02:05 No.102982436

>>102982409
fr fr

Anonymous
10/26/24(Sat)14:04:52 No.102982464

Anonymous 10/26/24(Sat)14:04:52 No.102982464

>>102982425
>spend hours and hours preparing, reading, learning about LLMs
>finally ready to do a sloptune of your own
>again, spend hours reading and learning, collecting your data sets
>can't be bothered to look up available documentation or the pages of discussion users have already had about the model
no you're right i'm the dumb one sorry man

Anonymous
10/26/24(Sat)14:04:57 No.102982466

Anonymous 10/26/24(Sat)14:04:57 No.102982466

>>102982410
Not really looking to pay a monthly subscription, especially when I'm running inference myself anyway.

Anonymous
10/26/24(Sat)14:06:15 No.102982476

Anonymous 10/26/24(Sat)14:06:15 No.102982476

>>102982365
>God forbid a sloptuner spend some time testing sampler settings, figuring out what produces quality outputs consistently
You didn't get it. I'll repeat it. A good model wouldn't need complex samplers. It's supposed to be better than their parent model with exactly the same settings and there's no more reliable sampler greedy. Different implementations of complex samplers on different inference software will case different results. All implementations of XTC are different, for example. But top-k 1 can be reliably implemented.
I'd tell you to read this PR, but i doubt you can maintain attention for that long
>https://github.com/ggerganov/llama.cpp/pull/9742

Anonymous
10/26/24(Sat)14:06:24 No.102982480

Anonymous 10/26/24(Sat)14:06:24 No.102982480

>>102982182
I doubt that nigger has ever trained anything in his life. Ever.

Anonymous
10/26/24(Sat)14:07:24 No.102982492

Anonymous 10/26/24(Sat)14:07:24 No.102982492

>>102982476
>more reliable sampler greedy.
more reliable sampler *than* greedy.

Anonymous
10/26/24(Sat)14:09:39 No.102982505

Anonymous 10/26/24(Sat)14:09:39 No.102982505

>>102982464
Sloptuning requires only slightly more intelligence than it takes to turn the computer on, I doubt they're doing a lot of reading or learning regarding the subject

Anonymous
10/26/24(Sat)14:11:14 No.102982517

Anonymous 10/26/24(Sat)14:11:14 No.102982517

>>102982409
>let's add a bunch of noise when we test bro
I have a feeling you barely know what samplers are except they're magic knobs that make things betterer. I hope you're not in charge of anything important

Anonymous
10/26/24(Sat)14:13:38 No.102982535

Anonymous 10/26/24(Sat)14:13:38 No.102982535

File: ComfyUI_34410_.png (914 KB, 848x1024)

914 KB PNG

Anonymous
10/26/24(Sat)14:13:45 No.102982537

Anonymous 10/26/24(Sat)14:13:45 No.102982537

>>102982466
You can use it locally for free, there was a rentry for it but I forgot the link.
Here is the one I had downloaded: https://litter.catbox.moe/jx56rv.html

Anonymous
10/26/24(Sat)14:14:55 No.102982552

Anonymous 10/26/24(Sat)14:14:55 No.102982552

>>102982464
I didn't say that you were dumb. That's something that came from within your own mind, perhaps your own soul.
I said that if you can identify wrong settings but don't offer the correct ones then you're choosing to perpetuate a problem that you could solve by teaching the correct settings.

Anonymous
10/26/24(Sat)14:27:16 No.102982670

Anonymous 10/26/24(Sat)14:27:16 No.102982670

>>102981878
>install package
>run web UI
>it just works, no fucking hoops to jump through
Finally, something that JUST WORKS

Anonymous
10/26/24(Sat)14:33:27 No.102982730

Anonymous 10/26/24(Sat)14:33:27 No.102982730

>>102981680
>>102981301
>>102980970
My problem with XTTS2 is tend to make noise, I don't know how to avoid this.

Anonymous
10/26/24(Sat)14:41:37 No.102982789

Anonymous 10/26/24(Sat)14:41:37 No.102982789

>>102982537
kek does that fully work? That's crazy. Where are the stories stored?

Anonymous
10/26/24(Sat)14:48:19 No.102982835

Anonymous 10/26/24(Sat)14:48:19 No.102982835

>>102982730
>xtts
decent/fast/cloning is decent but unstable as it produces garbage output often
>f5
decent/fast/cloning is good but cant pronounce some words
>gpt-sovits
okay/fast but clone voice dont sound like reference

>finetune
xtts: good/easy on 8GB card
gpt-sovits: havent tried it
f5: needs 20+GB of vram to run it, havent tried it

Anonymous
10/26/24(Sat)14:48:57 No.102982842

Anonymous 10/26/24(Sat)14:48:57 No.102982842

is this the new imggen thread

Anonymous
10/26/24(Sat)14:51:41 No.102982868

Anonymous 10/26/24(Sat)14:51:41 No.102982868

>>102982842
yeee

Anonymous
10/26/24(Sat)14:52:02 No.102982871

Anonymous 10/26/24(Sat)14:52:02 No.102982871

>>102982835
Can't we fix Tortoise? Tortoise worked pretty well. Slow, but worked. (Except when it barfed garbage or crashed my computer. Never figured that one out. Seemed to be in Python's math packages. Maybe an internal race condition?)

Anonymous
10/26/24(Sat)14:52:36 No.102982875

Anonymous 10/26/24(Sat)14:52:36 No.102982875

>>102982537
Here's the link:
https://rentry.org/offline-nc

Anonymous
10/26/24(Sat)14:57:10 No.102982909

Anonymous 10/26/24(Sat)14:57:10 No.102982909

>>102982871
Give up. Its dead. I personally think the best is F5 right now, it just needs proper dataset training, which a finetune might be able to do so. And people who are curious about emotional voices for it, (or any tts), they can simply use multi style format provided that your speaker reference has multiple emotional references as well. So I dont think looking backwards towards tortoise is the answer.

I dont know the root cause of why xtts useless results, but that might be another venue to fixing it, but xtts has been out for a long time and there really hasnt been a fix for it.

Anonymous
10/26/24(Sat)15:00:04 No.102982933

Anonymous 10/26/24(Sat)15:00:04 No.102982933

>>102978199
Even 1.1 is shit now. Used to work great for the size but now it can't even follow a simple scene without going out of character. I don't know if I should blame ST's recent formatting changes or Koboldcpp but shit sucks

Anonymous
10/26/24(Sat)15:00:18 No.102982935

Anonymous 10/26/24(Sat)15:00:18 No.102982935

bros... i genuinely can't fatho how much of an improvement the xtc sampler is to my erp sessions, it feels so much more creative without becoming gibberish like high temp
this + a good banned words list and we might just have a fix for slop

Anonymous
10/26/24(Sat)15:01:34 No.102982945

Anonymous 10/26/24(Sat)15:01:34 No.102982945

>>102978199
buy a fucking ad sao

Anonymous
10/26/24(Sat)15:01:40 No.102982946

Anonymous 10/26/24(Sat)15:01:40 No.102982946

>>102982935
what settings?

Anonymous
10/26/24(Sat)15:05:11 No.102982968

Anonymous 10/26/24(Sat)15:05:11 No.102982968

>>102982909
>I dont think looking backwards towards tortoise is the answer
I don't either, but it's the only thing I've gotten non-garbage results out of. Everything else found a way to make me feel retarded fighting with Python venv fuckshit or pip shitting all over itself and if it did work then it'd work for a moment, make bad output, then break, etc. and Tortoise let me voice clone in what at least seemed to be reasonable GPU time and results. If it didn't crash my whole fucking system at random it'd be sufficient for my use cases.

>F5
12GB Vramlet, above says it needs 24GB, so that does me no good.

Anonymous
10/26/24(Sat)15:07:21 No.102982989

Anonymous 10/26/24(Sat)15:07:21 No.102982989

>>102982946
by no means optimal, but what i use most of the time
temp : 1.25
min-p : 0.05
xtc-threshold : 0.1
xtc-probability : 0.5
dry-multiplier : 0.9
dry-base : 1.75
dry-length : 2

what this does is basically force the model to choose tokens that have a probability between 0.05 and 0.1 half of the time, you can increase xtc-probability up to even 1 and it works fine most of the time

Anonymous
10/26/24(Sat)15:14:21 No.102983030

Anonymous 10/26/24(Sat)15:14:21 No.102983030

have people stopped using exl2? I can only find q4 quants for Mistral-Small-Instruct-2409.

Anonymous
10/26/24(Sat)15:17:34 No.102983043

Anonymous 10/26/24(Sat)15:17:34 No.102983043

>>102983030
aren't most people now using GGUF and offloading as much as they can into their vram?

Anonymous
10/26/24(Sat)15:17:41 No.102983044

Anonymous 10/26/24(Sat)15:17:41 No.102983044

File: Screenshot_20241026_combined.png (179 KB, 1175x750)

179 KB PNG

>>102979167
I know this is bait and posted 8 hours ago, but let's take a look at a real piece of literature, shall we? This is a popular one, the first book of the wheel of time. It's a pretty modern book. Lots of words, there's probably some slop in there, right?
>No mixtures of emotions
>No ministrations
>No shivers up or down spines
>No tails swishing
>No almost-whispering
And so, you are a fucking faggot retard

Anonymous
10/26/24(Sat)15:20:33 No.102983073

Anonymous 10/26/24(Sat)15:20:33 No.102983073

>>102983044
that reminds me I need to generate more WoT smut
>Verification not required.

Anonymous
10/26/24(Sat)15:23:35 No.102983106

Anonymous 10/26/24(Sat)15:23:35 No.102983106

Can you access the more advanced sovits inference parameters? like seed and sampling steps, or whatever.

Anonymous
10/26/24(Sat)15:25:28 No.102983121

Anonymous 10/26/24(Sat)15:25:28 No.102983121

>>102983044
Now ctrl F for "tugging on her braid" and report back, champ

Anonymous
10/26/24(Sat)15:26:22 No.102983127

Anonymous 10/26/24(Sat)15:26:22 No.102983127

>>102983030
Isn't exl2 only for VRAM chads? I'm a bit busy buying food and gasoline to find my stack of bit coins to buy that shit so my slop comes a little bit faster.

And if I were going to run a model small enough to fit my VRAM, it's already faster than I type a prompt so I don't have a use case. I guess somebody does since you found that one, but otherwise, I'm going to make do with what I have, which is GGUF and four sticks of system RAM.

Anonymous
10/26/24(Sat)15:27:15 No.102983136

Anonymous 10/26/24(Sat)15:27:15 No.102983136

File: 00005-730888155.png (1.06 MB, 768x1080)

1.06 MB PNG

>>102953597
Finally, my new CPU has arrived, and I've managed to get everything up and running, except for Flash Attention 2, xformers, and fish-speech. With a 512GB/s bandwidth, card's performance is comparable to that of an 3060 with a 360GB/s bandwidth.

With GRUB_CMDLINE_LINUX_DEFAULT="amdgpu.sched_policy=2", idle power consumption decreased from 42W to 6-7W.

Anonymous
10/26/24(Sat)15:52:42 No.102983344

Anonymous 10/26/24(Sat)15:52:42 No.102983344

File: nagzul.jpg (160 KB, 1334x918)

160 KB JPG

https://vocaroo.com/1jXMXtD9RxoN

Anonymous
10/26/24(Sat)15:53:36 No.102983356

Anonymous 10/26/24(Sat)15:53:36 No.102983356

One thing I appreciate about aya is that I am an esl and so far all the models were unusable in my language. At least for smut rp. Aya is like 95% perfect for this (still has some minor retard mistakes). I am genuinely curious to see what kind of slop turns of phrase I will find in my language. I think I already saw a few of the english classics but they somehow don't trigger revulsion in me cause I didn't see them that many times in my own language yet I guess.

Anonymous
10/26/24(Sat)15:56:26 No.102983375

Anonymous 10/26/24(Sat)15:56:26 No.102983375

>>102983356
How smart is it really?

Anonymous
10/26/24(Sat)16:07:48 No.102983489

Anonymous 10/26/24(Sat)16:07:48 No.102983489

Are there any Nemotron 70b system prompts or merges that get rid of it always trying to force bullet points or headlines into all its RP replies?

Anonymous
10/26/24(Sat)16:08:33 No.102983497

Anonymous 10/26/24(Sat)16:08:33 No.102983497

>>102983136
RDNA2 can't do flash attention, only CDNA2+ and RDNA3+ have those capabilities. Don't know what is fish-speech but looks like it use faster whisper which use ctranslate2. It only had a HIP port a few months ago, https://github.com/arlo-phoenix/CTranslate2-rocm, worked fine for me for whisperx.

Anonymous
10/26/24(Sat)16:09:38 No.102983511

Anonymous 10/26/24(Sat)16:09:38 No.102983511

>>102983489
Telling it that it is {{char}} and will only respond in character.

But this finetune fixes it as well.
https://huggingface.co/Envoid/Llama-3.05-NT-Storybreaker-Ministral-70B?not-for-all-audiences=true

Anonymous
10/26/24(Sat)16:12:34 No.102983545

Anonymous 10/26/24(Sat)16:12:34 No.102983545

>>102982875
nice Miku pic

Anonymous
10/26/24(Sat)16:13:43 No.102983558

Anonymous 10/26/24(Sat)16:13:43 No.102983558

>>102982789
Your browser's localStorage afaict

Anonymous
10/26/24(Sat)16:19:43 No.102983616

Anonymous 10/26/24(Sat)16:19:43 No.102983616

>Intellect-1 just lost 0.16 percent of progress
I wonder what error occurred this time. Good thing they implemented save states into this thing.

Anonymous
10/26/24(Sat)16:21:19 No.102983632

Anonymous 10/26/24(Sat)16:21:19 No.102983632

>>102983616
God I hope they hurry up so someone can slop tune it so I can nala test it.

Anonymous
10/26/24(Sat)16:25:04 No.102983673

Anonymous 10/26/24(Sat)16:25:04 No.102983673

>>102983616
WHO CARES

Anonymous
10/26/24(Sat)16:26:55 No.102983694

Anonymous 10/26/24(Sat)16:26:55 No.102983694

does anyone have the link to the vtuber ai audio archive that was arounda a year or so ago?

Anonymous
10/26/24(Sat)16:33:40 No.102983763

Anonymous 10/26/24(Sat)16:33:40 No.102983763

>>102983673
Me, or else I wouldn't have brought it up.

Anonymous
10/26/24(Sat)16:35:25 No.102983778

Anonymous 10/26/24(Sat)16:35:25 No.102983778

>>102983616
I asked
I care
You must be watching that shit like a hawk, thank you for your service

Anonymous
10/26/24(Sat)16:36:34 No.102983791

Anonymous 10/26/24(Sat)16:36:34 No.102983791

>>102983375
I tried it for a bit longer and... not very. It even called a dick a roster. It is somewhere in between understanding the language and doing what other llms do - write in english and then directly translate words instead of sentences.

Anonymous
10/26/24(Sat)16:36:57 No.102983798

Anonymous 10/26/24(Sat)16:36:57 No.102983798

>>102983030
https://hf.co/LoneStriker/Mistral-Small-Instruct-2409-8.0bpw-h8-exl2 (also 6.0bpw, 5.0bpw, 4.0bpw, 3.0bpw)

Oddly a 8.0bpw exl2 of Mistral Small will fit into 24 GB of VRAM with 16k context and no context quantization, but that's not true for any fine tunes of Mistral Small I've tried so far (so I've mostly switched to Q6_K / Q6_K_L GGUFs which I can fully run in RAM with 18k context). I wonder what exllamav2 is doing.

Also I think the sweet spot for Mistral Small is 18k or 19k of context; right above 19k is when it seems to start falling apart. There was a guy who used byroneverson/Mistral-Small-Instruct-2409-abliterated to extend 1000 synthetic ERP chats from 4k context to 20k context, which if they aren't mad trash at the end would indicate the ceiling is slightly higher, but as he hasn't made the generated data isn't public my inclination is to think they did turn to garbage and he didn't catch it.

Anonymous
10/26/24(Sat)16:40:37 No.102983839

Anonymous 10/26/24(Sat)16:40:37 No.102983839

>>102983798
>8.0bpw
newfag trap

Anonymous
10/26/24(Sat)16:41:33 No.102983848

Anonymous 10/26/24(Sat)16:41:33 No.102983848

>>102983839
vramlet quant cope

Anonymous
10/26/24(Sat)16:42:21 No.102983857

Anonymous 10/26/24(Sat)16:42:21 No.102983857

I just got a 3090 the other day, what's the current meta?

Anonymous
10/26/24(Sat)16:42:23 No.102983858

Anonymous 10/26/24(Sat)16:42:23 No.102983858

File: file.png (764 KB, 768x768)

764 KB PNG

Anonymous
10/26/24(Sat)16:43:20 No.102983869

Anonymous 10/26/24(Sat)16:43:20 No.102983869

>>102983857
ONE 3090?

Anonymous
10/26/24(Sat)16:43:59 No.102983874

Anonymous 10/26/24(Sat)16:43:59 No.102983874

>>102983857
>what's the current meta?
2MWU_ntilgood_30B.gguf

Anonymous
10/26/24(Sat)16:45:15 No.102983889

Anonymous 10/26/24(Sat)16:45:15 No.102983889

File: file.png (107 KB, 1180x1152)

107 KB PNG

>>102982537
>>102982875
Am I retarded? I cannot figure out how to get this to work with ooba.
Mikupad works fine with the openai compatible API I've setup on ooba.

Anonymous
10/26/24(Sat)16:47:57 No.102983921

Anonymous 10/26/24(Sat)16:47:57 No.102983921

File: card.jpg (40 KB, 1343x302)

40 KB JPG

>>102983874
This one?

Anonymous
10/26/24(Sat)16:49:39 No.102983940

Anonymous 10/26/24(Sat)16:49:39 No.102983940

>>102983889
did you try adding v1 at the end

Anonymous
10/26/24(Sat)16:50:02 No.102983946

Anonymous 10/26/24(Sat)16:50:02 No.102983946

>>102983858
Standing behind the pochiface on the escalator and pushing her forward so her teeth meet metal

Anonymous
10/26/24(Sat)16:51:32 No.102983963

Anonymous 10/26/24(Sat)16:51:32 No.102983963

>>102983940
of course

Anonymous
10/26/24(Sat)16:53:30 No.102983976

Anonymous 10/26/24(Sat)16:53:30 No.102983976

>>102983857
What I'm using:
ArliAI_Mistral-Small-22B-ArliAI-RPMax-v1.1-Q6_K.gguf
bartowski_magnum-v4-22b-Q6_K_L.gguf
bartowski_Pantheon-RP-Pure-1.6.2-22b-Small-Q6_K_L.gguf

Anonymous
10/26/24(Sat)16:59:48 No.102984061

Anonymous 10/26/24(Sat)16:59:48 No.102984061

>>102982871
>he doesn't know
XTTS2 is tortoise with changes copied from the ick on eck faggot's tortoise fork
if youre not happy with XTTS2 then nothing will save tortoise

Anonymous
10/26/24(Sat)17:02:39 No.102984089

Anonymous 10/26/24(Sat)17:02:39 No.102984089

>>102984061
I haven't tried the new stuff but if that's a fixed Tortoise I'll give it a try and see if it'll go and not bring down my computer. Thank you for the information.

Anonymous
10/26/24(Sat)17:17:59 No.102984285

Anonymous 10/26/24(Sat)17:17:59 No.102984285

>>102983857
For a smarter option try this gemma 27B one. It can do complicated stuff you normally need 70B+ for

https://huggingface.co/anthracite-org/magnum-v4-27b

Anonymous
10/26/24(Sat)17:27:30 No.102984420

Anonymous 10/26/24(Sat)17:27:30 No.102984420

>>102984285
What quant / how many layers offloaded / what speed on your 3090?

Anonymous
10/26/24(Sat)17:33:36 No.102984496

Anonymous 10/26/24(Sat)17:33:36 No.102984496

>>102984420
4_K_M, even pushing it to 16k I get above 5 tokens / reading speed. With less context you can fully fit it but I prefer more context.

Anonymous
10/26/24(Sat)17:36:47 No.102984537

Anonymous 10/26/24(Sat)17:36:47 No.102984537

>>102984285
I've been wanting to try a Gemma 27B because apparently all the finetunes are for 9B. Is it slopped and/or dumber than the original?

Anonymous
10/26/24(Sat)17:39:40 No.102984575

Anonymous 10/26/24(Sat)17:39:40 No.102984575

>>102984537
No, which is why im recommending it. Gemma 27B has always seemed smarter to me than llama 70B (not qwen 2.5 though) but dry. This tune though fixes the dryness while keeping its smarts. And I find the 72B magnum retarded and too horny.

Anonymous
10/26/24(Sat)17:44:39 No.102984636

Anonymous 10/26/24(Sat)17:44:39 No.102984636

>>102980694
>based chinks
hahahahahahahaha roflmao lol even

Anonymous
10/26/24(Sat)17:46:21 No.102984654

Anonymous 10/26/24(Sat)17:46:21 No.102984654

>>102984575
The only model I would put above 27B in the smarts department WHEN IT COMES TO RP / CREATIVE FICTION is mistral large. But I can only manage that at 2bit and its too slow for me. I liked nemotrons writing but again, too dumb for complicated stuff. I like medieval political intrigue in a fantasy world composed of multiple species. Even 70/72Bs ive ever used fell apart with that.

Anonymous
10/26/24(Sat)17:48:28 No.102984675

Anonymous 10/26/24(Sat)17:48:28 No.102984675

>>102983044
It's hard to prompt things away. Just like with people you can say "don't jump" and you get more jumping than if you had said nothing.

Anonymous
10/26/24(Sat)18:04:58 No.102984814

Anonymous 10/26/24(Sat)18:04:58 No.102984814

>>102984496
>even pushing it to 16k
>"max_position_embeddings": 8192
So you're saying
>it can do complicated stuff you normally need 70B+ for
With RoPE scaling, at Q4_K_M, and despite being fine tuned using ChatML instead of Gemma 2's instruct format.

Anonymous
10/26/24(Sat)18:11:24 No.102984880

Anonymous 10/26/24(Sat)18:11:24 No.102984880

>>102976869
>genmoai-smol allows video inference on 24 GB RAM: https://github.com/victorchall/genmoai-smol

getting weird errors like when loading the actual checkpoint

Unexpected key(s) in state_dict: "t5_y_embedder.to_kv.bias", "t5_y_embedder.to_kv.weight", "t5_y_embedder.to_out.bias", "t5_y_embedder.to_out.weight", "t5_y_embedder.to_q.bias", "t5_y_embedder.to_q.weight", "t5_yproj.bias", "t5_yproj.weight".

t5 itself loaded without errors though.

Will they ever fix it?!

Anonymous
10/26/24(Sat)18:15:14 No.102984904

Anonymous 10/26/24(Sat)18:15:14 No.102984904

>>102984814
Yes, read the page. Its trained on a ton of 16k context chatml stuff based upon a chamml-"ified" version of the model by a company:
https://huggingface.co/IntervitensInc/gemma-2-27b-chatml

Anonymous
10/26/24(Sat)18:16:28 No.102984916

Anonymous 10/26/24(Sat)18:16:28 No.102984916

>>102984814
>>102984904
Which in itself was trained on a extra 13 trillion tokens.

Anonymous
10/26/24(Sat)18:21:38 No.102984953

Anonymous 10/26/24(Sat)18:21:38 No.102984953

>>102984904
>Its trained on a ton of 16k context chatml stuff
and it's not going to work on anything other than that stuff

Anonymous
10/26/24(Sat)18:21:41 No.102984958

Anonymous 10/26/24(Sat)18:21:41 No.102984958

>>102984916
No it was not. That's a line duplicated from the README of https://hf.co/google/gemma-2-27b-it

>These models were trained on a dataset of text data that includes a wide variety of sources. The 27B model was trained with 13 trillion tokens and the 9B model was trained with 8 trillion tokens. Here are the key components:

>Web Documents: A diverse collection of web text ensures the model is exposed to a broad range of linguistic styles, topics, and vocabulary. Primarily English-language content.
>Code: Exposing the model to code helps it to learn the syntax and patterns of programming languages, which improves its ability to generate code or understand code-related questions.
>Mathematics: Training on mathematical text helps the model learn logical reasoning, symbolic representation, and to address mathematical queries.

>The combination of these diverse data sources is crucial for training a powerful language model that can handle a wide variety of different tasks and text formats.

Anonymous
10/26/24(Sat)18:23:24 No.102984975

Anonymous 10/26/24(Sat)18:23:24 No.102984975

>>102984904
>by a company
it's just one random dude

Anonymous
10/26/24(Sat)18:23:35 No.102984976

Anonymous 10/26/24(Sat)18:23:35 No.102984976

>>102984953
It works for me. Literally just try it with some story or something. No catastrophic forgetting / going schizo.

Anonymous
10/26/24(Sat)18:24:59 No.102984987

Anonymous 10/26/24(Sat)18:24:59 No.102984987

File: 1717737163584922.png (6 KB, 285x279)

6 KB PNG

The constant search for new and better models, swapping them in and out, then having to experiment which quantization is the best speed to quality ratio...
It's all so tiresome
This general really should have a collaborative website/wiki/pastebin/whatever with the current best models, separated into which are the best for RP, instruct, and so on

Anonymous
10/26/24(Sat)18:27:57 No.102985009

Anonymous 10/26/24(Sat)18:27:57 No.102985009

>>102984987
SFW uses:
Mistral large / Qwen 2.5 / Deepseek 2.5

NSFW uses:
Mistral large, then nemotron / gemma 27B tunes depending on how complicated the scenarios are. Then some qwen2 / 3.1 tunes for smarts or then mistral small, then mistral nemo stuff for fun writing.

Anonymous
10/26/24(Sat)18:32:43 No.102985047

Anonymous 10/26/24(Sat)18:32:43 No.102985047

>>102985009
>or then mistral small, then mistral nemo stuff for fun writing.
For fun but dumb I should specify.
And I have yet to find a non-nemotron 3.1 / any qwen2 finetune that made a model fun without making it retarded.

Anonymous
10/26/24(Sat)18:40:42 No.102985107

Anonymous 10/26/24(Sat)18:40:42 No.102985107

>>102984987
>This general really should have a collaborative website/wiki/pastebin/whatever with the current best models, separated into which are the best for RP, instruct, and so on
I could try making a rentry that anyone can edit with some models to use as a base
We'd just need to put it in the OP and get the general to contribute

Anonymous
10/26/24(Sat)18:45:14 No.102985160

Anonymous 10/26/24(Sat)18:45:14 No.102985160

>>102984987
I don't speak reddit.

Anonymous
10/26/24(Sat)18:46:16 No.102985167

Anonymous 10/26/24(Sat)18:46:16 No.102985167

>>102985107
My experience with wikis and such has been that the thing is usually shouldered by one or two people.
So you should not start something like this with the expectation that other Anons will contribute relevant amounts.
And be especially wary of the fact that there are a lot more people saying that they would help vs. people that actually follow through.

Anonymous
10/26/24(Sat)18:47:10 No.102985174

Anonymous 10/26/24(Sat)18:47:10 No.102985174

>>102985107
>I could try making a rentry that anyone can edit with some models to use as a base
I don't think making anything publicly editable is a good idea. It would get vandalized for sure.
With wiki's at least, you can configure approvals for edits and such.
Regardless, any such index would need a steward.

Anonymous
10/26/24(Sat)18:49:32 No.102985197

Anonymous 10/26/24(Sat)18:49:32 No.102985197

>>102984675
Yeah, which is why it's a training issue. Unfortunately, it's up to the people who make the models to remove the bad data, and they aren't doing that.

Anonymous
10/26/24(Sat)18:50:47 No.102985205

Anonymous 10/26/24(Sat)18:50:47 No.102985205

>>102985167
>>102985174
It's even worse with generative AI because this shit moves so quickly. Someone might be diligent about it for a bit and then stop caring after the Nth time something becomes obsolete
>>102985197
It's more fundamental than that.

Anonymous
10/26/24(Sat)18:51:44 No.102985213

Anonymous 10/26/24(Sat)18:51:44 No.102985213

File: 1700056589360979.jpg (267 KB, 1024x1024)

267 KB JPG

ITT retards who can't code complain about the state of a cutting-edge field.

Anonymous
10/26/24(Sat)18:54:01 No.102985229

Anonymous 10/26/24(Sat)18:54:01 No.102985229

>>102985213
Show us your cutting edge models then :)

Anonymous
10/26/24(Sat)18:58:06 No.102985259

Anonymous 10/26/24(Sat)18:58:06 No.102985259

>>102985213
Anon I maintain a (small) python library + repo for all the LLM pipelines at my company. And I cannot be fucked to update the internal wiki anymore, there's too many changes and rewrites.

Anonymous
10/26/24(Sat)18:59:15 No.102985266

Anonymous 10/26/24(Sat)18:59:15 No.102985266

File: 1714562027764987.png (1.03 MB, 804x516)

1.03 MB PNG

>>102985197
>they aren't doing that.
They do.

Anonymous
10/26/24(Sat)19:00:59 No.102985286

Anonymous 10/26/24(Sat)19:00:59 No.102985286

>>102985174
Yeah, basic anyone-can-edit approach would go really wrong now that I think about it
One way to go around it would be to contribute by copying the whole thing, making a new rentry with the updates, then posting it near the end of a thread for OP to put in, still anyone-can-edit and safe from vandalism that way
I was considering 2 links, one read only and one anyone can edit with the read only link being taken from the other one (it would be made after each contribution), but that could be easily vandalized by removing the link to the backup
Does the copy method sound like a good idea?

Anonymous
10/26/24(Sat)19:21:14 No.102985495

Anonymous 10/26/24(Sat)19:21:14 No.102985495

t. retard here:
Which benchmark/or rather which benchmark stat is the best indicator if model can write a research essay well? Like I give it material for example and how well it structures and works out a task

Anonymous
10/26/24(Sat)19:21:17 No.102985497

Anonymous 10/26/24(Sat)19:21:17 No.102985497

What's a model / lora that's going to naturally on it's own lean towards romantic responses the way CAI used to be until recently?

Anonymous
10/26/24(Sat)19:24:34 No.102985536

Anonymous 10/26/24(Sat)19:24:34 No.102985536

>>102985286
That's exactly what I was going to suggest, but I crashed my computer and I got stuck on the 15 min timer again.
It's still not immune to vandalism, in that anybody can make a fucked OP (see blacked anon), but that's probably the best option, and is the standard for generals as far as I'm aware.

Anonymous
10/26/24(Sat)19:29:27 No.102985586

Anonymous 10/26/24(Sat)19:29:27 No.102985586

>>102985497
Pretty much any. Some are just more horny than others / which is often tied to how smart / dumb it is. Scroll up
>>102985009
>>102985047

Anonymous
10/26/24(Sat)19:33:52 No.102985629

Anonymous 10/26/24(Sat)19:33:52 No.102985629

File: 1707229294776113.png (1.48 MB, 709x905)

1.48 MB PNG

>>102984987
>>102985009
>>102985107
>>102985167
>>102985174
>>102985205
>>102985536
OK LISTEN UP
I MADE THE RENTRY:
https://rentry.co/nqinipvg
https://rentry.co/nqinipvg
https://rentry.co/nqinipvg
It includes the instructions on how to contribute, and a table of models (it's not the best right now since I'm a retard, but it's a start)
The way this works is if you want to make a contribution, you copy the whole thing (with markdown), make a new rentry with your edits, and post at the end of the thread for OP to put in the new thread
So now all that's left is for OP to see this, and include it in the next thread to get the ball rolling
(You) WILL help with this

Anonymous
10/26/24(Sat)19:34:50 No.102985636

Anonymous 10/26/24(Sat)19:34:50 No.102985636

>>102985629
>Una-TheBeagle-7B-v1
Jesus newfag retard kill yourself.

Anonymous
10/26/24(Sat)19:35:05 No.102985639

Anonymous 10/26/24(Sat)19:35:05 No.102985639

>>102985629
>Looks at list
Already garbage list. No thx.

Anonymous
10/26/24(Sat)19:35:49 No.102985647

Anonymous 10/26/24(Sat)19:35:49 No.102985647

>>102985629
Good on you for starting.
I nominate rocinante for the 12B slot. v1.1.

Anonymous
10/26/24(Sat)19:38:00 No.102985667

Anonymous 10/26/24(Sat)19:38:00 No.102985667

>>102985629
>>102985647
Also, 11B is probably redundant.
Hell, anything under 12b might be redundant. You are probably better off running nemo at q4km than an llama 3.x or gemma2 9b. Even more so when you consider that nemo is supposed to be more resistant to quantization to begin with.

Anonymous
10/26/24(Sat)19:38:22 No.102985674

Anonymous 10/26/24(Sat)19:38:22 No.102985674

>>102985636
>>102985639
Yes it's pretty bad right now, it's a rip of this list that used to be in the OP but got since removed and hasn't been updated in a while: https://wikia.schneedc.com/llm/llm-models
If you know better models, just edit them in as instructed!

Anonymous
10/26/24(Sat)19:41:40 No.102985712

Anonymous 10/26/24(Sat)19:41:40 No.102985712

arcanum's the 12b model i always end up going back to, some slopmerge between rocinante 1.1 and nemomix unleashed.
it's really good.

Anonymous
10/26/24(Sat)19:43:37 No.102985731

Anonymous 10/26/24(Sat)19:43:37 No.102985731

>>102985629
>Goliath

Anonymous
10/26/24(Sat)19:48:10 No.102985772

Anonymous 10/26/24(Sat)19:48:10 No.102985772

>>102985629
What the fuck is this? Literally none of those suggestions are good.

>35B: c4ai-commanr-r-v01
This could have been a good recommendation but it comes with a huge asterisk. It doesn't have GQA which balloons the memory requirements defeating the purpose of using a 35B model unless you're content with tiny-penis context size. I won't go so far as to call this an awful suggestion but it's far from being a generally useful one.

Anonymous
10/26/24(Sat)19:48:50 No.102985776

Anonymous 10/26/24(Sat)19:48:50 No.102985776

File: y1shyiwnl2zc932.gif (1.87 MB, 240x228)

1.87 MB GIF

>>102985629
>/lmg/ Official Best Models To Use Guide
>mixtral-8x7b-instruct-v0.1-limarp-zloss.Q5_K_M absolutely nowhere to be found
ngmi

Anonymous
10/26/24(Sat)19:49:48 No.102985782

Anonymous 10/26/24(Sat)19:49:48 No.102985782

>>102985629
>>102985647
it's really as shrimple as that
12B column updated: https://rentry.co/awnic2ai

Anonymous
10/26/24(Sat)19:50:31 No.102985792

Anonymous 10/26/24(Sat)19:50:31 No.102985792

>>102985772
Yeah. Anon just took some old shit from the OP and made a template. The point is to provide suggestions to make a proper list to put in future OPs.

Anonymous
10/26/24(Sat)19:51:17 No.102985797

Anonymous 10/26/24(Sat)19:51:17 No.102985797

Cba to edit it myself.
Someone else put this shit in it.
>>102985009
>>102985047

Anonymous
10/26/24(Sat)19:52:05 No.102985804

Anonymous 10/26/24(Sat)19:52:05 No.102985804

>>102985776
It's far past time to move on from that ancient model, old timer.

Anonymous
10/26/24(Sat)19:54:25 No.102985827

Anonymous 10/26/24(Sat)19:54:25 No.102985827

Rule 1 of the internet:
Write something so fucking stupid that people jump out to correct you.
>Verification not required.

Anonymous
10/26/24(Sat)19:54:55 No.102985830

Anonymous 10/26/24(Sat)19:54:55 No.102985830

>>102985804
models don't age, their weights are the same as they were the day they released
and mixtral has yet to be surpassed by any other model you can run on consumer hardware

Anonymous
10/26/24(Sat)19:56:23 No.102985839

Anonymous 10/26/24(Sat)19:56:23 No.102985839

File: 24356543676434.jpg (41 KB, 480x360)

41 KB JPG

>>102985804
I WOULD but nothing NEW is better.

And mixtral is so fucking good, any slop or pozzing is actually a prompt/skill issue.

Anonymous
10/26/24(Sat)19:57:52 No.102985850

Anonymous 10/26/24(Sat)19:57:52 No.102985850

>>102985629
That one of the worst list I have seen.

Anonymous
10/26/24(Sat)19:58:43 No.102985859

Anonymous 10/26/24(Sat)19:58:43 No.102985859

>>102985776
With a 3090 I can run Mixtral 8x7B fine tunes at Q6_K with 32k context offloading 18 layers at around 5.5 tokens per second.

Anonymous
10/26/24(Sat)19:59:33 No.102985869

Anonymous 10/26/24(Sat)19:59:33 No.102985869

>>102985776
>limarp-zloss.
Best mixtral.
In my own usage, Nemo seems to be about as good while being a lot smaller and faster to run on my 8gb vram setup.
Or at least I think it's faster. I remember mixtral taking a while to run last time I tried it.
Maybe I should download it again.

Anonymous
10/26/24(Sat)19:59:36 No.102985870

Anonymous 10/26/24(Sat)19:59:36 No.102985870

File: file.png (85 KB, 549x335)

85 KB PNG

>last post June 2023

Anonymous
10/26/24(Sat)19:59:47 No.102985871

Anonymous 10/26/24(Sat)19:59:47 No.102985871

Hi imagefags. Can someone generate a hot negress for me? Thanks.

Anonymous
10/26/24(Sat)20:00:55 No.102985880

Anonymous 10/26/24(Sat)20:00:55 No.102985880

>>102985871
oh, wrong thread.sorry

Anonymous
10/26/24(Sat)20:02:34 No.102985898

Anonymous 10/26/24(Sat)20:02:34 No.102985898

File: MonoMikuWut.png (851 KB, 896x1152)

851 KB PNG

>>102985871
>>102985880

Anonymous
10/26/24(Sat)20:04:20 No.102985916

Anonymous 10/26/24(Sat)20:04:20 No.102985916

>>102985870
Running a few scripts and raking in the cash. The easiest deal of the world

Anonymous
10/26/24(Sat)20:04:21 No.102985917

Anonymous 10/26/24(Sat)20:04:21 No.102985917

>>102985797
Since it's just markdown, editing is as simple as swapping the model name and the links in the table
I would edit them in, but I don't know which specific tunes and sizes on huggingface you mean on most of them

Anonymous
10/26/24(Sat)20:05:33 No.102985933

Anonymous 10/26/24(Sat)20:05:33 No.102985933

File: 342164376523.png (44 KB, 452x583)

44 KB PNG

>>102985859
>>102985869
Luv' me mixtral
Luv' me 16k+ context (just werks)
Luv' me limarp zloss
'Ate L3
'Ate gemma
'Ate nemo
(not poor just dont likem)

simple as.

Anonymous
10/26/24(Sat)20:08:51 No.102985962

Anonymous 10/26/24(Sat)20:08:51 No.102985962

lotta rock dwellers in this thread

Anonymous
10/26/24(Sat)20:10:04 No.102985974

Anonymous 10/26/24(Sat)20:10:04 No.102985974

On the UGI leaderboard 8x7B models score significantly lower than Mistral Small models. Am I being memed here?

Anonymous
10/26/24(Sat)20:20:40 No.102986080

Anonymous 10/26/24(Sat)20:20:40 No.102986080

File: local-llm-experience.png (674 KB, 1792x1024)

674 KB PNG

Can I just say, fuck the new claude. It is such a fucking piece of shit.
I cancelled my sub, and then it dropped the next day.
I've been using it since and I literally think it's gotten fucking worse.
Like holy fuck. You ask for Gradio code, it gives you React code.
You ask for a refactor of existing code, it turns a functional style into an OO style.
I swear to god I will buy however many 3090s I need to run a competent code assistant. FUCK

Where's the other version of this meme, with the openai fucking up...

Anonymous
10/26/24(Sat)20:20:42 No.102986081

Anonymous 10/26/24(Sat)20:20:42 No.102986081

File: chad.png (317 KB, 547x596)

317 KB PNG

>>102985974
Because mememarks dont actually translate to which model actually can keep your dick hard and keep a story and a conversation going at the same time.
Limarp Zloss is a thing of slopmerging but it actually worked and no one has managed to do the same, at least of such quality.

Its sloppy, its got the spine shivering, its got the out of place nipple play, its got boundaries to cross and keen sense of not interacting sexually with minors.
But you can also weed all of this out, and your left with a model that honest to god, out of all the shit models that exists, one of the BEST AI models.
It will act out your /ss/ dommy mommy molestation sessions like it was a real female pedophile. It will pass the Nala test with flying colors. It can even fucking handle group chats, and multiple characters.

If it did have i dick, yes I WOULD be sucking it.

Anonymous
10/26/24(Sat)20:22:44 No.102986104

Anonymous 10/26/24(Sat)20:22:44 No.102986104

>>102986080
>Claude
>OpenAI
You tried.

Anonymous
10/26/24(Sat)20:27:53 No.102986150

Anonymous 10/26/24(Sat)20:27:53 No.102986150

File: 1712118687081629.gif (154 KB, 640x480)

154 KB GIF

>>102986104

Anonymous
10/26/24(Sat)20:33:44 No.102986195

Anonymous 10/26/24(Sat)20:33:44 No.102986195

>>102986080
React is safer and more aligned.
You're welcome.

Anonymous
10/26/24(Sat)20:39:02 No.102986249

Anonymous 10/26/24(Sat)20:39:02 No.102986249

>>102986081
But the UGI benchmark should be pretty indicative of actual NSFW capability though shouldn't it? A model that's able to do furry ERP better should also perform better at the UGI benchmark. I mean if their scores were pretty close then I could see where you're coming from and it would make sense, but it's not close at all. This would imply that Mistral Small has been trained on more unsafe content than 8x7B.

Anonymous
10/26/24(Sat)20:39:54 No.102986261

Anonymous 10/26/24(Sat)20:39:54 No.102986261

>>102986081
Its too retarded to write quadrupeds as quadrupeds even when instructed too which makes it too dumb for me.

Anonymous
10/26/24(Sat)20:41:48 No.102986282

Anonymous 10/26/24(Sat)20:41:48 No.102986282

File: 1729426699627152.jpg (84 KB, 680x680)

84 KB JPG

>>102986195
Thanks anon, let me just rewrite my PoC in react. Because my love for gradio was totally the reason I chose gradio instead of NextJS.
With your advice, I think I might be able to get this PoC pushed out next year!
Fuck Gradio and Fuck Anthropic.

Anonymous
10/26/24(Sat)20:41:54 No.102986283

Anonymous 10/26/24(Sat)20:41:54 No.102986283

File: 1729989659642.jpg (64 KB, 552x556)

64 KB JPG

>>102986081
I haven't tried many older models with newer samplers or phrase banning but I really doubt the prose or intelligence of a 46~b moe model from almost a year ago is even comparable to newer models like mistral small, nemo, and especially not large

Anonymous
10/26/24(Sat)21:10:25 No.102986503

Anonymous 10/26/24(Sat)21:10:25 No.102986503

>>102986080
AI experience is getting worse and worse regardless if you're using the cloud or local. Welcome to the future.

Anonymous
10/26/24(Sat)21:17:24 No.102986565

Anonymous 10/26/24(Sat)21:17:24 No.102986565

>>102986503
Don't you feel safe?

Anonymous
10/26/24(Sat)21:22:17 No.102986603

Anonymous 10/26/24(Sat)21:22:17 No.102986603

File: 1456457653456243.png (33 KB, 720x540)

33 KB PNG

>>102986249
UGI bench is ass and isnt correlative to what people are actually running. Nobody is running 70bs, 123bs, or 405s unless your rich.
Using this logic and your own score bench;
Lol
Lmao
a 7x8b comes out on top.

Anonymous
10/26/24(Sat)21:24:08 No.102986621

Anonymous 10/26/24(Sat)21:24:08 No.102986621

>>102986503
Bitnet should save us.

Anonymous
10/26/24(Sat)21:24:13 No.102986623

Anonymous 10/26/24(Sat)21:24:13 No.102986623

File: jaggies.png (209 KB, 1710x679)

209 KB PNG

Hate being annoying and asking this, but is there an AI, hopefully a web hosted one, that can clean up jaggies like this? From a digital cartoon image like pic related, where someone tried to remove the background

Anonymous
10/26/24(Sat)21:27:48 No.102986665

Anonymous 10/26/24(Sat)21:27:48 No.102986665

>>102986623
You can probably use an image upscaler like waifu2 x or whatever.

Anonymous
10/26/24(Sat)21:30:34 No.102986703

Anonymous 10/26/24(Sat)21:30:34 No.102986703

>>102986623
expand selection (from outside) by 1 pixel then white to alpha

Anonymous
10/26/24(Sat)21:40:12 No.102986801

Anonymous 10/26/24(Sat)21:40:12 No.102986801

>>102986621
the bitnet is a lie

Anonymous
10/26/24(Sat)21:54:53 No.102986926

Anonymous 10/26/24(Sat)21:54:53 No.102986926

>>102985839
I can't believe people are still shilling mixtral, it was terrible even when it came out.

Anonymous
10/26/24(Sat)21:56:25 No.102986935

Anonymous 10/26/24(Sat)21:56:25 No.102986935

>>102985647
1.1 is the best? How come?

Anonymous
10/26/24(Sat)21:57:06 No.102986940

Anonymous 10/26/24(Sat)21:57:06 No.102986940

>>102986926
Stop beating around the bush and just post your sloptune.

Anonymous
10/26/24(Sat)22:00:39 No.102986965

Anonymous 10/26/24(Sat)22:00:39 No.102986965

That one post is literally him telling us that it's bait and was never serious. It's over folks.

Anonymous
10/26/24(Sat)22:01:07 No.102986970

Anonymous 10/26/24(Sat)22:01:07 No.102986970

>>102986935
Dunno, I compared it to the latest one and it was just overall better.
The latest one looked better at face value when I was putting it through my usual testing card, but then I tried some other cards that were pure roleplay and it was categorically worse, as in it made mistakes 1.1 never made.
The latest one (I forget the name) did have a Really nice cadence to the prose. No "she she she she, char char char char" etc, so that was nice. Really nice even.

Anonymous
10/26/24(Sat)22:28:13 No.102987170

Anonymous 10/26/24(Sat)22:28:13 No.102987170

Also AXCXEPT/EZO-Qwen2.5-72B-Instruct is nice.

Anonymous
10/26/24(Sat)22:42:50 No.102987260

Anonymous 10/26/24(Sat)22:42:50 No.102987260

File: 102.png (75 KB, 256x256)

75 KB PNG

>>102981301

GPT-SoVits v2 gave better and wider range of outputs than F5 for me. You just need to use the reference you want. IE: Anger, Exited, Normal, Tired, etc, etc.

Outputs:

Normal Refence:
"The food isn't that good here. Let's not go here next time."
ここの料理はあまり美味しくないね。次回はここに行かないようにしよう。
https://vocaroo.com/1mKoMlkXPYLT

Angry Reference:
"O flames that shake the earth, gather in my hands. The power of destruction that swallows everything, be unleashed here and now. Explosion!"
大地を揺るがす炎よ、我が手に集え。すべてを飲み込む破壊の力、今ここに解き放つ。エェェェエエクスプロォォォオオジォォォオンンン!!!
>Volume Warning
https://vocaroo.com/1nfcHP4rwJjt

Excited Reference:
Look! There's so much cool things over there!
見て!あそこにすごいものがたくさんあるよ!
https://vocaroo.com/1jP7rMuBnNNt

Tired/Satisfied Reference
"Haa...Kazuma, I already came 5 times. Please stop."
".はぁ...カズマ、もう5回行きます。おやめください。
https://vocaroo.com/1sbzK4jnDo1o

Anonymous
10/26/24(Sat)22:52:18 No.102987320

Anonymous 10/26/24(Sat)22:52:18 No.102987320

What are the best local models for coding so far?
Qwen2.5-Coder-7B-Instruct-GGUF is pretty good but it still has some retarded moments for sure.

Anonymous
10/26/24(Sat)22:54:18 No.102987330

Anonymous 10/26/24(Sat)22:54:18 No.102987330

>>102987320
https://huggingface.co/AXCXEPT/EZO-Qwen2.5-72B-Instruct
And mistral large. Also deepseek 2.5

Anonymous
10/26/24(Sat)23:00:33 No.102987371

Anonymous 10/26/24(Sat)23:00:33 No.102987371

File: 1727935227032428.png (1.03 MB, 899x1200)

1.03 MB PNG

End of thread soon, this is the current model guide rentry for OP to hopefully put in the next thread with a couple words encouraging people to contribute:
https://rentry.co/awnic2ai
https://rentry.co/awnic2ai
https://rentry.co/awnic2ai
So far there's only been one edit and the list is more of a placeholder than anything, but with time people will make it actually good

Anonymous
10/26/24(Sat)23:14:05 No.102987459

Anonymous 10/26/24(Sat)23:14:05 No.102987459

>>102987371
Do not use this. The suggestions are hopelessly retarded. Nobody here is competent enough to know good models and simultaneously lame enough to sit around babysitting the rentry

Anonymous
10/26/24(Sat)23:14:15 No.102987461

Anonymous 10/26/24(Sat)23:14:15 No.102987461

obviously this is asking for much, but what's the best rp model that can do more than just echo what you write with synonym embelishment?

Anonymous
10/26/24(Sat)23:18:08 No.102987482

Anonymous 10/26/24(Sat)23:18:08 No.102987482

>>102987459
As I said, the list is a placeholder, and since anyone can edit it by making a new one it will be usable in a few threads or so
There's been many great model suggestions this thread, all an anon needs to do is copy and paste the thing, and swap some model names and huggingface links

Anonymous
10/26/24(Sat)23:20:00 No.102987498

Anonymous 10/26/24(Sat)23:20:00 No.102987498

>>102987320
>7B
Ouch.

>>102987320
>best local models for coding so far
I've been turning to a small cluster of Llama 3 tunes, with L3.1-Nemotron-70B setting a new standard in that it handled one of my Java tests well enough to eagerly point out and deal with the issue that most L3's get wrong on the first try and then fix in one of two ways after having the error fed back into them.

However, I'm too VRAMlet and RAMlet to run big models on a quant that isn't lobotomized so I can't speak for Mist Large or that fat Deepseek Coder from earlier this year.

Anonymous
10/26/24(Sat)23:24:11 No.102987529

Anonymous 10/26/24(Sat)23:24:11 No.102987529

>>102987498
>L3
Why not Qwen?

Anonymous
10/26/24(Sat)23:36:19 No.102987595

Anonymous 10/26/24(Sat)23:36:19 No.102987595

>>102987529
Qwen was okay for simple Python but I have it behind six L3 tunes in my non-rigorous testing.

Anonymous
10/26/24(Sat)23:39:22 No.102987608

Anonymous 10/26/24(Sat)23:39:22 No.102987608

>so many good models that I'm getting choice paralysis, constantly switching between them because I don't want to miss out on each one's response to a specific prompt
I guess this is what local winning looks like, but it's actually getting annoying

Anonymous
10/26/24(Sat)23:41:37 No.102987622

Anonymous 10/26/24(Sat)23:41:37 No.102987622

>>102987608
This. I spent more time switching models / context / instruct formats then I use models these days lol. Really into this qwen2.5 finetune now though.

Anonymous
10/26/24(Sat)23:45:52 No.102987645

Anonymous 10/26/24(Sat)23:45:52 No.102987645

>>102987608

You just min maxed the fun out of everything. What gun should I use for this distance/target? What sword does the best elemental damage against this enemy? Etc, etc.

Anonymous
10/26/24(Sat)23:54:07 No.102987687

Anonymous 10/26/24(Sat)23:54:07 No.102987687

>>102987645
That's a problem the rentry anon is trying to solve, really.
To find generally agreed upon good models and just use those for that parameter size.

Anonymous
10/27/24(Sun)00:00:43 No.102987717

Anonymous 10/27/24(Sun)00:00:43 No.102987717

>>102987687
nta. The problem is that it gets outdated, just like all the guides in the OP, and the few other dozen guides and came and went.
Normally, the most you have to do is roughly scan the previous thread to see what models anons are talking about. If you're even lazier, just check the news and download whatever comes up. There's always a retard asking "wat coom 16gb?'.

Anonymous
10/27/24(Sun)00:11:12 No.102987776

Anonymous 10/27/24(Sun)00:11:12 No.102987776

>>102987717
The "anyone can edit" instructions in it should prevent it getting outdated if some anons are willing to put some good model names and huggingface links there. We'll have to see.

Anonymous
10/27/24(Sun)00:12:27 No.102987786

Anonymous 10/27/24(Sun)00:12:27 No.102987786

>>102987723
give her armpit hair

Anonymous
10/27/24(Sun)00:13:14 No.102987798

Anonymous 10/27/24(Sun)00:13:14 No.102987798

>>102987260
Glad to see you enjoy your new AI toy meguanon. Is the tone consistent over multiple samples with the same emotion?

Anonymous
10/27/24(Sun)00:15:43 No.102987814

Anonymous 10/27/24(Sun)00:15:43 No.102987814

>>102987371
Here you go retard https://rentry.co/piy864dr

Anonymous
10/27/24(Sun)00:16:24 No.102987819

Anonymous 10/27/24(Sun)00:16:24 No.102987819

>>102987776
Yeah. Nothing ever goes wrong with free edits. Best of luck thought. Just as i wished to the previous attempts.

Anonymous
10/27/24(Sun)00:18:08 No.102987831

Anonymous 10/27/24(Sun)00:18:08 No.102987831

Anyone try out GLM4 Voice?

Anonymous
10/27/24(Sun)00:19:47 No.102987843

Anonymous 10/27/24(Sun)00:19:47 No.102987843

>>102987819
The barrier of defense is that it requires OP's/the general's approval and can't be deleted
I find it better than same 3 wiki discord users that will eventually ditch it
>>102987814
Thanks for contributing! This is the one should be put in the next OP.

Anonymous
10/27/24(Sun)00:39:34 No.102987927

Anonymous 10/27/24(Sun)00:39:34 No.102987927

Does openwebui require an account? I remember seeing that and dropping it instantly without checking if it was mandatory.

Anonymous
10/27/24(Sun)00:45:59 No.102987960

Anonymous 10/27/24(Sun)00:45:59 No.102987960

>>102987959
>>102987959
>>102987959

Anonymous
10/27/24(Sun)00:47:09 No.102987966

Anonymous 10/27/24(Sun)00:47:09 No.102987966

>>102987960
why so early

Anonymous
10/27/24(Sun)00:48:08 No.102987969

Anonymous 10/27/24(Sun)00:48:08 No.102987969

>>102987966
Maybe he wanted to sleep so got it out of the oven early.

Anonymous
10/27/24(Sun)00:48:43 No.102987975

Anonymous 10/27/24(Sun)00:48:43 No.102987975

>>102986703
Thank you anon

Anonymous
10/27/24(Sun)00:50:26 No.102987990

Anonymous 10/27/24(Sun)00:50:26 No.102987990

>>102987969
nah he added a new link with a bunch of meme models, compromised op

Anonymous
10/27/24(Sun)00:51:58 No.102988000

Anonymous 10/27/24(Sun)00:51:58 No.102988000

>>102987990
Trojan horse bread, oh no.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.