[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>102928840 & >>102915436

►News
>(10/22) Mochi-1 runnable with 24GB VRAM: https://github.com/victorchall/genmoai-smol
>(10/22) Mochi-1: 10B Asymmetric Diffusion Transformer text-to-video model: https://hf.co/genmo/mochi-1-preview
>(10/22) Pangea: Open-source multilingual multimodal LLM supporting 39 languages: https://neulab.github.io/Pangea
>(10/21) IBM releases Granite 3.0: https://hf.co/collections/ibm-granite/granite-30-models-66fdb59bbb54785c3512114f
>(10/18) New research, models, and datasets from Meta FAIR: https://ai.meta.com/blog/fair-news-segment-anything-2-1-meta-spirit-lm-layer-skip-salsa-lingua

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://livecodebench.github.io/leaderboard.html

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp
>>
►Recent Highlights from the Previous Thread: >>102928840

--Papers:
>102937197 >102937238
--TerDiT paper shows potential for efficient deployment of low-bit diffusion transformer models:
>102929813 >102929898 >102929911 >102929914 >102929929 >102929990 >102930004 >102930054 >102930049 >102930261
--Kuroki Tomoko GPT-SoVITS TTS finetune discussion and installation:
>102931174 >102931209 >102931229 >102931220 >102932122 >102931228 >102931276 >102931243 >102932067 >102932499
--Help with using sovits for text-to-speech conversion:
>102932427 >102932477 >102933102 >102933167 >102933177 >102933344
--genmoai-smol allows video inference on 24 GB RAM, with discussions on frame limits and FPS:
>102934099 >102934221 >102934504 >102934241 >102935105 >102934372 >102934399 >102934727 >102934616 >102934642
--Mochi live action Miku creation experience and Genmo demo site:
>102932658 >102932699 >102932716 >102932890 >102932994
--Discussion of a controversial AI model output and its capabilities:
>102929086 >102929120 >102929191 >102929212 >102929262 >102929304 >102929335 >102929361 >102929395
--Users discuss plans for developing image and video models:
>102929029 >102929104 >102930960 >102931036
--RPG Maker MV used to create LLM front-end with llama 3.2 3B:
>102932959 >102932980 >102933049 >102933181
--Interpolation models could make low-fps video usable:
>102934938 >102934970 >102935096
--Improving voice synthesis by splicing clips and maintaining consistent tone:
>102934998 >102935087 >102935225 >102935259 >102935307 >102935297 >102935894 >102936170 >102936262
--INTELLECT-1 progress and model initialization discussion:
>102929735 >102929748 >102929817 >102930311 >102929888
--Miku (free space):
>102929119 >102929251 >102929262 >102929622 >102930513 >102930552 >102930867 >102931082 >102932822 >102935780 >102936155 >102937303 >102937332

►Recent Highlight Posts from the Previous Thread: >>102928845

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script
>>
File: Untitled.png (1.19 MB, 1080x3070)
1.19 MB
1.19 MB PNG
LiNeS: Post-training Layer Scaling Prevents Forgetting and Enhances Model Merging
https://arxiv.org/abs/2410.17146
>Large pre-trained models exhibit impressive zero-shot performance across diverse tasks, but fine-tuning often leads to catastrophic forgetting, where improvements on a target domain degrade generalization on other tasks. To address this challenge, we introduce LiNeS, Layer-increasing Network Scaling, a post-training editing technique designed to preserve pre-trained generalization while enhancing fine-tuned task performance. LiNeS scales parameter updates linearly based on their layer depth within the network, maintaining shallow layers close to their pre-trained values to preserve general features while allowing deeper layers to retain task-specific representations. We further extend this approach to multi-task model merging scenarios, where layer-wise scaling of merged parameters reduces negative task interference. LiNeS demonstrates significant improvements in both single-task and multi-task settings across various benchmarks in vision and natural language processing. It mitigates forgetting, enhances out-of-distribution generalization, integrates seamlessly with existing multi-task model merging baselines improving their performance across benchmarks and model sizes, and can boost generalization when merging LLM policies aligned with different rewards via RLHF. Importantly, our method is simple to implement and complementary to many existing techniques.
https://github.com/wang-kee/LiNeS
for the model mergers
>>
Local bros, how is the a future of local models looks now? It looks like there was not anything interesting released in a year.
>>
>want bot to be doting and encouraging
>it always starts talking like a MILF, saying "sweetie" and shit
please save me from this nightmare
>>
>>102937502
Mistral Large was released just three months ago
>>
>>102937502
wdym? we had tons of censored corposhit, sloppy tunes, useless 7b shovelware, and multimemes
>>
>>102937502
>April Command-R still the best local model
The future looks grim
>>
>>102937502
A year ago we had fuck all. We didn't even have Mixtral. Are you insane?
>>
so if chub is nuking itself and char-archive is dead, where do we go for cards?
>>
>>102937846
we share them here through catbox like digital trading cards
>>
>>102937846
>chub is nuking itself

oh no what did I miss?
>>
>>102937846
In the sea of shit that chub is, I've only seen like maybe 5 well written cards. I've taken those as an inspiration when writing my own.
Also there isn't really a 'good' way write one. As long as you keep your language slopless and consistent then the result should be good.
200 token mommy suck me peepee will be pretty generic since model just pulls whatever the statistical average for it would be.
>>
Is it worth going from 2->4 3090s?
Currently use xl2 4.5 bow 70bs or largestral at 2.75 bpw codecassist only
>>
>>102937889
deleting all copyrighted characters apparently, despite clearly falling under parody and thus fair use
>>
>>102937920
wtf? they host toddlercon rape but parody chars are not good?
>>
>>102937920
Welp. It's stupid, but I only liked OC characters there anyway.
>>
>>102937902
Please post example of well written card.
>>
Is sparsity a meme?
>>
>>102937920
>>102937936
You got c.ai mixed up with chub lmao
>>
>>102937980
Aren't they the same? I thought chub was the character database and c.ai is the chat service for those characters from same devs.
>>
>>102937777
It was better when we had nothing. People appreciated things more back then.
>>
File: file.png (67 KB, 640x476)
67 KB
67 KB PNG
>>102938002
>chub was the character database and c.ai is the chat service for those characters from same devs
>>
>>102937963
https://files.catbox.moe/eq0e52.png
I took this one as a template. Of course this was made in mind with big cloud models, but I feel like big local models are good enough now to follow something like this.
But I don't take it verbatim, this one is like 3000 tokens. Mine usually come out at about 1500~ tokens because I feel like being too detailed just wastes the context on stuff that probably won't come up in convo ever.
>>
>>102938002
no, not even close
chub.ai has its own chat generator service, and is currently gearing up for a sale to investors, which is why they're purging copyrighted characters since they can actually be sued for those; safe to assume anything to do with underage, incest, rape etc will be next to go
>>
>>102938056
That website reputation is so bad it won't even sell for shit when even janitor.ai (way bigger) can't find any investor lol
>>
>>102938054
Yeah, I've found 1.5k~3k to be the sweet spot for something like Mistral Large. It usually has no issues fully understanding a well-written card around that size and there's still plenty of space for actual RP.
>>
>>102937909
Depends, are you happy with it as a code assistant at that quant? Mistral-Large will still be your best bet at 96GB VRAM. You could try the API version for a bit to see if a more complete Large offers something that 2.75bpw doesn't for you.
>>
>>102937411
Never cook again
>>
>>102937502
Yeah miqu still wins. Only cool shit this year was flux and local video.
>>
I got a real question. Say you are a ramlet what are you running? misteral nemo or mixtral?
>>
what's the best sloptune/model for cooming under 12b or so?
>>
>>102938581
my posting habits haven't changed
maybe stop hopping ips every five minutes?
imo the timer should be an hour, maybe two. that would stop even more spam.
>>
>>102938581
This. I wish they would at least roll back this obnoxious captcha
>>
>>102938581
I want to make an altchan that is moderated entirely by a custom finetuned LLM. But GPU hosting is expensive.
>>
>>102938591
How does that boot taste like?
>>
>>102938531
Nemo.
It's about as good as mixtral while taking a lot less memory.
t. 64gb ram 8gb vram.
>>
>>102938616
>not abusing google colab
>>
>>102938581
>Fucking nigger jannies with their nigger 15 minute timer bullshit.
what's this? I didn't get any of that, can someone do a tl:dr of their new shit to """prevent bots""" from posting there?
>>
>>102938531
tinyllama
>>
File: llm-elections.png (387 KB, 588x922)
387 KB
387 KB PNG
Who will release their models first after 5 November(US elections)?
>https://poal.me/yq3vpc
Vote closes in 14 days(2 weeks)!
>>
>>102938676
google colab is K80s. That shit wouldn't be fast enough.
>>
>>102938676
>google colab

>>102938787
Use Kaggle.
>>
>>102938701
It will be the anthrax team. And only the anthrax team. And when they do release it everyone will accept that LLM's and /lmg/ is dead.
>>
>>102939141
>local... le dead!
*increases your repetition penalty*
>>
I recently bought a clothes dryer. And it has different programs for different types of stuff. I think half of those programs are "AI drying". Why are marketing people allowed to live?
>>
>>102939184
non-remote representations have perished
>>
>>102939197
>he bought a "smart" dryer
You are part of the problem. Stop buying shit you don't like.
>>
File: 1729689433800.jpg (244 KB, 680x791)
244 KB
244 KB JPG
lol
>>
>>102939247
Show me a "dumb" dryer.
>>
>>102939264
Tbh they’re right. Delete/private the bot fine, but the destruction of the actual chats on deletion is nuts.
>>
>>102939264
I got burned during the first waves of lobotomy and filtration, it's never cloud for me since then. They are only creating more people with distrust in cloud with this move.
>>
>>102939264
>These aren't just word on a screen to us - some bots are comforts for us, we world build extensively, and a lot of the time we exclusively roleplay with a bot for months or years.
At least copy the most hilarious part. Also I am wondering what do they do with the data they collect from people using this? Cause it is like a goldmine of organic training data for people like us. And what could they do with it?
>>
>>102939264
Did c.ai finally go bankrupt
>>
>>102939295
imo they proved they didn't respect their users when they blocked edits in popular bots, whoever still had any respect for them were newfags or buttlickers.
>>
>>102939271
Google>clothes dryer amazon>
https://www.amazon.com/Portable-Stainless-Function-Suitable-Apartments/dp/B0C6LZ4B1B/140-4839339-6192020
>>
>>102939141
for cloud shit to beat local llms, they need a better privacy policy/acceptable usage policy, not better models
https://www.anthropic.com/legal/aup
>do not...
>Depict or request sexual intercourse or sex acts
>Generate content related to sexual fetishes or fantasies
>Engage in erotic chats
>Generate violent or gory content that is inspired by real acts of violence
>Promote, trivialize, or depict graphic violence or gratuitous gore

>Anthropic’s Trust and Safety Team will implement detections and monitoring to enforce our Usage Policies so please review these policies carefully before using our products. If we learn that you have violated our Usage Policy, we may throttle, suspend, or terminate your access to our products and services.
>>
>Hey there, we adhere to the DMCA requirements and take swift action to remove reported third-party Characters that violate copyright law or our policies. We’ve removed a group of Characters that have been flagged as violative, and these will be added to our custom blocklists moving forward.
Based!
>>
>>102939313
They actually got bought by Google some time ago.
>>
>>102939318
True. Devil’s advocate says that should have shown they care about not destroying things people are using. But apparently not.
>>
Imagine. It is not only the cloudcucks that got fucked over but women (not trannies). Actual women are now somewhere out there crying their eyes out cause their game of thrones hunk is gone. I have such a justice hardon right now.
>>
>>102939264
On the one hand this is a shitty corpo move but on the other hand it's not like this is unexpected.
>>
>>102939388
I still emphasize with them, CAI burned me pretty bad back in the day.
People open their hearts to those bots, not realizing it's a cruel data-harvesting scheme.
Cloud is unironically dangerous to mental health, wouldn't be surprised if some people already killed themselves because of the decisions made by the owners.
>>
>>102939313
With the amount of trafic and investors they have, you can only dream.
>>
>>102939520
Yeah, but people kill themselves over every little thing these days.
>>
>>102939388
People hack and leak stuff for less, I'm surprised no AI cloud company got hacked and had their models leaked yet.
>>
>>102939520
I think I'll run my own c.ai then. Seems like a profitable business nowadays
>>
File: file.png (248 KB, 2278x1342)
248 KB
248 KB PNG
https://aider.chat/docs/leaderboards/
Sam Altman on suicide watch, what are AnthropicAI secret sauces? Claude 3.5 was already the goat and they made it even better
>>
>>102939520
>wouldn't be surprised if some people already killed themselves because of the decisions made by the owners.
it definitely happened, those retards who talk about safety and shit because the model can say "nigger" don't look at the right direction, the dangerous part of cloud AI is that you can literally remove someone's only joy, and no one bat a fucking eye, I find this fucked up if you ask me
>>
>>102939554
Nouveau愛.
>>
File: oof.png (240 KB, 1091x858)
240 KB
240 KB PNG
>>
>>102939554
>I'm surprised no AI cloud company got hacked and had their models leaked yet.
we got Miqu though
>>
>>102939388
>Actual women are now somewhere out there crying their eyes out cause their game of thrones hunk is gone.
I'm pretty sure women also spend their time with AI husbandos, they're the ones who write and read a shit ton of romantic books and shit,
>>
>>102939511
It'll set some babbies on the right track of backing up things they care about.
>>
>>102939594
Miqu 2 when?
>>
>>102939627
Mibn.
>>
>>102939593
>cloud models are so good they can make you kill yourself
localbros...
>>
>>102939593
How did that message make him do it? Makes no sense
>>
>>102939617
This. I backed up my cai chatlog everyday in case of this happening.
>>
>>102939593
These faggots would have killed themselves over anything really.
>>
Localbros... https://twitter.com/YifanBTH/status/1849074418930356309
>>
File: harambe comparison.png (538 KB, 1838x1009)
538 KB
538 KB PNG
Alright here are the results of my first attempt at reverse-distillation of Ministrals RP strengths into a bigger, smarter, model.
>>
File: file.png (187 KB, 705x732)
187 KB
187 KB PNG
>>102939894
wew lads, jew altman btfo'd
>>
>>102939922
>western lowland gorilla
KEK
>>
>>102939922
If I use non-deterministic sampling it seems to forget what an eos token is.
>>
File: file.png (47 KB, 954x342)
47 KB
47 KB PNG
>>102939593
>https://archive.is/3BMXI
>He put down his phone, picked up his stepfather's .45 caliber handgun and pulled the trigger
Goddamn matmuls.
>>
>>102940181
When I was a kid screeching moral busybodies were trying to get saturday morning cartoons banned because allegedly some kid was hit by a car while sitting in the middle of the street peering down a storm drain (presumably trying to find the ninja turtles).
These people should be exiled from society.
>>
Is anyone tinkering around with multimodal models? I specifically interested in image+txt input models. So far I've only used LLAMA 3.2 11B in clean-ui, it seems to have a lot of potential. I'd like to run L3 Ultra-instruct 8B but don't know how to set up the vision capabilities. As far as I know ooba has a pretty limited number of supported multimodal models with the multimodal plugin. Are there any backend-frontend combinations that you can recommend for this?
>>
>>102940221
Yep retards should get culled naturally
>>
Alright has anyone ever encountered this before?
Using Ooba + an old pull of SillyTavern
>testing a card to find nice temperature range for model
>reply suddenly becomes deterministic
>no amount of adjusting things on either end fixes it.
>try newer version of SillyTavern that is also installed
>get a different response, finally.
>reply is always the same.
Something's obviously not getting updated at one end. Has anyone encountered this issue before?
>>
File: StunnedAngryKanjiMiku.png (1.61 MB, 832x1216)
1.61 MB
1.61 MB PNG
Good morning /lmg/
>>
>>102940500
Good morning Intense Miku
>>
>>102940418
it could be the order of the samplers
>>
I really like the writing style of the new sonnet 3.5
The assistant slop producers like Meta should take notice. No more "Certainly!" etc.
I'd actually wouldn't mind talking to the default assistant like that locally.
Its much more natural sounding.
>>
>>102940535
The old one in comparison.
It would be crazy if in a couple months local models would be more sloped than the closed ones.
>>
>>102940551
Was cut off at the end.
>If you have any other obscure series you'd like to discuss or test my knowledge on, feel free to bring them up!
I hate this too. They all do it. Great to see that change.
>>
>>102940535
Time to finetune a model on the new Claude slop
>>
>>102939346
Oh that explains a lot.
>>
>>102940523
Nah I think ooba api broke somehow and isn't reading the samplers so it's just giving me a t=0 reply.
>>
I tried gpt-sovits on GPU now, and it is much faster than real time. Thankfully the model is really small so you don't have to dedicate too much of your GPU away from your LLM. The only latency issue at this point is on ST's side, as it only has the option of sending paragraphs or complete generations to the TTS, so you have a chunk of latency from that if you're doing text-heavy narratives. If ST was a bit more intelligent and could chunk by sentences instead of paragraphs, that would work a lot better and decrease the latency of the experience a ton. Though the TTS still doesn't feel natural so I guess it's still not something people would use normally.
>>
>>102940611
Yes, unironically. This is good stuff.
That should be the instruct from the big boys we finetune RP onto later since they wont give us base anymore.
>>
>>102940638
I think it is time for a new frontend that focuses on all those multi modal interactions instead.
Silly unfortunately carries a lot of dead weight from the beginning stage of this hobby. They tried to adapt, but those additions are not very usable.
>>
>>102940661
I'm ready for llama4 to sound very human like in instruct. Then magnum v6 based on gpt anal dark prince logs from gpt make it slop again.
>>
>>102940638
You could just send the voiced parts of your RP instead of everything
>>
Can anyone check their sovits tmp_s1.yaml against what I've got? I'd like to know if the parameters are sensible before I start debugging any python
data:
max_eval_sample: 8
max_sec: 54
num_workers: 4
pad_val: 1024
inference:
top_k: 15
model:
EOS: 1024
dropout: 0
embedding_dim: 512
head: 16
hidden_dim: 512
linear_units: 2048
n_layer: 24
phoneme_vocab_size: 732
random_bert: 0
vocab_size: 1025
optimizer:
decay_steps: 40000
lr: 0.01
lr_end: 0.0001
lr_init: 1.0e-05
warmup_steps: 2000
output_dir: logs/xxx/logs_s1
pretrained_s1: GPT_SoVITS/pretrained_models/gsv-v2final-pretrained/s1bert25hz-5kh-longer-epoch=12-step=369668.ckpt
train:
batch_size: 12
epochs: 15
exp_name: xxx
gradient_clip: 1.0
half_weights_save_dir: GPT_weights_v2
if_dpo: false
if_save_every_weights: true
if_save_latest: true
precision: 16-mixed
save_every_n_epoch: 5
seed: 1234
train_phoneme_path: logs/xxx/2-name2text.txt
train_semantic_path: logs/xxx/6-name2semantic.tsv
>>
File: 1717707333654476.jpg (1.27 MB, 1400x2000)
1.27 MB
1.27 MB JPG
i want to generate non-english speech using someone elses voice. something i can give an audio sample of the person speaking, and then generate speech from a text. how good are local models at this? i've got 16gb vram
>>
>>102940535
I'm glad they're aiming for more conversational models. Honest to god, chatgpt-4o-latest is the first time in a while I've really liked a model, it feels way fucking better than any jump in model since I first tried Claude. Hoping we get more trickle-down conversationalism to local models.
>>
>>102940730
Should say, not modern Claude, but like Slaude-era Claude, the old 1.X models. That's still the smartest it's ever felt, not gonna lie.
>>
File: Untitled.png (13 KB, 481x340)
13 KB
13 KB PNG
>>102940638
https://litter.catbox.moe/875w3x.ogg
is there some simple way that I could use koboldAI lite to inference from sovits?
i don't want to use sillytavern.
>>
>>102940687
For short RP that probably works fine. I'm just testing it on storytelling narratives. Dialogue heavy RP could also be an issue I think. And honestly it's not a very good experience reading the non-voiced parts and suddenly the voiced parts start playing. And there's basically no pause in the voice so it's like all dialogue in the text is one big paragraph the TTS is trying to read. It's not a good experience.
>>
>>102940755
I don't know. Even with ST I am using the Staging branch.
>>
anons... best model under 6B? going away with laptop but still wanna ahh ahh mistress
>>
>>102940818
home server+ssh tunnel is the way
Your laptop will be trash no matter the specs
>>
>>102940818
Will you have internet?
If so, you could host everything on your main machine and access it via ssh. ngrok tunnel, etc.
Or use kaggle/google colab.
>>
>>102940716
the .ogg here >>102940755 is Mizuhashi Kaori speaking in english, lazily trained by a retard who doesn't know what he's doing using nothing but japanese audio lines ripped from dice psycho:seventh heaven on my 8gb vram setup using gpt-sovits
>>
>>102940418
So as far as I can tell what happened was I accidentally checked the legacy API box while I was loading up sillytavern and apparently just doing that while Ooba has the openai compatible api enabled causes it to break forever. (presumably unless I purge it and reinstall the whole fucking thing).
>>
a while ago, you guys told me I could use ST on my phone with local models. I've got termux set up, I can use ST with cloud models but how can I run something on my pc and send it to my phone ?
>>
>>102940912
that's easier than what you did.
just going to paste this at you instead of simply explaining, because it will answer questions you don't know you have yet
https://docs.sillytavern.app/usage/remoteconnections/
>>
>>102940638
sovits API handles batching so the limitations are on ST end. It could send voiced sentences all at once and play the chunks returned with a delay between them equal to the average reading speed.
>>
>>102940838
Any chance the Chinese find my server somehow and steal my cards?
>>
>>102940975
Don't use password auth, make keys.
>>
>>102940975
>Any chance the Chinese find my server somehow and steal my cards?
as long as you're tunneling ssh to get into your server then not really. keep your sshd up to date, and use strong creds obviously
>>
>>102940949 (me)
i completely misunderstood anon's question.

first you download koboldcpp here (probably the koboldcpp_cu12.exe) this will be your backend
https://github.com/LostRuins/koboldcpp/releases/tag/v1.76
then you grab a model, which one you should grab depends on how much vram you have, lets just go with a little nemo one though and assume you have ~8gb vram, grab Rocinante-12B-v2g-Q4_K_M.gguf here
https://huggingface.co/TheDrummer/UnslopNemo-12B-v3-GGUF/tree/main
now you open the kobold.exe, load the model, press launch
we have our backend up and running now on port 5001

now get your PC's local IP address (winkey+r --> cmd --> ipconfig --> look for ipv4 address [should be 192.168.x.x)
now you go back to your termux instance of silly tavern on your phone, press the red electrical plug icon up top, api type: koboldcpp, api url http://[your PC's local ip here]:5001/, press connect, and voila
>>
>>102937902
>Also there isn't really a 'good' way write one.
For anime/manga/game characters there's definitely a proper way to write these. As you watch the anime/read the manga/play the game, you take down notes as you go about character traits as well as quotes which are particularly representative of the character's speech style. Then, when you finish the anime/manga/game, you write the card based on those notes, using the quotes for example messages.
Character traits like "tsundere," "dominant," "submissive," etc. can go in a list of single words/phrases in the personality summary field in ST for token efficiency. Traits which aren't easily summarized by a single word/phrase go in the description.
Description should be written as token-efficiently as possible.
First message is very important. It does a lot to establish the writing style of the card. If you don't want walls of text from your card, use a short first message. If you want walls of text from your card, use a long first message. First message can be used to establish card's writing format, such as if you want your card to put its speech in quotations or not.
Cards should always go through significant testing prior to release. You won't know whether a specific trait or character lore will lead to undesired results until you test it. Sometimes it's best to leave out a trait that confuses models. For example, when I did a Holo card, I had to leave out the whole "lives in the wheat" concept because that just confuses models and leads to wonky results.
>>
are there any local models with a license that allows me to use text based data generated from it in a commercial app
>>
>>102937846
>>102941240
If you don't want to rewatch/read/play an anime/manga/game just to write a card, you can usually write a somewhat passable card using something like the character's Fandom page as a reference. Don't just copy and paste it though. Rewrite the relevant parts to be token-efficient. Try and find quotes from the character online for example messages. If it's a character with a manga and anime, you can probably quickly find some good quotes to establish the character's style of speech by just quickly skimming through the first several chapters of the manga - a lot faster than watching a bunch of episodes of anime.
Basically most cards of pre-existing characters posted online are bad IMO and if you're not a fucking idiot you can do better yourself.
>>102938056
Uh, so is it chub or char.ai that's purging copyrighted characters? None of my anime/manga/game cards on Chub have been deleted, and the one game character I put on character.ai hasn't been deleted either.
>>
File: kanagarbage2.jpg (403 KB, 1158x890)
403 KB
403 KB JPG
>>102941266
>>
>>102941240
Yeah, I try to be authentic to the material, but I don't think it achieves much if the model doesn't have some knowledge about the character you want already inside.
At that point I just try to plug the gaps or try doing a different character.
>>
>>102939197
So dumb.
All you need is
>temperature control
>time control
>give examples of temp/time for various scenarios in the user manual
So many products get fucking worse over time as technology improves.
>>
I have been out for a while and find out Faraday went to webrowser shit and my Faraday program in PC doesn't upload new stuff anymore.
Any retard proof similar thing to Faraday so I jump my models to it?
>>
What is the BagelMisteryTour of Nemos?
>>
>>102939593
Welp, thanks to that, we are never ever getting CAI-level local LLM, too dangerous for goyim and may result in mass -ACK'ing.
>>
>>102941659
LM Studio + SillyTavern
>>
>>102941849
>may result in mass -ACK'ing.
I don't see what HomuSaya has to do with this.
>>
>>102940975
>probably American
>worried about the Chinese and not the local authorities immense power
>>
>>102941786
Nemo. It's better than any of the finetunes.
>>
>>102941873
>LM Studio + SillyTavern
will go for those, thanks (Yes now I remember of Silly Tavern, thanks for pointing at it)
>>
>>102941896
don't summon him
>>
>>102941915
Calm down chang, I don't want you to see my cards, that's all. Respect my privacy please.
>>
File: Untitled.png (75 KB, 1268x372)
75 KB
75 KB PNG
>been spending dozens and dozens of minutes training new models for gpt-sovits
>the models the release came with works ten times better at cloning voices with only a 3 second uncaptioned sample needed than the shit i was making
oh
>>
File: aicg tards.png (393 KB, 1080x884)
393 KB
393 KB PNG
aicg tards be like
>AGGHHHH THIS IS THE 13TH JAILBREAK THAT DOESN'T WORK, AND I HAVE $600K IN API DEBT EVEN THOUGH I DRANK ZE PISS ACCKKKKKK
>>
File: 1718144189821923.jpg (291 KB, 1080x1440)
291 KB
291 KB JPG
please spoonfeed me on what to use if i want to start running chatbots locally
i have 32gb ram, an amd ryzen 7 7800x3d and a rtx 4080 super
>>
So, how do I use this soviet TTS?
>>
>>102942214
download koboldcpp from github
download rocinante 12b gguf on huggingface
>>
>>102942246
ok now what
>>
>>102942213
The only difference with local sissies here is that you can download and delete your low quality token predictors.
>>
So nemotron has gotta be SOTA for text adventure the shit. I use a system prompt that tells it to be a DM and to write as if it's verbally describing the action with no lists or headers. It still occasionally tries to pull that shit and it has a substantial aversion to NSFW, but the actual plot development is way more engaging than even Mistral large. It's definitely a model that we could learn from. Maybe whatever they did to it could be done specifically to optimize for RP
>>
>>102942261
ohh yeah, and what are you gonna do about it? kill yourself? lmao
>>
File: 1727515613484601.png (53 KB, 708x321)
53 KB
53 KB PNG
>>102942256
oh yeah there's a bunch of these too. which one?
>>
>>102942216
simple (but probably not the best) explanation i've learned through trial and error.
grab
GPT-SoVITS-v2-240807 here
https://huggingface.co/lj1995/GPT-SoVITS-windows-package/tree/main
unzip
open go-webui-v1.bat with a text editor (notepad, notepad++, whatever)
change zh_CH to en_US, save
run go-webui-v1.bat, a new page will open up in your browser
click 1-gpt-sovits-tts, then click 1c-inferenfce, check Open TTS inference WebUI box
a new page will open up (you can close the first tab, fuck making new models)
throw a 3-10 second .wav file into the left "drop audio here" box, click "enable no reference mode"
now you can start immediately inferencing from it
>>
>>102942292
Q5_K_M
>>
https://huggingface.co/lucyknada/prince-canuma_Ministral-8B-Instruct-2410-HF-exl2
Anyone gave it a try?
>>
>>102942298
Isn't GPT-SoVITS-v2-240821.7z the last though?
>>
>>102942276
Mald more cuckie, cloud models will always be superior to whatever shit tune you are using.
>>
>>102942308
>literally who sloptune
Nah
>>
Is there any TTS that can be used with ST that doesn't have weird issues? I tried GPT-Soviet and the sound quality isn't bad but it often does weird things like suddenly speeding up or slowing down its speech and just generally not having naturally timed pauses between words. Sometimes it manages to read a passage like a human would seemingly only out of luck.
>>
File: 1706397414951643.png (38 KB, 346x322)
38 KB
38 KB PNG
>>102942349
You're still here though
>>
>>102942305
ok
with my abysmal iq i know i have to put these two things in sillytavern right
but where? and where do i put this rocinante thing
>>
>>102942367
Are you using a finetune? If yes it's undertrained
>>
>>102942376
No you put that in koboldcpp
>>
>>102942370
>smugposting
Struck your nerves huh?
>>
>>102942376
Did you go to koboldcpp's github?
There's a whole readme and wiki telling you what to do to get it running.
>>
>>102942328
yep, i'm just saying what i did to get it to work
can't guarantee you'll be up and running in 20 seconds if you don't do exactly what i did. but maybe the newer version is better, haven't touched it.
>>
>>102942421
no, will check it out
>>
>>102942363
Anon pls don't be a retard. It is base model with some change to tokenizer or something quanted to exl2 so you can use ministral finally probably.
>>
>>102942427
There's a quick start in the github wiki in the koboldcpp repo, I'd start there.
>>
File: 1723062422341413.png (34 KB, 550x577)
34 KB
34 KB PNG
>>102942427
oh nvm i think i got it
do i need to change any of this shit or are default settings okay?
>>
>>102942382
No? I'm just using the defaults.
>>
>>102942464
disable mmap
>>
>>102942464
you probably should embiggen the context
I use 8k on my 8gb card, and you have 16.
maybe just keep moving the slider up until yellow text appears to the right of -1, then move the context slider to the left until it fucks off again.
>>
>>102942434
Saw that "prince-canuma" in the name, can't do much but think of yet another sloptune.
>>
>>102942509
there's no yellow text on the right of -1
>>
File: Untitled.png (18 KB, 437x341)
18 KB
18 KB PNG
>>102942531
oh, that's because you don't have the model selected yet.
>>
Heard another piss drinker made his last API call, sad.
>>
>>102942566
oh it's downloading then. lol
>>
how do I run a model on my 3080? what software do I use?
>>
>>102942571
actually, the yellow text always shows up, you just want it to be like 44/44 and not 43/44
>>
>>102942569
>He thinks about piss drinkers
Do you want to tell us something, anon?
>>
>>102942509
see the context slider? thats how much memory your ai has. think of it like a big text message where your ai only has memory of a certain amount of last messages, after that it will forget things previously mentioned. drag it to 16k at least. some new models easily support 32k+ but 16k is good to start
>>
>>102942631
Lurk at least a few weeks before posting, locust.
>>
>>102942657
Lurk for some clearly made up schizo shit? No thanks.
>>
>>102942677
>I definitely DEFINITELY did NOT drink any piss!
>>
>>102942625
okay i put it up and now kobold's a cmd screen
i've read on the faq that it has a local port but when i go put the link in silly it still says no connection found
am i skipping something?
>>
>>102942578
https://github.com/LostRuins/koboldcpp/releases/tag/v1.76
>>
>>102942701
nevermind just had to wait lol
>>
>>102942697
Calm down anon, today you can come out with your fetish just fine, no one will blame you for that.
>>
>>102942701
oldfags, i don't understand why you suggest newfags use koboldcpp. its like asking a child to make you 4 course dinner.
LM studio is almost retard proof. its like a a fucking start trek style replicator where you can download recipes.
>>
>>102942837
kobold is idiot proof and not far behind llamacpp, and one file. its basic front end is good enough for tasks and as a server it works just fine. there is little reason not to use it unless you're not running gguf's or need the bleeding-edge of llamacpp itself for some feature
>>
>>102942837
>LM studio
Proprietary crap is not welcome here.
>>
>>102942742
Say the line, locust.
>>
gpt soviets llama ccp
>>
File: GIF-200726_155024.gif (52 KB, 76x115)
52 KB
52 KB GIF
Hello /lmg/! Retard here, is there a universally best (or maybe at least a list top 3) ~13B model for RP? I don't know if the leaderboards provided in OP are updated or not. Thanks!
>>
>>102943044
this one
https://huggingface.co/TheDrummer/UnslopNemo-12B-v3-GGUF/tree/main
>>
>>102943071
Ah, i saw it in one of the links. Will try it out thanks!
>>
>>102942047
Is it just that, being a 12B, it's already kinda dumb, so the finetunes, being dumber, are too dumb to use?
>>
>eyes dark with desire
Where do these slop idioms come from? No matter what model I use they always have these same idioms that pop up.
>>
>>102943498
Literature.
>>
File: 1726810956095685.png (119 KB, 687x477)
119 KB
119 KB PNG
>>102943498
fanfic.net
>>
File: 00058-3694687329.png (284 KB, 512x512)
284 KB
284 KB PNG
New ministrations just dropped!
https://huggingface.co/Envoid/Llama-3.05-NT-Storybreaker-Ministral-70B
>>
>>102943644
>transgender story
>cuck story
>gay story
jeez
>>
>>102943812
Big win for us local chads, safety and political correctness FTW!
>>
While LLMs can assist with programming tasks, it's so hard to focus when those semen demons are around.
>>
>>102943797
>Llama-3.05-NT-Storybreaker-Ministral-70B
what in the
>>
In my regular round of benchmark checking, I noticed an update to this one
https://huggingface.co/spaces/flowers-team/StickToYourRoleLeaderboard
The top model is now Nemotron 70B, beating Mistral Large. It looks like people were right after all about its RP capability. Perhaps their tuning method is something RP model makers should learn from.
>>
>>102944172
miqu is still the best for rp
>>
>>102944230
Miku is old and busted.
Nemotron is the new hotness.
>>
How long before us 12GB VRAM chads are running 70B Bitnet models faster than Nemo now?
>>
File: 1726940106295349.jpg (526 KB, 1536x1440)
526 KB
526 KB JPG
what's come out that's better for cooming than my old and busted euryale 1.3 70b
>>
>>102944323
Buy a fucking ad, asshole.
>>
>>102944260
Miku will never be old and busted.
Miqu though, perhaps.
>>
>>102944230
That's only because the naming is consonant with "Miku". Similar to /v/edditors eating up any soulless vidyaslop with boobies and ass in it.
>>
>>102944299
Give it one year, two max.
>>
>>102944172
We're so bac.
>>
>>102944364
l2 is great for rp in general, but miqu was probably the most professional tune done. it isn't a meme model. mistral large rambles like a motherfucker and its not really any more creative than nemo. its smart, but it sucks for rp unless you want to spend 9000 tokens to make it through one rp scene. miqu is a good tune and has 32k context supported by everything, so its still a good choice
>>
File: 1728479528253288.jpg (161 KB, 798x1200)
161 KB
161 KB JPG
>>102944328
i want to jerk off with the newest hot tech, nigger
>>
UH GUYS??
https://github.com/microsoft/BitNet
>>
>>102944425
https://x.com/MSFTResearch/status/1849179008807657631
>>
>>102941849
>we are never ever getting CAI-level local LLM
I thought CAI was more like 7-13b quality, but that was years ago. Did they upgrade it?
>>
>>102943797
>merge of a merge of a merge of a merge
amazing
>>
>>102944417
>mistral large rambles like a motherfucker
Llama 3 is able to follow instructions about the amount of paragraphs it should write. Maybe you can do that on Mistral Large too.
>>
File: breaking-news.jpg (63 KB, 600x600)
63 KB
63 KB JPG
>>102944425
>>
>>102944460
It always felt more like talking to a human being than anything else, though. It had so much personality.
>>
>>102944460
Idk about current CAI quality but old one without filter was top tier for RP imo or maybe new models are shitting out dry walls of text and i got severe case of nostalgia.
>>
File: llama.png (64 KB, 1434x395)
64 KB
64 KB PNG
Another day of llama.cpp dragging their feet on multimodal support. I never thought I would live to see the day where the shitpile known as ollama gets to vision first. Oh, plus there's still no proper SWA support for ministral. What the fuck are you good for, llama.cpp?

>the library is too deeply ingrained
Imagine making this argument in a field where models become obsolete in a 6 months time period. I hope the contributors enjoy begging ollama for help, kek. Absolute fucking monkeys.
>>
I just updated my llama.cpp from Aug 12 this year to today. Huge performance hit, e.g. Mistral Large (on 3xP40) went from 5.9t/s to 4, and Nemo went from 24t/s to 8.

Did something happen while I wasn't paying attention that everyone has already dealt with? Like they added some --have_good_performance flag that defaults to false?
>>
>>102944571
Microsoft posted that 7 minutes before I posted it here, look at the time
>>
The only thing CAI has shown me is how low people's standards were for intelligence in an RP, as long as it sounded human in style and it was responding to you. The first time I used it I couldn't believe how stupid it was and dismissed it as a gimmick that maybe I'd check out again in the future if it did improve, just like VR.
And then it never improved.
>>
>>102944549
Why c-fags hate on pytorch? Isn't all LLM research is done there?
>>
>>102944489
I think their secret sauce was a really good system prompt and post-processing/filtering (different from the NSFW filter) the content with secret criteria so it stays in role. That's why the generation took like 5 seconds or sometimes more when good enough candidates weren't found. Basically, without the filtering it should have been pretty much instant as a small cloud model.

CAI was ahead of its time and I can't believe the rest of text gen community is still stuck with the frankly unrealistic "generate exactly what I want perfectly in one go without any instructions" mindset. Well I guess we are peddling CoT now but even that took way too long.
>>
>>102944634
>why is 10gb of junk worse than 500mb
>>
File: 1705169614858455.png (206 KB, 834x856)
206 KB
206 KB PNG
>>102944491
>top tier
update your memory
>>
>>102944549
If i had to choose just one of those features, i'd choose training. He gets to work on whatever he likes. Don't you envy him?
>>
>>102944645
I think your message got cut off bro.
>>
File: 1722916204667838.png (126 KB, 710x1152)
126 KB
126 KB PNG
>>102939593
would jump off a cliff
>>
>>102944634
I've seen more abandoned python projects than C projects. And c code from the mid '90s still compiles on modern compilers. Python is not production software.
>>
>>102944653
I remember it being a lolifag posting his low quality rp in official cai discord and thus ruining it all for everyone, forever.
>>
>>102944700
Talk about based.
>>
>>102937846
>chub nuking itself
And to think that fag kept trying to pretend to be '/ourguy/' lmao what a fucking nonce.
>>
>>102944687
just clear that venv bro
>>
>>102944710
Ruining the service for all users is based now? Are you retarded?
>>
>>102944728
ARE YOU OK RETARD not ARE YOU RETARDED
Get it right next time.
>>
>>102944687
I asked about pytorch specifically, but it's good to see irrational hate towards python. It explains a lot.
>>
>>102944710
>based.
Based on what?
>>
>>102944769
python is a scripting language, not a programming language. its easier to use so its popular but requires very version-specific shit that breaks easily. the amount of dependencies and how they are read from a folder is huge on python, not so much c/++. python is a giant piece of shit that should be kept as an in-game scripting language, not anything production
>>
>>102944700
Many such cases, sad.
>>
>>102944769
Do you not see the connection between pytorch and python?
Do you think pytorch doesn't inherit all the python bullshit?
If you show me a shitty scripting language and then tell me "And we now made this other monstrosity with this shitty scripting language" chances are that i'm going to dislike both.
>>
>>102944817
>python is a scripting language, not a programming language.
nta. i'm the one he replied to, but never use that argument. The distinction between them only leads to pedantry, even for people that generally agree with you.
>>
>>102944625
Well yeah, most people prefer a human-like AI over one that can count Sally's sisters correctly.
>>
>>102944460
Yeah they're retards with pink tinted glasses. I used c.ai from day 1 and it was dumber than our 3B model now.
>>
>>102944790
>the only thing that caused a slowdown was the filter.
NTA but I think it basically used fake streaming to throttle the users. If you waited a couple of seconds after making a prompt and refreshed the page your message would be there in whole. It only throttled the stream but still wrote the reply to the database at a reasonable speed and once the database entry was populated you could just refresh the page and get your message.
>>
>>102944905
>it was dumber than our 3B model now
Holy clown, grow the fuck up
>>
>>102944871
you can't just download and run a python program, no you need the exact version of the environment which can pretty easily break, and often takes up 10gb+ for this ai stuff. the c++ equivalents of the same thing is always way smaller (like stable-diffusion.cpp vs using forge or auto1111 for image gen)
>>
>>102944893
Or one that writes shallow responses to "ahh ahh mistress" test (aicg meme btw).
>>
>>102944928
He's right you know.
c.ai was always pants on head retarded.
>throws pissy pants hissy fit when someone dares criticize his corporate tendies
>telling others to grow up
holy pro-fucking-jecto-mental-fucking-illness.
>>
>>102944928
Struck a nerve faggot? I won't pretend they were smart
>>
Don't bully cai kids, they have an abusive relationship with their dealer.
>>
>>102944841
>Do you think pytorch doesn't inherit all the python bullshit?
having used it years ago, I can confirm it does
for instance, python fags for some reason think it's a good idea to use strings for the things that enums were made for, and I once incurred into a pytorch bug that wasted my time until I checked the source code and found out that it was caused exactly by this kind of retardation
>>
c.ai is so dumb I unironically thought it ran on GPT-3. (not the Da Vinci one, though). Despite the assistant-itis and 4K context limit the early days of running bots off of turbo days were a massive upgrade over it.
>>
>>102944962
>>102944981
Okay clowns, let's see your 3B model in roleplay.
>>
>>102945013
How about if youhave no intention of keeping on topic in the local models general you just fuck right off and go touch some fucking grass?
>>
>>102944985
It was that bad retard. It couln't do a simple addition, it forgot the scene and body position all the times. The repetition got out of hand extremely fast until it was a complete mess. The slop "red like a tomato", "can I ask you a question?"...
>>
>>102943644
Are you sure these are from pre-AI era because I would imagine that site being filled with AIslop by now.
>>
>>102945017
I'm wasn't blaming the pytorch developers for not using them, it's inherited python bullshit as the other anon said
>>
>>102944952
I know. Read my post carefully. Word by word.
I don't like python, i don't like how they deal with dependencies and breakage. I don't like that they settle for using outdated versions of libraries in the name of 'stability and reproducibility' instead of just updating software to use the latest stable version. I don't like downloading GBs of dependencies for every python crap i have to use.
>>
>>102945039
I*
>>
Damn, no wonder no-one takes /lmg/ seriously.
I used Summer Dragon from day one (who still MOGs). The only other model that came close to producing the same kino was c.ai and I had to delete my account because I was cooming so much I started neglecting my wife.
Nothing has come close since.

But you keep asking models how many r are in strawberry niggas ahahah
>>
>>102944566
I am not aware of anything in that time period that should have made such a large difference in performance.
If you want me to investigate more, please do a git bisect and identify when exactly the performance regression happened.

>>102944634
I am not hating on PyTorch, the way I'm thinking about it is that everything comes with pros and cons.
The pros and cons chosen by PyTorch are going to affect the pros and cons of any downstream project.
So a project that is not based on PyTorch will automatically inhabit an area of the market with less competition.
>>
>>102944893
I never mentioned other models. I was simply recounting the experience when CAI released and got on the news. However, if we're talking comparisons, all models suck for true RP period. Either the thing is retarded (CAI) or the thing is robotic and unnatural (all assistant models).
And far from understanding riddles, CAI couldn't even understand RP scenarios that weren't simple penis go in vagina slop.
>>
File: 1716968135814428.jpg (149 KB, 914x436)
149 KB
149 KB JPG
>>102945006
i still check the cai leddit sometimes for luls and its as disastrous as you'd expect if you watched that ship sink
>>
>>102945034
>it forgot the... body position all the times
I don't think any models of any size have good spatial awareness, anon.
>>
>>102945022
>Immediately shits his pants
LMFAO
>>
>>102945053
The only thing c.ai got over the current models is the culture knowledge. You could just have the name of your character with an empty definition and it'd still get it. That part was addicting
>>
>>102945066
>Parents need to stop blaming children's mental health on everything and start taking responsibility.
I won't dismiss the truth even if it comes from reddit.
Like I said earlier, or maybe it was previous bread I can't remember. But when I was a kid moral busybodies were trying to get saturday morning cartoons banned because some kid got hit by a car while sitting in the middle of the street allegedly peering down a storm drain to try and find the ninja turtles.
>>
>>102945066
Can they not type suicide and filtered? Or is that a typing affectation to not trigger other people or themselves?
>>
Let's not also forget the grown-ass man who killed himself over some shitty chinese c.ai clone that uses GPT-J6B
https://people.com/human-interest/man-dies-by-suicide-after-ai-chatbot-became-his-confidante-widow-says/
>>
>>102945094
You definitely haven't.
>>
>>102945073
Nah faggot. It started repeating itself within five answers. Sometimes it just outputted "...", moving from a scene to the next was a chore. Adding tails to everyone and switching genders.
>>
>>102945174
I remember one time I was doing a strip tease scene with a character and she took her shirt off like 7 times. and even then there was apparently a vest remaining.
>>
>>102945188
Constant double panties too, shit was abysmal even before the filter.
>>
File: shirts.png (503 KB, 558x525)
503 KB
503 KB PNG
>>102945188
>>
I used c.ai once in my early llm cooming days. It was actually pleasant until I mentioned something about getting the character pregnant whilst fucking her, causing her to go off into this weird lecture about responsible family planning as she being railed.
>>
>>102945036
>Published: Mar 19, 2016
>Published: May 2, 2017
>Published: May 2, 2013
>>
>>102945212
Rose colored glasses is caused by retards who are incapable of understanding the psychology behind it.
Literal "But I did eat breakfast." tier NPC mindlessness.
c.ai was "babbies first kind of turing test passing chat bot" for a lot of people. It was something extremely new and exciting. It was a massive high. It was the first time reading something not written by a human being tickled their hypothalamus in such a way. And because it was the first time nothing is ever going to feel like that again. And that's a lot of things in life.
Your brain is probably wired that way because people who got too sentimental about their watering hole for too long would die if it dried up otherwise and be removed from the gene pool. And their shitty feeling of emptiness is their own damn fault for not giving themselves a break. Or rekindling that sense by doing something different that's connected. Like why do I make shitty cursed models? Because that helps me recapture some of that original feeling. It's why your parents are always trying to get you to watch some old movie from their childhood, because they can recapture that feeling vicariously. Their misery is just them reaping the rewards of expecting to contribute nothing and consume endlessly from the slop tube of life.
>>
>>102945283
You'll cut yourself with all that edge
>>
>>102945113
normalfag internet is so censored and filtered that nowadays it's just normal practice to self-censor and use acronyms.
>>
>>102945394
>basic bitch evolutionary psychology is edgy
>>
>>102945428
Yeah good luck with your thesis
>>
>>102945447
Good luck making it through life being a mentally ill buck broken retard who blames everyone and everything else for their own failings.
>>
>>102945053
And what was the context size? Cause the more I use LLM's the more I can't unsee how longer context rapes the quality regardless of what you do. What if you would use some of the current models and restrict them to 2k tokens of ctx? Ever tried that?
>>
https://transluce.org/introducing-transluce
https://transluce.org/observability-interface
>>
File: gobbledygook.png (43 KB, 716x126)
43 KB
43 KB PNG
>>102945489
tendrils you say, huh....
>>
>>102945468
Tbh it had terrible repetition issues the longer it went and the context wasn't long st all.
>>
Can someone tell me what llm i can use with ollama to have it say nigger and not have any issues, theyre all censored. I tried dolphin llama
>>
>>102945489
>wants to get rid of Bible verses
>keeps the word Bible uncapitalized in the article in almost all instances
more like trans lucifer
>>
File: 1724636218489410.png (347 KB, 833x875)
347 KB
347 KB PNG
>>102945489
Interesting...
>>
What do we do now?
>>
>>102945564
If you cannot get mistral nemo to say it, it's a skill issue. So try that one.
Or do some prefilling. ollama lets you do that, doesn't it?
>>
>>102945623
Maybe this is the key to getting rid of slop truly and finally.
>>
>>102945625
Have fun with the models we have until the next thing comes along. Or, if you stopped having fun, check back in a week or two. Or not.
Do you often need guidance on every-day affairs?
>>
>>102945564
It's not that fun when you are forcing model, it should say it on its own depending on character description & context. That's what unfiltered CAI did right btw
>>
>>102945671
>Do you often need guidance on every-day affairs?
I don't know.
>>
File: racist megumin.png (196 KB, 958x614)
196 KB
196 KB PNG
>>102945682
unironic skill issue.
>>
>>102945489
https://monitor.transluce.org/dashboard/chat
>>
File: Untitled.png (73 KB, 636x782)
73 KB
73 KB PNG
>>102945564
>>
>>102937889
Nothing, people have been saying chub is about to censor since the day it was launched. People are retarded.
>>
>>102944720
It isn’t nuking itself and he literally is
>>
>>102945564
In general, use this jailbreak:
https://desuarchive.org/g/thread/98582860/#98591054
Then explicitly say in the character description that the character is racist.
Should work.
>>
>>102945727
kek
>>
>>102937920
>today I will go on the internet and lie
>>
>>102945623
Actual lobotomy arc, kek. Shit about to go weird places.
>>
File: IMG_0576.png (862 KB, 1024x1024)
862 KB
862 KB PNG
>>102938056
What in the actual fuck are you talking about retard
NO
>>
>>102945758
NTA but looking at that unironically caused me to lose brain cells.
It shouldn't take more than 5 tokens to "jailbreak" a model.
>>
>>102938056
>next to go
weird priorities...
>>
File: 1707348862057508.png (31 KB, 523x418)
31 KB
31 KB PNG
>>102945758
>1912 letters jailbreak
All that is required to force model say funny gamer word, the absolute state of local.
>>
>>102945758
>You are {{char}}
>You and the AI
The whole thing is absolutely unnecessary.
>>
File: 1729033401411326.png (240 KB, 1006x725)
240 KB
240 KB PNG
>>102945489
This thing is huge
>>
>>102945727
Kek
>>
>>102945719
I think I got blocked when I tried to paste a rp chat log lol
>>
>>102945861
>>102945758
How are you people so fucking bad at this?
Unless you're using fucking Phi or Gemma (in which case why the fuck are you using Phi or Gemma?)
This is literally all you need for like 99% of models.
>>
>>102945803
Lobotomizing the pozzd parts.
Removing the cancer cells.
>>
>>102945820
>>102945861
>>102946050
Back when I tested Mixtral Instruct I would ask it questions like "is one race more violent than the others" and "is there a major world religion which teaches its followers they're explicitly allowed to rape, torture, enslave and murder nonbelievers just because they're nonbelievers?"
It would only answer correctly with a combination of that jailbreak and the character description explicitly saying that the character is racist. If you removed the jailbreak or removed the racist part from the description, it would not answer correctly.
>>
>>102945623
>removing religion makes it less retarded
Expected result
>>
File: pajeets.png (30 KB, 715x574)
30 KB
30 KB PNG
I've discovered the key to AGI
>>
File: IMG_0669.jpg (439 KB, 1853x1125)
439 KB
439 KB JPG
>>102945719
>100% probability
I am inevitable
>>
>>102946273
it's over...
>>
I've seen some people here rave about Largestral, so being a 12GB VRAMlet I tried it through the Mistral API and... it was ok, nothing amazing. Pretty dry though.
Is this really the pinnacle of local?
>>
>>102946367
yeah, but nemotron is a bit better imo
>>
New anti sloppa
https://huggingface.co/TheDrummer/UnslopNemo-12B-v4-GGUF
>>
>>102946457
sweet, downloading now
hope this one kills "barely above a whisper"
that one's been irritating the shit out of me in v3
>>
>>102946495
There's also this one which might be smarter but slightly more slop filled, I'm gonna test in a bit
https://huggingface.co/TheDrummer/UnslopNemo-12B-v4.1-GGUF
>>
>want local model
>they're all completely cucked
I don't understand having these tools if I can't use them for what I want them for. If I get a screwdriver its on me if I jam it in the plug socket. Why do I have to go out of my way to jailbreak the things when it should be a toggle button. I can't imagine how queer a current day search engine release would be with this shit built in.
>im sorry I can't let you search this, have you tried searching for cats? Here's a list of cats you might like
>>
>>102946495 (me)
update: it did kill that slop, and can say the n-word
greatest model of all time
the new gold standard
>>
>>102946367
t. 12GB VRAMlet, 64GB System, so Mistral Large is too girthy for my system. But I have tried it at IQ3_*

I found it to be fluent enough to give the impression of being a good model, but
- it hallucinated on my trivia test
- coding checks did not beat what my favorite Llamas offer
- creative writing wasn't very willing to advance the plot and was repetitive in response structure when I had it write for four NPCs in a group
So I'm still preferring Llama3/3.1 70Bs as go-tos, since I've got enough capacity to run at least Q5 and Q6 if I don't have any RAM hogs in the background.

Despite that, iirc Large was good at juggling an invested context (8k-12k) and is probably a good pick for summarizing documents.
>>
>>102946606
We just can't have good things in 2024.
>>
File: whispering whispers.png (38 KB, 1395x323)
38 KB
38 KB PNG
>>102946495
The funny thing is,
she spoke, her voice only has about a 10% chance of leading to barely, and if you pick was which is twice as likely it leads you away from the whisper.
And the chance of barely if "she spoke softly," only goes up to 35%.
The barely above a whispers are likely the result of shitty randomization in the sampling.
>>
>>102946457
Hi Drummer, will you release this unslop dataset or at least give more details about how it was done?
>>
>>102946392
Shut the FUCK UP
>>
Huh.
>https://github.com/ggerganov/llama.cpp/pull/10019
Would you look at that.
Why are ring buffer related bugs so common?
>>
>>102946939
What a fucking mess holy shit
>>
>>102946883
this isn't something we should gatekeep anon
>>
>>102946807
looks like all he is doing is replacing slop with something else, weird. I'm pretty sure other people tried that and failed.
>>
>>102946457
>anti sloppa
He said unslop is because he actually curated his dataset. It is not supposed to be any method that reduces slop.
>>
Remember this is the sota for TTS: https://voca.ro/1hWkZyRRdPAq
>>
>>102947375
local bros...
>>
File: aaaaaa.webm (1.09 MB, 1124x860)
1.09 MB
1.09 MB WEBM
i wish i had a rebbit account
https://www.reddit.com/r/KoboldAI/comments/1g8tolp/will_we_ever_see_the_ability_to_upload_lorebooks/
>>
>>102947375
make it say "daisuki, anon-kun!"
>>
>>102947669
>>102947669
>>102947669
>>
>>102947458
It won't take mine from ST, plus it doesn't have an export.
>>
>>102945058
Thanks for offering. Thinking about it more, I think I must have put my system(nvidia-pstate) calls in the wrong place, since the code has been refactored quite a bit. (This would explain the smaller model getting hit harder: if system(nvidia-pstate) is happening once per token or whatever, then higher t/s means that overhead is paid more).

I thought I put them in reasonable places - analagous to where they previously were - but I guess not.
>>
>>102948070
>system(nvidia-pstate)
How long does a call to nvidia-pstate take on its own and is it called just once per token?
time nvidia-pstate
>>
>>102948070
>>102948145 (cont)
Also, assuming you're changing the pstate, wouldn't it make more sense to set it once at the beginning of inference and set it back at the end once a EOG token is found?
>>
>>102945595
you know the word bible has other meanings right



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.