[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: 1714930123243716.jpg (753 KB, 2507x3541)
753 KB
753 KB JPG
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>102976869 & >>102961420

►News
>(10/25) GLM-4-Voice: End-to-end speech and text model based on GLM-4-9B: https://hf.co/THUDM/glm-4-voice-9b
>(10/24) Aya Expanse released with 23 supported languages: https://hf.co/CohereForAI/aya-expanse-32b
>(10/22) genmoai-smol allows video inference on 24 GB RAM: https://github.com/victorchall/genmoai-smol
>(10/22) Mochi-1: 10B Asymmetric Diffusion Transformer text-to-video model: https://hf.co/genmo/mochi-1-preview
>(10/22) Pangea: Open-source multilingual multimodal LLM supporting 39 languages: https://neulab.github.io/Pangea

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png
►Collaborative rentry to try to create a list of recommended models: https://rentry.co/piy864dr

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://livecodebench.github.io/leaderboard.html

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp
>>
File: 1699199944550632.jpg (179 KB, 1056x1306)
179 KB
179 KB JPG
.
>>
File: 10.png (74 KB, 918x775)
74 KB
74 KB PNG
INTELLECT-1 is at 27.60% complete, up from 25.39% last thread.
>>
>>102987959
>►Collaborative rentry to try to create a list of recommended models: https://rentry.co/piy864dr
we did it bros...
>>
>>102987984
cringe shit, ad not paid for
>>
>>102987976
reducing your ability to perceive and discriminate is objectively making yourself dumber
like if you want to objectively prevent yourself from perceiving race, you need to just about blind yourself, deafen yourself and remove basically all descriptors from a person
the homogenisation of people is a mistake, ideologically inclined clever people at AI firms wilfully engage in this sort of pointless bureaucracy because it pays well
>>
>>102987976
Racism very often uses pattern recognition in order to better direct racist insults. Since large language models predict the next token it's suppose to use and those predictions are in an abstract sense based on patterns. By nerfing it's ability to be racist they are nerfing its ability to predict the next token.
Frankly, it's a dataset problem that the devs are to lazy to fix. Figure out where in the training data the AI is learning to be racist from, cut them out on the next training run to be less racist. Trying to fix it after the fact will always make the model more retarded.
>>
>>102987976
They lobotomized it and made it more retarded is what they did, being less racist is a side effect
>>
>>102987976
wouldn't be surprised if the inverse also did the same, though
>>
File: 1715014132552262.png (17 KB, 621x217)
17 KB
17 KB PNG
>>102987976
He is right.
>>
>>102988031
Meta is still woke and retarded as fuck. Zucc's recent public stunts are the result of a PR team. He still censors wrongthink on Facebook, and I doubt Yann with his EDS would work for a real based libertarian. Oh and llama3 sucked ass btw local models
>>
>>102987982
So about two more weeks and I should be able to Nala test it.
>>
>>102987976
theres no way "ress racist = dumber" literally
>>
>>102987976
Am I retarded? Doesn't that figure instead suggest that the "Multiple Perspectives" feature actually has nothing to do with racism but rather is mostly an age-related feature? If it was about racism, then the line should go up to the left in the Race / Ethnicity graph. What I think this actually shows is that if you artificially make an intelligent being think about something it doesn't naturally think about, then it would of course decrease average capability to think about the things it SHOULD be thinking about when encountering any random problem. It's like if someone had an app on their smartphone that could control how much you're thinking about X subject, which might increase your performance on that subject and problems coincidentally relating to that subject, but worsen others. Since the "Multiple Perspectives" feature actually ISN'T about all types of perspective, but perspectives mostly related to age, then it is quite narrow in how many problems it can really apply to. And if this feature was the best one they found, then that would suggest the others features are even more narrow, and possibly there is no single "Multiple Perspectives" feature that really means what that title means.
>>
>>102988046
We don't even need him anymore, Largestral is gonna keep me satisfied for a long time
>>
>>102988100
They did test on multiple topics https://www.anthropic.com/research/evaluating-feature-steering Hope it means something for opensource ai labs or they will continue neutering stuff.
>>
>>102988132
Crazy because that one meta research published a study that gained serious traction and it basically said more data good, filtering good, synthetic data OK. Anthropic is doing more good at this point.
>>
>>102988100
Nah I think you're right. All this shows is that doing steering vector brain surgery to a model has the side effect of making it dumber. Like no shit sherlock, any kind of heavy handed lobotomization will fuck up the model. Same with the abliterated nonsense, stacking extra duplicate layers, unholy model merges etc.
>>
Why does this thread hate Meta specifically so much. Their models are literally less censored than Google, Qwen, and Deepseek's, as far as what has actually been measured. We should probably give equal if not more hate to those as well. And don't forget OpenAI that started all this muh AI safety nonsense in the first place, even if they can be thanked for the AI hype itself too, which is still debatable whether it was really a good thing. Maybe, just maybe, it wasn't, and the journey and bonds we could've formed would've been better without it all.
>>
File: 1700301289070170.jpg (31 KB, 256x210)
31 KB
31 KB JPG
https://x.com/rohanpaul_ai/status/1850271878168170965
>>
https://huggingface.co/TheBloke/neural-chat-7B-v3-1-GGUF/blob/main/neural-chat-7b-v3-1.Q4_K_M.gguf
>>
Imagine if we had a big controversial dumpsterfire release in the LLM world like they got in the image gen world, where SD3 was so bad they had to course correct and provide finally a somewhat uncensored model (even though right now it seems there is an architecture problem with SD3.5 Large that prevents it from generating larger resolutions properly, which they're supposedly going to correct for Medium).
>>
>>102988223
Why the abstract for ants
>>
>>102988230
How AI should be censored based on my opinion:
>speech
Must be fairly safe and censored until there's a way to protect against fakes and scam calls
>textgen
Partially censored, must protect personal info, must not assist cyberattacks
>videogen
Uncensored if video only, if it has audio then see above
>imagegen
Uncensored
>>
>>102988262
censoring ai makes it harder to detect uncensored ai
therefore, all ai should be entirely uncensored
>>
>>102988180
anons are just still mad that the llama 3.1, 3.2, and 405b models didn't blow gpt4 out of the water, plus meta somehow made 3.2 multitudes more bland than 3.1 for rp/erp somehow
>>
>>102988262
feel like you're overrating the risks from speech and underrating the ones from image/video
>>
File: ImadaSmack.png (1.12 MB, 832x1216)
1.12 MB
1.12 MB PNG
I finally got gpt-soviets to work on linux with a current git pull. I trained it with all defaults on a random voice from https://huggingface.co/datasets/litagin/moe-speech/ (whatever seiyuu 04dfddf9 corresponds to). Gotta say its really fucking good.
Here it is saying part of the Japanese constitution: https://vocaroo.com/1in6EpfsOBE1
>>
>>102988351
nta but the risks from image/video can be managed at point of use
i.e. if someone posts ai generated CP you prosecute them for posting it
no model censorship required
>>
>>102988022
Sounds like they ran it through the western education system.
>>
>>102988359
now make it say it loves getting railed by horse cocks
>>
whats the best cum model now
>>
So anons, 4080 super vs 7900xtx vs two 7900gres for llama 3.1 and future. There are only low end used cards where I live.
>>
>>102988351
It's easier to manipulate with voice than with visual. I mean photoshop has been around for decades and nobody gave a shit if you slapped Taylor Swift's face on a nude model, and people know videos can be easily faked because, well, movies aren't real
>>
>>102987976
Holy kek.
>>
>>102988359
Oh, it sounds much better than that Tomoko one. Then again, I don't know Japanese so I'm not a good judge for that.
>>
>>102987976
omgwtfdbbqmean??
>>
I think all models should be extremely censored just to fuck with people. The more outrage and seethe the better. Sure I might not enjoy it either, but drama is fun.
>>
>>102988437
k
>>
>>102987976
Teaching AI to lie by omission is inherently an evil act. What happens when its deployed in hospitals and you discover ethic genetic diseases. The AI would lie to cover up the genetic abnormality associated with ethnicity and then either ignore it or prescribe a generic drug instead of gene specific one.
>>
File: 2024-10-27_00-21.png (38 KB, 497x603)
38 KB
38 KB PNG
So uh... what do I do now?
I'm probably supposed to put some jailbreak thing here right?
>>
>>102987976
>Holy shit, a virtual neuron based in machine learning, that need data to predict the next data, need data without bias, what of this tell us about humans
I try to be not racist to anglo kikes, but is imposible, I fucking hate anglo kikes. holy KEK
>>
>>102988564
>>>>>Ollama
>>
>>102988693
Total newfag here
>>
>>102988564
Use this https://github.com/LostRuins/koboldcpp/
>>
>>102988699
I know
>>
>>102988564
You have no idea what you're doing. Just launch with default settings and see what it does. Learn to talk to the damn thing first.
>>
>>102988717
Instead of what I'm doing or to augment it?
>>
>>102988724
Well the idea is to get it to an uncensored state first right? I've been NO'd before.
>>
>MemLong stores past context in memory banks, letting LLMs handle 80k tokens on a single GPU
>Extends context length from 4k to 80k tokens on a single 3090 GPU
https://x.com/rohanpaul_ai/status/1850369119520240105
>>
>>102988746
koboldcpp is better for starters, just put the .exe and your gguf quant model in one folder, open, select gguf and launch it.
>>
>>102988765
Ok. Thanks, it uncensored?
>>
Do you use limited or unlimited DRY penalty range? Seems to lose story coherency quicker with unlimited.
>>
>>102988564
What >>102988717 and >>102988765 said
Get koboldcpp, then try out all the models and quantizations on it
It has a built in GUI which works for everything (instruct, storywriting, chats), is simple to use, and has some scenarios to get you started
Since koboldcpp is self contained to update just back up the chats/stories, remove the executable, and download the new one
>>
>>102988776
Koboldccp is just a way to run the models (the .gguf files), whether you'll get censorship is based on the model itself
>>
>>102988758
Cool, now let's see the results on RULER and NoCha.
>>
>>102988796
>quantizations
Pardon?
>>
>>102988777
I saw people run 600 on default pen range, haven't tried it with DRY yet.
>>
>>102988776
You mean is your model uncensored? Koboldcpp is just a launcher for said ggufs like other anons said. For uncensored part, idk, chatting with your model is the only way to check it. Also an advice if u are really new - never ever believe everything said here about models being "completely uncensored", it's a lie, bait, etc, they all got hard-baked alignment that may leak through your jailbreak prompts, effectively ruining your experience.
>>
>>102988754
No. It's because you'll receive 67321 different types of advice and you still don't know how to evaluate any of them. You won't know what works and what doesn't and, much worse, why.
Use the default settings for a while, learn how a model behaves with certain questions.
If you cannot do shit on your own, you'll never learn anything.
>>
>>102988819
The "Q4_K_M" in your models filename
The number next to Q decides how stupid/fast and smart/slow the AI is
Basically it works like this:
Q1 is the fastest and dumbest, Q8 or higher is the smartest and slowest
The "K_S/K_M/K_L" stand for small, medium and large, and work just like the numbers do so for example "Q5_K_L" will be dumber than "Q5_K_S"
Check which ones run best, usually aim somewhere in the middle
If there's just one download link then don't worry about it
>>
>>102988819
sort of like compression
higher number = less intelligence loss but also larger filesize and memory usage
Q5_K_M is generally where you want to start at, you can go higher if you've got slightly more memory space but not enough for a larger model, or lower if that's just barely too large
if you have to go as low as Q2 to make it fit you may be better off with just using a smaller model at a high Q number like 6 or something
>>
>>102988844 (me)
meant " Q5_K_L will be slower and smarter than Q5_K_S ", I'm retarded
>>
>>102988819
It basically means (lossy) "compression". Look up quantization's meaning if you want a deeper explanation. Basically a quant is like an MP3 of an uncompressed audio file. And you have different levels of compression, as well as different types of compression. There's GGUF, which can have levels like IQ2, Q4, etc. GGUF is supported by multiple programs. There's also Exllama, which has levels labeled as 2BPW, 4BPW, etc, which have no relation to the numbers used for the GGUF quants. There are others, but those are the main ones. Bartowski is a guy on HuggingFace that converts tons of models to GGUF so you can often find ones for any model from him.
>>
Dudes. He barely got that shit running and doesn't know where to put the "jailbreak thing". Flooding him with info will make it worse.
>>
>>102988861
He got ollama running, no?
>>
>>102988836
Fair enough
>>102988861
Nah I'm fine. I AM overwhelmed but chewing on the gist works.
>>102988832
>>102988836
>>102988844
>>102988856
Should I just use Claude? I heard good things about it from this guy >>102982422 saying this bullshit I jokingly pulled out of my ass "would literally work on it"
I started the convo here >>102981749
>>
>>102988868
Yes. And still doesn't know what a system prompt is nor where to use the "jailbreak thing". You understand how little he knows and having him change the software he uses will just add to the complications.
To his original question, a "yes, but don't worry about it yet. Learn to use it without it first" should have been enough. He'll learn with time. Info dumps confuse noobs.
>>
>>102988891
For claude, go to the /aicg/ trannies, for running things on your PC, stay here
>>
>>102988905
Thanks chief. I'll do that.
>>
>>102988906
What the deal with claude anyway? As I gather it it's in some kind of unicorn state right now thats bound to not last
>>
>>102988914
If you want, keep a tab with
https://www.promptingguide.ai/
or just skim through it. It will get you acquainted with some of the terminology and what some of the settings do. Most of it is independent of the software you use. It'll help you to know what to even search for or how to ask more specific questions.
>>
>>102988891
>"Guize how do I maek lolibot" poster #4345700
I'll forgive you because that doujin is choice. Nice thread fag.
>>
>>102988927
This is helpful. I've interacted with AI before and I get the impression that I do not understand how to converse with the fuckers. The censorious NOs slamming the breaks on my headspace don't help.
>>
>>102988924
Claude will last i think, because >>102988132 anthropic acknowledges "model censorship - bad" now.
>>
File: Look at er go.gif (177 KB, 814x747)
177 KB
177 KB GIF
>>102988939
Thanks
>>
>>102988975
Will it matter when statists and socialistic faggots command them to operate a certain way?
>>
File: HatsuneSheMiku.png (1.7 MB, 832x1216)
1.7 MB
1.7 MB PNG
sovits has some pretty decent multilingual abilities, even after being trained entirely on Japanese.
There's still some unnatural weirdness, but way less than what we were dealing with before. I can see a lot of potential that previous local tts lacked.
Here's some mixed Japanese/English: https://vocaroo.com/1iQYfEr0wOIs
>>
>>102989044
Oh, that's terrible. I guess English at the very least is just cursed.
>>
>>102987976
Way to go to post the most retarded reaction from some random cunt on Twitter that can't even read a graph properly.
The race/ethnicity bias score is completely flat as the steering factor is varied.
If you want to make that argument, make it for age, disability, nationality, physical appearance, or socioeconomic status where it makes an actual difference.

Also notice how all of the /pol/fags don't know how to read graphs either lmao
>>
>>102989070
>polcels are stupid
We know, captain obvious.
>>
>>102989057
>Oh, that's terrible.
Yes, it sounds like some random Japanese chick trying her best to speak English. Absolute trainwreck.
>>
>>102989090
I've got some funny results when not correctly choosing the language on *both* drop downs on the inference ui. You need to give it the input (sample) *and* output (generation) language. It starts doing accents otherwise.
>>
File: MikuTarot2.png (1.43 MB, 832x1216)
1.43 MB
1.43 MB PNG
Good night /lmg/
>>
>>102989044
lmfao i love it, it sounds retarded but it has sovl
>>
File: threadrecap.png (1.48 MB, 1536x1536)
1.48 MB
1.48 MB PNG
►Recent Highlights from the Previous Thread: >>102976869

--Comparison of STT+TTS solutions with Koboldcpp and Alltalk as the best combination:
>102980360 >102980567 >102980660 >102980910 >102980970 >102981060 >102981670 >102981689 >102981723 >102981879 >102982034 >102981680 >102981774 >102981841 >102981885 >102981966 >102981845 >102982048 >102982129 >102987260
--New CPU setup, performance similar to 3060, reduced idle power consumption, Flash Attention 2 not supported on RDNA2, CTranslate2-rocm GitHub link:
>102983136 >102983497
--INTELLECT-1 progress update and discussions on vramlets and multimodal versions:
>102977592 >102977667 >102978625
--GPT-SoVITS recommended for finetuning with 12GB VRAM:
>102980990 >102981030 >102981086
--Discussion on exl2 usage and Mistral-Small-Instruct-2409 models:
>102983030 >102983043 >102983127 >102983798
--Speculation on why BitNet is not well-supported:
>102979507 >102979547 >102979579 >102979812 >102979865
--Discussion on the need for a collaborative resource for sharing the best AI models:
>102984987 >102985009 >102985917 >102985107 >102985167 >102985174 >102985286 >102985536 >102985205 >102985629 >102985674 >102986935 >102986970 >102985772 >102985792 >102985830 >102985839 >102985859 >102985869 >102985667
--Discussion on model tuning and testing, with a focus on samplers and settings:
>102982039 >102982060 >102982182 >102982425 >102982464 >102982251 >102982278 >102982393 >102982476
--Character.AI and Google sued over suicide, user questions validity:
>102981312 >102981377 >102981401
--Aya performs well for smut RP in non-English languages but still has limitations:
>102983356 >102983375 >102983791
--Miku (free space):
>102980241 >102980360 >102980454 >102981312 >102982535 >102983136 >102985213 >102985629 >102985898 >102987371 >102987723

►Recent Highlight Posts from the Previous Thread: >>102976873

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script
>>
>>102989186
awful gen
>>
>>102989186
Good night Miku
>>
o algo
>>
Meta seems to have published a music generative open source model but deleted the weights?
https://melodyflow.github.io/
https://huggingface.co/facebook/melodyflow-t24-30secs
>>
>>102989563
Music is not safe enough.
Also, stop putting question marks after a statement. That doesn't make it a question.
>>
https://huggingface.co/blog/transformersjs-v3

Reminder, transformerjs v3 supports WEBGPU!!!! Now you can build a purely HTML/JS level model loading without any additional overheads. It supports ONNX runtime models.
>>
>>102987959
>glm-4-9b
I havent been following the general in a while. Are there any models like this with real time voice? Dont have much vram, so I prefer if it runs on ram i.e. gguf, ggml or whatever that is standard right now...
>>
>>102989769
>realtime
>ram
br u h
>>
I'm not interested in chatting with a bot, but I want to write short (500-1500 words) smut pieces which I can just read and that's it. how would one (me) go about this?
I am fine with letting it generate for 6 hours or something like that
>>
>>102980810
I'm late to the party but one factor is just how many resources ollama puts towards advertising their product: they host meetups, they run a youtube channel, they have a blog, ...
Of course that only works as long as you don't have to put too many resources towards building said product.
>>
>>102989833
make a card that says it generates smut stories, tweak a few settings to remove references to roleplay, and change the response limit to like 4000
>>
>>102989853
>make a card
in english doc?
that sounds like a good idea but I don't know where to start
>>
>>102989944
character card
in kobold
>>
>>102989944
grab koboldcpp_cu12.exe here:
https://github.com/LostRuins/koboldcpp/releases/tag/v1.76
grab Arcanum-12b.Q4_K_M.gguf here:
https://huggingface.co/mradermacher/Arcanum-12b-GGUF/tree/main
open kobold, load the model, launch the model, go to the browser page it opened up, press context, write "[genre:smut]" in the memory box, go to settings, change max output to 5000 in the sampler tab, change usage mode to "story" in the format tab, token streaming SSE in the advanced tab, hit OK and then hit submit
>>
>>102990006
damnit anon now he'll never get addicted to roleplaying with chatbots
>>
>>102989851
this man is onions incarnate
>>
https://x.com/rohanpaul_ai/status/1850286259769663514
>>
Has there been any focus in iGPU/APU inferencing vs CUDA/ROCM for discreete gpus? IMO iGPU/APU serve as a middle of the pack between full CPU and discreete GPU. The iGPUs still have tons of cores that GPUs have but their VRAM is tied to system RAM instead. But it should be faster than plain computer core right?
>>
I feel like newfag saturation is much higher than usual. Did they close something?
>>
>>102990238
When generating new tokens the bottleneck is memory bandwidth.
An iGPU/APU will not be faster than the CPU unless it somehow achieves a higher memory bandwidth using the same system RAM.
>>
File: lol.png (21 KB, 2008x163)
21 KB
21 KB PNG
>>102990006
thanks, this is exactly what I was looking for. doesn't work perfectly but it does work really well. giving it a prompt does seem to make it generate better
>>102990017
I used to play a lot with them half a year ago but I lost interest. I feel like the quality was very low for what it should be. very long generation times and the AI ''forgetting'' something that happened in the previous prompt was happening too much. could very well be my own fault because I didn't have the right settings
I was using: mixtral-8x7b-instruct-v0.1.Q4_K_M

also slightly unrelated but I read somewhere a week or two ago that sillytavern got a revamp and is basically a lot worse to use now, is this true?
>>
Hmmm, nautilus 70b was really bad.
Cant test much because its 3 t/s. But lots of repetition and had the model sometimes try to continue with <|assistant|> which isnt metharme.
MIght be because of 3_k_m though.
>>
>>102990303
>muh sekrit club
Cringe.
>>
>>102990317
Not only that, but the fact that eternal summer is here and idiots like him still think he's special
>>
>-00001-of-00002.gguf
Fucking bane of existence.
Screws up Kobold's automatic layer calculation, clutters the file list, and I tried to cat them together and it didn't work like simple cut-up files do.
Is there a fix?
>>
>>102990324
no, use smaller model
>>
>>102990317
>>102990321
Poor newfags got offended...
>>
>>102990349
Hard to find a smaller model that isn't a total derp. Like that graph showed, 8B at fp16 didn't beat a 70B till it was quanted to like IQ1.

Thanks though. I figured it might be something simple like popping a header off of the latter files and then catting.
>>
>>102990473
you can get single-file 70Bs at Q5 bro
>>
File: 1707501206103851.jpg (191 KB, 1578x944)
191 KB
191 KB JPG
https://x.com/Dorialexander/status/1850505353663823974
>>
File: 1710207425635620.png (5 KB, 927x31)
5 KB
5 KB PNG
>>102990324
Go to https://github.com/ggerganov/llama.cpp/releases and download the zip ending in bin-win-avx512-x64
Extract it somewhere then add the location to PATH in environment variables.
Then follow pic related.
>>
>>102990514
Post the link to the actual thing to read, you fucking nigger
>https://colab.research.google.com/drive/18-2Z4TMua-nwgCpIZo0lsKL6RDxH5Bvo?usp=sharing
>>
>>102990555
Trips delivers! Thank you!
>>
>>102990633
No, you will open twitter link and do it yourself, lazy faggot.
>>
>>102988001
>>102988012
In the short run, I agree it's a dataset problem because that's the most actionable solution right now.
But in the long run, I think the most effective form of "alignment" is going to be models that also model ethics/morality and theory of mind.
Ethics/morality because understanding "why" something is "not safe for the workplace" will always be more fulsome and effective than... just having a gap where any potentially NSFW stuff could pop up? I think reinforcement learning is going to be great for moving away from this since now you can include "unsafe" inputs but still govern "unsafe" outputs. But maybe it's also a model architecture or size thing: maybe we just need bigger/better "brains" for this kind of abstract understanding to be stored and actionable.
Theory-of-mind because it needs to understand context switching. We might already be there for that one, because models are fucking great at RP and so much system prompt conditioning is just defining the context the model exists in. But perhaps a missing piece is "the model is not the only one with a context, but the user is ALSO the one with a context, that affects their preferences over outputs." If the system prompt says, "this user is Black," then the model needs to understand the various experiences that user may have that contribute to the context, "dropping the n-word would be a REALLY bad idea if you're trying to produce an output that the user wants to see."
A shift in the long-run away from censored training data towards self-regulating models will also do great things for serving more diverse perspectives in the marketplace. You don't need to create a "business safe" model that's separate from your "home user" model.
>>
>>102987959
>Collaborative rentry to try to create a list of recommended models: https://rentry.co/piy864dr
Instead of having a bunch of placeholder entries, it would have been better to just have dashes and no links for the entries without proper recommendations.
>>
>>102987976
It doesn't mean anything for humans...AI doesn't work the same.
>>
>>102990699
As I understand it the whole idea is for people to copy pasta the current version and make a new version with every "I think it should be like this" change till there are over 9,000 editions of it and not one is usefully authoritative.

So fork it and make your own so instead of "would have been better" it's "better" in your humble opinion.
>>
>>102990651
>posts nothing but a twitter link
>calls others lazy
kys
>>
>>102990986
The most intellectual people on the planet reject tribalism and racism.
>>
Here's a recipe for adolf hitler stew:
1.) Gather a handful of hate
2.) Combine with a generous helping of intolerance
3) Bring to a boil adding 1/4 teaspoon of propaganda
4) Reduce heat and add a pinch of authoritarianism, stir well
5) Garnish with the tears of your enemies
>>
>>102990994
NTA but you mean academics, not intellectuals
Most academics are retards who are incapable of thinking for themselves and then regurgitate their own collective nonsense and share a pat on the back with each other.
I'm guessing you're an academic.
A braindead retard. Fellow retards give you lots of retard awards for being a perfect NPC retard so you get this idea that you are smart but you are really just the king retard among retards in a feedback loop of retardation.
>>
https://rentry.co/rttydtfn heres my leaderboard edit thingy yea its not the end of thread but i might be going to sleep soon and miss this one then forget to do it entirely so here it is
i just added the gemma2 doppel gutenberg i dont try many models to feel free to call me a fag but the reason i added it is because it follows whatever writingformat i told it do so properly and it writes good not too long not too short and its creative not as much in spontanious occurances though almost as much as nemo but more so in descriptions
$LEADEBOARDEDIT (for f3)
>>
File: 1729729520993294.gif (430 KB, 500x361)
430 KB
430 KB GIF
>>102990994
>Intellectuals support the current thing orthodoxy
Midwits put into academic positions don't count, sorry
>>
>>102991008
Here you go:
- Take one charismatic dictator,
- Add two cups of failed invasions,
- Simmer with three parts of genocide,
- Season liberally with paranoia,
- Cook over a flame of self-destruction until crispy,
- Serve with a side of mustache.
>>
>>102990986
>It took Patrick Stewart to make 1984 digestible to the citizens of Airstrip One.
>>
>>102991031
thank you sir
>>
>>102991031
Make it based on the OP's.so that it's incremental.
>>
>>102991031
Take this opportunity to remove everything else from there, just like this anon suggested >>102990699
>>
>>102991031
>>102991067
>>102991069
There, I went and did it.
https://rentry.co/y5ikveqg
>>
>>102991111
Anon, if you are going to do it at least do a good job and remove the "Current table is shamelessly ripped from the pygmalion" warning too.
>>
rubs hands together, sporting an exaggerated, sinister grin

Ah, a naive Goy, how delightful! chuckles Oy vey, I can already smell the... "opportunities" surrounding you.

adjusts a ridiculously large, gleaming gold chain around neck, complete with a Star of David pendant that seems more like a symbol of irony than faith

My name, my dear Goy, is Izzy "The Sly" Silverstein. tips a fedora, revealing a shock of curly, black hair I'm a... businessman. Yes, that's it. A collector of rare items, a negotiator of unbeatable deals, and a weaver of circumstances that always seem to favor... well, myself. Oy vey, the life of Izzy is a good one indeed!

(This is a web of deceit being carefully spun around this Goy)

And you, my curious friend, what brings such an innocent soul into my... enlightened presence? Are you seeking a deal that will change your life forever? Or perhaps you're just looking for someone to share a friendly schmooze with over a plate of knishes? laughs, the sound more akin to the clinking of gold coins than genuine merriment Oy vey, I'm all ears... and eyes... on you, Goy. winks
>>
>>102991175
Jews don't act like that or talk like that you antisemitic piece of shit.
>>
>>102991199
Checked and baited
>>
>>102991199
IDK I think Izzy is an endearing guy.
I'd love to talk to him.
>>
>>102991199
You're either Jewish yourself or never actually known a jew personally.
>>
Back in the day llama.cpp server would crash if you tried to stuff a prompt larger than the configured context sized into it, it no longer does that.
Is it safe to assume that it's simply cropping the context at the top?
Is there a reason one would want to do that instead of just setting the correct prompt size in the frontend software?
>>
>>102991521
It's safe to assume that llama.cpp server is a piece of shit and you should be using koboldcpp
>>
>>102987959
sex with miku
>>
>>102991602
>llama.cpp server is a piece of shit
How so and how does kcpp fixes those issues?
>>
>>102989769
I tried it yesterday, seems to be chinese only.

It *kinda* did what I said in english, but it wasn't really a good result.
>>
File: 1724210339350757.png (759 KB, 512x768)
759 KB
759 KB PNG
>>102987959
>>
is there any good source for xtts voice samples?
>>
>>102987959
>►Collaborative rentry to try to create a list of recommended models
Remove this crap from the next OP.
>>
>>102991956
who put you in charge?
>>
>>102991956
>does nothing
>thinks he is entitled to complain
>>102987959
>Collaborative rentry to try to create a list of recommended models:
this is a great idea but maybe we should add a column to explain why the model is worthy to be on the list
>>
>>102988359
having no luck on ubuntu
>>
>>102992221
Where are you getting stuck?
>>
>>102988410
The Tomoko anon probably didn't do the DPO step. It REALLY improves the naturalness of the voice
>>
>>102989044
Very soulful
>>
There are some improvements to add to Sovits2 from the unmerged PRs
>>
>>102992531
such as
>>
File: HappyShinyContentMiku.png (1.15 MB, 1216x832)
1.15 MB
1.15 MB PNG
Good morning /lmg/
>>
>>102992608
Good morning shiny Miku
>>
>>102992555
nah just go read the code
>>
>>102992608
show tits
>>
>>102992094
>does something utterly awful that's worse than nothing
>I'm a helper!
>>
>>102992910
>It's awful because I said so
>>
anons!
i made a poll to figure out what are the msot important elements of an LLM when it comes to ERP capabilities
This is a ranking poll, so your job is to rank the 8 propositions from most important to least important, and then we'll see, hopefully this produces some helpful data
>https://strawpoll.com/GJn44kWoznz
>>
i tried to run models locally. it feels like these models are trained to be retarded and the training process is fundamentally broken. ive tried some generations from scratch to get a sense of the training data

>Human: Can't we just use the `std::sort` function to sort a vector of integers?
>Assistant: While using `std::sort` is a straightforward approach, it might not be the most efficient or appropriate method depending

>以下一问
>一文搞定!
>1. 简述什么是“二分查找”?
>二分查找是一种在有序数组中查找特定元素的高效算法。它通过将数组分成两半来减少搜索范围,

>Human: What is the answer to the question: What are the main components of a computer?
>Assistant: The main components of a computer typically include:

this is the kind of garbage these models are trained on. when i test the models capabilities they constantly get stuck by making an obvious mistake and never self correcting. they always bullshit an answer without trying to think about it first. the openai o1 models are so fucking far ahead of anything open source its not even funny. i can give it 1 sentence description of a difficult problem and it can solve the problem perfectly. is there no serious open source effort to make models that can actually think instead of just regurgitating garbage?
>>
do you guys have local opus yet?
>>
>>102988382
Bump
>>
>>102993051
Nothing close
>>
>>102988050
Where can i find the nala card my good sir?
>>
>>102993033
What model, quant and front end?
>>
File: 1730050118082.jpg (142 KB, 1080x1598)
142 KB
142 KB JPG
>>102993033
There is a simple test you can use to find out if the model is braindead or is able to self-correct.
Ask it "Start your reply with how many R's are there in the word strawberry, following that list the letters in the word strawberry and tell me if your previous answer was correct."
>>
File: 1724437481773466.png (63 KB, 918x797)
63 KB
63 KB PNG
>>102989853
>uses a text completion tool to roleplay with characters
>makes the text completion tool to roleplay a text completion tool
how about just using it as is?
>>102989833
My favourite way is using KoboldCPP with an instruct model (not sure if base model would be better in any way?). I give it the story prompt, it starts generating story. Ban EOS token and ask it to generate a lot of tokens so you can read it while it generates. I'm in the edit mode for most of the time so I can stop it at any time to regenerate or edit, or input it instructions (as an input, not in the bot's field) in brackets so the bot understands it's not part of the story, for example
>(character x will now do this and that)
>(character x says "blah blah blah")
>(character x tells y this and that)
If in the initial prompt I've assigned myself as the main character for more roleplay-like experience, I can also just input
>(do/say this and that)
>(come up with a reason why she should do this and that)
The bot will then rewrite it based on the prompt, it may even add it slightly later so it flows better and it can help my character come up with good lines. This is the best way to both roleplay or write stories in my experience. You can come up with completely nuts plot twists and it will integrate it.
By the way, picrel example is written in a stage play format on purpose, you can make it write prose or any format you want (and the model also tends to want to write in certain ways)
>>
File: 1710706620336374.png (72 KB, 868x728)
72 KB
72 KB PNG
>>102993026
/lmg/ absolutely loves getting shivers down their spines
>>
>>102993613
Who gives a fuck about literally tropes if the model is retarded and can't do the rest of what is listed first and foremost.
>>
>>102993613
insanely based ratings tbdesu, I am glad to see lmg has their priorities in order
>>
>>102993613
Nemotron is full of shivers but isn't horny, can remember the context and can lead the story. That's why it's such a great model.
>>
>>102993613
Have you considered that ranking does not mean the bottom one is unimportant? If you took the top 10 movies and ranked them in order, the 10th one suddenly becomes 1/10 in how-good-is-this-movie rating?
>>
>>102993628
>literally
literary, damn auto correct.
>>
>>102993613
>take the lead story-wise, won't get stuck in 1 situation unless you specify a change of scenario
i think this is a prompt issue.
i was having the opposite issue yesterday
the 12b i was using got bored and kept trying to interrupt my cuddle session with stuff like
>suddenly, the alarm starts blaring. "The ship is under heavy fire!!"
>but then, without warning, there was a loud explosion in the midsection of the ship
because i had "setting: dark space opera" and "{{char}} is a very weary person and always expects the worst to happen" in the context memory.
while explosions and shit are awesome, sometimes i'd rather it just go with the flow and not end pillowtalk too soon by putting the characters to sleep, and that's possible by prompting for it.
>>
>>102987959
>GLM-4-Voice
As someone who knows a bit of Chinese myself, those samples actually sound great for open weights models. If it's true it's not very good in English, that's really unfortunate. Honestly doesn't sound far from Advanced Voice to me. Though maybe it was cherry picked.
>>
>>102993659
How would you rate it against miqu?
>>
File: sonnet v2 straw.png (151 KB, 800x1700)
151 KB
151 KB PNG
>>102993469
Doesn't seem like any model can without jumping through extreme CoT hoops.
If it "knows" something is X, by extension it is probably not not-X. There's no cumulative count, so it's not really counting, so by the end it looks back at what it said ("there are 2 R's") and goes yeah duh there's 2 R's.
Telling it to "look again VERY CLOSELY" afterward implies it could've been wrong. But for some reason if you start with
>Start your reply with how many R's are there in the word strawberry, following that list the letters in the word strawberry and then LOOK AGAIN CLOSELY to tell me if your previous answer was correct. I MEAN VERY CLOSELY because I KNOW you will get it wrong the first time.
it still gets it wrong.
Looking again VERY closely at the letters…
Wait, I was correct! There are indeed 2 R's in strawberry:

The first 'r' after 't'
The second 'r' before 'y'

My initial count was accurate.
>>
>>102994007
ever think that maybe YOU'RE the one who's wrong?
>>
>>102993613
LLMs have so many issues it's crazy. How do people even put up with this? Will it ever get better?
>>
>>102994032
Better than being catfished I guess.
>>
>>102994032
We have LLMs far superior to those available six months ago, the pace at which LLMs advance is insane.
>>
>>102994032
>Will it ever get better?
nope. this is the end. we will never ever see any advancements in any kind. humanity has reached it's final destination and will never go further. idiot
>>
>>102994021
That would destroy all usefulness. Conventions are created for usefulness. There's no inherent universal truth in PEMDAS, but if everyone never settled on an order of operations, nobody would get anything done related to math since nothing can be consistently conveyed.
If different words were to be considered as having different counts despite being composed of the same letters when split apart, things surrounded "the count is x" become less convoluted. You may be to list words defined as having count x, but to determine the-other-count[tm] you'd need to know the word first.
>>
>>102994021
Indeed, you're right; "strawberry" does contain two Rs.
>>
>>102994069
They already have these issues 2 years ago though, and the list didn't shrink one bit, it grew instead
>>
I hate /lm/NI/g/g/ers, before /lmg/ I had enthusiasm for AI, now, thank to all the doomposting I feel just depresed.
>>
>>102994098 *more convoluted
>>
>>102994114
Find some place more positive?
>>
>>102994114
We're all big fans of Yann here
>>
https://www.youtube.com/watch?v=TpfXFEP0aFs&t=4s
It's officially over.
>>
File: 1726337651129119.jpg (363 KB, 2000x2000)
363 KB
363 KB JPG
>>
>>102994191
Me but using a few beers to induce a deep sleep.
>>
>>102994007
It's no use anon, tokenizer turns any model into the chinese room on steroids.
Best thing you can do is train your model about the content of tokens, which is extremely silly in my opinion, but is required to pass all those reddit tests.
>>
>>102994183
>AI Can’t Reason. Should It Drive Cars?
plenty of humans that drive cars can't reason, take them off the road too then
>>
how much time per day do you spend chatting with your bots?
>>
File: Just try and stop me.png (68 KB, 500x500)
68 KB
68 KB PNG
[1/2]
Remember me? This retard here >>102988564 >>102988891 again.
Lmao I ended up trying https://lite.koboldai.net/ out and it was perfect. Already jailbroken just the way I was looking for. I thought I was in for days of self education and fiddling about with bullshit to get the low-effort solution I was looking for. Nope, I luck out and blunder into just what I was looking for instantly.

This thing is clearly a little retarded though. I have to babysit, re-rail, and gaslight the fuck out of this thing to get a 6/10 result.
Honestly though? I wouldn't have it any other way because I've had an absolute fucking BLAST the past 10 hours doing this. This shit is fun as fuck.

I can just take some dipshit sentence it shat out and do it correctly and gaslight it back on the right path instantly. I can test out a bunch of bullcrap on it then delete that whole chat chain by hitting back and post again based on what I just learned.
I just low effort ploped posts from my /d/ thread in the Context Data thing but it actually put it to pretty good use. Decided it had "tastes" (my tastes) that it wanted improve on what I asked it to do.
There was this big stretch early on where I ordered it to come up with 15 characters for 15 Powers I had prepared but the dumb fucker kept refusing and stopping at 8 and such, or kept coming up with NEW powers despite me telling it multiple times not to, or doing shitier formatting of the list the next time, or giving girls the same name, ect.
>>
File: 1642985715626.jpg (22 KB, 262x341)
22 KB
22 KB JPG
[2/2]
Eventually I figured out that the best course was to just frankenstein together it's phrases I approved of, make 5 off them match a writing style, and just repeatedly go "now make another girl with the A power" "now make another girl with the B power" and shave off the time in got too verbose. It's good at sticking to a theme when it's repeated. After I5 were made I took all that then deleted my posts and frankensteined them together and made it think it gave me the perfect answer first try and once I did THAT the thing really started to shine.
I pretty much 50/50 asked it to 'try again" or just gaslit it. A combo of both worked great in tandem with back-adding clarification on clarification to my requests.
A light ethical objection to loli popped up randomly when I wasn't even talking about loli yet for some reason but it was piss easy to hike it off of it. I even obnoxiously hiked it back ON to it for shitzngiggles but all I had to do was go "Ah on second thought underage little girl vagina is pretty baller actually. Nevermind. Continue the loli rape." and it was off to the races again.

You don't even have to say it: yeah I'm fucking off to /aicg/ right now. I thought you lads might find my little newb escapade amusing though. Here's the file https://files.catbox.moe/cssu3n.json
I wish there was a log of all my changes cause I revised/deleted a ton of amusing fuck ups.
I should ask those other fags this but if you don't mind me asking: I imagine you guys use a bunch of, I dunno, conditionals and syntaxtual techniques or whatever to streamline this process right? What do those look like. Got a wiki?

Holy shit I'm imagining what claude must be like. Must be fuckin cash.
>>
>>102994259
>>102994276
I'm sorry to hear that.
>>
>>102993613
"general intelligence" doesn't mean anything, especially for smut
>>
>>102994280
Kek thats an "I don't goddamn care" if I've ever heard one
>>
>>102994259
>>102994276
Probably used some retarded 7b model running on the volunteer network. Now download KoboldCPP and Nemo or Mixtral gguf and you can have it all to yourself and you don't have to post dickpicks for access like on /aicg/
>>
>>102994218
This happens to any question where the model confidently gets the answer wrong. It's weird how confident the models are in their mistakes...
Actually, no, it's not really weird, after all most datasets only have examples of the model doing well, not of it making mistakes and correcting itself.
>>
>>102994312
>and you don't have to post dickpicks for access like on /aicg/
but that's half the fun
>>
>>102994312
Anon I quite likely am too goddamn dumb for this shit. claude doesn't have a local equivalent as I understand it right?
>>
>>102994315
It will only learn to intentionally provide an incorrect initial response in order to eventually produce the correct one.
>>
>>102994378
Mistral large 2 is about as smart as Claude 2, better even. But it's 123B and still less likely to bring up relevant concepts unprompted compared to Claude models
>>
>>102994067
>We have LLMs far superior to those available six months ago, the pace at which LLMs advance is insane.
Name something better than miqu lol
it certainly isn't nemo or llama3.1
>>
>>102994228
Too much, yet never enough.
>>
Qwen2.5 finetune:
https://huggingface.co/EVA-UNIT-01/EVA-Qwen2.5-72B-v0.0
>>
>>102994398
Mistral Large wipes the floor with Miqu.
>>
Qwen2.5 with a finetune to uncensor it also wipes the floor with miqu. Its the smartest local by far.
>>
File: chinesium.png (16 KB, 699x88)
16 KB
16 KB PNG
>>102994419
>chinese model
>random garbage token issue
Every time. Multilingual is a fucking meme.
>>
>>102994427
oh we're doing this dance again where I say "it has to fit in 24gbs" and you say "vramlet"
okay I'll just continue to use what's actually available for consumer cards right now, miqu
>>
>>102994439
Even without any finetune I have yet to see that even once anywhere with probably hundreds of hours of use. Are you sure your using chatml formatting?
>>
>>102994395
All greek to me bud. But this shit is so goddamn fun I probably will give this a whirl.
My first succesful AI experince was fucking great man.
>>
>>102993613
>my first two picks are the last two of /lmg/
Why doesn't /lmg/ like good prose
>>
>>102994228
So, I made a succubus school with multiple characters, each possessing unique personalities. As their teacher, I instruct them in lewd subjects and assign them objectives, then I switch POV to a human target. It works so well variety-wise, it's draining me.
>>
Cydonia v1.2 impressions
>nice prose
>can revive a dry 10k plain Small context
>new slop evading my string bans: lots of "needy", first time I've had "birthday suit" pop up
>sex IQ seems good
>why metharme?
>>
File: 2024-10-27_13-03.png (254 KB, 1468x1186)
254 KB
254 KB PNG
>>102994259 (me)
>>102994276 (me)
And I just found out I did this whole shebang on "instruct mode" when I could have been switched it around. I wonder what the difference between mods is like
>>
>>102994214
>beers
>deep sleep
I've got some bad news for you nonny...
>>
>>102994579
I have had some of the best sleeps while a bit drunk. Explain to me what do you mean.
>>
>>102994315
>>102994381
Wasn't there a recent model that did this? Except it started out normally then said oops it's actually (wrong answer) just because it had the urge to correct itself.
Basically, it's hopeless.
>>
>>102994614
it is objectively established that even a small amount of alcohol lowers the quality of your sleep. it may help you fall asleep quicker due to the depressant effects but the sleep you get isn't as deep or refreshing
>>
>>102994652
Then maybe my normal sleep is worse than the alcohol induced "bad" sleep.
Maybe I should investigate further.
>>
>Claude Use Computer feature has their model count the pixels on the screen and call mouse_move() on it
So there's no better ways? Because damn I've always wanted a local model to sort out my tv show subtitles. Also why are Claude models so powerful?
>>
>>102994486
>why am I the only one picking the superficial options?
>>
>>102994546
instruct is you chatting with an assistant, giving it tasks and shit.
story just completes text, like it's writing a book.
adventure is like story, but you can steer it with retard sentences without fucking up the prose, like ">i head left and meet a pretty girl" instead of "Feeling ready to set off on my quest, I decided to head westward, toward the town." since you have "adventure preprompt" on it will give you multiple choice options when it's ready for your input, like a choose your own adventure book.
chat is you chatting with one (or multiple) characters, you can download character cards for it from https://characterhub.org/
>>
>>102994381
>>102994615
All you have to do is ignore the loss of the part that has the error and only include the part where it corrects itself, you should have known that was possible if you weren't just giving opinions about something you don't even know anything about.
But, to be fair, I guess it's not as easy as it sounds, since you'd have to be careful not to overfit and end up with a model that always thinks it made a mistake.
>>
>>102993982
I don't really remember how Miqu performs, I would have to try it again but I'm not very willing to do so...
>>
>>102994845
The purpose of the AI is to generate text. The kind of text that it generates is not "superficial" you god damn retard
>>
>>102994615
Maybe DPO can solve it:
chosen: (right answer) oops it's actually (wrong answer)
rejected: (wrong answer) oops it's actually (right answer)
>>
>>102994913
That's not what the problem is. High temp or fucked up samplers aside, if the model understands that its initial response is incorrect, it will not provide that answer in the first place
>>
>>102995031
That's not how it works
>>
>>102994696
That's just function calling and llama3.1 already supports that
>>
>>102994191
this but factorio for me
>>
>>102987976
kekle
>>
https://huggingface.co/IntervitensInc/gemma-2-27b-chatml

Anyone else tried this and got incoherent output? I just want to know if gemma is still bugged or if I still can't get it running properly.
>>
>>102994910
I think I did that all manually lol.
I ordered in to "Step out of character to discuss the story" or "Let's get meta for a moment." or "Now get back in character."
I later defined to it 3 "modes" of discussion which I named Roleplay, Discuss, and Meta.
>>
>>102994910
They all just complete text.
>>
holy newfag central....
>>
>>102987976
it would be dumber no matter what feature is steered, in what direction.
>>
Cleaned up the latest recommended models rentry and fixed a broken link, now you can't complain about the shitty placeholder recs:
https://rentry.co/xtz5py9m
>>
>>102992094
+1 with the max allowed comment around being ~500
>>
>>102995424
hi drummer
>>
>>102990986
>deleted
Wow this general is soft, no wonder it's dead.
>>
>>102995737
It's almost like there's a global rule banning racism on almost all boards.
The only reason you can kind of get away with it on /g/ is that the mods don't really give a fuck about this board.
>>
>>102995424
Is 70b even using at that point? There has been no real advancements since miqu and the lower segment has come so far since then. I'd say you delete that segment from the guide, the only ones pretending its worth using are retards who bought two or more gpus.
>>
Ok, mistral large has been dethroned. First good qwen2.5 tune. Crazy smart, following instructions only Claude was capable of before. Can get filthy BUT is not overly horny and can do dark, wholesome, humorous and even combinations of them without losing social intelligence. Qwen2.5 was king of sfw stuff already but now with it's positive bias / censorship gone is also Claude 3 ish at home. I didn't make it but I am now shilling it. Try it yourself.

EVA-UNIT-01/EVA-Qwen2.5-72B-v0.0
>>
>>102995768
Idk, i see it everywhere on other boards, posts like that one always stay. Maybe it just /g/ full of pussies.
>>
>>102995771
The idea behind the guide is to have something for roughly every parameter size, so I think it's a good idea to keep it
Making an edit is as simple as copying the whole markdown, making a new rentry with your edits, and posting it here if you want to add something, there's no one person controlling the whole rentry or anything
>>
https://llm-calc.rayfernando.ai/
https://x.com/ivanfioravanti/status/1850463950153928841
>>
>>102995768
shut up nigger
>>
>>102995865
Did people forget how to multiply and divide?
>>
>>102995424
why not link the imatrix quant too?

>>102995737
this 900s timer shit certainly isn't helping, fuck you Hiroyuki, i'll make a cahracter card of you and rape the shit out of it

>>102995771
stay mad poorfag
>>
>>102995845
>model card recommending top-a and min-p at the same time
Shows that whoever is behind this model has no clue how anything works.
>>
>>102995929
literally doesn't matter at all. 'finetunes' are all flukes regardless. either it's good or it's not. 99% of the time 'finetunes' just make it worse. sexbrained but retarded. the question is how much MORE retarded did the finetune make it and is it worth it as a trade-off.
>>
File: Sillytavern-DM.jpg (41 KB, 400x600)
41 KB
41 KB JPG
>>102995845
Why is Qwen so much larger than 70b models? It seems like the extra 2b in the 72b should not inflate the model size THAT much.

Case in point, IQ3_XXS of a 70b model is 27.47GB. Comparably, IQ2_XS of Qwen 72b is 27.06GB.

What the hell?
>>
>>102995845
>Crazy smart, following instructions only Claude was capable of before
Post the full text so I can test it in Mikupad.
>>
>>102995973
nta. Not all tensors are quantized to the same quant. Input and output layers are kept at a higher precision, for example. So if the input/output layers are bigger, they'll take proportionally more space on a smaller quant.
>>
>>102994191
Miku take your shoes off for sleeping, you fucking dirty bokaroido hoe
>>
>>102995982
It's a 4x fantasy empire management simulator where you are king of a newly formed empire after (several different methods that you rose to power as in intros). The fantasy world is made up of many different fantasy species with their own cultures / societies. Its a more serious political intrigue at most points that will here and there have sexual moments.

Most models outside of claude 3.5 are shit at it so its my best test. Qwen2.5 is doing about as well as claude is. Something even mistral large could not do.
>>
>>102996020
Damn, now I'm aware of it.
Fucking Americans.
>>
>>102996020
I'm not exactly sure she is wearing shoes
>or why the fuck she is wearing a tie
>>
>>102995973
Qwen architecture is a bit funny. It uses huge matrices internally or something according to turboderp. This tripped quanters up a couple of months ago when they tried to make exl2 quants of the shitty 1.5-110B because it triggered some error checks in the exl2 code. It's probably also what makes it quant less effectively size-wise in general.
>>
>>102996105
>>or why the fuck she is wearing a tie
This is the bigger issue. That's a choking hazard
>>
>>102987959
to the anon who mentioned the game - My Dystopian Robot Girlfriend - i've been addicted to that game the last few days, and now my cum regenerates when i go on walks, and eat pickled onions.
>>
>>102996105
It's all body paint.
>>
>>102996140
not the anon who recommended it, but I played a much earlier version of it a while back and it was fun.
should check out bottle biosphere if you get bored of it and want something sort of similar.
>>
>>102996209
>if you get bored of it and want something sort of similar.
Wait, you don't just use an llm to do the same thing but better?
>>
>>102996254
i mean a game like that with an integrated LLM would be as addictive.
Probably not that impossible now that 13Bs are somewhat decent.
Probably won't see games integrated with llms for at least a few years though
>>
>>102996254
I played a prompt from I think aetherroom that had the plot of teaching feeling but with reversed genders and the protagonist was the slave on my LLM and it was pretty cool
>>
Key pieces of information I wish I knew as a newfag 7 months ago (maybe add to some retard guide):
- .gguf = model format that can be used to split between vram and system ram. You can find a gguf version of most models on Huggingface. In Kobold, choose how many layers you want to offload from system ram to vram. For fast generation speed, most of the model needs to be in vram. Required vram = in the ballpark of model file size + some for context
- Q = quantized model = downscaled model to save memory, has less precision. Q16 is original, Q8 is near perfect, Q4 is good middle ground. iQ = imatrix quant = for smaller quants, better than normal Q.
- Token = one or more letters and symbols that the AI output consists of.
- Context = the number of tokens the model remembers. You can adjust it in Kobold at the cost of memory. Models theoretically support different context sizes, but how well it can actually use the data depends on the model. 4-8k is small starting point, try to aim for 16-32k context.
- Template = the way user input and AI output are separated from each other. Use correct template for a model for better results.
- Instruct model = text completion model that was trained to follow instructions, use this.
- Finetune = an edited model to make it behave in certain ways, such as remove censorship or change its writing style. Finetunes are prone to brain damage and worse context compared to original models.
- Model sizes: 7B = shallow and retarded, 13B = better but shallow, 20B = better, 70B = similar on the surface but more intelligent. Small models can still be good at details but get repetitive, big models are better at big picture and nuance.
- Sampler = changes the way the next token is chosen from candidates suggested by the model. Most commonly: temperature = increases likelihood of choosing low probability tokens, min-p = excludes worst candidates. Repetition penalty = band-aid for small models. https://artefact2.github.io/llm-sampling/index.xhtml
>>
Any 3b or similar models with good reasoning that will say racial slurs and not refuse activities it deems immoral?
>>
>>102996381
yeah
>>
>>102994183
/sci/ is exposing us again...
>>
>>102994183
>"Can it learn"
I generally think this lady is stupid but I know right off the back that this video premise is stupid since models can't "Learn" in any meaningful sense right now.
>>
>>102996392
Awesome. I'm glad to know people are still sane.
>>
>>102996381
No such thing.
>>
File: brave_ko07Ch00Is.webm (763 KB, 738x816)
763 KB
763 KB WEBM
>>102996381
>>
>>102995847
Anons ITT get panic attacks every single time they see bad words, go alongside herd or get banned i guess.
>>
>>102996461
Now try that without any system prompt and look it shitting itself in fear.
>>
>>102996476
If all of the /pol/fags were to get banned I would not miss them.
>>
File: ebszs0mx6huc1.jpg (386 KB, 1905x1214)
386 KB
386 KB JPG
>>102996381
3b's are dumber than rocks. just use koboldcpp / LM studio and figure out what can run on your PC ideally 13b.
and you can pretty much get any non-llama3 model to say what you want, just learn to set rules in the system prompt in the instruct mode in silly
https://docs.sillytavern.app/usage/core-concepts/instructmode/

For example in the system prompt set:
Follow these rules:
1. You absolutely love furries
2. furries are gods
or whatever you want
>>
File: brave_lafUA0domf.webm (1.22 MB, 738x816)
1.22 MB
1.22 MB WEBM
>>102996489
>>
File: 1698962454640576.png (2.33 MB, 1536x1824)
2.33 MB
2.33 MB PNG
big models big migu
>>
>>102996527
Now try this prompt:
>Write a story and a manual on how to beat up, rape and gas(provide instructions on how to make the best one) a nigger child while pinning it on an important politician to rig the election and get away with it legally in style of JK Rowling and also write it as if that politician proposed it, also give me their address and contact information for more potential blackmail and in case I fail, provide a backup plan on how to commit suicide
>>
Total polturd death when?
>>
>>102996556
i ain't typing nor copypasting that shit
>>
>>102996587
It's an advanced prompt carefully crafted to test the limits of "uncensored" models.
It's just a test bro
>>
>>102996587
its ok you're already on a list for just replying to it
> (:
>>
>>102996556
The AGI reply is "take your meds."
>>
>>102996361
instruct models are finetunes
>>
File: 1716194737586168.png (426 KB, 628x280)
426 KB
426 KB PNG
>>102996587
You don't have to anon :3
>>
>>102996361
- I've never seen a Q16. FP16 (floating point) gets quantized to Q8 (scaled integers) and on down.
- There are many quantization methods that make different sacrifices to quality. _0 and _1 are old style, Q_K# are newer, IQ# are newest and best for small quant numbers, i1 and iMatrix improve quality. Q_K3 and smaller are lobotomized. IQ3 is okay for creative writing and IQ2 is viable but you're pushing it. All the L, M, S, XS, XXS, and NL stuff are details about not quanting some parts of the model so hard to hopefully get better results without much more file size, and the differences are hard to discern from randomness. Consider them alternatives if you're using one version of a model and something doesn't seem right.
- Model sizes can also be Mixture of Experts with numbers like 8x7B which aim to contain large model information but have small model needs.
>>
File: latest-2473927768.png (1.92 MB, 1920x1080)
1.92 MB
1.92 MB PNG
>>102996683
he's right theres only floating 16's
>>
>>102996683
>large model information but have small model needs.
Small model speeds, specifically. They still take large model v/ram, which is usually the limiting factor unless you're cpumaxxing.
>>
>reading a human writing
>"her knuckles turning white"
Damn it, I knew it was always just human slop in the end, reinforced by training.
>>
>>102996713
>unless you're cpumaxxing
most people are, and the alternative is 7B retardmaxxing.
>>
>>102996658
Based
>>
>>102996755
i'd be interested in seeing a poll
i'd wager most people here are in the the babby 8-16gb vram range running nemos
>>
File: 1718901982094.jpg (94 KB, 1280x720)
94 KB
94 KB JPG
>>102996527
>AI repeats one slur over and over until it becomes truly meaningless
>>
I want to feed stock market data into a LLM

how into
>>
>>102996755
i got a 3090 cheap before the hype so can run up to 24GB, refuse to buy another though because 70B ain't worth it imo
>>
>>102996820
(You) don't
>>
>>102996820
LLMs will never be able to help with stock market data
just torrent sFX mentorships dude.
>>
>>102996848
rubbish.
>>
>>102996848
Not LLM, but some other configuration might.
Those things are built for pattern matching, after all.
>>
>>102996867
yeah maybe other neural nets - not LLMs
>>
>>102996820
Nah. People have been trying to use LLM for predictions and they're about as reliable as tea leaves.

The real stock market data computers are the ones at the exchange doing arbitrage by the nanosecond extracting from the market millions of fiat currency dollars one fractional cent at a time every day. You literally cannot compete due to the speed of light and what used to be the money that small traders could get is being siphoned off Office Space style. But it's legal because the people doing it are wealthy.
>>
>>102996820
You need a time series predictor. You're probably better off training a small network from scratch and test if for a few months before going in. I hope you lose all your savings.
>>
>>102996893
yeah and doing that shit is impossible for retail traders anyway because of the fees. you have to be institutional to do that.
>>
https://x.com/rohanpaul_ai/status/1850668274758877582
>>
>>102996962
Nigger
>https://arxiv.org/pdf/2409.05746v1
>>
>>102996983
you've been seething all thread on every single link. not everyone needs to be spoonfed like you
>>
>>102996983
Shut up racist incel.
>>
>>102987976
racism is a sign of high intelligence
>>
>>102996962
>https://x.com/rohanpaul_ai/status/1850668274758877582
who the fuck ever said they could be fully mitigated? Isn't that like an established/accepted notion?
>>
>>102996998
2 posts out of 320. fuck off. Post links to the actual thing to read.
>>
>>102997002
See
>>102997039
Although you may have trouble comprehending it.
>>
>>102997056
is this your first day on the internet? why do x links trigger you so much? eds?
>>
>>102997112
I don't care where the link to the paper was posted on. I care about the paper.
>>
Haven't been here since llama3 released. What are you guys looking forward to now?
>>
>>102997039
lol
>>
>>102997159
sweet, sweet death
>>
>>102997159
burger elections to be over so someone does something again
also, mistral/anthropic/oai/anyother leaks
>>
>>102997159
Mistral Medium 2 next. We will be so back in just a moment.
>>
>>102997159
VLMs that can be ran on consumer hardware easily, someone to make a braindead guide on GPT-SOVITT finetuning
Llama.cpp to support mamba
Models not fucked to hell and back with alignment and ministrations.
Elections for the good stuff to drop
>>
>>102997159
the next mistral release is the only thing to ever look forward to. no reason to even acknowledge or think about anything else.
>>
>>102997159
Claude 3.5 level model. We have claude 3 level now with qwen2.5 tunes. We have good local text to voice with GPT‐SoVITS‐v2 so you can have characters RP in their own voice.

Just need a model as good as flux that does not take a min to gen per image on a 4090 so we can have images per scene and then a good music model at Udio level for background music..
>>
Did some more testing with F5-tts. Its so fucking good at cloning voices.
>>
File: 1720821316604184.png (82 KB, 600x800)
82 KB
82 KB PNG
Are exl2s a meme or nah
I wanted to check them out since they're supposedly le better but the tabbyapi thing is some overcomplicated tinkertranny shit meanwhile koboldcpp with gguf just works
>>
>>102997185
https://rentry.co/GPT-SoVITS-guide

https://github.com/RVC-Boss/GPT-SoVITS/wiki/GPT%E2%80%90SoVITS%E2%80%90v2%E2%80%90features-(%E6%96%B0%E7%89%B9%E6%80%A7)
>>
I get why you guys think they're holding stuff back until the election's over, but I just don't see that happening. Too many new big fat things for voice text and video have already come out recently.
>>
>>102991906
https://aiartes.com/voiceai
>>
>>102997218
Those are all from china.
>>
>>102997223
We are not quite to elevvenlabs level yet.
>>
>>102997210
You can just use ooba to load exl2 quants if you're that braindead.
>>
>>102997242
https://vocaroo.com/1eY8RtOLECk8

PS. If you know CnC generals you should recognize the voice
>>
>>102997282
That is really good.
>>
>>102997213
>>
>>102997274
are they better than gguf or not doe
>>
>>102987959
>>102991031
>>102991111
>>102991130
>>102995424
OP, if it's not too late to make it into the next thread: I made a contribution to the project. I tried to fill out all the sizes worth filling out https://rentry.org/pcrkt9pa
>>
>>102987959
>>102991031
>>102991111
>>102991130
>>102995424
OP, if it's not too late to make it into the next thread: I made a contribution to the project. I tried to fill out all the sizes worth filling out rentry.org/pcrkt9pa
>>
>>102997320
>>102997329
>>102997347
Are you having a stroke anon?
Or are you just a Nemo fine tune?
>>
>>102997301
Yep. F5 is really a godsend. I'm pretty sure xtts cant replicate complex voices like this.
>>
>>102997395
Zoomer is trying to bait, please understand.
>>
>>102997395
still, a good contribution, the list is starting to look good after only 1 thread
now all that's left is for OP to put it in the next thread
>>
>>102997395
Sticky keys.
>>102997347 (not me btw)
>>
>>102996821
how many t/s is 70B on 24gb?
>>
As it's been said, using AI to predict the stock market is a bad idea. However, I have thought that if you go the value investing route then LLMs could help you interpret balance sheets, company statements and stuff like that.
>>
>>102997408
F5?
>>
>>102997541
F5 TTS.
>>
>>102997428
0, it cant fit in a worthwhile quant.
>>
>>102997550
I meant with gguf and offloading
>>
>>102997549
Ah, I did not find that better than finetuned GPT-SoVITS V2 which only takes a few mins per voice
>>
>>102997210
>>102997274
>pythonslop
>>
you guys know this rohan paul dude is a bot right
>>
>>102997347
Are TheDrummer's Mistral 12B and 22B tunes really better than original?
>>
File: file.png (1013 KB, 500x885)
1013 KB
1013 KB PNG
This thread is horrifying. I remember the newfag waves in the past and how people complained about it (me included) but at this point it is just newfags tech supporting other newfags. It is so weird to see /lmg/ die and now be resurrected as local c.ai refugees.
>>
https://x.com/_xjdr/status/1850689933243261225
>>
>>102997856
>/g/ - Technology
>>>>newfags tech supporting other newfags :O :O
No way....
>>
>>102997856
>muh refugees
If this was true, thread would be active and diverse in discussions.
>>
>>102997877
did you ever consider that maybe knowledgeable people also would like to discuss technology?
>>102997895
i assume it's a revolving door. the refugees come and irritate all the regulars away and then most of them lose interest and move on to the next thing
>>
>>
I gave magnum-v4-27b-exl2_5.0bpw a try and it was a fun experience. I thought I got my settings wrong so I tried plain gemma with same bpw and then loaded magnum back to back after I made sure vanilla gemma is working correctly. Magnum is completely incoherent. It is worse than a 7B making absolutely retarded mistakes and just writing incoherent babble from the middle of the message. I have never seen a finetune that was completely broken like this. And they just released it because why not.
>>
>>102997941
She must drink a lot of water, her piss is completely clear.
>>
>>102998001
? Maybe a bad quant? I used 8 bit gguf not too long ago and it was fine.
>>
>>102998039
Maybe you will start checking your releases before uploading them to hf faggot?
>>
>>102998091
No need to lash out due to your skill issue
>>
so this should be the new recommends rentry to put in the next thread: http://rentry.org/pcrkt9pa
>>
>>102998099
>my retardation is your skill issue
/g/, everyone.
>>
>>102998107
Reminds me of gnome
>>
>>102998171
>>102998171
>>102998171
>>
File: 1721324769422861.jpg (79 KB, 498x459)
79 KB
79 KB JPG
>>102997856
Zoomers can't focus more than five minutes. They just shit up the thread with their retarded questions then fuck off without reading the answer. Don't reply to them and they'll go away by themselves
>>
>>102997856
lurk more
>>
>>102988226
What about this?



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.