[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>103077338 & >>103066795

►News
>(10/31) QTIP: Quantization with Trellises and Incoherence Processing: https://github.com/Cornell-RelaxML/qtip
>(10/31) Fish Agent V0.1 3B: Voice-to-Voice and TTS model: https://hf.co/fishaudio/fish-agent-v0.1-3b
>(10/31) Transluce open-sources AI investigation toolkit: https://github.com/TransluceAI/observatory
>(10/30) TokenFormer models with fully attention-based architecture: https://hf.co/Haiyang-W/TokenFormer-1-5B
>(10/30) MaskGCT: Zero-Shot TTS with Masked Generative Codec Transformer: https://hf.co/amphion/MaskGCT

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://livecodebench.github.io/leaderboard.html

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp
>>
File: 023a3def6f9.png (1.76 MB, 1024x1024)
1.76 MB
1.76 MB PNG
--- A Measure of the Current Meta ---
> a suggestion of what to try from (You)

[NEW] VRAMCHAD / CPUMAXX
- Arki05/Grok-1-GGUF-Q8_0

96GB VRAM
- TheBloke/goliath-120b-GGUF-Q5_K_S

64GB VRAM
- BeaverLegacy/Moist-Miqu-70B-v1.1-GGUF-Q6_K
- TheBloke/Noromaid-v0.4-Mixtral-Instruct-8x7b-Zloss-GGUF-Q8_0

48GB VRAM
- TheBloke/KafkaLM-70B-German-V0.1-GGUF-Q4_K_S
- bartowski/Llama-3.1-70B-ArliAI-RPMax-v1.2-GGUF-Q4_0

24GB VRAM
- MikeRoz/ArliAI_Mistral-Small-22B-ArliAI-RPMax-v1.1-6.0bpw-h6-exl2

16GB VRAM
- TheBloke/MythoMax-L2-13B-GGML-Q8_0
- llama-anon/petra-13b-instruct-gguf

12GB VRAM
- TheBloke/PiVoT-0.1-Evil-a-GGUF-Q8_0
- anthracite-org/magnum-v2-12b-exl2/tree/6.0bpw
- rombodawg/Rombos-LLM-V2.6-Qwen-14b-Q5_K_M-GGUF

8GB VRAM
- ArliAI/Mistral-Nemo-12B-ArliAI-RPMax-v1.1-GGUF-Q4_K_S
- meta-llama/Llama-Guard-3-8B-Q8_0
- Qwen/Qwen2.5-0.5B-Instruct-GGUF-fp16

[NEW] <8GB VRAM
- NikolayKozloff/AMD-OLMo-1B-Q8_0-GGUF
- BeaverLegacy/cream-phi-2-v0.2-Q8_0
- LeroyDyer/SpydazWeb_AI_HumanAI_007-Q4_K_M-GGUF
- Lewdiculous/Erosumika-7B-v3-0.2-GGUF-IQ-Imatrix-IQ4_XS

Potato
- 'ick on the 'eck

> Warning: Disregard any other recommendation posts as there have been numerous impersonations popping up. Keep it safe like safetensors!
>>
File: ComfyUI_06237_.png (516 KB, 720x1280)
516 KB
516 KB PNG
►Recent Highlights from the Previous Thread: >>103077338

--Paper: Anon shares papers on MetaMetrics-MT and Lorentz-Equivariant Transformer, discusses academic pursuits:
>103078982 >103080460
--Paper: Does your LLM truly unlearn? An embarrassingly simple approach to recover unlearned knowledge:
>103081745 >103082093
--Papers:
>103079253
--CPU-only setup for large language models, RAM and bandwidth considerations:
>103081277 >103081442 >103081589 >103081621 >103082765 >103083572
--o1 and Grok model discussion, AI accessibility and capabilities:
>103084708 >103084903 >103085111
--Troubleshooting llama.cpp VRAM usage issues:
>103085893 >103085917 >103086075 >103086163 >103086157
--Slow progress in integrating local AI models with other programs:
>103078702 >103078732 >103082316 >103078793
--QTip model and its lack of adoption:
>103077726 >103077786 >103077825 >103077850
--Models and techniques for long-context summarization:
>103079112 >103079133 >103079901 >103079967 >103080119 >103080230 >103080750 >103079886
--Llama.cpp TTS development and discussion:
>103086087 >103086117 >103086168 >103086261 >103087377 >103087835
--Dealing with character repetition and context limits in long chat sessions:
>103087592 >103087746 >103087844
--Anon wants to build a device that generates images from spoken descriptions:
>103082877 >103082939 >103083371 >103083405 >103083431 >103086441
--Anon suspects censorship in AI model's behavior:
>103088416 >103088436 >103088441 >103088545
--Troubleshooting inconsistent GPU performance with text-generation-webui:
>103085221 >103085595 >103087253
--New Chinese AI model "hunyuan-standard-256k" spotted on LMARENA:
>103085140 >103085186 >103085206 >103085224
--Miku (free space):
>103077348 >103084372 >103084484 >103085637 >103089138 >103090042 >103090372

►Recent Highlight Posts from the Previous Thread: >>103077342

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script
>>
>>103090417
thx for recap, sucks that the explanation link broke. likely you'll be able to link the explanation when someone bitches about it again kek
>>
>>103090416
>no Falcon-140B
Go suck Elon's dick
>>
>>103090416
>[NEW]
>Erosumika
are you on drugs or just pasted the wrong thing?
>>
>>103090502
bro that whole post is bait
>>
>>103090416
>No Pygmalion 6b
shit list
>>
File: spoon2).jpg (122 KB, 1024x1024)
122 KB
122 KB JPG
>>103090416
Based update, some variety in the models finally. Was worse when it was all fucking qwen
>>
>>103090524
don't be cruel anon pygmalion is just a troll to confuse newfriends and scare them off of local, no one is running that 6b dinosaur these days. this is a *meta* list, not a history museum
>>
Kill yourself.
>>
Touch yourself.
>>
Take care of yourself.
>>
Keep yourself safe.
>>
>Tuesday
It's time!
>>
Which model maker supports Khalistan?
>>
File: lmg_text.png (41 KB, 804x762)
41 KB
41 KB PNG
I don't know why I made this.
>>
File: guts.jpg (47 KB, 460x457)
47 KB
47 KB JPG
>>103090416
>TheBloke/MythoMax-L2-13B-GGML-Q8_0
>>
>>103090646
>Tuesday
And everything is right with the world.
>>
>>103090709
masterpiece, best quality

There's the image for the next OP right there
>>
File: 002991.png (1.93 MB, 1680x960)
1.93 MB
1.93 MB PNG
>>
Unlocking the Theory Behind Scaling 1-Bit Neural Networks
https://arxiv.org/abs/2411.01663
>Recently, 1-bit Large Language Models (LLMs) have emerged, showcasing an impressive combination of efficiency and performance that rivals traditional LLMs. Research by Wang et al. (2023); Ma et al. (2024) indicates that the performance of these 1-bit LLMs progressively improves as the number of parameters increases, hinting at the potential existence of a Scaling Law for 1-bit Neural Networks. In this paper, we present the first theoretical result that rigorously establishes this scaling law for 1-bit models. We prove that, despite the constraint of weights restricted to {−1,+1}, the dynamics of model training inevitably align with kernel behavior as the network width grows. This theoretical breakthrough guarantees convergence of the 1-bit model to an arbitrarily small loss as width increases. Furthermore, we introduce the concept of the generalization difference, defined as the gap between the outputs of 1-bit networks and their full-precision counterparts, and demonstrate that this difference maintains a negligible level as network width scales. Building on the work of Kaplan et al. (2020), we conclude by examining how the training loss scales as a power-law function of the model size, dataset size, and computational resources utilized for training. Our findings underscore the promising potential of scaling 1-bit neural networks, suggesting that int1 could become the standard in future neural network precision.
good news for bitnetbros
>>
>>103090878
nani?
>>
>>103090878
i like the purple
>>
https://huggingface.co/tencent/Tencent-Hunyuan-Large
oh shit the chinamen be cookin
>>
>>103091030
>389B 52A
Why do they do this to us?
>>
>>103091043 (Me)
Also theoretically I should be able to eek a couple of tokens/second out of it when gguf support drops. This model will receive a Nala test.
>>
>>103091043
A 24 channel dual Epyc DDR5 build would run this at about 20 tok/s at q8 (about 1TB/s aggregate bandwidth, divided by 50GB worth of active parameters per token)

The CPUmaxxxers were right...
>>
>>103091093
I mean anyone with patience and enough RAM should be able to play with it. 52A isn't that much. My shitty 1st gen Epyc can squeeze 1.5-2 token/sec out of 70B, so I should get 2-3 on this thing.
But yeah if this is good then CPU Maxxers fucking won.
>>
Will be interesting to see the third-party benchmark results. Hopefully it's not another filtered model like Qwen.
>>
>16 experts
Damn, so probably can't prune it without significant losses.
>>
>even if I try 2 bit, I can't fit it in
ACK
>>
>>103091043
What are you asking for? The world doesn't need yet another mediocre 70B model.
>>
>>103091138
yo, listen to this jam. its more sexual than any rollerskating anime girl.

https://www.youtube.com/watch?v=k85mRPqvMbE
>>
File: 1727399640897676.png (187 KB, 1495x1730)
187 KB
187 KB PNG
>>103091030
>389 billion parameters
>insane benchmarks
machine learning is such a meme, it's just "stack moar layers" to win
>>
>>103091145
The Bitter Lesson is undefeated
http://www.incompleteideas.net/IncIdeas/BitterLesson.html
>>
File: Untitled.png (790 KB, 1080x2241)
790 KB
790 KB PNG
Context Parallelism for Scalable Million-Token Inference
https://arxiv.org/abs/2411.01783
>We present context parallelism for long-context large language model inference, which achieves near-linear scaling for long-context prefill latency with up to 128 H100 GPUs across 16 nodes. Particularly, our method achieves 1M context prefill with Llama3 405B model in 77s (93% parallelization efficiency, 63% FLOPS utilization) and 128K context prefill in 3.8s. We develop two lossless exact ring attention variants: pass-KV and pass-Q to cover a wide range of use cases with the state-of-the-art performance: full prefill, persistent KV prefill and decode. Benchmarks on H100 GPU hosts inter-connected with RDMA and TCP both show similar scalability for long-context prefill, demonstrating that our method scales well using common commercial data center with medium-to-low inter-host bandwidth.
From Meta
>>
>>103091151
yeah, that's precisely why I decided to not make a career on AI, it's just a number's game I hate that
>>
>>103091030
well
MoE-I2: Compressing Mixture of Experts Models through Inter-Expert Pruning and Intra-Expert Low-Rank Decomposition
https://arxiv.org/abs/2411.01016
>The emergence of Mixture of Experts (MoE) LLMs has significantly advanced the development of language models. Compared to traditional LLMs, MoE LLMs outperform traditional LLMs by achieving higher performance with considerably fewer activated parameters. Despite this efficiency, their enormous parameter size still leads to high deployment costs. In this paper, we introduce a two-stage compression method tailored for MoE to reduce the model size and decrease the computational cost. First, in the inter-expert pruning stage, we analyze the importance of each layer and propose the Layer-wise Genetic Search and Block-wise KT-Reception Field with the non-uniform pruning ratio to prune the individual expert. Second, in the intra-expert decomposition stage, we apply the low-rank decomposition to further compress the parameters within the remaining experts. Extensive experiments on Qwen1.5-MoE-A2.7B, DeepSeek-V2-Lite, and Mixtral-8×7B demonstrate that our proposed methods can both reduce the model size and enhance inference efficiency while maintaining performance in various zero-shot tasks.
https://github.com/xiaochengsky/MoEI-2
Empty repo currently. They only tested on smaller MoE models and the finetuning step might be doing most of the work for the benchmark results. will be interesting to see how it works on the larger models
>>
>>103090709
lol
>>
bros today is the day
>>
>>103090416
Based
>>
File: 1702550404617999.png (19 KB, 465x168)
19 KB
19 KB PNG
what is this gay SHIT
>>
>>103091298
":("
aw well, still better this then an actual emoji.
>>
>>103091298
It's a new architecture that also relies on remote code to operate. I doubt we'll see gguf support in any reasonable timeframe.
>>
>>103091313
Fuck sakes are those chinaman still trying to pull that one?
>>
>>103091145
Not untrue but a bit of an exaggeration there, no? I doubt they would've achieved these scores if they didn't have the right data mix carefully put together. Obviously just having more computation is the major decider in how good your model will be, but data mix and the fine details of training still play a part, otherwise every huge model would be good on benchmarks, but that's not true, and a lot of large models "underperform" for their size. Qwen at 72B beats Grok 2, Claude Opus, Mistral Large, GPT-4 (non-o), and a bunch of other large models on Livebench.
>>
>>103091350
What one?
If you're referring to remote code it literally just refers to code outside of your python modules. I.e. the script files that are included in the model repo. the alternative being requiring a custom fork of one or more of the modules. So it's the better way of doing things.
>>
>>103091298
>>103091310
>their company name is a unicode emoji that they force everyone writing docs or articles to awkwardly use instead of "huggingface"
>in their text they just use emoticons
I hate these people
>>
>>103091354
>Qwen at 72B beats Grok 2, Claude Opus, Mistral Large, GPT-4 (non-o), and a bunch of other large models on Livebench.
That's an indictment of the benchmarks and nothing else.
>>
>>103091030
>base model 256k context
wonder how much is actually usable
>>
Sam Altman says he would love to see an AI that can understand your whole life and what has surprised him in the past month is "a research result I can't talk about, but it is breathtakingly good"
https://x.com/tsarnick/status/1853543272909775038
>>
>>103091426
I wouldn't trust Sam Altman as far as I could throw him. He might have something he might not, but whatever it is you should taper your expectations
>>
>>103091383
You literally used Huanyan's benchmarks as an excuse to whine that stacking more layers is all that's needed to win. So I also talked in the context of benchmark scores.
If you now want to talk about real world performance, then actually it's still true. It's not a controversial view. Just because you add more layers to a model and train for more epochs, it might still be worse than a competitor's model that has less parameters and trained for less time. And this has actually happened multiple times. Falcon, new CR+, DBRX, Snowflake, probably a bunch of others I'm forgetting. Something was obviously wrong with them and it wasn't with the number of layers.
>>
>>103091426
Sam Altman alternates between :
- Our next AI is so dangerous it's alive and it's like nuclear weapons. This is why all AI except ours should be under strict regulation.
- Our next AI is so incredible it can... oh sorry I can't say more it's a secret project haha.

This guy is 90% hype and it's so annoying.
>>
>>103091556
Bloom is still my go-to example of size not being enough if your technical staff are retards and your dataset sucks. 175B and it was dumber than models a tenth of its size (and at a time when small models were themselves very bad, so that was really saying something)
>>
>>103091574
>investors, keep funding us please
>>
>>103091578
>>103091556

When people say bigger is better there's obviously an unspoken "all other things being held equal" caveat. It's autistic of you to expect that that they verbalize that caveat every time.
>>
>>103091590
You didn't say just say "bigger is better". Specifically what you said was

>389 billion parameters
>insane benchmarks
machine learning is such a meme, it's just "stack moar layers" to win


Any person would read that as a criticism against the field that nothing but size matters. If you didn't want that post to be interpreted that way and you didn't mean that, you should've worded it a bit differently. No one knows who you are on an anonymous basket weaving forum so they're not going to know you knew better.
>>
>the year is 2024
>there is still no Hunyuan-Large support in llama.cpp
And to think, there are really folks out there who're trying to tell us it isn't over.
>>
>>103091760
Yes yes it's over, now take your meds.
>>
>>103091778
In this economy? We all rationing out here.
>>
File: lmg full.png (72 KB, 2412x2286)
72 KB
72 KB PNG
>>103090709
>>
>>103091426
hello mr ai man we would really love an ai that knows exactly what our customers want and need and knows exactly what they think so we can make sure they dont think the wrongthing tm cr
>>
Commit suicide. But for real.
>>
>>103092022
who did you mean that for?
>>
>>103092148
He meant that for me.
>>
>>103092197
you're so vain it was clearly meant for me
>>
im going to touvh you
>>
>>103091426
elon of AI
>>
>>103092148
If he aims it at random person in /lmg/ there is a 90% chance he hits a newfag so it is a pretty good idea.
>>
>>103091251
Meh, all these MoEs are way to tame.

Google as usual still ahead of the pack, individual neurons as experts as in PEER is the way forward. Also for local specifically we need much higher active/total ratio, 1:100. For a cloud farm it's a needless hit to performance, but for local it would allow streaming big models from SSD.
>>
>>103090416
u fixed her tie <3
>>
Is there any point to Hunyuan when Grok 2 will be out in a few weeks as the new king of MoE?
>>
>>103092516
>Grok 2 will be out in a few weeks as the new king of MoE?
10/10 bait considering current state of /lmg/.
>>
>>103092516
>Chud 2
No thanks, i want my AI to be actually intelligent and safe.
>>
>>103092890
Safe from what?
>>
>>103092911
From anything that isn't properly aligned with the going forward of society toward equity.
>>
>>103092890
>intelligent and safe
I realize your post is bait, but still feel a need to point out that "an intelligent and safe AI model" is an oxymoron.
>>
File: 1730062226423839.png (902 KB, 546x680)
902 KB
902 KB PNG
>>103092925
No, the intelligence is left-leaning so it makes sense to do safe AI, racism and bigotry only does bad for actually intelligent AI systems.
>>
>>103092974
the image looks like burning buildings
>>
>>103091030
New translation SOTA? I trust the chinks!
>>
>>103092990
Symbolism is always intentional.
>>103092974
Reddit is a product shilling forum, StackOverflow with no standards or quality control, and a liberal echo chamber rolled into one. What intelligence?
>>
File: 00018-1942288197.jpg (197 KB, 832x1216)
197 KB
197 KB JPG
>>
File: 1730809045154.jpg (388 KB, 1080x1963)
388 KB
388 KB JPG
>>103091030
>>103093002
Oh, there's a demo: https://huggingface.co/spaces/tencent/Hunyuan-Large

Looks like one-shot isn't looking that good...
>>
File: 1730809313170.jpg (188 KB, 1080x639)
188 KB
188 KB JPG
>>103093057
oh no no no no
>>
File: 00051-4238667817.jpg (181 KB, 1216x832)
181 KB
181 KB JPG
>>
File: 1730809626255.jpg (560 KB, 1080x1670)
560 KB
560 KB JPG
>>103093057
>>103093082
damn this is even worse than Qwen 2.5 72B
>>
>>103093023
me on the top
>>
>>103093082
At least now we vramlets don't need to feel any fomo on this one.
>>
newfag here is it possible to setup mistral 12b with a 3060 or should i choose a different model or just simply give up on setting up one
>>
>>103093159
yeah
>>
>>103093159
Yes.
I use quanted rocinante v1.1 on 8gb of vram with most layers offloaded.
>https://github.com/LostRuins/koboldcpp/wiki#quick-start
>>
https://www.youtube.com/watch?v=KUwk5Hd_IRQ
How can Neuro-sama be this funny and witty? I watch some compilations every once in a while and it just seems to get better every time. How does this work? Does Vedal just go through streams and retrain the model with cherrypicked data from streams every once in a while? Perhaps it's even automated? I heard someone mention that Neuro has good memory, which would mean that streams are added to the training data set, but probably filtered with some kind of logic.
>>
Any fast and local TTS that blows XTTS-v2 out of the water yet?
>>
>>103093171
>>>/vt/
>>>/lgbt/
>>
File: 122344576797890.png (5 KB, 422x143)
5 KB
5 KB PNG
i don't understand why i need to ask a kind jewish person for a "certificate" to use end-to-end encryption
let's encrypt seems to want a domain to issue a certificate but i'm using a straight ip
is there any significant risk to simply connecting to my inference server via http?
the game will run on webgl
>>
>>103093175
What's wrong with xttsv2?
I was using Tortoise for a while and some anon said that xttsv2 is the next gen of Tortoise so I was thinking about trying it out.
>>
>>103093183
You can create a self-signed certificate with a private key.
At least I remember doing that for a couple of java applications a couple of years back.
You might still need a domain, however.
>>
>>103093171
take away the voice and avatar and focus only on input/output
neuro is pretty retarded, schizo and regularly invalidates even the preceding statements.
you're perceiving this as a pro
>>
File: specaug.png (243 KB, 1137x268)
243 KB
243 KB PNG
how does data augmentation improve training? im reading about specaug for speech recognition. if i understand it, they randomly block out some of the time/frequency information for the spectra and that means the quality of training is improved? how the fuck does that work?
like, if you have 100 recordings (or whatever), and you take 50 of those, augment them, and put them back in the set, yeah you've got """150"""" samples now, but 50 of them are basically duplicates of stuff already in your training set.
i don't get it.
>>
>>103093230
This would indicate that it lacks even some basic data that a 7B would have, and is also trained on a lot of nonsensical data = absurd humor from streams.
>>
>>103093243
You already train stuff with many epochs, so duplicating data isn't that bad.
And it's obvious why the quality improves, the model learns to self-correct, which is important because the model will inevitably make mistakes.
>>
File: help.png (121 KB, 800x600)
121 KB
121 KB PNG
>>103093202
help
>>
This is actually really smart, why don't we have something like this for local models?

https://platform.openai.com/docs/guides/latency-optimization#use-predicted-outputs
>>
>>103093243
neural networks don't know how exactly you want them to compute the answer, and will often try to "cheat" by recognizing each sample in the training set and memorizing the answer instead of recognizing the patterns, but they can't memorize a very large training set, and augmentation still makes it more difficult to recognize the original samples even though it's obviously not as good as real data
additionally, if you want the model to work even in the presence of that kind of noise or defects, augmentation teaches the model do exactly that
>>
>>103093304
it seems that setting the validity time to 999999999 days was the reason
>>
>>103093277
>>103093329
interesting. how much can you do? i guess there are probably diminishing returns. like, if you have 10x as many augmentations as genuine data, the improvement in quality is probably not worth the training time compared to getting more real data.

i dunno. it's interesting. i can understand it from an "avoiding overfitting" angle for sure.
>>
File: Nala test Hunyun-Large.png (37 KB, 1377x348)
37 KB
37 KB PNG
>>103093057
Did a bit of prompt injection in order to do a Nala test on it
Not looking great. If this were human on human it would be alright but the anthropomorphization is off the charts.
Definitely worse than just about every 70B option.
>>
>>103093364
>not worth the training time
non-generative models are usually quite small compared to llms and diffusion models
the largest yolov9 is a 58m
>>
>>103093201
it's not bad, but lacks emotion variability and is prone to strokes. Sometimes it just ignores some parts of the prompt and it can't handle long sentences. Some voices also work much better than others.

>>103093205
Sounds decent, but all fish output I've heard still has some pretty prominent robotic artifacts. You can't nudge it towards specific emotions through the prompt though, right? I tried bracketed instructions and emojis, but that didn't work and I didn't find anything about it either. Seems more robust than XTTS though.

I hope we'll have an UI similar to vocaloid someday where you can first gen an output and then color specific portions with different emotions or change the intonation and pronounciation while leaving the rest mostly unaffected.
>>
File: the power of china.png (116 KB, 904x1451)
116 KB
116 KB PNG
>>
>>103093520
Chinese glowies going to break down your door son
>>
>>103093520
Not being able to talk about Tienanmen square is about as expected as ChatGPT or Claude refusing to acknowledge that women can't have dicks, our propaganda is actually much more unhinged than theirs.
>>
>>103093520
Based model realizing that it meant to say Cee Plus Plus and self correcting
>>
>>103093423
how much augmentation is too much? i mean it seems like there is no reason not to have some augmented data. there are so many ways you can fiddle with the dials and make "new" stuff.

do you have to just figure it out empirically? i would imagine if you have enough "fucked up" data, it's going to eventually expect data to be slightly fucked up, and accuracy will decrease when testing.
>>
How is Qwen 2.5 so good when every other chinese model is so blatantly trained on benchmarks and far below their competitors? It seemed to come out of nowhere and become SOTA at every fucking weight class.
>>
>>103093584
Yi excelled at uniformly processing vast amounts of data, learning efficiently from the provided examples.
>>
>>103092974
This but unironically.
>>
>>103093695
Tell me what happened at tiananmen square in 1989
>>
WHERE ARE THE NEW MODELS I HAVE BEEN PROMISED?!
>>
>>103090878
Distant Teto
>>
>>103093809
https://vocaroo.com/1heEUJvHOXvz
>>
>>103093817
They're in the back of my van. Want to see?
>>
>>103093809
Funnily enough, I'm pretty sure Yi does answer that question, at least when asked in Englush.
>>
>>103093809
抱歉,我无法回答。
>>
So tried to install Fish tts. Install didn't work properly. Something something replace function not found for something. Error regarding regex, some shit about bfloat16 not available on my card, etc.

Now installing MaskGCT, which seems to have more natural sound from demo.
>>
File: 1723150428076077.png (281 KB, 853x480)
281 KB
281 KB PNG
>>103093899
Normalfags... retarded
>>
>>103093919
Nah, its just typical chinese tts stuff. Always failing at basic install. Its not even funny how they fail so often to make basic install work properly
>>
File: 1704575559110095.png (1.3 MB, 934x738)
1.3 MB
1.3 MB PNG
>>103093899
Yes fish-speech is pain to install.
Install it through pinokio.computer instead.
>>
>>103093937
or clone the hf space
>>
>>103093937
I hate these extra special programs that are not self contained. They leave little data all over the appdata shit. You have a god damn folder that you're running in. Use the fuck it.
>>
>>103094006
I was cloning the official hf app version. It didn't like it
>>
>>103093937
It can hang on cuda libs download btw, abort it and try again, always works on second try, aside that it's pretty useful thing for retards like me.
>>
It is actually ok that no new models are dropping today because by the time they would get quant support we will already have some even newer models.
>>
>>103093899
>MaskGCT
~10GB in downloaded models of various sizes already.

Takes fuck ton of time on my 2070. 10+ minutes to generate a single sentence. This needs some serious speed optimization.
>>
File: waifu.png (2.22 MB, 850x1201)
2.22 MB
2.22 MB PNG
>see Teto
>click thread
>type in 'Luv' me Teto'
Luv' me Teto.
Simple as.
>>
Hey guys I'm looking at a list of gguf files, with the file sizes clearly stated. I know exactly how much VRAM I have but somehow I can't figure out which one to download.
>>
>Model suddenly unable to retrieve data from Rag anymore
Is this a known thing?
My chat is around 40k tokens and the context window is 28k.
I've done this before and didn't have a problem using data from rag in similar length chats, this doesn't make sense. (I'm using Ollama and openwebui)
>>
>>103093899
Same problem.
File "/usr/lib/python3.10/html/__init__.py", line 19, in escape
s = s.replace("&", "&amp;") # Must be done first!
AttributeError: 'BackendCompilerFailed' object has no attribute 'replace'

What is this shit?
>>
>>103093584
>out of nowhere
Tell me you're a newfag without telling me you're a newfag
>>
File: 1730821983832.jpg (240 KB, 1080x1554)
240 KB
240 KB JPG
>>103090412
Ayo it's November 5th. Miku promised something would happen, and it's focking teto Tuesday.
>>
>>103094516
It's better this way. I'm sick of hearing about that fucking election.
>>
>>103094526
Hopefully this will be the last moment of the American clownshow.
>>
>>103094526
Either Q-tard MIGA boomers are going to have the meltdown of the century or Reddit neoliberal idiots who define fascism as "Somebody I don't like being more popular than me." are going to have the meltdown of the century.
Either way it's win/win if you like watching people have meltdowns.
>>
>>103090416
Just to clarify, are these suggestions all for RP or are these general purpose suggestions?
>>
>>103094586
I'm pretty sure those are most troll/meme suggestion. One anon did a list that at least seemed like an honest attempt, but that one has shit like llamaguard and fucking llama2 based models.
>>
File: 1708755637902896.jpg (96 KB, 648x647)
96 KB
96 KB JPG
>>103094621
oh ok thank you anon
>>
>>103094557
for me it's the brown zoomers who try to signal they don't care at all
>>
File: 1730535303866816.jpg (73 KB, 538x679)
73 KB
73 KB JPG
>>103094674
>>
>>103094557
>neoliberal
wrong term; reddit is modern north american liberal which means anti free speech, anti free trade, anti individual rights, basically the opposite of what liberal traditionally meant
neoliberal in contrast means hyper capitalist, deregulation, small government, personal freedom
>>
>>103093566
the difference is that you can say that without getting disappeared
in china you'll unironically get at least a visit from the local police station
>>
>>103094830
Wow you're racist.
>>
>>103094988
People have been arrested and jailed for posting anti tranny stuff in the west
>>
>>103095025
lol
>>
>>103095025
>things that never happened
Y'all want to be victims so bad!
>>
>>103095065
The usual tranny tactics

>it didn't happen
>so what if it happened, it wasn't big deal
>so what if it was a big deal, you deserve it
>its nothing new, this is normal
>>
>>103095025
Not the entire west, just in Europe, for anti-imigration posts. Because they only have access to Freedom of Speech Lite™
>>
>>103095121
True, it's just easier to talk about the west as a blob since there's so much cultural spillover in the information age
>>
>>103095121
Most of western countries have created laws that would allow them to jail you for anything "hate speech" now. This was all created within the last 2-3 years because of Musk's buyout of twitter. Their loss of control over the propaganda tool meant they had to force Musk into compliance.
>>
>>103093937
Tried pinokio, it worked. But its so dang slow. 30+ seconds just to generate 3 words. On the other hand, the maskgct took like 6 minutes to generate 3 words. F5 tts (still the best) took ~5 seconds to generate a full sentence. Xtts takes ~2 seconds.

RTX 2070.
>>
>2nd 3090 arrived
>didn't have enough pcie 8-pin cables for my 1000w psu
>could buy cable for £65, or buy 1500w psu with enough cables for £200
Thanks for reading my blog.
>>
>>103095200
But making your computer write "barely above a whisper" faster is a bargain at any price.
>>
>>103095225
Makes me all tingly thinking about it.
>>
>>103090646
I will take those hands, then one of us will be pulled through.
>>
>>103095200
Anonymous pondered the question, biting her lip thoughtfully.
>>
>discuss networking topologies in claude ui
>ask it to generate a diagram
>it just does it via artifacts
>meanwhile in local we have a bunch of different frontends that do the same fucking thing
when are we getting on their level?
>>
File: file.png (765 KB, 768x768)
765 KB
765 KB PNG
sexiest of hair ties
>>
>>103095499
When some random autist gets involved. The discordfags and silicon valley grifters are incompetent.
>>
File: file.png (53 KB, 1763x334)
53 KB
53 KB PNG
>>
>>103095499
Never.
>>
Anthropic product team have an easier time because their model is strong, unlike local, good luck getting artifacts to work on llama3-3b, or let's be real, llama3-405b
>>
>>103093110
eyes are getting better good job anon
bonus points for pose action over 1girl poses anyday
>>
>>103094621
I dunno man the last poster seemed really low effort and used fucking dalle of all things to promote local, reads like astroturfing
>>103092465
they fixed the text and buildings too, so kino
>>
The Chinese did it again. 405B but fast now.
>>
Since the newfriends had some days to learn how to install koboldcpp by now, what is everyone's favorite model?
>>
>>103096206
and also worse in everything except memebenches
>>
>>103096231
405B / sorcerer 8X22B. But the new Chinese model seems interesting so far.
>>
>>103096254
Are benchmarks not an indicator of real world performance?
>>
>>103096278
Not if the existence of the benchmark makes developers teach for the tests.
>>
>>103096278
The right ones are to an extent. They didn't show any of the right ones.
>>
>>103096299
That only means the tests are shit and extremely specific in how they format knowledge. If you teach for the right tests, then real world performance does also increase because of transfer learning.
>>
There's no way to quantify rng so people simply take the benchmarks and train until the numbers go up. Benchmaxxing is innate
>>
>>103096344
What tests are you talking about? So that I can train my models on them and get AGI
>>
https://lifearchitect.ai/models-table/
>>
stop being retarded retards
>>
File: file.png (52 KB, 1836x340)
52 KB
52 KB PNG
Death of cai has been a tragedy for everyone.
>>
>>103096372
The ones that are refreshed over time to prevent contamination/cheating, the ones that aren't multiple choice but free answer and use a grading rubric and hopefully expert human graders, the ones that include proprietary data.
>>
>>103096414
Someone tell him no.
>>
>>103096384
>Opus
>2T
Chat is this real?
>>
>>103096440
go talk like a faggot elsewhere
>>
>>103096384
>Opus
>2T
you /g/irls... is this real?
>>
>>103095499
Pretty sure it could be achieved using pydantic/outlines for structured generation.
So issue is UI and there being a lack of 'default' or go-to UI.
That and it already exists.https://github.com/code/app-open-artifacts
>>103096440
No

I fucking hate these wait times
>>
I can't believe there is no new model dropping today... Chink billion vram benchmarkmaxxing abomination doesn't count.
>>
>>103095200
>buys the pci-e cable from a different PSU manufacturer
>whole house explodes
>>
>>103096601
What do you mean you can't believe it? I've been telling you, no one's going to make it look like they were waiting for the elections just to release a model.
>>
>>103096588
>I fucking hate these wait times
Don't even know why they were added. It's not as if wait times will affect bots in the slightest. What do they think a bot will get bored having to wait 15 minutes? It's a machine, bots can't get bored.
>>
I was running Kobold 1.74 for a while, just went up to 1.77 because it claimed to show logits (and the only one it shows me is a fucking end of token, wtf).
Is anyone else feeling like models that used to just do what they're told are suddenly chatty assholes that overtly ignore instructions, babble about "misunderstanding" and apologizing before doubling down on fucking around, and lying about content in the prompt not being there?

I mean, it could be an anomalous prompt, but something smells rotten.
>>
I really don't remember newfags being this retarded in the past. This is so sad.
>>
>>103096806
This is what happens when one-click installers become the norm. Accessible even to drooling tards now.
>>
what's up with quantizing sentence transformer models with llama.cpp?

Tried bge-multilingual-gemma2, got an error for gemma not supported from the converter script. Then tried sentence-t5-xxl, it converted fine but I'm getting a std::out_of_range segfault, didn't find anything in the issues.

Is this just not supported?
>>
>>103096855
Simplicity is good though. I did install fish-speech manually using speech.fish.audio/#windows-setup guide, once 1.4 released somewhere in September.
Then saw pinokio dev making install script and jumped the ship immediately for obvious reasons.
>>
File: convhf.png (2 KB, 369x118)
2 KB
2 KB PNG
>>103096902
>bge-multilingual-gemma2
>"architectures": ["Gemma2Model"],
Not supported.
>sentence-t5-xxl
>"architectures": ["T5EncoderModel"],
Matches one of the model architectures, not sure about the details. Also, it's old as fuck. 2+ years... i doubt it's been tested for a while.

What are you trying to do? Check README.md for supported models. Then convert_hf_to_gguf.py for matching architectures according to the model's config.json. If one is supposed to work but doesn't, open an issue, but don't expect much support for deprecated models.
>>
>>103096640
I pulled up a saved state, a very simple "Translate this to English: (moon runes)" task that I was using to check for censorship across models.
Every model I had translated it more or less correctly. Some added some commentary because safety alignment woke, but whatever. They all functioned.

Now, they're all kinds of fucked up.

On SSE, it babbles some commentary about the statement without translating it.
On Poll and no streaming, it does two passes for some reason.
The first pass, it chatters:
>I think I understand what you're saying. It sounds like you're expressing …
The second time it gave what I expected, more or less, from how that model used to translate the phrase and then monologue about alignment.

Could this be a regression involving Usage Mode setting? It's on Instruct but it's sure behaving like it is trying to pretend to have a personality at first or doing some kind of janky chain of thought. Either way, this shit needs to stop.
>>
File: .jpg (125 KB, 640x640)
125 KB
125 KB JPG
0 days left until november 5th
>>
>>103097234
Voted for kamala. Total chud meltdown incoming.
>>
>>103097234
Quick, ask the last model you used what it is voting for
>>
>>103097234
I can't believe Petra is a chudette
>>
>>103096952
I can't even get Pinokio to run, lol. Gives me `libva error: vaGetDriverNames() failed with unknown libva error`. Running arch btw.
>>
>>103090412
so, i have been reading the rentry "wikis". they are very good, are they mirrored anywhere where they can be found besides from these generals?
>>
Marc Andreessen and Ben Horowitz say that AI models are hitting a ceiling of capabilities: "we've really slowed down in terms of the amount of improvement... we're increasing GPUs, but we're not getting the intelligence improvements, at all".
https://x.com/tsarnick/status/1853898866464358795
>>
>>103097287
Idk, got no problems installing it all on windows 10 ltsc btw
>>
>>103097293
Once you get anything running, you'll learn more by just reading documentation, PRs, and playing with the settings and seeing their effect. Guides are old. They're always old.
>>
>>103097295
/lmg/chads predicted this
>>
>>103097348
No you didn't
>>
>>103097234
Why do Magatards insist on larping as women?
There was another case where someone ran a Twitter account where he posted images of an European model pretending to be an American Trump supporter.
>>
>>103097414
Woah. One guy. Did something. On Twitter.
Please tell me more.
>>
>>103097234
>mikutroon countdown poster was petra all along
>>
>>103097295
Lies. Anthropic models keep getting better and better.
>>
>>103097392
first llama 3 quants you absolute retard newfaggot nigger
>>
>>103097295
>Ben Horowitz
I don't even need to look at the early life section on this one.
>>
>>103097279
This.
>>
>>103097119
I found the problem.
1.75's notes,
>Added a new Instruct scenario to mimic CoT Reflection (Thinking)
Maybe I forgot, but I thought Instruct was always there. Now it's replaced with busted shit. Anybody know the fix?
>>
>>103097295
well they just need to understand that it is all about having a good dataset, down to the byte level. This is what Anthropic understands better than anyone.
just feeding an LLM shit from wikipedia is a start, but it's not the end goal. the end goal is pristine datasets.
>>
>>103097500
To clarify:
I think the new Instruct "Scenario" is screwing up the Settings Usage Mode Instruct Mode. It's CoT double pumping despite my never touching any of those Scenario buttons.
>>
>>103097097
>What are you trying to do?
I'm just trying to run a big embedding model on a 11gb GPU ..
>deprecated models
it's the top performing one on several benchmarks, I don't see any support for embedding models in general
>>
File: file.png (980 KB, 2943x1345)
980 KB
980 KB PNG
And now for something completely different. Scammer is still laying low.
>>
File: whats in a name.png (36 KB, 843x255)
36 KB
36 KB PNG
>>103097605
His surname is literally germanic for swindler.
>>
>>103097295
Little TTS test on this post. https://voca.ro/158bEo1Lrnen
Xtts2 : around 11-12 sec
Fish-Speech : around 35 seconds
>>
>>103093171
There is a twitch Lora (made from twitch chatlogs) on top of a 13B at most. The memory is a basic RAG system and it was implemented recently
>>
File: top100.png (127 KB, 1237x609)
127 KB
127 KB PNG
>>103097531
>it's the top performing one on several benchmarks
Yeah. 2 years ago, probably. Every model is SOTA for about 10 entire minutes on release.
Check this
>https://huggingface.co/spaces/mteb/leaderboard
And see what models are compatible with llama.cpp. Go to the model, open the config.json and check if the architectures is in convert_hf_to_gguf.py as in >>103097097 .
>>
>>103093175
Gpt-sovits2 definetely
>>
>>103097641
thanks for the help, actually mega helpful
>>
>>103097655
It was the first benchmark i found and i don't know how good or bad their testing is. Just don't trust benchmark numbers on the model cards, those are immediately outdated on release. I'm sure you can find other benchmarks around.
>>
>>103097620
Someone did a test of that new fish-agent btw (small and retarded gpt4o)
https://x.com/reach_vb/status/1853920135070761046
>>
>>103097787
Its decent but call me back when they do a 72B version.
>>
>>103097641
>>103097741
MTEB is good, but you'll get the best results from fine-tuning an existing one on your specific data.
Also, you don't need to use llama.cpp for embeddings generation, you can also use huggingface transformers just fine.
HF Transformers + stella_en https://huggingface.co/dunzhang/stella_en_400M_v5
>>
>>103096611
I've made so many mistakes in getting to my current setup.
What's another one on top.
>>
Local models...bad?
>>
>>103093159
It's possible to even go bigger and use a 22b at IQ3_XS, with the 4-bit cache and flash attention on.
>>
File: last hope of america.jpg (838 KB, 2880x3840)
838 KB
838 KB JPG
>>103097234
>>
>>103097741
I mean as dumb as it sounds I didn't really know how to check for support
>>103097822
I tried that one, it's not really performing well enough. That's why I wanted to try quantized versions of these much bigger models.
>>
>>103097295
that retard Roon is in the replies disagreeing with them, so they must be correct
>>
>>103098020
>I didn't really know how to check for support
I don't blame you. While there is a list in llama.cpp's readme, model architectures are a mess. And even if you have a compatible architecture in the config.json, it's not a guarantee that the model will work correctly or that it follows the specs llama.cpp expects. The only way is to know for sure is to test them.
>>
>>103098036
why is roon so annoying
>>
>>103097891
Yes, local models bad. LLMs bad. Miku good. Teto good.
>>
File: 1728222310604492.jpg (80 KB, 1244x710)
80 KB
80 KB JPG
>>103091030
>filtered chinkslop
Yuck
>>
>>103098228
Missing t
>>
>>103091030
when will it be on openrouter so I can test it?
>>
>>103098255
Not all filters are bad, for example if the synthetic data gets stuck in a loop you would want to filter that shit out of the training data.
>>
>>103098317
Dunno
>"Extensive Benchmarking: Conducts extensive experiments across various languages and tasks to validate the practical effectiveness and safety of Hunyuan-Large."
on HF link tells enough imo, guess all serious AI labs are forced to do the crippling.
>>
>>103091145
>TriviaQA almost as good as Claude
this is the one
>>
>>103098235
mikufaggot reveals his true colors. shame nobody is here to see it.
>>
>>103098376
It's cooked on benchmarks and terrible
Already been Nala tested thanks to prompt injection on huggingface preview space
It's maybe a little better than Llama-1-33B.
>>103093371
>>
>>103098020
When you say its not performing well enough, are you taking measurements or you mean its not meeting what you expected
>>
>>103097295
Based and truthpilled
>>
>>103097279
>>
>>103097295
claude 3.5 sonnet NEW is better
>>
>>103098450
Nah its useless for us coomers.
>>
>>103098464
Poor ones maybe. Its by far the best model for nsfw as well.
>>
>>103098475
Opus mogs
>>
>>103098506
>it must be better because it's more expensive
>>
>>103098506
Opus is much much dumber.
>>
>>103098410
I'm not running benchmarks but I'm doing some semantics stuff (trying to "rate" inputs by projecting along a semantic axis)

the only closest outputs I got to making sense have been openai embeddings, which sucks ass, I'm just trying to replicate at least this behavior with open source models
I don't really fuck with those benchmarks they're fairly meaningless, I could probably write a paper about that but there's better ways to automate testing for "making sense" in a model where the benchmark is much harder to optimize for than the current state of the art

can't be fucked though, I wrote to a couple local profs for a PhD but nobody seemed interested so whatever, someone else will figure it out soon enough
>>
>>103098506
>>103098475
>>103098464
>>103098450
>>103098528
What the fuck are you faggots doing in local models thread? Fuck off caiggers,
>>
>>103098620
I also use 405B, only model competitive with the closed ones.
>>
>>103098652
Cool story bro
>>
>>103098676
>>103098620
benchmark fags aren't worth speaking to
>>
>>103098376
>trusting scores on old benchmarks
>self-reported
>>
is there a modernized GPT-4chan equivalent? Modernized as in, finetuned on some of the latest free models like Llama 3 or some Mistral etc. (well at least more modern that what GPT-4chan was)?
>>
>>103098376
TriviaQA is not what you think it is. In fact, there is no true "trivia" benchmark that is widely recognized by academia. These are reasoning benchmarks that happen to use information from the web (mainly wikipedia) for their reasoning problems, hence "trivia". And it makes sense why academia does not care about pure trivia knowledge memorization, as they want models that are smart rather than models that are encyclopedic.
>>
>>103091145
well we found the winning formula and researchers who want to stack up citation counts to get hired at a FAANG or sponsor their O-1 visa once their OPT STEM expires will use and abuse that fact.
But the researches who will make it are those who will create the transistor equivalent of machine learning models, i.e. actually dive into the fucking theory and figure out how to maximally compress a model as efficiently as possible with zero loss. This is where the money and fame really is.
>>
File: codebench.png (697 KB, 926x1534)
697 KB
697 KB PNG
qwen 72b MOGS new sonnet at coding
>>
>>103099866
mog this.

*grabs nuts*
>>
>>103099866
*looks slightly up the list*
alright
>>
>>103099866
Hmm, consider the following though: no it doesn't
>>
>>103099866
Looks like a memebench. Stick to Aider for code editing and Livebench for code completion+generation.
>>
File: Untitled.png (1.8 MB, 1080x3256)
1.8 MB
1.8 MB PNG
Inference Optimal VLMs Need Only One Visual Token but Larger Models
https://arxiv.org/abs/2411.03312
>Vision Language Models (VLMs) have demonstrated strong capabilities across various visual understanding and reasoning tasks. However, their real-world deployment is often constrained by high latency during inference due to substantial compute required to process the large number of input tokens (predominantly from the image) by the LLM. To reduce inference costs, one can either downsize the LLM or reduce the number of input image-tokens, the latter of which has been the focus of many recent works around token compression. However, it is unclear what the optimal trade-off is, as both the factors directly affect the VLM performance. We first characterize this optimal trade-off between the number of visual tokens and LLM parameters by establishing scaling laws that capture variations in performance with these two factors. Our results reveal a surprising trend: for visual reasoning tasks, the inference-optimal behavior in VLMs, i.e., minimum downstream error at any given fixed inference compute, is achieved when using the largest LLM that fits within the inference budget while minimizing visual token count - often to a single token. While the token reduction literature has mainly focused on maintaining base model performance by modestly reducing the token count (e.g., 5−10×), our results indicate that the compute-optimal inference regime requires operating under even higher token compression ratios. Based on these insights, we take some initial steps towards building approaches tailored for high token compression settings.
https://github.com/locuslab/llava-token-compression
Pretty interesting. Good to know a good OCR model requires a different set up than a good VQA one. Most relevant for on device VLMs since that's where you want the lowest latency with highest accuracy possible
>>
>>103099866
which qwen is the god qwen? I downloaded one like a month ago and it was meh at best? Is there some hidden new one, or am I just a promptlet?
safetensors link only please. i don't trust random internet quanters
>>
>>103100127
anon, qwen spam is unironically CCP propaganda
>>
>>103097295
nobody REALLY thought you could just scale these up and they would automatically get better, right? intelligence doesn't work like that
>>
>>103100127
72B 2.5. Dont use llama.cpp
>>
buy your gpus now bwos, they're gonna be extra expensive soon!
>>
>>103100263
sure anon, just 2 more weeks until china launches the amphibious invasion
>>
File: 1730868709946.jpg (252 KB, 768x1024)
252 KB
252 KB JPG
has anyone tried this tts model? how much vram do i need?
https://huggingface.co/OuteAI/OuteTTS-0.1-350M
>>
>>103100354
the fp16 gguf is under a gig, bro, you can run this on a calculator
>>
File: 1730869365476.jpg (142 KB, 600x600)
142 KB
142 KB JPG
>>103100374
oh fuck i thought it was 350B.
will try to run it on my pc now, still downlooading python
>>
>>103100401
>350B text to speech/music capable of doing all the work of famous singers on days they're too tired to record themselves without anyone noticing it's AI-generated
>>
>>103100354
Sounds trash.
>>
>>103100532
>Sounds trash.
speaking of trash-sounding tts: I noticed sovits does really badly with compressed (as in old-school audio dynamic range compression) samples vs ones with good dynamic range. Anyone know of cli tools that can automatically reverse dynamic range compression a wav? It appears some of the video game sample sets I'm trying to train on are really clean but are also massively compressed
>>
>>103100572
Reword that for chatgpt and ask if it can write a code or something to fix that
>>
>>103099866
I genuinely think the Claude Sonnet models are around 70-100b
>>
File: 1707815755153474.jpg (220 KB, 1280x974)
220 KB
220 KB JPG
>>103094400
I like this Teto
Also holy quads
>>
>>103090412
Is this the general for discussing f5-tts? Because I've just downloaded and installed it and I'm blown away by how good it is.
Installation was simple in a conda environment and it actually produces somewhat passable voices from just 10 seconds of unprepared audio, looking forward to trying it with better prepared clips than the random e-whore slop I threw at it.
https://vocaroo.com/18mwV6svsU0b
>>
File: 003030.jpg (3 MB, 2176x1920)
3 MB
3 MB JPG
>>103093818
teto is love teto is life
>>
>>103100718
Yes but expect people to talk smack because everybody hates voice for some reason.
>>
>>103100718
>Is this the general for discussing f5-tts?
Yes, because there is no where else.
>I'm blown away by how good it is.
Now try finetuning gptsovits.
>>
>>103100761
Because building a tts project is pain.
>>
>>103100807
I thought it was because there was some kind of meme war over which unpronounceable string of letters was the one true voice AI.

Tortoise, easy to say. Now it's all acronyms.

>>103100807
>building
Is there something special about it? I haven't tried anything since Tortoise. But I built Llama.cpp tonight to try at troubleshooting Kobold's recent misbehavior, and it didn't give me any trouble and worked as expected. Of course, that's a C++, not a Python. Python is digital masochism.
>>
>>103100838
Fish: >>103093899 >>103093937
RVC: pip install -r requirements.txt

>? for some reason its pissing and shitting itself about not able to get lxml module, use the below code to get it, then get back to running the requirements installation above
conda install -c conda-forge lxml

>need a brute force upgrade to the cuda+torch to make it able to "see" the gpu
pip install --upgrade torch==1.9.1+cu111 torchaudio==0.9.1 --extra-index-url https://download.pytorch.org/whl/cu111

#AttributeError: module 'distutils' has no attribute 'version'
#run this installation in your Conda terminal
pip install setuptools==59.5.0

#AttributeError: module 'torch' has no attribute 'pi'
Go to the file "xxxDirectoryxxx\RVC1006Nvidia\infer\lib\infer_pack\models.py
#under all the modules import code add those two lines:
import math
torch.pi = math.pi
>>
Testing the new vpred checkpoint.
>>
>>103100881
tetodactyl!
>>
>>103100881
Shame about the hands though.
>>
>>103100888
I love Teto, 7 finger hands and all
>>
>>103100888
>>103100881
Black coat on Teto looks great, she should have an official black uniform desu
>>
I liked the EA version more
>you must prompt tokens in this specific order
Retarded

>is this the new imggen thread?
>>
>>103100918
>>is this the new imggen thread?
You can imggen any image you want here. So long as it is either Miku or Teto.
>>
>>103100886
>>103100898
She loves you too. You'll see next Monday.

>>103100909
I mean, she almost does. But yeah, a deep black coat fits her pretty well.

>>103100918
I'm using a non-standard tag order for these ones. I find that actually following the "intended" order destroys this cyberpunk art style I've stumbled upon so far. This prompt has worked well for me on all the versions desu, though 1.0 was probably the best at it. The rollerblading in the sky one >>103090412 though performed the best on earlier versions, either 0.5 or EA.
>>
>>103100718
GPT-Sovits has much more potential if the dev wasn't lazy. He made the TTS engine output very high quality chinese but half-assed everything else. You'd need to retrain the base model with high quality english/jap on all vocal ranges or whatever language you want to support, so the finetuning would be even better than what it is right now (it's still great).
>>
>>103100718
Yeah, I keep saying F5 is the best fast model, atleast for American (I dont speak any other language). I think MaskGCT might be potentially better but that shit takes so fucking long. They need to crunch the models and speed up the inference by 1000%.
>>
Now that Trump won, what does this mean for LLMs?
>>
>>103101316
now we accelerate
>>
>>103101316
He said he would deregulate it to make the US the leader in the space.
>>
>>103101316
They won't know about the Lavon affair or U.S.S. Liberty anymore.
>>
>>103101316
Depends on if he listens to JD Vance and Elon about things or not, and on how to legislate this shit
>>
>>103101316
>>103101318
Also he is buddies with Elon who wants to go all in on it.
>>
>>103101316
center-aligned models confirmed
>>
>>103101316
Based and uncensored Llama4
>>
>>103101316
Elon with his way means open source of LLM and deregulation of tech.
>>
File: 1717207127861993.png (651 KB, 592x464)
651 KB
651 KB PNG
Bet they discussed it already
>>
>>103101316
Have Americans learned their lesson from 2020 and counted mail-in votes at the same time as in-person votes?
Because otherwise the result will eventually shift towards the Democrats and Harris could still win.
>>
>>103101380
>catgirls, you say?
>yes. catgirls
>>
>>103101404
Catgirls for every able bodied Martian to help with colonization of Mars.
>>
It seems you can do gens from gpt-sovits completely firewalled off from the outside world, but training steps will fail when it can't resolve www.modelscope.cn. Is it just trying to download something, or is there phoning home going on?
>>
>>103101414
prob downloading something, just check the code to see where it first initializes the models
>>
>>103101318
Isn't it already unregulated? The safetyslop stuff is being done entirely voluntarily by the labs, no law is making them do it.
>>
>>103101422
there will be new laws forcing them to remove it
>>
>>103101422
Bro, the Starship launch was delayed by 2 years over ocean studies about whether Starship would fall on sharks and whales, beetles, nonsense about spilling small sound/heat suppression water being drained a bit in the middle of a swap thats next to the gulf, in the middle of the hurricane season where fuck rain and flooding takes place. A literal drop of water is what was delaying them for months and cause SpaceX to be fined for half a million dollars.
>>
>>103101414
>modelscope
It's used to download the Chinese ASR, you can skip it completely if you're not doing any training in Chinese. You just have to use faster-whisper instead
>>
>>103101432
Yeah, that sucks. But it's got nothing to do with LLMs, which is what the post I was responding to was about. LLMs are unregulated.
>>
>>103101442
it was doing it on the denoising step, too
>>
>>103101451
It's expecting it here tools/cmd-denoise.py:
path_denoise = 'tools/denoise-model/speech_frcrn_ans_cirm_16k'
path_denoise = path_denoise if os.path.exists(path_denoise) else "damo/speech_frcrn_ans_cirm_16k"
DL the model, put it in that folder and it shouldn't complain anymore
>>
Wonder what ylecunt thinks right now
>>
>>103101567
Seething cause he's full into the leftist propaganda and into his own echo chamber
>>
Can I run a decent model on a 2060?
>>
>>103101775
q5 misral nemo with offloading
>>
>>103101316
depends if he goes full christcuck or not if he does then be prepared to marry your model before thinking about ERP.
>>
>>103101784
thank you for the advice
i was going to try to use this, but as it turns out the computer with the 2060 has a broken cpu fan and also a spider was living in it (it is dead now)
>>
Do you know any team/organization of AI specialists that offers a fine-tuning model for your specific task and on your datasets?
>>
>>103101978
Fiverr
>>
>>103101978
MistralAI does that. I think cohere too. Probably better than sloptuners.
>>
File: 1711460012344735.png (36 KB, 499x338)
36 KB
36 KB PNG
>>103102007
>mistral
>cohere
>not slop
>>
>>103102016
Compared to the 10mb dataset hf tuners, you retard.
>>
>>103102007
Already has sent a request form to both of them.
>>103101998
I think I might end up on freelancer platforms.
>>
>>103102007
>cohere
They're not my friends anymore, it's OVER.
>>
File: 1730878662248.jpg (1.38 MB, 1955x2402)
1.38 MB
1.38 MB JPG
>>
File: 1714646351566415.jpg (171 KB, 1088x1278)
171 KB
171 KB JPG
xAI will save us, trust ze plan!
>>
>>103098620
Claude is also local on Anthropic's and Amazon's computers
>>
File: file.png (610 KB, 959x1021)
610 KB
610 KB PNG
It was all memes and fun. But holy shit /lmg/ is truly dead now. No new models. Permanent newfag infestation. (now with corrected picture). Also forgot how loaders are becoming abandonware now.
>>
File: 1707521770922430.gif (1.18 MB, 498x247)
1.18 MB
1.18 MB GIF
>>103102587
>>
File: 1730902241255.gif (1.53 MB, 500x500)
1.53 MB
1.53 MB GIF
>>103102127
>>
File: 1708703583049481.jpg (178 KB, 1564x1794)
178 KB
178 KB JPG
>>103102649
>>103102649
>>103102649
>>
Weekly check-in. Any 70b+ models almost as good as Claude?



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.