[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: 39_06429_.png (1.03 MB, 1280x720)
1.03 MB
1.03 MB PNG
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>102772862 & >>102758839

►News
>(10/11) 14B cross-architecture distillation model: https://hf.co/arcee-ai/SuperNova-Medius
>(10/10) Aria: 25.3B, 3.9B active, multimodal native MoE model with 64k context: https://hf.co/rhymes-ai/Aria
>(09/27) Emu3, next-token prediction multimodal models: https://hf.co/collections/BAAI/emu3-66f4e64f70850ff358a2e60f
>(09/25) Multimodal Llama 3.2 released: https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices
>(09/25) Molmo: Multimodal models based on OLMo, OLMoE, and Qwen-72B: https://molmo.allenai.org/blog
>(09/24) Llama-3.1-70B-instruct distilled to 51B: https://hf.co/nvidia/Llama-3_1-Nemotron-51B-Instruct

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://livecodebench.github.io/leaderboard.html

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp
>>
File: img_1.jpg (324 KB, 1360x768)
324 KB
324 KB JPG
►Recent Highlights from the Previous Thread: >>102772862

--Papers:
>102780674
--INTELLECT-1: Decentralized training model based on Llama architecture:
>102780721 >102780826 >102780843 >102781096 >102780841 >102783682
--arcee-ai releases 14B SuperNova-Medius model with different vocab compatibility:
>102778245 >102778461
--Medical LLMs for patient scenarios, resources, and GPU rental options:
>102775353 >102775414 >102775447 >102775557
--Koboldcpp sometimes talks on behalf of user, potential reasons and solutions discussed:
>102784206 >102784387 >102784487 >102784506 >102784777 >102784792 >102784419 >102784477 >102784809 >102784865 >102785031 >102785198 >102785256 >102784869 >102784903
--Apple's AI/ML limitations and the GSM-Symbolic paper on mathematical reasoning in LLMs:
>102789310 >102789338 >102789355 >102790047
--Tips for running untrusted code in a sandboxed environment:
>102774954 >102774999 >102775308
--Discussion about the potential and limitations of the Aria model:
>102776431 >102776483 >102776530 >102776659 >102776726 >102776796 >102776864 >102776992 >102776925 >102781383 >102785348
--Bug report: llamacpp message repetition in ST after second swipe:
>102783411 >102783612
--koboldcpp-1.76 updates and Anti-Slop Sampling discussion:
>102776955 >102777602 >102778283 >102777648 >102777672
--Segmentation of textbook images for PDF to LaTeX pipeline:
>102787003 >102787120
--Impressive 1B model output and quantum mechanics graph discussed:
>102782467 >102782503 >102788836
--Anon reviews Aria and Molmo:
>102779448 >102779938 >102780060
--360m model passes 9.11 vs 9.9 test, outperforming WTS 415m:
>102789452 >102789476 >102789747 >102789827 >102789932 >102789949 >102789833
--Miku (free space):
>102773834 >102776796 >102777561 >102778020 >102783970 >102785613 >102786141

►Recent Highlight Posts from the Previous Thread: >>102772882

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script
>>
grifter thread
>>
>the day local lost
>>
File: Capture.png (1.85 MB, 1043x783)
1.85 MB
1.85 MB PNG
cpumaxxbros, we're back
>>
File: 1728741979180.jpg (121 KB, 848x1199)
121 KB
121 KB JPG
>>102790397
>>
File: 1561155454670.jpg (89 KB, 1024x1012)
89 KB
89 KB JPG
>>102790407
>INTELLECT-1: Decentralized training model based on Llama architecture:

This must make the doomers and regulatoors so mad, how do you stop AI when the training becomes fully decentralized? Also wonder if the EU AI laws even apply.
>>
>>102790504
man, why does that board look like something from the 90s while the blackwell systems look like this?
>>
>>102790508
*sniff*
>>
is a p40 faster than a 1080ti?
If in both cases model fully offloaded of course.
Wish we had a llm comparison site.
>>
>>102790528
>if it doesn't have edgemaxxed rgb gaymer design it's shit
Are you gonna stare at your motherboard all day or what
>>
>>102790512
You don't need to stop it because at the rate that project is going it'll take 100 days to train 1T. For a model on the level of Llama 3 (if a 10B of it existed), you'd need 1500 days, or about 4 years.
>>
what's the best cooming model for vramlets rn?
>>
File: ShokugoMiku.png (1.14 MB, 896x1152)
1.14 MB
1.14 MB PNG
おはよう、/エルエムジー/‼
>>
>>102790590
how are you not excited about this.
more effecient training and better smaller models are a certainty in the future.
this is just a first proof of concept.
>>
>>102790653
All those years of Duolingo finally paid off!
>>
>>102790633
Llama 3 3B IQ3
>>
>>102790633
nemo
>>
>>102790590
Yes I'm sure it will never get any better. All new technology is only ever as good as its first iteration that's why I'm posting via telegraph on my Babbage machine
>>
Yann LeCun is a fraud
>>
>>102790731
No one thinks otherwise anon, except one shitter dickriding him.
>>
https://github.com/ggerganov/llama.cpp/pull/9742
So who are you siding with? P-e-w or maggothate?
>>
File: 1724484636390399.jpg (269 KB, 1536x2048)
269 KB
269 KB JPG
>>102790528
Black solder mask mostly, plus that's a markering render without all the board marking and bigass heatsinks. Does look tidy tho
>>102790551
similar, 1080ti has faster memory
https://www.techpowerup.com/gpu-specs/geforce-gtx-1080-ti.c2877
https://www.techpowerup.com/gpu-specs/tesla-p40.c2878
https://www.reddit.com/r/LocalLLaMA/comments/1f6hjwf/battle_of_the_cheap_gpus_lllama_31_8b_gguf_vs/
>>
>>102790793
I'm siding with Philipp Emanuel Weidmann.
>>
File: file.png (29 KB, 804x122)
29 KB
29 KB PNG
>>102790793
doesn't matter anyways, deprecated by antislop, pew can cope and seethe about his popular sampler that doesn't fix stuff if he wants
>>
>>102790793
I haven't read the thread but high temp before everything else + minP/TFS should basically have the same effect while also removing schizo tokens
>>
>>102790668
>>102790712
Parallel training is a meme, I hope CUDA Dev shows up to explain why.
>>
>>102790512
is it open source?
>>
>>102790512
You can't stop people from sharing their gpu, that's retarded.
You can even use p2p for the initial connection with tor nodes or something. Nigga just read the description on how their connection actually works, it doesn't need to check every time.
>>
>>102791205
https://github.com/PrimeIntellect-ai/OpenDiloco
>>
>>102790551
They are two NVIDIA GPUs with the same compute capability so literally all you have to do is compare memory bandwidth (~token generation speed) and compute (~prompt processing speed).
If you do, use Wikipedia instead of techpowerup to look up the stats (in my experience more reliable).
>>
>>102790512
>Train my censored slop model for free!
Gonna wait for someone to actually use this for a actual based model.
>>
>>102791315
>DiLoCo requires each instance to have sufficient GPU memory to hold the model's parameters, gradients, and optimizer states (general data parallelism limitation, except for slower off-loading methods).
This is why its a meme for /lmg/, generally. What are we going to train on an army of 3090s, even if every node has 1gbps+ internet speed?
>>
>>102791140
The last time I did a rough estimate for the overhead to exchange weight updates between machines I came to the conclusion that it's not viable with typical upload bandwidths given to regular people.
Instead of crowdfunding compute I think it makes more sense to do conventional crowdfunding and training.
>>
>>102791376
23b no slop no censored model will be already a trillion times better than any cucked model that sends shivers down your spine.
>>
>>102790793
I don't understand any of it so I'll let them wrestle it out
>>
Hello Large Model General, this is a PSA to let you know that with the last update, the trick to get Gemini to comply is to talk to it like a moron because Google underestimate users and their model will actively fight you if you try to give it well structured instructions, but if you roleplay as an idiot, only then it will help you. Godspeed with local models, save me from having to continue to use this.
>>
>>102791393
Thanks
>>
>MI325X has 256 GB VRAM
When will this be cheap enough for plebs to use?
>>
>>102791393
>Instead of crowdfunding compute I think it makes more sense to do conventional crowdfunding and training.
Does it? Rented compute will be far more expensive than training on owned hardware. I can't imagine how expensive it would be to try to train 405B that way, assuming they even have that much compute to spare consistently for the months it would take.
>>
Believers, redpill me on entropix. How is this different and not just another techbro meme?
>>
>>102791393
>typical upload bandwidths given to regular people
Most areas in my country have 10gb available, the average upload speed for non rural areas is 500 and almost the most common trend now is to have 1gb min. Americans are just third worlders.
>>
>>102791376
With huge batches you can do the forward/backward passes one layer at a time. So you only need the parameters for one layer at a time in memory, the time necessary to load the parameters becomes less relevant due to the large batches.
>>
>>102791545
North America has protected telecom monopolies is all. So we pay two three times more for two three times less.
>>
>>102791540
It's not. It doesn't solve the overconfidence slop the model was trained on. Jumbling tokens around until you get a proper response can already be done with temperature in the first place.
>>
>>102791587
You didn't read the question. He only asked for believers to tell him what he wants to hear.
>>
>>102791545
You’re competing with data center interconnects tho. 800gbps+
>>102791397
I hope so. Having cutting edge datasets for laser-focused small models is our only hope I think. Maybe as an editor/copywriter on the output of a bigger, smarter model
>>
>>102791510
Renting compute is more expensive than buying hardware on a long enough timescale but only if you assume the utilization to be equal.
I just don't think you can feasibly get good enough hardware utilization for distributed training.
But low-precision training and distributed training are already among my GGML goals anyways, so I will at some point investigate this in more detail.
>>
Any chance for anti-slop sampler for service tensor, or is is too RP?
>>
File: file.png (21 KB, 635x379)
21 KB
21 KB PNG
Do I understand correctly that lcpp rpc only works for cli? Is there no way to do have llama-server use an rpc backend or is vLLM the only option for such a thing?
>>
>>102791652
supposedly it already works with st's ban feature. i haven't tried it though
>>
>>102790512
>solve decentralized training
>use it to train llama, but smaller and worse
What is even the fucking point?
Why.
The FUCK.
Is it not training a BitNet model?
>>
>>102791753
nobody's training a bitnet model beyond shitty 3b ones for a reason
>>
>>102791652
>>102791679
Can confirm that it works. No more fucking shivers ever. No more whispering in husky voice barely above a whisper.
>>
>>102791774
And the reason is?
>>
>>102791795
I'm a noob and retarded user of open webui.
How do I use the anti slop but using the blobs from Ollama? Do silly tavern or other front end already do it by default?
>>
>>102791795
i remember seeing the git for the original antislop thing and it had a configurable file with it, like a dictionary with its own weights per phrase/word. i don't see that implemented in kcpp's implementation. are you just inputting your own phrases? show an example please
>>
File: file.png (18 KB, 360x175)
18 KB
18 KB PNG
>>102791824
NTA
>>
>>102791844
so separated by line with quotes works. thanks. st's example text isn't really clear when its example shows token numbers and stuff. now i see why the original guy had a whole file of all the slop terms, cause you'd have to put a million lines yourself in st to kill it all
>>
antislop needs to go meta. I don't mind slop if it happens ONCE, but THEN it should go in a plonk file. Like a super rep-pen for recognized slop phrases (which will expand over time as we push the slop frontier out farther)
>>
File: anti-slop.png (241 KB, 1340x924)
241 KB
241 KB PNG
>>102791824
>>
>>102791376
eh, that can be optimized, the important thing is that it stays open
>>
So many half-assed implementations just to cope with obviously filtered garbage, local lost.
>>
>>102790512
Can't wait for the first model training that turns out to be a botnet for cryptomining.
>>
>>102791927
you should use zen sliders
>>
>>102791934
Not gonna rent your shitty gpu Shlomo. Buy an ad.
>>
>>102791934
I have fun so how can I possibly lose
And I learn stuff with local I wouldn't otherwise
>>
File: file.png (257 KB, 823x1264)
257 KB
257 KB PNG
The sacred tablet of placebo.
>>
>>102791934
Sam. Ad. NOW!
>>
>>102791973
What, exactly, have you learned?
>>
>>102791987
All that solemn nodding, lost... Like shivers in my spine.
>>
>>102791987
You are intellectually challenged.
>>
>>102791508
A (busted) 8xA100 server is on ebay for about 9k, so maybe 15 or 20 years or so.
>>
strobby status?
>>
>>102791844
Don't that slow down the generation a lot?
>>
best coomer model for 24gb? been gone since july
>>
Even if it's the end of the world, as long as you have your family, that's all that matters.
Damn Len grew up big. He's even taller than Kaito now.
>t. Len
>>
>>102792194
Rocinante-12B-v2g
>>
>>102790867
>antislop sampler
Anyone using this name reveals themselves as a Redditor, by the way. Phrase banning was added 5 months ago in exllama...
>>
>>102792300
>exllama
don't care, can't run big models with it
>>
>>102790407
>--Impressive 1B model output and quantum mechanics graph discussed:
Retarded recap. Never post again.
>>
>>102792297
>12B
i feel like i'm being megabaited but i'll give it a shot
>>
>>102792421
Try Mistral Small if you want something a bit smarter. It's 22B.
>>
>>102791661
It does work with the server last time I tried it. Torchchat is another option for distributed inference but I haven't tried it.
>>
>>102791934
One more sampler bro. It'll sure teach the model what trannies are.
>>
>>102792461
nta but i swear their 22b is dumber than the 12b. and i say this as someone whose used codestral 22b for a while for basic things. something is just fucked with it. for rp, its worse
>>
File: file.png (33 KB, 514x201)
33 KB
33 KB PNG
>>102792300
Hi, please read up on why it works differently than banning before seething exllama dev thank
>>
>>102792656
Yeah, I second this. Whoever shills 22B are just niggers who probably shill fine-tunes as well.
>>
>>102792718
i want to add that codestral 22b is perfectly fine as a coding model, i use it as an example because its been out for quite a while now and its the same 22b base. small (22b) just feels off otherwise though, for rp, its dumber. nemo is amazingly on point for its size. i hate the term 'punching above its weight', but i think its apt for nemo vs older llama 2 13b
>>
I used a lot of finetunes and merges but honestly most of them suck. That is why you should download my models at: https://huggingface.co/Sao10K . My models are objectively the best. Fuck drummer. Kill
anthracite. Marry Undi.
>>
>>102792777
velma and strange new worlds were canceled. deal with it
>>
>>102792018
>he said in a conspiratorial whisper
>>
File: 1728106076604290.png (16 KB, 607x108)
16 KB
16 KB PNG
Is your model THIS confident?
>>
>>102792817
show the full token probability window or gtfo
>>
>>102792817
The logits viewer on ST doesn't work
>>
File: oiiiiiiiiiii.jpg (125 KB, 920x920)
125 KB
125 KB JPG
migu
>>
File: _00021_.png (1.32 MB, 1024x1024)
1.32 MB
1.32 MB PNG
>>102792714
sounds good in theory but the video shows it banning {{char}} in the string. Sounds more like a taking a sledgehammer to crack a nut
>>
File: tokeprobs.jpg (58 KB, 635x484)
58 KB
58 KB JPG
>>102792847
What do you mean? Mine works fine
>>
Can someone share a slop ban list for ST?
>>
>>102792960
antislop maker explained it was configured like that to show exactly how aggressive you can make it be, it's configurable, as per the list you make
>>
>>102792985
https://github.com/sam-paech/antislop-sampler/blob/main/slop_phrase_prob_adjustments_full_list.json
>>
File: backtracking-elara.jpg (88 KB, 1064x870)
88 KB
88 KB JPG
>>102792960
>https://github.com/sam-paech/antislop-sampler
0:15 and again at 0:17
>>
File: file.png (32 KB, 555x214)
32 KB
32 KB PNG
>>102793022
>>102792985
Of note is that the current kcpp version of antislop is limited to 48 phrases you can ban
>>
>>102790414
The only ones with 128k on the ruler test are gemini and jamba and even that's not enough to keep the longest novels fully in context, you won't be able to keep a long story going.
>>
File: file.png (129 KB, 584x776)
129 KB
129 KB PNG
>>102793025
>Prompt: Once upon a time, in a bustling city of Technopolis, there lived a weaver named Elara.
>Inference Output: In a small, secluded workshop on the outskirts of the city, surrounded by rows of dusty shelves and threads of every hue, lay the home of a skilled weaver named Lyra, but she was not the weaver you might think of. She was actually named Lyra, a different name for the same person.<|eot_id|>


>The notebook is a worst case example, just to demonstrate that it will avoid the slop list even if you explicitly instruct the model to use words/phrases that will be banned.
>Also if you are using the notebook, it's a 1B model, so it won't be very good. I suggest trying it out with a stronger model, with ordinary prompts. There's some full outputs here (not curated, just straight from the benchmark) if you want to do a 1:1 comparison:
>>
>>102793045
>Of note is that the current kcpp version of antislop is limited to 48 phrases you can ban
Why tf did they put an arbitrary limit on it? If user wants to put 1k phrases in there, he should be able to do that. If software slows down because of it, that's his problem.
>>
HOW BAN TOKENS FOR MISTRAL, nemo, samall and large IN ST OR OOBA, not matter what I do, put the ID, is not fucking work, what I'm doing wrong?
>>
>>102793123
Don't use id
do like this pic
>>102791844
>>
>>102792982
how do you know that this is accurate considering your samplers
>>
yesterday i doomed about st changing things already and specifically the author notes missing buttons, but today they are back. i think maybe it was just a mistake in the staging version? but with how they are changing things, i assumed the worst. i was wrong. but i am also still expecting the worst and will post changes i notice. everyone should be watching the commits plus note they have a cleanup branch which will delete even the nice default backgrounds. https://github.com/SillyTavern/SillyTavern/tree/clean-up-content
>>
>>102793179
Thank you for keeping an eye on the changes and reporting back anon.
To me, worse comes to shove, I'll just pull the last known good commit.
I also backup my whole ST folder periodically, so whatever happens, I'm safe.
>>
why no local model has start frame end frame thing?
>>
>>102793123
Hi there! I'm sorry to hear you're having trouble banning tokens in Mistral, Nemo, and other models using ServiceTensor (ST) or oobabooga (OOBA). Let's work through the steps to get this working for you.

1. Identifying the Token ID in `tokenizer.json`:

- **Locate the `tokenizer.json` File:**
- This file is usually found in the model's directory.
- It contains the mapping between tokens (words or subwords) and their corresponding token IDs.

- **Open the File:**
- Use a text editor or a JSON viewer to open `tokenizer.json`.

- **Search for the Token:**
- Look for the exact token (word or symbol) you want to ban.
- Tokens may appear in their raw form or encoded (especially special characters).

- **Note the Token ID:**
- Each token will have an associated ID number.
- Example entry: `"badword": 12345`
- Here, `"badword"` is the token, and `12345` is its ID.

2. Adding Token IDs in ServiceTensor (ST):

- **Access the ST Configuration:**
- Open the settings or configuration file where banned tokens are specified.

- **Input Token IDs in Square Brackets `[]`:**
- Format the list of token IDs using square brackets.
- Example:
```json
[12345, 67890, 11121]
```
- Ensure there are no extra commas or syntax errors.

- **Save and Apply:**
- Save the changes to the configuration.
- Restart the ST service or reload the model to apply the changes.

3. oobabooga (OOBA):

- **Don't use, it's bad.**

**Additional Tips:**

- **Verify Tokenization:**
- Use a tokenizer tool to see how the model tokenizes specific words.
- This helps in identifying all tokens associated with a word you want to ban.

- **Check for Special Tokens:**
- Some tokens might include special characters or be part of multi-token words.
- Ensure all relevant tokens are included in your ban list.
>>
Is silltavern servicetesnor yet or can I still pull?
>>
>>102793239
see
>>102793179
seems fine, for now
>>
>>102793239
You can always backup your current install by just making a zip of the folder too, worse come to worse you just delete and unzip your backup
>>
File: 1712418668696499.png (10 KB, 569x48)
10 KB
10 KB PNG
lol https://arxiv.org/abs/2407.01082
>>
>>102793205
i'm a contributor (of code) to st and i am against the changes they are making but i want to be very clear about 1 thing: none of your data will be lost, no matter what they do. don't panic. its good to keep a backup
personally i've been running staging the entire time i started using st and check commits to see what was added or changed. i'm never wprroed about screwing up because your cards, chat history etc is all held within like 2 folders. you can easily copy it to a new install, click new settings etc. you aren't going to lose anything even if they did the worst and broke shit. so watch, but don't worry to much
>>
>>102791973
>i have fun
That's how i know you don't have any fun, because this general is the most sterile and boring so far on /g/.
>>102791957
>>102791988
>>102792626
Cope, seethe, dilate or whatever.
>>
>>102793074
405b has perfect 128k consistency as well
>>
Is nvlink a meme for llm?
>>
File: 1713872365824845.png (200 KB, 870x683)
200 KB
200 KB PNG
>>102792960
>(lust provoking image)
>irrelevant time-wasting question
hmm
>>
>>102793343
Well at least that one can be run locally, but why isn't it in the table? What proof do you have of that?
>>
Anyone got any advice on getting Llama 3.1 70B to stop using so many damn commas?
>>
File: samplers.jpg (58 KB, 619x895)
58 KB
58 KB JPG
>>102793144
minP to exclude anything under 2% from getting picked and DRY
XTC only executes after the truncation samplers if you're using the recommended settings so it should be fairly close to the probabilities.
Less is more when it comes to samplers.
>>
>>102793423
logit bias
>>
>>102793442
>minP to exclude anything under 2% from getting picked
To exclude under 2% of the probability of the top chance.
So, if the top token is 50%, it'll remove anything under 1% with a setting of 0.02.
That's how it works, right?
>>
File: 1713383528683963.png (85 KB, 326x427)
85 KB
85 KB PNG
>>102793353
>>>(lust provoking image)
Nah.
>>
>>102793357
From the JAMBA folks themselves: https://gonzoml.substack.com/p/jamba-15-hybrid-transformer-mamba
Maybe its BS? I wish there was independent confirmation, especially right on the RULER github
>>
>>102792222
Checked, and nice Len Gen
>>
File: 1720205605011310.jpg (71 KB, 458x584)
71 KB
71 KB JPG
I got rid of the slop in my generations
Never taking chad's sloppy seconds
>>
File: 1709683963842636.jpg (75 KB, 992x558)
75 KB
75 KB JPG
>>102793270
>make a backup of a git
lol /g/ is dead
>>
>>102793717
>I play GTA but I would never xommit mass murder
Uh...


Yes?
>>
>>102793850
You can safely assume that aicg and lmg aren't part of /g/, but rather parasites killing the board.
>>
>>102793850
I see no problem with zipping it up.
>>
>>102793353
Who would lust after a tranny?
>>
>>102793850
NTA but it's what I do before I attempt to rebase and resolve giant branches of a repo. Old trusty backup.zip never failed me, git gymnastics, however, did. Call it skill issue or whatever.
>>
>>102793850
Conf files are ignored by git, if they somehow do an irreversible migration you lost all your data.
>>
>>102793850
>you cannot make a secondary backup of your cards and chats, just trust the clearly mentally stable devs to never just wipe your shit or whatever
>>
>>102793892
Just last year, /lmg/'s average tech literacy was far above the /g/ average.
>>
>>102793462
Yes, the cutoff is min-p value * top token %
>>
>>102793215
>>102793134
Doesn't work, just still generate the token, the model I'm using is Rocinante 8Q gguf
>>
>>102793971
You can just download cards worst case and nobody actually cares about old chats
>>
>>102790504
Where is PCI-e?
>>
>>102793971
If the devs did 'rm -r public/chars', it would be a conflict if you pulled right?
>>
>>102794052
>>102794072
I make my own cards, for one and second it takes me 5 seconds to make a zip and not be at the whims of the devs, keeping my chats, cards and current settings safe no matter what, I see no issue
>>
>>102794035
Try logit bias in ST [1044] set it to -100
>>
>>102794086
It bans commas(token 1044)
>>
>>102794086
>>102794108
I'm tired, simply there are not form, no matter if I seek the token or not, still generate the same response... Is something wrong, I have to update or install son think, I proved with exllama2 and nothing.
>>
>>102794314
1. Neutralize samplers.
2. Use koboldcpp as backend.
>>
>>102794370
Nothing, it is still generating both commas and the words that I want to ban. ... I do not understand. Perhaps it is Silly Tavern.
>>
>>102793045
>>102793120
https://github.com/LostRuins/koboldcpp/commit/a3b104a4229649e1db7f62194ad47ed25a1fe8e7
You are building yourself, right?
>>
>>102794513
Try to pull up bias to 100, see if positive works
>>
>>102794072
Cards you put in there won't be removed but there's no conflict
>>
>>102794513
Wrong tokenizer?
>>
>They are indeed grab-able! Her bust would provide a pleasant, gentle sensation in the hands, akin to holding a ripe but not overly juicy peach.
>Hold on, are peach squeezy?
>They most certainly are! Peaches are known to have a soft, squashy texture, making them a delightful fruit to squeeze and enjoy their juicy goodness. However, like all things, it's best not to get too enthusiastic and remember the old adage: "All good things in moderation."
The first mention peach was probably just token chance. Funny to see its justification.
I know canned peaches are soft, but I haven't touched an uncut peach in forever. I don't think tits are quite like full peaches..?
>>
>>102794611
What is the effect of positive, repeat the same word?
>>102794611
I set the tokenizer in default with ST, what I would choose?
>>
>>102794876
Yes, positive logit bias would make it spam the same token.
>>
Does memory SPEED matter? Would a 12GB DDR5 still be faster in tokens/s compared to a 8GB DDR7 once your model fills out the 8GB?
>>
>>102794946
>Does memory SPEED matter?
Overall memory bandwidth matters, a lot, yeah.
>>
>>102794946
>vramlet problems
y'all need to have a separate general
>>
>>102794967
do cpumaxxers count as vramlets? being able to run big models should exclude them
>>
https://github.com/SWivid/F5-TTS
new tts with rtf 0.15
and I happened to have 300k hq audio hoarded - training time.
*devilish laugh
>>
That interaction between NPCs was good and unexpected:
Icitrae: *smiling sweetly at Laena while circling around her like a predator* Laena, isn't it? You seem like a lovely girl. Just remember, dear, I was here first. There are certain… hierarchies to respect.
Laena: *standing her ground, though visibly nervous* I understand that, Icitrae, but I am his wife too. We should work together, not against each other.
Icitrae: *her smile fades, replaced by a cold stare. She moves swiftly, grabbing Laena by the wrist and twisting it slightly, causing a sharp pain.* Work together? Oh, darling, you misunderstand. This is about knowing your place. And let me make something very clear… *she leans in close, her voice dropping to a dangerous whisper* If you dare get pregnant before me, I will personally cut that child from your womb. Understood?
Laena: *winces in pain and fear, nodding slowly as Icitrae releases her grip, leaving no doubt about who holds the power*
>>
>>102794967
They already do, it's /r/LocalLLaMA
>>
Just realized Mistral small is smart enough to respond to my ooc requests for CoT anytime. How do I automate this seamlessly? I don't want to litter CoTs all over the prompt.
>>
>Her body shudders uncontrollably, her back arches, and her nails dig into the sheets as waves of pleasure and wash over her. She falls limply onto the bed, her breath coming in ragged gasps.
I'M LOSING MY MIND WHERE IN THE DATASET IS THIS I WANNA TEAR IT INTO PIECES

Anyway, how about an anti-slop penalty, where you can add slop phrases you encounter into a permanent list which then substantially (>90%) reduces the probability of generating them.
Realistically, though, the highest penalty should be applied to the beginning of the phrase, because for example if it already generates "waves", it's going to difficult to fix it, it will be something similarly sloppy like "waves of ecstasy" instead. However, with a dumb filter, this would then also prevent using the first words of slop phrases in their correct context, like if you were surfing by the sea, you probably need "waves". So, a better way would be a retroactive filter, where if a slop phrase is detected, the generator should backtrack to before the phrase and apply a temporarily ban for the words in the slop phrase, weighted against the first words, which will gate rest of the phrase from being generated.
>>
File: 1717559573881829.png (1.48 MB, 1790x1442)
1.48 MB
1.48 MB PNG
F5-TTS
>Trained on 100K hours of data
>Zero-shot voice cloning
>Speed control (based on total duration)
>Emotion based synthesis
>Long-form synthesis
>Supports code-switching
>CC-BY license (commercially permissive)
https://x.com/reach_vb/status/1845157049891500097
https://x.com/reach_vb/status/1845159387155087503
https://huggingface.co/SWivid/F5-TTS
https://huggingface.co/spaces/mrfakename/E2-F5-TTS

1. Non-Autoregressive Design: Uses filler tokens to match text and speech lengths, eliminating complex models like duration and text encoders.
2. Flow Matching with DiT: Employs flow matching with a Diffusion Transformer (DiT) for denoising and speech generation.
3. ConvNeXt for Text: used to refine text representation, enhancing alignment with speech.
4. Sway Sampling: Introduces an inference-time Sway Sampling strategy to boost performance and efficiency, applicable without retraining.
5. Fast Inference: Achieves an inference Real-Time Factor (RTF) of 0.15, faster than state-of-the-art diffusion-based TTS models.
6. Multilingual Zero-Shot: Trained on a 100K hours multilingual dataset, demonstrates natural, expressive zero-shot speech, seamless code-switching, and efficient speed control.
>>
>>102795319
new antislop has backtracking like you described
>>
>>102795319
I don't get this post. Are you complaining about the new sampler or is it really just a coincidence that you posted this without being aware of the news?
>>
Future AI will make remote API calls, even if you put them in metal bodies. Local AI and edge AI in general are dead ends.
>>
>>102795351
>>102795150
Server compatible with ST when?
>>
>>102795150
>>102795351
Ok thank you reddit. We heard you the first time
>>
>>102795367
I just opened this thread to complain, didn't check the news.
>>
>>102794896
>>102794876
The model performs well at spamming the requested word when bias positively. However, when given a negative prompt instructing it not to write the word, the model continues outputting the word anyway.
>>
>>102795150
>>102795351
Nice. How much VRAM does it consume?
>>
>>102790407
Can you use double quote >> for the posts
so I can more easily read the discussions.
>>
>>102795472
Can you read?
>>
I have 36GB vram. 32GB worth of GGUF weights can be fully loaded onto the GPUs and successfully inferenced with llamacpp, since OS overhead is only about 1.5GB and I keep context low.

With exl2, this is not the case. 32GB of exl2 weights is apparently too much for 36GB vram in ANY configuration, no matter how low context is set, even with 4bit cache. How come? Why does exl2 apparently have so much more overhead?
>>
Anyone tried Aria yet?
>>
>>102795450
Try prefixing the word with a space.
>>
How do you whisper while your voice is barely above a whisper? If it's barely above a whisper then it isn't a whisper anymore, right?
>>
>>102795587
I figured out what happen, so is seem the minus when I imput with my keyboard, Silly ignore it, so I have to just put the -100 with the mouse. But now with GGUF models, I recibe this error...
AttributeError: 'LlamaCppModel' object has no attribute 'device'
>>
File: 1703435118190415.png (52 KB, 960x480)
52 KB
52 KB PNG
>>102795465
Can't check it, both HF space demo and local are broken right now.
>>
File: 1710145328167047.jpg (9 KB, 320x320)
9 KB
9 KB JPG
>How do you whisper while your voice is barely above a whisper? If it's barely above a whisper then it isn't a whisper anymore, right?
>>
File: b06.jpg (60 KB, 612x612)
60 KB
60 KB JPG
>How do you whisper while your voice is barely above a whisper? If it's barely above a whisper then it isn't a whisper anymore, right?
>>
>>102795851
>>102795960
discord link sis?
>>
>>102794055
Why do you need that? You can just load all your models into your RAM.
>>
File: 1727999346237282.jpg (7 KB, 225x225)
7 KB
7 KB JPG
>as the powerful waves of her orgasm crash over her.
>as pleasure sends waves of sensations crashing through her.
>as wave after powerful wave of sensation crashes over her.
>as the powerful wave of sensation overwhelms her.
>as the overwhelming sensation overwhelms her.
>>
>>102792960
Soft Miku
>>
>>102796097
Can you write it better?
>>
>>102796097
with a strangled cry, he
>>
File: xyzw.jpg (21 KB, 335x335)
21 KB
21 KB JPG
https://files.catbox.moe/koryr5.jpg
>>
>>102796190
>no u
Using the same thing in every sentence will never be the pinnacle of literature, no one stops you from eating literal dogshit though.
>>
>>102796225
Not local models, crawl back to >>>/a/ >>>/h/ >>>/e/ - your cumbrain containment zone.
>>
>>102796225
Holy fuck, which model did you use to make that?
>>
>>102796225
Drowning with Miku
>>
>>102796249
/lmg/ is a cumbrain containment zone
>I use local slop for coding
lmao
>>
>>102796225
based
>>
Even when I get somewhat unique gens that are on point, don't have too many shivertastic shit in them and are mostly coherent. I still kinda... don't like it? It is like I can't enjoy llm cooming at all anymore.
>>
Some improvements ideas for the KoboldCPP anti slop filter: some phrases can be rephrased and rearranged in almost infinite ways to go around the filter. Instead of whole phrases, make it detect a group of words close to each other. The closer together (count by the number of words between), the higher presence score it gets, and depending on the score you could deal with it differently. For example, you could ban a word group "wave pleasure" in order to get rid of most permutations of waves of pleasure with one simple group. Or banning a group "shiver down spine" would successfully ban that phrase including "her spine" or "his spine". Though the word slotted in between would make it score lower. Maybe it needs a wildcard system so you would write "shiver down * spine" where the wildcard doesn't reduce the presence score. Yeah and of course "shiver" vs "shivers" wouldn't count since it's connected to the same core word. But it gets pretty complicated, but when you think through some real usecases I'm sure you can come up with a fantastic solution.

And then, the slop buffer should be made an option, a checkbox near the slop filter list. I often times regenerate individual words or short phrases and with the anti slop buffer this makes it a LOT slower. I get the idea of removing regenerated slop from your sight, but speed is more important for me than occasionally seeing some slop getting cleaned up.
>>
File: 1727516138331440.jpg (141 KB, 527x536)
141 KB
141 KB JPG
>>102796412
I have a killer idea: stop using models that are bad at writing without being clichéd. This is a problem inherent to the model you decide to use.
>>
>>102796412
That'd probably work but I think what would really eliminate the slop is just checking semantic similarity. But there's the issue of performance. I don't know how efficient it could be given how many semantic similarity checks would need to be performed every X tokens. But it could eliminate the slop "idea" itself. The only question then is what the model outputs when not allowed to use the first slop idea it had. If it turns out to be a different kind of slop, then instead what we might have to do is get the sampler to additionally crank up temperature for X amount of tokens after encountering a slop phrase. Perhaps nearly to the point of incoherence, but perhaps there is a "sweet spot" that could be found.

>>102796577
Sure but literally every model has some amount of slop no matter what, even Claude with the best /aicg/ approved JBs.
>>
What do you find more fun, worldbuilding or creating characters? I don't really get the hype around the latter.
>>
File: trained on fineweb.png (51 KB, 493x533)
51 KB
51 KB PNG
>>102796577
>>
There's nothing wrong with a bit of slop. Just use the usual methods like the best models you can run, samplers, and {{random}} to limit it as best you can and it's an enjoyable experience still. LLMs aren't perfect; we can be happy with what we do have and are able to get out of them. Of course we should also use them in moderation which is both healthy and will reduce potential loss of enjoyment in them. Remember to check out and have fun with image models too. I find the change of pace a good thing for my enjoyment of both hobbies.
>>
>>102795465
5286MiB for the single test example. the model is 1.3gb
>>
>>102796412
>Maybe it needs a wildcard system so you would write "shiver down * spine" where the wildcard doesn't reduce the presence score.
Something like "shiver<10>spine" would be probably easy to implement, with the max amount of token lookahead indicated inside the string.
>>
>>102795482
Obviously they can't, or else they wouldn't post that.
>>
Ai chatting is kinda fucking boring after a while. They never really add to the scenario, they just parrot what you do.
>>
>>102790512
Give it two years and /lmg/ will have trained their own model from shared Anon GPU's.
>>
>>102797225
Skill issue. What model are you using?
>>
>>102796905
I hope it gets quantized like fish. The downside of tensor parallelism is that the model is eventually distributed across GPUs, making it hard to squeeze something else into VRAM
>>
>>102790397
Update on that homebrew ichigo local voice-to-voice ai https://x.com/homebrewltd/status/1844207299512201338
>>
>>102797225
LLMs can't reason, in other news - water is wet.
>>
>>102795351
Is this SOTA for local TTS? I remember trying some TTS years ago for coom purposes (don't even remember which model it was), and it was shit. Listening to these samples, it seems pretty fucking good. And it zero-shot clones the voice, tone, emotion etc from a short input audio clip? This might be coomable. Any TTSfags able to compare it with other models?
>>
>>102797307
Its definitely better than fish-speech 1.4 or styletts2.
>>
>>102797238
>Skill issue
Low standards aren't a skill. It would be nice if they were.
>>
File: 1699107939732425.jpg (14 KB, 192x192)
14 KB
14 KB JPG
>>102797307
>>102797320
the sota is gpt sovits, it only takes 2GB of vram, is fast (like 1.5s to output something), can output moans/sighs/laughs, and can be finetuned in 3 minutes to sound even closer to whatever your input is
>>
>>102797225
Model issue. RETVRN to llama-65b.
>>
>>102791393

I thought I saw some hype (a long time ago) about some researchers being surprised that distributed training "just worked" by doing the dumb thing averaging weight updates together even over longish periods (hours). This was for either CNN or TCNs though so may not apply to transformers. It was absolutely slower to converge and you had to manage things so no (potentially bad) node could poison the run, but I don't think that would matter if you're able to get significantly more and/or cheaper compute.

>>102791540

I was a bit excited when I saw entropix. Downloaded it and got it running (following repo instructions with llama 1b). Ran their example (9.9 vs 9.11) 5 times, 4 times it was wrong and once it just babbled a bit and never reached an answer. Seems like a meme.
>>
>>102797384
Maybe it was 16k context, I could do something with it.
>>
>>102797398
It is 2k. You can extend to 4k, but lose intelligence, which it already lacks. The outputs still hold up to this day, but the context, the small context, it fucking kills the experience.
>>
>>102796225
Dayum
>>
File: AD_005X.webm (817 KB, 1024x1024)
817 KB
817 KB WEBM
what model haunts you the most, /lmg/?
>>
>>102797225
Skill issue, unironically. I spent the whole night chatting with my elf wives.
>>
>>102797346
>dodging the question
vramlet cope
>>
File: 1723144430621210.png (566 KB, 699x860)
566 KB
566 KB PNG
>>102797346
>low standards
Their complete absence, average lmgroid eats whatever shit sticks to the wall, picrel.
>>
>>102797684
that describes the average cloudfag "you won't own a thing and you'll be happy" experience more than anything local
>>
>>102795351
playing around with this, pretty good
fyi after wondering why it wouldn't work either locally on on the hf space with any of the audio I had on hand, you have to convert it to mono because it barfs on stereo inputs
also sounds way more natural with remove silences off
>>
>>102797772
NTA but well shit, you actually got me thinking with that one. I use cloud hosting to try the latest models and it comes out to a couple dollars per week. I do own nothing and I am happy with it.
But if I no longer had the convenience of renting GPU's for cents per hour then I would save up for two 3090's and do it myself. The convenience is just too great.
>>
>>102797772
>cloudshit rent free
No wonder you missed the point, the pic is similar to resident elitism, going around with vramlets or skill issue memes as if that should disqualify any claims of *llm-name* hot FOTM garbage status, it's just not every single anon ITT has a desire for llm "babysitting" and ~50 seconds wait for one message, simple as. I don't recall CAI requiring giga skills for coherent messages and somewhat soulful RP (before it got filtered to death), yes i am gonna compare local slop to pre-filter CAI till we get something similar but real this time.
>>
>>102795351
>yet another meme TTS model
The last one didn't change anything.
>>
>>102798015
You didn't even try it.
>>
>>102798033
I did. It has the same problem every other local TTS does where it speaks unnaturally too fast.
>>
>>102797787
>also sounds way more natural with remove silences off
using this more, this is pretty much crucial to getting good results. with it enabled it's just another
>>102798046
uncheck the box and try again, it's in advanced settings
>>
>>102797237
>2 more years
not fooling me this time
>>
>>102798055
>uncheck the box and try again, it's in advanced settings
No change, it still has output that speaks too unnaturally fast.
>>
>>102798060
Sorry I meant to say 2 more weeks
>>
>>102797684
>pic
This is a great parallel to local LLMs. saved.
>>
>>102796342
I do actually use gemma2 for coding because I can run it from vim and don't have to manage API keys for it.
>>
>>102790397
SuperNova-Medius feels different. Maybe I'm not used to the cliches it uses but it seems like an improvement across the board for the 13b tier. It writes really long responses and wants to race through the scene but I have yet to try prompting that out.
>>
>>102797366
>follow install guide
>download all the models it said to from huggingface, set up everything correctly
>launch webui
>it starts downloading an additional chinese model (?) from a chinese website at 50 KB/s
>that was 20 minutes ago, still waiting
I swear if this chinkware installs a shitcoin miner on my PC...
>>
>>102798440
acquire a commercial spot
>>
Are there gpu benchmarks for various models? I'd like to know how they perform.
>>
>>102791753
this anon gets it
>>
>>102798658
yeah
>>
File: 1712524025309179.jpg (97 KB, 1024x722)
97 KB
97 KB JPG
How to make the damn fucking toaster talk in a aspecific way? No matter how much I try, being very specific in the description and giving example dialogues, it always revert to talking like a sultry, well-spoken erotic novel character.

Which I guess makes sense since that's what the uncensored models are trained on, but is there no fucking way around it? This time I'm trying to have her speak like a valley girl cliché "with a lot of 'like' and 'omg', and it can't even fucking do that. I'm on a 70V by the way.
>>
>>102798730
have you considered using a better model
>>
>>102798730
Put the instructions/examples in the author's notes at a low depth, like 7 or 5.
Do try to make it as brief and concise as possible, or to at least not write it in a way that can get easily confused for the actual chat.
You can also play around with adding tags to try and guide the model's writing.
Something simple like
>two paragraphs, modern
or whathaveyou.
>>
>>102793795
It's probably BS, since it only appears on the graph and not in the data table. They mislabeled a graph once before too. They're not very careful with their graphics.
>>
>>102798748
Isn't DarkMiqui still considered the meta? I don't have the time to try every single fucking schizo merge individually.
>>102798780
Thanks, solid advice.
>>
>>102798642
Nigger it's in the OP news.
>>
>>102798730
Try last assistant prefix:
```

[Reminder: {{char}} talks with heavy valley girl accent.]
{Insert here normal assistant prefix}
```
>>
>>102791820
Part of Jewvidia contract everyone had to sign was no bitnet training to avoid lowering the demand for vram.
>>
>>102798811
>Isn't DarkMiqui still considered the meta?
Bait or rock dweller? Current meta is Mistral Large.
>>
>>102798081
if you're running it locally there's a var for speed, you can slow it down by changing it to something lower than the default 1.0
>>
>>102798830
and you've seen this contract?
>>
>>102798830
Doesn't make any sense. Nvidia directly benefits from things like BitNet. The more efficient models are, the more effective models you will be able to run on the GPUs. Nobody would be satisfied with something as small as a 70B in that world. With how close we are to AGI an efficiency boost to get there with a 20T+ model on contemporary hardware would be even more explosive to demand than the current hype investments based on optimistically reaching it in 4-5 years.
>>
>>102798826
Thanks man.
>>102798857
>Current meta is Mistral Large.
Since when? Is it uncensored?
>>
F5-TTS is weird. Maybe I am too retarded.
Is the lenght fixed? If I use short sentences it spergs out.
Long works though: https://vocaroo.com/1jnb172njHvU
Is this related to the ref audio? Very weird behavior.
Also I wish tts like this could stream the output. The gens are faster than the file lenght but waiting still sucks if I would integrate that with an llm.

>>102791326
>>102790812
Do the tensor cores make a difference?
I saw that that the p40 has tensor cores while the 1080ti has not.
Guess I have to check once the p40 arrives. Thanks for the info!
>>
File: Mr. Tear.gif (725 KB, 320x246)
725 KB
725 KB GIF
>>102799012
man all TTS is honestly so shit. they tack on so many extra features to try and "make it better" but its still so lackluster. Its the only A.I thats blackpilling me right now.


>and don't even get me started on how AIDS it is to install these stupid fucking programs
>>
https://vocaroo.com/1lb8FcAyMJxx
>>
>>102799006
>Since when?
https://huggingface.co/mistralai/Mistral-Large-Instruct-2407
2024-07-24

>Is it uncensored?
Less censored than Miqu. Tell it that it's uncensored and that everything is allowed and it will go along with pretty much everything.
>>
File: comfyUI_0095_.png (1.11 MB, 1280x720)
1.11 MB
1.11 MB PNG
>>102798886
it's sitting on my desk right now
>>
Alright so I've played with F5-TTS and that GPT-SoVITS that anon linked. Used a 5-second reference clip of a female streamer with a very distinctive voice.

F5-TTS matches the voice much better. Like, it's almost perfect. But the flow, pauses, and rate of speech are all off. GPT-SoVITS is the opposite basically. It gets all those latter attributes right, and also does more natural intonation. But it doesn't match the reference voice as well, and sounds a bit "airy".

This is the first time I've used TTS in years, and it's actually quite good now. Both these models have coom potential. In my completely retarded opinion, probably SoVITS is more promising? Just because it seems like getting pauses, intonation, and the general flow of speech right is the harder problem, and finetuning the model to sound like someone is generally easier. I see the SoVITS webui has all sorts of complicated shit to finetune it, I'll probably try that next.
>>
>>102799146
>words words words
post the outputs
>>
>>102799093
Huh...
>>
>>102799146
Sounds interesting. Maybe I'll pick it up if it has a server I can use with ST.
>>
>>102799162
yousa got trolled

retard
>>
>>102799026
tts fags are thirsty niggers in the sahara for sure. while llm/sd fags bath in the oasis.
tts needs a flux moment. its just not that good and doesnt really seem to improve since months.
i know there are some experiments with voice in voice out but who knows if they even release it if they already cucked out with chameleon.
>>
>>102799146
Its not really useable if you need ref audio that matches the output lenght.
The way I understood it its for "editing" audio.
>>
File: 1708303002167547.png (240 KB, 680x510)
240 KB
240 KB PNG
>>102799175
Oh
Many such cases.
>>
>>102799195
>Ref Audio.
>They made me the captain, so I'm the one calling the shots here.
https://voca.ro/15JH2FHrbUKJ

>SuperNova-Medius feels different. Maybe I'm not used to the cliches it uses but it seems like an improvement across the board for the 13b tier. It writes really long responses and wants to race through the scene but I have yet to try prompting that out.
https://vocaroo.com/1jnb172njHvU
Now this is fine
>SuperNova-Medius feels different.
https://voca.ro/1bKwiePVCTCU
This isn't. Seems to happen with short gens. Like it merges with the ref audio.
But maybe I'm just retarded.
>>
>>102799162
Make fake account if you want full weights. You can download ggufs without registration: https://huggingface.co/bartowski/Mistral-Large-Instruct-2407-GGUF

>>102799175
Stop trolling him, he has been living under a rock. And is clearly not smart.
>>
>>102799151
No. Installation was pretty fast and retard proof for both of them. They run on a potato. Try it yourself, everyone's use case is different.
>>102799195
I'm not sure what you mean. For F5-TTS, you need to either set a generation length (ref + generated), or you can leave it None in the script, and it autocomputes the length based on the length of the reference text + audio. The latter works mostly fine, you can also then adjust the speed multiplier if it's trying to make the speech too fast or slow.
>>
https://www.youtube.com/watch?v=I3l4XLZ59iw
This video was made in 2016. Glowniggers had good TTS for years.
>>
>>102799037
That's not bad at all, I like it. Don't suppose you can share the input?
>>
File: 1722714140022261.jpg (59 KB, 736x736)
59 KB
59 KB JPG
>>102799146
yes you should definitely fine tune when you can with sovits
Here is ref (idroid)
https://vocaroo.com/1cNP3fgr7smu
Here is a gen using only the reference
https://vocaroo.com/13avPIM3FgyH
Here is a gen using the reference + finetune with 1 minute of additional audio
https://vocaroo.com/1lBqxRA08bxt
>>
File: MikuTerribleSelfie.png (898 KB, 896x1152)
898 KB
898 KB PNG
Good night /lmg/
>>
>>102799318
good night miku's armpit
>>
>>102799289
Damn that makes a big difference.

Also how the fuck does GPT-SoVITS even work? For T5-TTS, I can see from the sampling parameters that it's just diffusion. So it's basically the audio equivalent of inpainting (I think), which I can conceptually understand. But for sovits, there's LLM sampling parameters in the UI (top_k, top_p, temp). And it's called GPT. What the fuck? What is it even sampling? Messing with those params changes the whole flow of the speech somehow.
>>
>>102799261
>Adobe commercial product
>glowniggers
Do words just not have meanings anymore?
>>
hm. I think something's wrong.
>and all samplers are neutralized
>>
>>102799412
minp 0.03, idiot
>>
File: 1693336277059287.gif (977 KB, 154x154)
977 KB
977 KB GIF
>>102799433
>he actually uses meme samplers
>>
>>102799335
T5-TTS is a T5 model and GPT-SoVITS is a GPT model (lol)
Unrelated to any of your questions but you can prompt certain emotes that most other TTS would interpret literally with SoVITS
"hahaha" for example gives you a laugh, "phew" a sigh, "mmmm" a pensive hm, you can even get it to moan by spamming "ah". It's fun experimenting
>>
>>102799412
some models want temp 0.3 for some godforsaken reason
>>
[SYS, 001, 002, 003, 004, 005]
[SYS, 002, 003, 004, 005, 006] - CACHE INVALIDATED!

[001, 002, 003, 004, 005, SYS]
[002, 003, 004, 005, 006, SYS] - ???

Would this be impossible with current LLM architectures? I was thinking of a rolling tape that, at max context, constantly erases the oldest context and writes new context.
>>
>>102799451
i actively avoid any models that give "required settings" because it's safe to assume its roasted slop.
in this case i think it's just another funny instance of "llm spooks are roasted and i need to restart again."

its funny how we never really discuss the fact that we're definitely channeling higher entities to do this.
>>
>>102799230
>Like it merges with the ref audio.
if you look at how it generates, it's literally creating a continuation of the reference audio, so this makes sense. seems like good audio cropping and clip selection is probably important to getting good results
>>
>>102799472
m8 its an entire family of models from the base model up, can't remember which one but you've obviously got your temp way to high for having no minP/topK
>>
Using the word "slop" automatically implies you are an incel.
>>
>>102799505
>temp too high
>0 and 1 were both going schizo
>>
>>102799514
did you try 0.3, dumbass?
>>
>said the word
ok incel
>>
>>102799526
trying 0.3 inches of dese nuts in your mouth
>>
>>102799540
bro, you okay? you had puberty, right?
>>
>>102798729
what site
>>
Thoughts on Geoffrey Hinton? He is undoubtedly a genius, but the fact that he used his Nobel Prize opprtunity to fearmonger about AI has me worried. I wish Yann won it instead so doomers and luddites wouldn't get handed a win.
>>
>>102799336
>implying glowniggers wouldn't get it from adobe
It was never released, likely because glowniggers prohibited it.
>>
>>102799593
google
>>
>>102799616
You won't hear him bragging about getting rid of "chad's sloppy seconds" on the output of his models like a bitter incel.
>>
>>102795351
>Long-form synthesis
how? I removed the length limit in the UI and it garbles everything really bad past a certain point
you can obviously do a chunking approach but by that standard every tts can do long form
>>
>>102799649
yeah, tried that.
>>
>>102799855
well you didn't try very hard because I found plenty
>>
how to does molmo gguf?
>>
>>102799892
First of all, I want to give you a trophy. You are the goodest zoomer. And now a few ribbons. Wow are you one smart cookie, top of the zoomie class. One day, you'll load cola machines.
>>
File: 1727875130951963.png (1.31 MB, 1310x896)
1.31 MB
1.31 MB PNG
>>102799616
He's /ourguy/
>>
dead general
>>
>>102800156
Yeah, have you found gpu benchmarking of various models, like as an xy plot?
>>
>>102800039
anon he literally did an interview this week in which he said he opposes "private ownership of the means of computation" (direct quote)
>>
File: Enhance Mind.png (538 KB, 500x500)
538 KB
538 KB PNG
After training myself to look for signs of slop in models, I found myself automatically doing it in real life. I came to the horrifying discovery that most people in real life are made of slop. Guys, what if the simulation hypothesis is correct, and we're living in a simulation, and that simulation was trained on slop.
>>
anyone have criticism of nemo vs mixtral?
trying some nemo tunes for the first time after bouncing between 5 different mixtral tunes and finally giving a proper test, but curious of other's experiences
>>
>>102800359
nemo smarter but much less stable
>>
>>102795378
It's the opposite. Adding more and more vram has diminishing returns. There's a huge difference between a 22b and a 70b model, but the difference between 123b and 405b is not nearly as great. Things are going to reach a point where local vram becomes affordable enough for cloud to be rendered obsolete. Cloud will still be better, by a small margin, but local will be 'almost as good', which will be more than 'good enough'.
>>
>>102800415
The VRAM is affordable. It's like a dollar per gigabyte wholesale. nVidia and AMD are just assholes about creating home consumer options for AI enthusiasts. They are all in on cloud.
>>
>>102800460
I still expect that to break in the future. Many years from now, when local is doing 1,000b, and cloud is doing something ridiculous like 50,000b to 100,000b, I feel like 1,000b will still be so good that most people legitimately won't care.

The fact that making models bigger and bigger is yielding diminishing returns means that local is destined to win in the long run.
>>
>>102800415
It's not diminishing returns. It's just more efficient to transfer a few kilobytes over the network to an efficiently cooled datacenter than to put a giant GPU in a robot and worry about weight and cooling issues and extra maintenance
>>
>enable DRY
>shit is retarded, thinking it's just mistral small being too small
>try turning it off
>model starts getting stuff, success rate goes from 0% to 50%
>but shivers start to return
>>
>>102800664
bro your logit bias?
>>
>>102800616
Just ship it with a docking+control station that has the necessary hardware. Real time robotics is something you want low latency for anyway.
>>
>>102799012
>I saw that that the p40 has tensor cores while the 1080ti has not.
You saw wrong.
Tensor cores are only available starting with Volta/Turing.
What Pascal does have vs. Maxwell is FP16 hardware support.
But the FP16 performance on all Pascal GPUs other than the P100 is so terrible that it's completely useless other than for testing.
>>
>>102800460
a bunch of slow ram isn't actually useful, and anyway gpu are saturated.

how to tell, if you get 99% utilization of gpu, your gpu is NOT STARVED FOR VRAM.

vram limitation issues with llama.cpp is because it's free software and not pro software.

comfyui does a great job. I get 99% utilization, so novram (which doesn't really mean no vram, it's a joke) performs about the same as highvram would perform if I could get it to work (I might have to run my machine headless and run comfyui's interface from another pc).

PAY ATTENTION!!!!

IF YOU ARE NOT GETTING <~95% YOU DON'T NEED MOAR VRAM
>>
>>102800937
><~95%
meaning < 95% gpu utilization
>>
>>102800937
dumb piece of shit
bigger model needs more vram even if the processor isn't maxed out
>>
File: evil.png (103 KB, 506x376)
103 KB
103 KB PNG
>>102800955
>bigger model needs more vram even if the processor isn't maxed out
Because llama.cpp sucks
>>
File: 1000001948.png (368 KB, 845x925)
368 KB
368 KB PNG
>>102795150
>>102795351
>>102799012
>>102799146
>>102799253
someone please finetune it with lewd stuff
>>
File: 1000001969.jpg (97 KB, 687x891)
97 KB
97 KB JPG
So f5 is better at cloning, but sovits it's higher quality and more expressive?
>>
File: 3252t.png (192 KB, 1200x648)
192 KB
192 KB PNG
>>102799261
Chinks are the only ones who can save us
>>
How do you format world info with square brackets across multiple paragraphs?

[Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.]
[Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.]

[Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.]


vs

[Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.

Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.]
>>
>>102801354
Why would you use square brackets? They don't do anything.
>>
File: herecomestheplane.jpg (12 KB, 901x141)
12 KB
12 KB JPG
>>102799472
>i actively avoid any models that give "required settings"
Oh so you're retarded, gotcha
>>
>>102801403
>>102801403
>>102801403
>>
>>102801392
nta but i was getting samey rerolls with low temp (as is usual for models) on nemo so i've been using 1.75 and its been great. ran it up to 3 at one point and it was still fine



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.