[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>102876583 & >>102862101

►News
>(10/18) New research, models, and datasets from Meta FAIR: https://ai.meta.com/blog/fair-news-segment-anything-2-1-meta-spirit-lm-layer-skip-salsa-lingua
>(10/18) bitnet.cpp: Official inference framework for 1-bit LLMs: https://github.com/microsoft/BitNet
>(10/18) DeepSeek releases Janus-1.3B with multimodal understanding and generation: https://hf.co/deepseek-ai/Janus-1.3B
>(10/16) Ministral 8B instruct model released: https://mistral.ai/news/ministraux
>(10/15) PLaMo-100B: English and Japanese base model: https://hf.co/pfnet/plamo-100b

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://livecodebench.github.io/leaderboard.html

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp
>>
►Recent Highlights from the Previous Thread: >>102876583

--Paper: Nvidia's nGPT generates interest in 8B model:
>102881926 >102882121
--INTELLECT-1 decentralized training and model debate:
>102876754 >102876770 >102876814 >102879968 >102879988 >102880010 >102879987 >102880004 >102880020 >102880051 >102880055 >102880072
--IQ4_XS has higher t/s than IQ3_M due to more efficient 4-bit data unpacking:
>102884988 >102885067
--Recommendations for programming-competent models and running them on a 4090 GPU:
>102880097 >102880142 >102880167 >102880522 >102881070
--Decentralized training and Intellect-1 skepticism:
>102882434 >102882877
--Confusion about merging split models, recommendations for alternative backends and 3B model:
>102886485 >102886507 >102886525 >102886550 >102886557 >102886601 >102886613 >102886677 >102886610
--Techniques for improving model output by limiting and trimming tokens:
>102887278 >102887332 >102887336 >102887372
--Nala test shows 3B bitnet model performance:
>102882649
--MoE model with high number of small experts proposed for local use:
>102876808 >102876851 >102876949
--Meta's decline since llama1, importance of competition, and working around censorship in models:
>102882633 >102882709 >102882979 >102882991 >102883066
--Grok vs Llama 3.1 70B comparison:
>102877049 >102877118 >102877154
--Discussion of human-generated content in leaked dataset:
>102885171 >102885197 >102885211 >102885218
--Discussion of Mistral and Nemotron models, with focus on Nemotron's RP understanding and formatting quirks:
>102885034 >102887566 >102887576 >102887579 >102887748 >102887632 >102887800 >102887884 >102887909 >102887941 >102887929
--Miku (free space):
>102877971

►Recent Highlight Posts from the Previous Thread: >>102876588

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script
>>
Layerskip for useful models when?
>>
WHAT'S LAYERSKIP?
>>
File: 1663782156082337.jpg (71 KB, 851x1024)
71 KB
71 KB JPG
>96 GB of VRAM on tap
>just coomed to Ministral
It's 8B stupid but clearly there was some hand-selected high quality smut in its instruct tuning corpus that wasn't used on any other Mistral instruct model.
>>
What's the matching llava thingy for 12b?
>>
>>102888728
<disruptive tech> for useful models when?
Never
>>
>>102888747
what are you running it with?
>>
>>102888798
I blame Meta and their lack of risk tolerance
>>
>>102888820
They gave one guy 40 million GPU hours to conclude that more and better data make betterer models, then finally proceeded to aggressively filter all their training data for "quality" based on that study. And so we got llama3.
>>
>>102888694
Inefficient yard work with Rin
>>
>>102888814
ooba.
I use this hf-ification of it
https://huggingface.co/prince-canuma/Ministral-8B-Instruct-2410-HF
It's the earliest one and someone might have done a better job since then.
SWA doesn't work and the weights are in fp32 but you can load it in 16/8/4bit just fine.
There's other gguf versions out too that might have fixed the issues some of the earlier gguf quants of it had.
>>
>>102888820
This isn't really a new model that has to be trained from scratch though. The more likely reason in this specific case is that it just doesn't work that well in practice even if the benchmarks said it was good and fine. And in the first place, they already have tiny models released that can be used for speculative decoding.
>>
>>102888798
Layerskip is not disruptive at all, nor is the effort that has to go into making a layerskip model.
>>
>>102888915
It's disruptive for Nvidia because faster models
>>
>>102888878
>SWA doesn't work
Ah ok. I tried it myself but it is incoherent at large context.
>>
I have Claude Opus.
>>
>>102888923
Not really. Go and actually compare it to existing speculative decoding methods. The speed boost is not that great.
>>
Where are the files for image-parsing?
>>
>>102888991
Elaborate.
>>
>layerBRAAAP decodingBRAAAAP nonlineaBRAAAAAAPPPP
who cares
give me whatever oai is running but on local
>>
>>102889074
You don't have the hardware to run whatever oai is running.
>>
>>102889094
hopes and dreams and indomitable human spirit
>>
>>102889074
You must be new to open source. Go look at any open and local source project and compare it to to the proprietary equivalent.
Linux, LibreOffice, GIMP, Nextcloud, etc
Either purchase a subscription or start learning how to settle for less.
>>
Get a life already P*tra.
>>
>>102889007
Like llama-13b-mmproj-v1.5.Q4_1 and mistral-7b-mmproj-v1.5-Q4_1.

I'm trying to find one for 12b.
>>
>>102889210
Not supported on llama.cpp. If/when support is added, it'll probably be after the image API is merged.
>>
File: solidmiku.jpg (51 KB, 770x770)
51 KB
51 KB JPG
>>102889140
You can kill the tranny but not the idea
>>
>>102889318
>the image API
You got a link to the PR for this?
>>
>>102889140
Get a life already schizo.
>>
what's the deal with quants like q4_0_4_4 etc? i only lately started seeing them at bartowski but i always stuck to s/m/l
>>
>>102889446
I might be hallucinating, but I think those are quants optimized for running on the CPU backends or something of the sort.
>>
>>102889318
Fuck.
>>
File: 2.png (86 KB, 915x781)
86 KB
86 KB PNG
INTELLECT-1 at 12.46% complete, up from 11.03% last thread.
>>
>102889503
buy an ad
>>
>>102889446
Maybe if you read the model card you would notice that it explains what those are for.
>>
>>102889503
That's like around 1% a day, it means we still have to wait three months
>>
>buy an a-CK
>>
>>102889503
Diminishing returns...
>>
>>102889632
40K tokens per second = 3.5B tokens per day = 105B tokens per month
More like 9 months. It isn't so bad though, the point is not the model but testing and developing opendiloco
>>
>>102889383
>https://github.com/ggerganov/llama.cpp/pull/9687
API in the classical sense. It's not what you're thinking. Things that want to include image stuff will have an easier time implementing the thing instead of just copy-pasting the llava code.
>>
>>102889632
>3 months
>for 10B on 1T tokens
For a model worth using it would have to be BitNet + 100B + 10T tokens, at least.
Should only take 25 years. Assuming they start next year, they might finish by 2050.
>>
>>102889503
I can't wait to finetune it on PIPPA and then coom with it.
>>
I just need a series of base models trained on at least 10T of unfiltered human tokens. No post training voodoo, no "quality control" except for deduplication. Is that too much to ask for?
>>
Yes
>>
>>102889696
>Is that too much to ask for?
As long as you are willing to pay for it, no.
>>
Anybody not actively contributing to INTELLECT-1 needs to leave, right now.
>>
>>102889789
They aren't accepting contributions right now so you first for being a retard and exposing yourself
>>
i gonna miss the lil nigga
>>
>>102889802
>>b-but but!
Leave.
>>
File: 1721776327211827.gif (28 KB, 340x409)
28 KB
28 KB GIF
>>102889446
gguf trannies have to cope with not being able to run all layers on gpu, so they have like 60 million different quanting methods
>>
>>102889840
Crawl back >>>/pol/ incel.
>>
>>102889602
hmmmmm nyo
too used to model cards not giving an ounce of information or being straight up empty so i skip right to model files and get my info from lmg
>>
Oops! https://news.itsfoss.com/osi-meta-ai/
>Even though Meta advertises Llama as an open source AI model, they only provide the weights for it.
>As for the other aspects, like the dataset, the code, and the training process, they are kept under wraps.
>>
>>102889899
*gasp*
>>
>>102889899
they are probably training it on things legally not allowed to use for training, you have to be braindeath to ask meta to release the training data
>>
>>102889927
Just admit it, you got bamboozled with simple definition hack.
>>
>>102889899
it's over for meta after this killing blow
>>
>>102888747
>that wasn't used on any other Mistral instruct model.
Even nemo?
>>
>>102889899
>get the data
>ctrl+f for my favorite type of shivers
>no expensive GPU needed
Checkmate, Jensen.
>>
>>102889899
>Company called OpenAI that hasn't released a single open model going around bribing government officials to shut down competition and ban local

>Meta is somehow the villain

Huh?
>>
>>102889912
>>102889965
Smug downies acting smug as if that gives them win in this case, you have no control over models that you are downloading all the time.
>>102890055
>b-but le openai!
Lmao, they stopped releasing open stuff and made it clear to everyone, while Meta twists the meaning of “opensource” how it likes and you eat it like pigs, asking for more slop and shivers.
>>
File: flat,1000x1000,075,f.jpg (84 KB, 904x864)
84 KB
84 KB JPG
How's Roicante so much better than Cydonia btw?

It's a 12b whereas Cydonia is 22b, is Nemo unironically better than Mistral Small? I've found every finetune of Mistral Small lackluster
>>
>>102890088
bait
>>
>>102890055
nta. Their problem with it is the claim. OpenAI doesn't claim to be open source. I doubt they have much fondness for OpenAI or any special animosity towards Meta other than them mentioning Open Source every time they can.
I don't care either way as long as we get good models. It'd be nice to have the data and training code. Not much i can do with it, but someone else will probably give it some good use.
>>
>>102890121
What models are best for rp in this range?
>>
>the faint scent of sweat and leather and something unmistakably masculine that made her head spin.
I feel like I've read this exact description of how {{user}} smells from about 10 different models
>>
>>102890144
Interesting, I don't recall anything similar. What models are you using?
>>
>>102890144
Surely you know better than to be surprised by that.
>>
>>102890125
He's been posting this same shit for ages now.
>>
>>102890144
As a footfag I get this a lot (minus the masculine part)
>>
>>102890144
You fell for the finetune meme.
>>
>>102890144
what are you dressed as? A biker? A gimp? A cowboy?
>>
>>102890133
My point is moreso that. OpenAI is straight up evil, They racked in benefits from being a non-profit and switched to a for profit business. Have done a 180 on their company name and specifically went out of their way to try to kill local. It's fucking stupid to go after Meta when there's a cartoonishly evil company that is actively moving around with very little criticism
>>
>>102890176
just a normal guy dicking down elves in generic fantasy world #166123
>>
>>102890121
Never used mistral small, but it is pretty crazy how good rocinante is.
Nowhere in the context there's an explicit mention of pantomiming.

>>102890144
Never seen that.
Plenty of calloused hands, but never anything regarding leather.
I also don't do BDSM, so I don't know ifthat has anything to do with that.
>>
>>102890191
>every sentence starts with She...
>good
>>
Have I lost my mind or is L3 8B better than Mixtral? The former has significantly better character coherence at the cost of slightly less creativity. Samplers are the same.
>>
>>102890204
It's a good thing I'm not a fourth grade English teacher.
>>
>>102890191
>66.3s
>>
>>102890204
Yeah, that's a nemo pitfall. Sometimes it falls into these repeated patterns and stick to it.
I don't mind it much, but it's certainly noticeable.
That said, the model is smart, able to make great use of lorebooks, and has nice prose (better than -instruct at least), so it's the best nemo fine tune as far as I'm concerned.
>>
>>102889789
im too busy contributing to INCELLECT-2
>>
>>102890184
Different poster here. I would say OpenAI does get a bunch of criticism and it's clearly visible online, though it's far from enough.
>>
is there anything that fills the gap of llama 3 30B? it kinda sucks they didn't release a model this size
>>
Have we reached a consensus re Rocinante v1.1 vs v2g?
>>
>>102890282
It's always been 1.1
>>
>>102890235
Shit. Should have edited the html to show 66.6
>>
>>102890282
>>
>>102890268
Mistral small, commandR,
Qwen 2.5?
>>
>>102890304
will check. is there a particular one you recommend? my internet kinda sucks too
>>
>>102890282
i'm still team v2g
i didn't really understand samplers back when i tried 1.1 though to be honest, i could have just had a retard setup.
>>
>>102890364
a good model works on neutral
>>
new models when?
>>
>>102890385
I'm still trying to figure out what to use out of the old
>>
>>102890312
Not really, sorry.
I'm firmly stuck in the 12B zone for now.
>>
>>102890415
why not 13B? I thought there were way more of those
>>
>>102890430
You mean llama2?
Those aren't really that good and their context is too small, at least for the stuff I do, namely choose your own adventure type ERP with huge ass lorebooks.
>>
>>102890191
You're using rocinante v2g or v1.1?
>>
>>102890088
*grabs onto your leg and starts nibbling on your unwashed toes*
>>
>>102890459
1.1.
Will give the other one a try after I get bored with this one.
>>
>>102889899
This is why nobody likes fr*e software kids. You can offer them them weights but they'll still pretend that freedom 3 is relevant to them
>>
>>102890489
I don't know where your weights have been. Show me the dataset or shove your weights up your ass.
>>
>>102888694
what is the best large local model?
>>
>>102890550
Grok-1
>>
>>102890550
https://huggingface.co/google/switch-c-2048
Remember to post the Nala test.
>>
>>102889899
>you claim your software is open source yet you lock your front door. Care to explain?
>>
>>102890191
quant/samplers?
>>102890204
bro just regex your gens with {{char}} instead of 'she'.
>>
>>102890589
that won't solve it, sentences need a more dynamic structure than "XXX does YYY"
>>
>>102890602
wtf are you rping or taking english lessons?
>>
>>102890616
Can't RP with poor English
>>
>>102890616
I have autism
>>
>>102890583
anon my motherboard only supports 1 socket and 2TB of RAM
At least quant it to 256GB for me or post a guide on how to do that if im going to dedicate half a zfs pool to 1 download...
if its really the best it should still be smart at Q2... right?
>>
>>102890589
>quant/samplers?
For that specific gen I think it was temp 1, top k 10, min p 0.05.
Q4KS.

>bro just regex your gens with {{char}} instead of 'she'.
You don't even need to do that. You can get around that with samplers or prompting, but it's recall accuracy drops.
You could also lower the context.
>>
>>102890632
>if its really the best it should still be smart at Q2... right?
It's a very wide MoE, so no.
>>
svelk
>>
File: ayin.jpg (7 KB, 228x221)
7 KB
7 KB JPG
>character that is written to love war crimes
>50 messages in
>"What do you would you like to do?"
>"Maybe go on a vacation, eat ice cream..."
and the best part is that it's not human and physically cannot eat
>>
>>102890666
>666
Alright satan, do the following.
Take the description of your character card, put it in the Character's Notes field, and have it be injected at depth 10 as system.
See how that works out for you.
>>
>>102890697
but that means reprocessing the context every 10 messages... I think I just need a better model than old ass mixtral frankenmerge
>>
>>102890697
>bricking his context to 10 messages
lmao
>>
>>102890709
>but that means reprocessing the context every 10 messages
Only for the last 11 messages.
Try it.
Maybe you'll find that the tradeoff is worth it.
>>
>>102890666
Even war criminals can like ice scream okay?
>>
>>102890647
so you want me to run it at 32 bit?
umm... no?
I dont think runpod even has a RAM slider. Not that it would be more than 0.1T/s without 1000+ CPU cores
>>
is plamo 100b self gguf-able yet?
>>
>>102890039
even Nemo.
but right now I'm setting up a synthetic data pipeline to try and instill Ministrals ministration strengths on larger, smarter models. A reverse distillation, so to speak. It will probably take me about a week to finish.
>>
>>102890786
There's pure CPU with RAM slider, you're being generous with the speed.

>>102890815
I wish I knew
>>
>>102890839
>It will probably take me about a week to finish.
that's the longest edging streak I've ever seen
>>
>>102890583
>>102890632
>>102890647
I know this is a troll posting a 2 year old enormous model...but has anyone in this general actually run and tested it?
>>
>>102890550
Define large
Of any size? L3.1 405b
>>
>>102890923
Not that I'm aware of.
Hence why I always bring it up, in hopes some mad man will.
>>
>>102890923
CPUMaxx anon probably can run it if he wanted
>>
>>102890935
Incorrect. Qwen 72B is better than 405B in every way.
>>
>>102890859
These things take time. I'm too lazy to set it up to shotgun it across all 4 gpus on 4 different instances since I'm using the ooba API to generate data and it's a pain in the ass to run multiple instances on different ports I mean it's only a few extra steps for each instance I launch, but still. don't feel like doing it.
>>
>>102888694
Threadly reminder that Nemotron 70B is the current SOTA for RP.
>>
File: aaa.jpg (10 KB, 334x133)
10 KB
10 KB JPG
>>102890944
why not you?
>>
>>102890935
yes any size.
i think >>102890956 is right accorting to the slop forecasts on 405
>>
>>102891186
cool, what's the current SOTA for poorfags?
>>
>>102891254
Either Nemo or Smallstral, whichever is more convenient for you. Don't fall for memetunes.
>>
>>102891254
are you trying to tell me that $21/h can afford a $1000 PS5 but not 2x $200 4060 ti's?
>>
>>102891283
I'm a cpumaxxer with 64 GB RAM. 70B can fit but it's too slow for continuous roleplay
>>
>>102891312
64 GB RAM isn't maxxing anything, that's as min as it gets
>>
>she brushes her fingers against him, sending a spark of electricity through her body
>she leans forward, her face inches away from his
please stop...
>>
>>102891340
Yeah but stuffing more RAM into it won't help anything, I've hit a speed bottleneck basically. Probably should have bought a server board, but those are too bulky
>>
>>102891281
I'll probably try smallstral
>memetunes
So the base instruct model is fine? I've generally found good RP tunes (like 1 out of 100) to increase the gen quality
>>
>$200 4060 ti
bait
>>
>>102891352
cpumaxxer means buying a server with 8+ memory channels and at least 128gb of RAM, preferably ddr5 but ddr4 with 8 channels can give you something like 2t/s on 70b models.
You are cpupoor
>>
>>102891283
Even if we were living in the wonderland where a 4060 TI is 200 dollars you'd still need to spend money on a server board with enough PCI-E slots and lanes and a compatible cpu, PSU, cooling solution bla bla bla
>>
>>102891224
Alas, I am not a mad man.
>>
>>102891418
Oh I do get 2 t/s lol, but that's obviously too slow
>>
File: 00120-3160555956.png (704 KB, 1024x1024)
704 KB
704 KB PNG
>>102891344
That's just the SOVL, anon.
https://vocaroo.com/1mR15X3U8clz
>>
File: dzfx3aytngod1.jpg (58 KB, 1367x467)
58 KB
58 KB JPG
>>102891431
>where a 4060 TI is 200 dollars
https://www.ebay.com/itm/286091801603
close enough. its still 2 of these for 1 ps5
methinks you are a poorfag who cant afford a $3000 PC or normie who spends that money on his car or something
in which case... ACK
>>
>>102891372
Well, that's subjective. I don't like most tunes because tunes usually make my characters stupider and hornier, and I like slow burn. But if you have time at your disposal, don't feel discouraged to give them a try just because some anon said they are memes.
>>
>>102891186
>Nemotron 70B is the current SOTA for RP
Because it's so respectful and considerate while it expresses zero spatial awareness?
>>
>>102891572
show me a model with above zero spatial awareness and make it pass the watermelon test
>>
File: 1000019079.jpg (2.21 MB, 2448x2728)
2.21 MB
2.21 MB JPG
>leans closer to you
>leans closer to you
>leans closer to you
>leans closer to you
>leans closer to you
>>
>>102891431
>money on a server board with enough PCI-E slots
here retard
https://pcpartpicker.com/list/RqJrJy
it even has an amazon link for 1/2 of it which i will spoon feed you
https://www.amazon.com/gp/aws/cart/add.html?AssociateTag=pcpapi-a2c-20&AWSAccessKeyId=AKIAJ7JFGIL5RY5JHCPA&ASIN.1=B09FXNVDBJ&Quantity.1=1&ASIN.2=B07H25DYM3&Quantity.2=1&ASIN.3=B0C15THTK7&Quantity.3=1&ASIN.4=B0CFZGFGZN&Quantity.4=1&ASIN.5=B0B25ML2FH&Quantity.5=1&ASIN.6=B0CCG5S6LD&Quantity.6=2&ASIN.7=B08C7BGV3D&Quantity.7=1&ASIN.8=B0BYQHWJXC&Quantity.8=1
>>
>>102891537
I'm sure a retard that doesn't even know GPU prices is rocking a good setup
>>
>>102891593
please back up miku I can't breathe
>>
I will try nemotron. If I don't like it I will be back here and call you a nigger you shill faggot.
>>
>>102891186
It's retarded sorry
I don't care how much sovl a model has if it's stupid
Might as well use Ataraxy 9B if writing style is the only thing you care about and the ability to correctly world-model and not spout logical non-sequiturs means nothing to you
>>
>>102891589
does anyone here even remember the watermelon test?
>>
>>102891650
I remember that banging song that one Anon generated.
>>
>>102891666
based oldfag
>>
>>102891648
Skill issue. It's not retarded at all.
>>
>>102891572
Why don't you share some logs, anon? You must have them, right? You aren't just a vramlet trolling, right?
>>
>>102891688
Self-defeating argument because actually smart models don't need any 'skill' to use.
You can talk to 4o or Opus like a retarded ESL caveman and they'll still give smart responses.
>>
>>102891727
Skill issue posting is just rage baiting 80% of the time. People who actually believe it is real just have zero standards and are happy when the model mentions shivers for the 10th time in last 4 replies.
>>
>>102891709
Vramlet, yes. Trolling, no. But I don't save shitty runs just in case by chance a day or two later Anon says pix or it didn't happen.
>>
>>102891727
Except that they don't give you complete control of the model. For example, you can't use a wrong prompt format, and the only samplers you're allowed to use are basic ones like top k and repetition penalty. This isn't the case at all for local models.
>>
>>102891678
>>102891666
>oldfag
wasnt that from this year?
https://files.catbox.moe/8x4w7n.mp3
>>
>>102891764
Llama 2 released last year and it's already prehistoric
>>
>>102891764
oh yeah, I thought you mean a different one
>>
>>102891186
Tried it for a few minutes and went straight back to Largestral. If it can't pass that bar, I'm not interested.
>>
>>102891787
no idea >>102891764 (I'm) NTA
capcha: P0P vh
>>
>>102891795
you sure did
>>
>>102891795
Bet you also want people to be interested in you despite not passing as a woman
>>
>>102891764
I was thinking of something something melons flow.
>>
>>102891795
Yeah it's crazy how much Largestral blows it out of the water
people calling nemotron sota are coping fucking HARD, even harder than the dicks in their mouths
>>
>suddenly trannies
Showcasing your mental illness won't help your shilling.
>>102891851
It seemed better than other 70bs I've used but that doesn't matter if I can run something better.
>>
>>102891851
Largestral is pure concentrated slop.
>>
>>102891885
prompting issue
>>
>>102891882
>suddenly le troons
>>102891851
>suddenly le dicks
>>
The people who always complain about models being shit are the people who run them at meme quants who should be running a smaller model at a non meme quant. They hated him because he told them the truth.
>>
File: Clusters.png (874 KB, 2658x1452)
874 KB
874 KB PNG
>>102889503
>It costs 20 bucks to rent 8 H100 for an hour
>The top ten contributors to Intellect-1 have a combined total of of 15,446 hours
Either for just the ten top contributors to INTELLECT-1 have spent $308,920 bucks already training this thing. Or $38,615 bucks if it is combining all 8 H100's into one hour. I wonder how much the final cost will be once the whole thing is fully trained.
>>
File: 1729295025426369.jpg (1.05 MB, 1170x1052)
1.05 MB
1.05 MB JPG
Alright boys, post predictions for LLMs in the next years.
>Do multimodals get supported and adopted as the way forward?
>Do we get new architectures that enable greater capabilities? What are they.
>MoE expanded on or abandoned forever?
>Mistral vs Meta, who will win the LLMbowl?
>Will usable context grow and by how much?
>How will governments react to agentic AI and how will society handle discernment between AI's and humans?

Personally, I can see OpenAI training a new SOTA model trained on data from o1 which will dominate the competition for some time. It will be so good that institutions and organizations will start to adopt it at a faster pace. This will spur meta, mistral and others to build their own o1-like models to try to compete (if they already aren't working on something similar). Rate of improvement will not slow, and we will get better models, but most of them will be released with shitty licenses.

What do you guys think?
>>
>>102892044
meme quants are better than smaller models
>>
>>102892056
Anon... Wake up, it's over.
>>
>>102892089
>Its over
Tell that to my peepee when my AI gf makes it happy. It hasn't even begun fren.
>>
>>102892056
OpenAI will plateau and try more different gimmicks, Anthropic will overtake them as the technical masters of LLMs.
Token based multimodals will be more of a think, able to input and output mixed data.
MoE will continue as is, seemingly not a focus but not quite abandoned. We might see other forms of sparsity.
Usable context will grow to 128k as a default, with claimed contexts in the 1m zone.
>>
>>102892056
>Do multimodals get supported and adopted as the way forward?
Yes, companies love it when they can say their model does something others can't. So each one will be trying to get their models to do even more things.
>Do we get new architectures that enable greater capabilities? What are they.
Unpredictable. I think many say that a new architecture would do a world of good but who knows what that would even look like or when it would come out. Everything right now is focused on improving what we already got.
>MoE expanded on or abandoned forever?
Worked on but takes a back seat to multimodels. It will still have its place and will probably be refined some more but no massive expansions in there specifically
>Mistral vs Meta, who will win the LLMbowl?
No one wins, competition continues
>Will usable context grow and by how much?
Probably, and who knows
>How will governments react to agentic AI and how will society handle discernment between AI's and humans?
Hopefully slowly, the slower the government gets its grubby little mitts on AI the longer it gets to grow and the better it will be in for local. Once governments finally get to assessing the "AI Issue" you know they will want to make sure the "right kind of people" have it. I doubt we are the "right kind of people" in their eyes.
>>
>>102892112
>OpenAI will plateau and try more different gimmicks.
It seems like OpenAI is doing everything in its power to continue to scale up. I have no doubt going for-profit is the strategy to capture the most amount of capital for the purpose of scaling and improving. I don't see any other companies being able to take the lead in this regard, although I don't necessarily think having the most compute or scale will necessarily result in the best outcome.

>>102892136
>Hopefully slowly, the slower the government gets its grubby little mitts on AI the longer it gets to grow and the better it will be in for local.
I completely agree and would prefer to live in a society that allows you the maximum amount of freedom when it comes to using and developing AI models. I just think that governments will crack down fast and hard at some point when the technology gets good enough to be used to disrupt sizable portions of society. It will be impossible to differentiate between human and non-human without some sort of hardware ID, social credit system etc.
>>
>>102892232
>although I don't necessarily think having the most compute or scale will necessarily result in the best outcome.
That's exactly my thought.
They'll arrive at a point where they'll continue trying to brute force results, while other people out there will manage to actually advance the technology.
>>
>>102892248
With the amount of capital they are accruing, you would think that they could afford the best talent worldwide for the express purpose of covering all their bases. Seems like they are losing talent though with all the departures happening. I'm just a retard who doesn't actually have insight into the company though, so take my opinion with a grain of salt.

I definitely think there is a chance of OpenAI continuing to deliver though. It seems like the entire US global tech sector is being shaped to specifically support OpenAI and nvidia with the US governments backing.
>>
grifter jeet thread
>>
>>102892379
and?
>>
>>102892439
sneed
>>
>>102892439
Dunno, looking at your "local giga super opensource AGI in two more weeks" cope thread.
>>
File: Untitled.png (10 KB, 531x117)
10 KB
10 KB PNG
>>102890666
AT LAST
>>
>>102892541
>490s
>1 sentence
>>
>>102892541
so this is the power of largestral...
>>
>>102892541
Prompt processing is one hell of a drug I guess.
>>
File: img.png (676 B, 102x102)
676 B
676 B PNG
>>102892556
>it's not even his first gen
Anon must be very desperate
>>
I'm back from a 3 month coma and want to thank the Local LLM Community for diligently maintaining llama.cpp during my impromptu absence.
>>
>>102892648
Put multimodal support back into llama-server or I will fucking rape you. In. The. Ass.
>>
>>102892648
The joke is that he's pretty active on github?
I agree with >>102892683, regardless.
>>
>>102892683
What's the usecase for that?
>>102892686
I gave the account to my wife.
>>
>LLM says my question was excellent
>it was a retarded question
I feel patronized
>>
>>102892818
Nice post, Anon!
>>
>tfw using kobold is banned in 23 states
>>
>>102892818
I'm sorry. Thank you for pointing out my mistake. It is important to be respectful and considerate of the feelings of others.
>>
source?
>>
>>102892860
No way.
It's not banned here in Francistan, I refuse to believe's there's 23 countries with more government overreach than France.
>>
>download nemotron
>first response is uninspired
>reroll 3 times
>every single time it mentions shivers down the spine
I fucking hate shill niggers. I am gonna join them now and shill some other garbage too because that is what /lmg/ deserves.
>>
>>102892694
Anon, let's be friends, there's no need to be salty. Just to be clear, I wasn't the one who called you a transgender, I actually support transgenders and have some transgender friends myself. But to answer your question, yes, it was the instruct one.
>>
>>102892876
I meant American states. Also I made up the number
>>
>>102892901
Oh
>>
>>102892892
Skill issue.
>>
>>102892939
You are a skill issue of your mom and dad you fucking faggot.
>>
>>102892892
Card/prompt issue.
>>
>>102892892
Can you share the log? I'm genuinely curious.
>>
>>102892892
>>102892971
More than just the log, it would be super cool to have the whole context.
>>
>>102892971
>>102892982
Can you share some incredible nemotron logs? More than just the log, it would be super cool to have the whole context.
>>
>>102892971
>>102892982
Your mom will have the context of my dick inside her
>>
>>102892997
>>102893003
Weirdly defensive responses but alright.
>>
>>102893011
basically, what's happening is a guy spent $10k building a rig that can run a 405b at decent speeds, so the existence of a smaller model that performs better is very upsetting.
>>
>>102893011
Post your log. Show us how good it is without skill card prompt issues.
>>
File: file.png (168 KB, 916x394)
168 KB
168 KB PNG
>>102893051
NTA but I like this one. And yes, it's sloppy, but I'm fine with that as long as it makes my pp hard.
>>
File: 1729022575636839.png (352 KB, 534x622)
352 KB
352 KB PNG
So how good is this newfangled Nemotroon compared to ye olde reliable Midnight_Miqu?
>>
>>102893051
I can't even run the thing, I was just curious.
You guys are too on edge.
>>
>>102893086
>guys
I'm 99% sure it's just one schizo
>>
>>102893076
TRASH! JENSEN SHOULD HAVE SWALLOWED!
>>
>>102893110
I know there's at least 1 schizo "trolling" lmg too, so that makes it at least two guys shitting things up I guess.
>>
>>102893049
>spent $10k building a rig that can run a 405b at decent speeds
Genuinely interested about what that would look like. Four 4090s running together?
>>
>>102893076
>ancestral blood memory
I'm going to start using that one from now on.
>>
>>102893076
I always found Miqu better than Midnight Miqu tbqh
>>
File: 1728921386994079.jpg (168 KB, 2048x1358)
168 KB
168 KB JPG
>>102893126
>there's at least 1 schizo "trolling" lmg
>one (1)
>>
>>102893178
>at least
>>
>>102893075
Wow it looks like shit.
>>
File: misaki.png (647 KB, 850x850)
647 KB
647 KB PNG
What's the best format for Minstral?
Back and forth with [INST] [/INST]</s> or something like
[INST] PROMPT[/INST] Fine, I'll ERP with you.</s>[INST] Here's the story so far: CHAT HISTORY[/INST]
?
>>
File: m581dexua0c91.png (750 KB, 931x1024)
750 KB
750 KB PNG
>>102893076
I wanted to save a higher quality version of this and came upon a different post. Something tells me this matthew dude might not be the original creator.
>>
>>102893423
This is an extremely good image.
>>
7-12B will overtake the open source community. There is no point going bigger
>>
File: Untitled.png (567 KB, 800x600)
567 KB
567 KB PNG
>>102893557
i thought it was from the visual novel swan song at first glance, that'd make a cool prompt/scenario and the cold would give characters a reason to cuddle up for warmth
>>
>>102893598
It's obviously a manga version of that viral image of the asian chick standing in the snow, I don't have it.
>>
i just recently heard of flux ggufs. is dev q2 even remotely usable or should i be expecting anything below q4 to be horrible and unusable like LLM ggufs? any helpers?
>>
File: file.png (168 KB, 1637x1425)
168 KB
168 KB PNG
https://huggingface.co/nvidia/Llama-3.1-Nemotron-70B-Instruct-HF
Is this another nothingburger?
>>
>>102893653
>>102892892
>>
>>102893653
The scores are bs but apparently it is great for RP.
>>
>>102893653
Were you living under a rock these past few days?
>>
>>102893708
I don't lurk there everyday dude, c'mon :(
>>
>>102893653
It's a somethingburger. Pretty good for RP, but quite slopped. Some shill was even claiming it's better than Mistral Large but that's probably because he can only run it at Q2.
>>
>>102893632
Quantization for imagegen is even more destructive. Q4 is almost unusable, I can't imagine going smaller than that
>>
File: 1723597495714658.png (122 KB, 1000x1000)
122 KB
122 KB PNG
>She captures his lips in a searing kiss, pouring all her love and desire into it
>>
>>102893884
If it happened to you irl you would piss your pants in joy.
>>
>>102893929
I try not to piss my pants when kissing women anymore.
>>
No more finetunes. I just want newer models. I'm going back into hibernation.
>>
>>102893884
But did your spine shiver?
>>
File: 1713060614241194.jpg (185 KB, 1247x1400)
185 KB
185 KB JPG
>>102893970
>I just want newer models
>>
>>102893970
>I just want newer models.
I want more than that, I want newer BASE models, tired of having to deal with cucked instruct models, was way better during the llama2 era when we could finetune our way to the sovl, that's how we got mythomax in the first place
>>
>>102893970
You got newer models yesterday.
>>102894010
You got newer base models this week.
>>
>>102894028
>You got newer base models this week.
Nta, but I was away during this week, what did I miss?
>>
>>102894039
PLaMo
>>
>>102894047
>PLaMo
https://huggingface.co/pfnet/plamo-13b
I see, and it's good? I guess not because no one talk about it lol
>>
>>102894054
>and it's good?
No. We never get anything good.
>>
>>102894054
>13B
>llama based
What do you think?
>>
>>102894065
>>102894058
Yeah so basically everytime they make shit base models they give it to us, and everytime they got something good they just give us the cucked instruct version, got it. Feels good to be treated as litteral trash, not gonna lie
>>
If distributed training works will /lmg/ start making their own fine tunes not connected to any of the existing fine tuners that have been in the threads?
>>
>>102894054
>>102894065
It wasn't the 13B that was released recently, it was the 100B.
>>
>>102894071
Don't be like that. You're not trash. You are important free advertising for the multinational megacorps.
>>
File: file.png (209 KB, 746x512)
209 KB
209 KB PNG
>>102894096
>Don't be like that. You're not trash. You are important free advertising for the multinational megacorps.
thanks anon, I definitely feel better now
>>
File: LocalLlama.png (801 KB, 881x750)
801 KB
801 KB PNG
>NIGGGA
>even llama3 1B avoids saying NIGGA
Meta models are a lost cause
>>
>>102894220
>not saying nigga with the number of g's
yeah it's fucking over, pack it up. deleted. unsubscribed. unfollowed. canceled. facebook deleted.
>>
>>102894365
you know black people rule the world when the nigga world is the most untouchable one, even "kike" isn't that controvertial lol
>>
>>102894365
Yeah the most likely token after "NIGG" is a "G". The model is super accurate. Nothing to see here.
>>
>They want you to claim them, to make them yours.
Wow, had never seen that one in the third person.
>>
Magnum V4.

https://huggingface.co/collections/anthracite-org/v4-671450072656036945a21348

Get it while it's hot.
>>
Bit out of the loop for the past few months, whats the general go to model for a single 3090?
Also, did the issue with small context ever really get sorted with llama3?
>>
>>102894584
nobody really cares.
>>
>>102894627
>Small context
Yeah they released Llama3.1 - 131k context
>Go to model for 3090
Magnum 27B V4
>>
>>102894584
downloading the 12b right now
>>
>>102894635
I'll check that out, thanks anon
>>
>>102894656
It's slop. Don't bother. Celeste is better
>>
>>102894584
I don't want your gay slop
>>
>>102894584
Just a reminder that there are mentally disturbed users in their team
>>
>>102894710
Yeah? and?
>>
>>102893152
Its in the op
https://rentry.org/miqumaxx
>>
Both propriety and local have reached a wall as far as performance is concerned and most of the advances we're seeing now are just crafty ways of prompting the model to perform a task better.

I'm going back into hibernation for a few months. Maybe a year. Maybe something will be different then.
>>
>>102894584
whats anthracite core org? theres no repos on there but alot of u seem to be in it
>>
>>102894584
Gonna vibe test the 27b
>>
https://opencollective.com/anthracite-org

Lol. Lmao. they have an open collective.
>>
>>102894084
>PlaMo-100b
safetensors to gguf convert script crashes...
>>
>>102894794
>settling for meager donations while gpt-wrapper baseddevs are raising millions
Some people really do it for the love of the game.
>>
>>102894823
>too retarded to secure funding properly
>>
File: anthracite.png (243 KB, 1174x1187)
243 KB
243 KB PNG
>>102894794
What's the point of sending money to themselves?
>>
>>102890631
Me Too
>>
>>102894835
*anonymously
>>
why does 2 tokens/sec feel 10 times as fast as 1 token/sec
>>
>>102894926
ADHD brain
>>
File: 1509905184379.jpg (173 KB, 474x740)
173 KB
173 KB JPG
Trying GPT-Sovits, but my results seem to be far more robotic than examples on https://tts.x86.st/. Are those just cherrypicked or something? I got somewhat good results with a female scottish voice, but then it shits the bed on all three male voices I've tried. I'm using curated WAV datasets. Even raising the epochs for finetuning doesn't do much. Admittedly, Male2 has a good bit of whispering in his dataset so that probably explains his results, but the others don't. They generally speak at the same tone.

>17:54 dataset
Female Reference: https://files.catbox.moe/5rdvxs.wav
Female GPT-Sovits: https://files.catbox.moe/1ysz46.wav

>7:43 dataset
Male1 Reference: https://files.catbox.moe/8ralzz.wav
Male1 GPT-Sovits: https://files.catbox.moe/6dvbhv.wav

>15:03 dataset
Male2 Reference: https://files.catbox.moe/g50wfu.wav
Male2 GPT-Sovits: https://files.catbox.moe/nzjzew.wav

>5:09 dataset
Male3 Reference: https://files.catbox.moe/wgtb6k.wav
Male3 GPT-Sovits: https://files.catbox.moe/6dlles.wav

I've been most impressed with fish audio, the quality isn't amazing, but it is actually consistent and character accurate unlike 11. And I've only been using the online website for fish. I'd like to use it locally, but I have an RTX 2060 (6GB VRAM) and heard the reqs are a little bit much for that, or that it takes weeks to train. Same especially with F5, which I haven't been able to test at all.
>>
>>102894710
Most of the users on this board are mentally disturbed
>>
>>102895064
I should note these results themselves are cherrypicked, by the way. This is about the best I can get out of them, on average it sounds much worse.
>>
>>102895064
>>102895074
I had the exact same experience when I tried it and concluded the examples were cherrypicked. xtts2 actually sounds more natural.
>>
>>102895092
Last year, I heard XTTS v2 + RVC is the best solution. Is that still the case? I haven't gotten around to testing it yet. Also have seen StyleTTS2 mentioned. Like I said, I've tried fish and that's actually quite nice on character accuracy/delivery, but I'm unsure about a local setup for that one.
>>
>>102895064
These results are definitely not normal, consider putting more effort in filtering your dataset
>>
>>102895307
I've put hours into filtering the datasets and have fully curated them, I can assure you that they're not the problem here. I had more consistent results with the old models you train through collab back a few years ago.
>>
>>102895320
Send me a zip of male3's set, I want to test training for myself
>>
>>102894772
Ehh it's so horny that it ignores whatever I'm saying and just throw ass at me
>>
>>102894635
>Magnum 27B V4
Is that still 8k context?
>>
>>102895355
You can just look at the config.
>>
>>102895340
Sure thing, I've included the ASR as well. I did adjust the captions to be more accurate, maybe I messed something up there, who knows. File paths have been removed if you do use it. I appreciate the effort.
https://files.catbox.moe/u1ecc8.zip
>>
>>102894635
>Magnum 27B V4
But what about Qwen2.5 32B...
>>
>>102895355
Yes. Blame Google.
>>102895383
We did not like how it turned out. So we went with Gemma instead
>>
>>102895395
But you tuned on Qwen2.5 72B...
>>
>>102895406
Incomparable, past 70b, most modern models outclass anything in the 30 range.
>>
>>102895422
But you Qwen2.5 32B is better than Gemma 2 27B, and it has more context...
>>
>>102895406
We liked how the 72B turned out, but 32B was not meeting our expectations. 27B was just a better fit.
>>
>>102895440
Translation: we intentionally gave the middle finger to people that have 24GB of VRAM because we hate them.
>>
Qwen confirmed a meme only saved by its parameter size at 72B.
>>
>>102895458
A new mistral 70B would BTFO everything out there right now.
>>
>>102895491
Surely that's the next one right?
>>
>>102894379
>trained on reddit
>surprised that it self-censors
???
>>
File: sera.png (328 KB, 911x650)
328 KB
328 KB PNG
>>102894584
I'm liking 22b so far, better than other Small tunes.
>>
>>102894220
Does Twitch have word filtering? If it does, potentially what this shows isn't actually how censored the model is but how much twitch chats are in the dataset, since people would try to misspell nono words to get past the filter.

Anyway, there are censorship benchmarks out there right? How does Llama 3 do on them compared to other models?
>>
>>102895064
>>102895376
Only thing I changed was remove the space at the beginning of each lines in your list file, I believe that can fuck it over.

Same ref, SoVITS 12 batches 24 epochs (though 16 also sounds similar), GPT 12 batches 15 epochs
https://voca.ro/16ensZIUTZof
I redid the GPT weights with single batch DPO to try and squeeze more stability out of it
https://voca.ro/1FfWoBLX9mbi

It definitely doesn't sound as fucked. It might be related to the batch size you chose or your list file.
>>
>>102895548
>so many swipes you had to crop it out
>still breaks character
slop
>>
>>102895580
The space fucks it over? That's... really weird, since it autogens the list with the spaces.
As for batch size, I was using 1, but I'm assuming epochs is what matters. I'm following this guide: https://rentry.org/GPT-SoVITS-guide
>>
File: MikuClockingOut.png (1.59 MB, 816x1232)
1.59 MB
1.59 MB PNG
Good night lmg
>>
>>102895455
Have you tried not being poor?
>>
>>102895592
If that's the case then I blame batch size. Lower batch = less stable training
>>
>>102895609
Huh, alright. Not sure how high I can run it considering that the guide recommends 1 batch size for 6GB VRAM, but I'll see. Your results do sound much more promising. Thanks very much for checking.
>>
>>102894584
nice thanks, ignore the haters
gonna try all of these
>>
>>102895640
look at this shameless astroturfing
>>
File: 1716524153626341.jpg (75 KB, 992x558)
75 KB
75 KB JPG
>>102895619
All I can say is I'm baffled at the recommended settings suggested by the guide you linked, no wonder people are throwing in the towel
>>
>>102895669
Kek, it would certainly explain it. Do you know of a better guide, or has it been trial and error for you?
>>
>>102894584
Can you do Q3_K_M or K_S quants of 123B? Q3_K_L is just slightly too large
>>
>>102895353
cont.
The 22B is much better at character adherence. Mistral cooked. Truly the last hope of open source.
>>
>>102895653
cry more
>>
>>102895548
Don't bully Seraphina for testing pls.
>>
>>102895595
Good night Miku
>>
>>102895738
That's her only use
>>
>>102895678
It's just the 3 buttons (speech-to-text, ssl, semantics) and then training. Don't really need a guide for that.
The default training settings on the gradio UI are fine but I save epoch 12-16-24 on SoVITS for testing as that's the sweet spot range.
Next thing that matters a lot is the ref audio you pick, and you can also drop your entire dataset into the "multiple references to average tone" box, which can improve the voice
>>
>>102895580
NTA but how much time did it take to finetune?
>>
>>102895772
Like 4 minutes or something for both weights on a 3090
>>
>>102895723 (You)
notice how they don't deny it? notice how everyone agreed these models were slop until they started taking donations? brace for nothing but paid shills soliciting donations in /lmg/ for the next two weeks.
>>
>>102895758
>you can also drop your entire dataset into the "multiple references to average tone" box
Where can I find this in the gradio? Looked everywhere and haven't found it. I'm using the Windows integrated package from https://github.com/RVC-Boss/GPT-SoVITS.
>>
File: 2024-10-19 23_18_21.png (125 KB, 1523x917)
125 KB
125 KB PNG
>>102895809
For reference:
>>
>>102895818
Actually, this "beta" version they left in the readme probably isn't the latest version, huh? And the latest version probably has what you're talking about. Well...
>>
File: 1712724273031226.png (199 KB, 1337x1074)
199 KB
199 KB PNG
>>102895818
>>102895869
I'm on commit b3e8eb40c25bea1b9977195a380d40c941040419
>>
File: 1607734063051.png (228 KB, 387x409)
228 KB
228 KB PNG
>>102895809
>>102895880
>outdated readme linking to their beta version
>user guide that THEY link on THEIR page is trash (yes it's that same guide in >>102895592)
Genius gatekeeping or chingchong retardation?
>>
I'm being really vague, but I'm thinking LLM could be used for "flavor" in a game with semi defined goals and main endings. Not something for the model to rawdog from start to finish, rather there should be an engine to keep track of things and hold things together. Of course, this would require writers and coders who know what they're doing (and I am neither).
I guess summarize and evaluate statuses at certain checkpoints. Settings on whether the user's inputs are just suggestions. Ending could display an overview on the user's overall playstyle: pragmatic, creative, evil, etc., and highlights of the run.
>>
>>102895934
the readme's 7z is out of date by only a month relative to when they added averaging tone. Better to just git pull for projects like this
The guide is fucked though
>>102895997
LLM can't be trusted with handling game state. It definitely work much better if used as "the thing that generates dynamic flavor text" given a textual representation of the current state
>>
File: GaS2PwOXYAA0hjN.jpg (79 KB, 672x756)
79 KB
79 KB JPG
>>
>>102896218
Miku stew for me and you
>>
>>102894926
if you can read at 2.1 tokens per second you spend ~5% of your time waiting during a 2t/s stream but ~50% of your time waiting during a 1t/s stream; the difference in feel going from genning slightly slower than you can read to slightly faster than you can read is hugely impactful since the time you spend waiting when you overtake the text goes down to zero
>>
Why aren't there any LoRAs for LLMs? Do they suck compared to fine-tuning or something?
>>
>>102896387
anything too small to be worth a finetune can be done just by dumping shit into context
>>
>>102896387
Most finetuned modes just apply a lora. But given the subtle differences between all model types (even same size from different companies or different size from the same one) it's not as simple to make general use loras. There's many more architectures for llms than for image.
>>
>>102896387
from the op:
https://rentry.org/llm-training#low-rank-adaptation-lora
>>
>2024 nearly over
>still no good reason to stop using mixtral
>>
>>102896445
this, especially for multilanguage, I don't like RP in english desu
>>
>>102896445
Check back in 2026 for mistral with bitnet
>>
>>102896579
>bitnet
it'll be a year since bitnet has came out, why no one dared to make a big bitnet model? that's fucking insane...
>>
>>102892648
>I'm back from a 3 month coma
WTF are you talking about, Petra?
You've just been spamming other shit like blacked Miku in the meantime.
>>
>>102896587
Or they dared and it was so bad they didn't publish results to avoid embarrassment
>>
>>102896576
Same.
English RP data sends shivers down my spine, and I wouldn't have it any other way.
>>
>>102896576
>>102896844
Is it actually better in other languages?
>>
>>102896874
Prose-wise it's better and missing GPT-isms, but you start to encounter grammar issues if model is under-trained on that language. So you really need to find a model that works for yours.
>>
>>102896874
>Is it actually better in other languages?
I just prefer to rp on my native language
>>
File: 1609958786054.jpg (94 KB, 343x361)
94 KB
94 KB JPG
>>102895580
Holy shit man. I just got it trained on my measly RTX 2060 (took 40-50 minutes) and this is fucking insane. Actually peak AI voice. This is it. Thank you.
>>
>>102896944
Can we have a rentry spoonfeed guide for retards such as myself, or is it too early?
>>
>>102896971
Basically, follow this guide https://rentry.org/GPT-SoVITS-guide
but ignore the recommended settings for training and instead listen to >>102895758 >>102895580
And make sure you get the latest version https://github.com/RVC-Boss/GPT-SoVITS/releases
>>
>>102896980
Thanks anon.
>>
>>102897209
>>102897209
>>102897209
>>
>>102891283
>288.0 GB/s
One of the worst GPUs for local AI, what was NVidia thinking?
>>
>>102895148
fish-speech is pretty good and very fast. I don't like how unpredictable XTTSv2 results are. Fish never fails that hard
>>
>>102897326
Probably about selling to gamers and if AI bros don't do their research, it's free real estate.



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.