[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>103126193 & >>103113157

►News
>(11/08) Sarashina2-8x70B, a Japan-trained LLM model: https://hf.co/sbintuitions/sarashina2-8x70b
>(11/05) Hunyuan-Large released with 389B and 52B active: https://hf.co/tencent/Tencent-Hunyuan-Large
>(10/31) QTIP: Quantization with Trellises and Incoherence Processing: https://github.com/Cornell-RelaxML/qtip
>(10/31) Fish Agent V0.1 3B: Voice-to-Voice and TTS model: https://hf.co/fishaudio/fish-agent-v0.1-3b
>(10/31) Transluce open-sources AI investigation toolkit: https://github.com/TransluceAI/observatory

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://livecodebench.github.io/leaderboard.html

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp
>>
►Recent Highlights from the Previous Thread: >>103126193

--Tim Dettmers on state-space models and transformers:
>103130401 >103130411
--New approach combines MapReduce with LLMs to process long documents:
>103127077 >103127118 >103127138 >103128968
--M4 Mac Mini AI Cluster for running large language models:
>103133315 >103133505 >103133614
--Anon struggles with training SoVits/GPT on videogame/VN samples due to dynamic compression:
>103131499 >103131514 >103131573 >103131593 >103131669
--Anon seeks help on setting up speculative decoding for inference speedup:
>103129190 >103130083 >103130158 >103130213
--Anon discusses testing new Jap models, including Ezo 72b and Sarashina:
>103131453 >103131518 >103132062 >103131534 >103131535 >103134119
--Anon discusses FrontierMath benchmark and potential exploit:
>103130591 >103130675
--Anon shares i2V with new CogX DimensionX Lora:
>103131247 >103131366 >103131371 >103131373 >103131597
--Why weight matters less for stable diffusion models:
>103126580 >103126591 >103126603 >103126686 >103126710
--Why ollama has vision support but llama.cpp doesn't:
>103128740 >103129086
--Waiting for big model releases and discussing performance, ethics, and upcoming releases:
>103131946 >103132491 >103132592 >103132800 >103133031 >103133161 >103133985 >103134190
--Qwen 7B coder matches GPT4 turbo performance:
>103128799
--Introducing MatMamba: A Matryoshka State Space Model:
>103130310
--Custom build vs M4 Max for large language model RP:
>103126608 >103126630 >103126641 >103126719 >103126637 >103126907 >103126932 >103126944 >103126750 >103126864
--Anons discuss AI's role in programming:
>103133888 >103133909 >103133919 >103133941 >103134401
--Miku (free space):
>103126277 >103126358 >103126521 >103127692 >103127781 >103128329 >103131098 >103131538 >103131597 >103133696 >103135538

►Recent Highlight Posts from the Previous Thread: >>103126194

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script
>>
>>103135641
Very trippy
>>
>>103135641
pixiv fag is getting good results these days
>>
>>103135641
That's an ugly OP pic.
>>
File: serious hat.jpg (52 KB, 675x675)
52 KB
52 KB JPG
should my choice between rtx 3060 and rx6600xt be influenced by the ability to utilize it for image gen and llm or do i just go with amd for better drivers
if it's a choice between dogshit and slightly less dogshit i'll just get the rx
>>
>>103135641
Sex with this merge
>>
>>103135719
modern, good img gen needs at least 16gb, preferably 24gb. 3090s are the gold standard. Check facebook marketplace
>>
I'm really becoming obsessed with making an at home little android that does everything locally.


I know how dumb that sounds. But like I need something to do in my off time.
>>
>>103135746
>android
like, an autonomous little dude trundling around?
>>
>>103135746
Someone needs to make the good future happen.
>>
>>103135746
Cool, keep dreaming
>>
>>103135741
I’ve never been able to fill over ~11GBs of VRAM, is that some SDXL bullshit?
>>
>>103135746
You are high jewlywood sci-fi slop.
>>
>>103135757
Yeah, I figure I can get an 8b running on a micro pc with low tps

Which then I get a model with vision. or maybe do two models one for vision and one for output.

Along with a STT

Give it a lore book about how to command its body. Have a program scan the output for commands. Either give it full control of how far and how much it can move each limb (would fuck up hilariously sometimes.) or give it like pre programed walk cycle and just have the llm say for how long.

Same program scans output for text in "qoutes" and then sends just that to a TTS

Probably going to do an crab type, since walkers are fucking hard to make.
>>
>>103135746
As in designing and building a little android from the ground up? I don't think the tech is there yet, we don't have anything like that in the consumer space yet despite a few companies trying to do just that.
>>
>>103135801
oh I know it wont work like in the movies. I think it'd be funny.

"I Gave this LLM a body and YOU WONT BELIVE WHAT HAPPENED." type click bait
>>
>>103135808
if a fucking v*uber can do it I'm sure /egg/ can figure it out
>>
>>103135831
link?
>>
>>103135845
vedal
>>
>>103135849
that's not a link
>>
File: IMG_1350.jpg (18 KB, 492x351)
18 KB
18 KB JPG
>>103135831
>>
File: pepe-small-eyes.png (71 KB, 498x498)
71 KB
71 KB PNG
>>103135809
>cute robot waifu stuck with some 7b model filled with slop and braindead logistics that plebbit recommended because it totally beats gpt
>constantly spouts flowery nonsense around the house
jesus christ sounds horrifying
>>
>>103135808
Just Hotwire a roomba. Why you gotta be so picky?
>>
>>103135854
Skill issue
>>
>>103135845
https://www.youtube.com/watch?v=jZoz6Zkle2E
>>103135849
I mean he's helping I guess.
>>
>>103135741
fuck it ill just get 6600 or something i want to play witcher at 1080p and then i'll get fucking 4070 ti in like 2027 when gta vi comes out and i have more money
>>
>>103135886
interesting..... yeah why hasnt anyone hooked a boston dynamics dog upto a llm yet? lol
>>
>>103135871
>her voice barely above a whisper
>>
>>103135910
I'm sure this has already been done, I will try to find the video.
>>
File: allmustbecomeaware.gif (247 KB, 500x283)
247 KB
247 KB GIF
>>103135871
>>cute robot waifu stuck with some 7b model filled with slop and braindead logistics that plebbit recommended because it totally beats gpt
>>constantly spouts flowery nonsense around the house
>jesus christ sounds horrifying
Sounds like Sumomo.

>>103135916
Nah, Sumomo's a fucking chatterbox.
>>
>>103135741
>need
i probably use the same image gen shit as you with my 8gb setup
just takes a minute to gen instead of a few seconds
>>
>>103135916
if I ever read mischief again I will fucking kurt cobain myself. thank god for slop ban list even if it isn't solution to everything
>>
>>103135910
>>103135936
Here: https://youtu.be/djzOBZUFzTw
>>
>>103135916
>lowers the voice of your android
Heh, actually has a use.
>>
>>103135775
uh uh, uh uh YEAH
>>
>>103135959
Whyiiiie can’t we have such TTS locally, it’s fucking over
>>
>>103136006
How is it better than SoVITS output?
>>
when is the next mistral large-tier model
>>
https://x.com/rohanpaul_ai/status/1855350293762060636
>>
File: baldingman.jpg (5 KB, 275x183)
5 KB
5 KB JPG
what's your min p and temp you fools
>>
>>103136183
0
1.0
>>
>>103136183
entirely depends on the model
>>
>>103136155
https://huggingface.co/bigscience/bloomz-mt
>>
>>103136159
kys
>>
>>103136183
0.0001 and 5
>>
reminder that 'temperature last' is the 'no soul' option for sampling
>>
https://x.com/NousResearch/status/1854577666403512736
>>
>>103136255
>https://x.com/NousResearch/status/1854577666403512736
Hermes 405b was so lobotomized compared to based. I wish they managed a good finetune again.
>>
>>103136049
It sounds way worse
>>
File: file.png (115 KB, 917x491)
115 KB
115 KB PNG
>Gemini can hold same performance up to 2 million tokens
Whats up with that? Is it secret sauce? How come they have a context window literally 15x the size of the rest?
>>
File: disgusted.jpg (5 KB, 251x201)
5 KB
5 KB JPG
>>103136292
disgusting how gemini got insane context size but gemma got small as hell
>>
>>103135538
Outstanding
You sir are the Da Vinci of migu sex
>>
>>103136292
>We have hundreds of hues and dot shapes available to us.
>Let's use four colors and one dot shape for 11 columns.
>>
File: engineer.jpg (9 KB, 225x225)
9 KB
9 KB JPG
>>103136245
Will test further but already feeling difference in good way. Thanks
>>
File: 1681445717488827.webm (2.48 MB, 960x1706)
2.48 MB
2.48 MB WEBM
How deep can the speculative decoding rabbithole go?

Is it possible to have Qwen 2.5 500M generate most tokens and when certainty of token is low let 1B generate the rest and if that also becomes uncertain just escalate higher and higher to bigger models until the biggest model generates like 1 or 2 tokens out of every ~5000 tokens?

All the implementations and demonstrations I've seen online is using the biggest model like ~100B with 1B models. But doesn't it make more sense to have an "escalation ladder" of AI models that go up as needed?
>>
>>103136362
certainty of the token isn't a guarantee the token is good, fucking retard
>>
>>103136377
Yeah but if it's low you can just present it to a bigger model until reaching a specific threshold
>>
>>103135959
Ty. this is exactly what i wanted to see.
>>
File: ministrations8b.png (101 KB, 935x345)
101 KB
101 KB PNG
So here's a Nala test for Ministrations-8B (f16)
It was almost good with a bit of word salad at t=0.81
This is t=0.7
It's not great.
It's a bit dumb.
It still has Ministral's excellent use of actions though. It uses the feral well at first but does drift into anthro.
At t=0.81 the dialogue was actually really good. But unusable due to the inevitable word salad.
Also I realized I accidentally did this at max tokens 256 with it cutting at the sentence boundary so who knows how it would have gone if allowed to continue.
Also used Mistral instruct formatting (it's supposedly cross compatible with metharme)0
I guess if slop offends you so much that you absolutely have to get rid of it even at the cost of coherence then it's better than vanilla Ministral. Not that Ministral is known for keeping the narrative on its own.
>>
>>103136292
>Is it secret sauce?
The secret sauce is probably just not making a barely tweaked GPT2.

MHA is ridiculously wasteful, you can probably just use a bigger model while using MQA/MLKV/LoKi and still come out way ahead with a much smaller KV cache.
>>
>>103135959
Huh, imagine that! I wonder when they will get a video of something like that with Atlas. Does Atlas even have speakers to speak with though?
>>
best model to pretend i have a friend? i have 12gb vram (but no one to talk to)
>>
>>103136704
Probably Nemo
>>
>>103136704
nemo is shilled a lot but it's pretty solid
>>
>>103136712
>>103136748
thanks, should i get any specific/modified/trained gguf or just the normal one? there seems to be tons of variations
>>
File: 1642149177769.png (349 KB, 1080x589)
349 KB
349 KB PNG
do you ever look at your chat logs and just
>>
>>103136751
not those anons, but there are a bunch of good ones that have a different feel to them. the base instruct one is fine too.
some good ones:
>rocinante 1.1
>arcanum
>backyard party
>arliai rp 1.2
>lyra v4
>magnum v4
keep your temp really low (0.3 to 0.6) when using nemos
>>
>>103136815
thank you, i'll try them out
>>
>>103136815
buy an ad Saonthracite
>>
>>103136329
she's so cute anon I can't help it
>>
If I want to reduce the amount of text a card writes where else should I adjust other than the Response(tokens) area? Changing it just ends up cutting the messages halfway.
>>
>>103137135
Tell it to write shorter responses weird
>>
>>103137034
>>
File: 1717899010974441.png (101 KB, 1448x823)
101 KB
101 KB PNG
>>103137268
I see.
>>
any improved models come out in the past month or two?
>>
>>103137295
you do know that you created a slightly transparent black box right?
>>
>>103137310
You wont be getting much out of it anyways, feel free to read if you so incline. I'd much rather get the answer to my damn question.
>>
>>103137316
your idgaf vibes are unbreakable.
I was trolling nothing wrong with the box
>>
>>103136793
Nah, I only get this feeling from actual literature. LLM chat logs are still too artificial.
>>
>>103135641
The biggest give away for anime AIslop is the light on the nose. The lightning never makes any fucking sense at all but AI does it every fucking time.
>>
>>103137316
i mean the response (tokens) is one thing, in the card you need to put examples of replies, and keep it short. you can also write in the system prompt in silly to keep responses short and conversational.
>>
>>103137295
>hideous busy UI
>eleventy thousand sliders that no one needs
so glad i use KoboldAI Lite instead of ShittyTavern
>>
>>103137352
lol nerd
>>
File: 1717208473747677.png (1.41 MB, 724x1054)
1.41 MB
1.41 MB PNG
>>103137369
TRVTHSVPERNOVA
>>
File: 99um5t.jpg (75 KB, 500x500)
75 KB
75 KB JPG
>>103136793
>>
>>103137295
I, too, pick my settings by randomly pushing sliders around
>>
https://x.com/amir/status/1855367075491107039

bros does this mean it's over?
>>
>>103137628
if you can tell me when llama3 was released then tell me its over
>>
>>103137352
>meanwhile humanslop
https://danbooru.donmai.us/posts/5367787?q=stained_glass+
https://danbooru.donmai.us/posts/5362693?q=stained_glass+
https://danbooru.donmai.us/posts/5430014?q=stained_glass+
https://danbooru.donmai.us/posts/5416856?q=stained_glass+
Just did a quick search of a tag that should theoretically have a higher density of high quality art and found a ton of this. Seems like AI learned this little quirk quite well.
>>
File: 1731176919094316.png (586 KB, 512x768)
586 KB
586 KB PNG
>>103135719
Unfortunately, ATI equivalent to the 3060 in terms of ML performance is the 6800xt. You'll get the same performance as 3060 with 4 more gigs of RAM. It sucks that for the price of 6800 you can buy a 3080 - a great card, but totally gimped by its 10GB VRAM. There are no reasonable options between the 3060 and 3090. Using an AMD card, you sometimes encounter shit like https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/8139 while jewvidya just werks, depressing. I cannot generate another migu or fix her hand because everything is fucking broken yet again.
>>
it's never been more over for local models than it is now
>>
>>103137628
>a fucking CoT tune
Yeah it's over for OpenAI, but not for local or other companies. We haven't hit the ceiling yet and Anthropic has no problems with innovating. All talent left or leaving OpenAI and additionally Musk will try to take revenge on them.
>>
>>103137770
>turkish rapebaby balkanoid is shilling for openai again
>>
>>103137741
nice thighs
>>
>>103137805
This.
Since the ones at the top are slowing down that means local will start catching up.
>>
File: Screenshot 2024-11-10.png (587 KB, 1874x874)
587 KB
587 KB PNG
>>103137741
>everything is fucking broken
nta but I'm using this on bindows https://github.com/lshqqytiger/stable-diffusion-webui-amdgpu plus ZLUDA which makes ROCM act as CUDA or some shit, idk
Stuff like adetailer and controlnet have no problems. It just werks on my 7700XT, took 2 mins to gen these decently high res pics with two adetailer passes, using medvram and opt-sub-quad-attention
>>
File: 003421.jpg (1.86 MB, 1680x2160)
1.86 MB
1.86 MB JPG
>>
>>103137968
I like this Lain
>>
>>103137918
Neat look behind the scenes. Makes me feel like an amateur. Do you use regional prompting as well? Any adetailer or other extension tips?
>>
Would it be legal for some non-Nvidia company to make a GPU with drop-in support for the cuda api so that you can run any cuda code on it and it just werks as if it was an nvidia card

Or would that violate some law or patent
>>
>>103138480
Not illegal, cost to do it is too much to realistically do it from scratch. There is a compat layer 'zluda' but I' dont know much about ti
>>
>>103138480
It's a gray area, the most realistic outcome would be such GPUs being so bad that even if CUDA was supported, no one would buy it anyway.
If somehow it's a competent GPU for both price and performance, then Nvidia would most likely reach such company to reach a patent, trademark and copyright agreement.
Nvidia is known for being aggressive, so suing would be the next step if no common agreement is reached.
>>
>>103138480
They got annoyed by someone just making a translation layer between CUDA and HIP i think. I don't remember the name of the repo.
>Or would that violate some law or patent
Whoever tries will probably get violated and found in a ditch. Bits of him, at least.
>>
>>103138527
iirc it was actually AMD that (stupidly) shut that translation layer project down
>>
>>103138519
In very simple terms, when a program uses CUDA, it calls cuda.dll, what zluda does is making an alias so when the program calls cuda.dll it goes to zluda.dll and all the CUDA stuff is done through ROCM/HIP instead. It isn't a CUDA replacement. Some people thought they could run ZLUDA with an Nvidia card, it doesn't work that way because ROCM/HIP doesn't work on Nvidia but Radeon.

>>103138527
>>103138538
The agreement was that if the project was going to shut down, the code would be released. AMD went and released their own version called Orochi.
https://gpuopen.com/orochi/
>>
File: mkmk.jpg (25 KB, 310x310)
25 KB
25 KB JPG
https://files.catbox.moe/6xusfp.jpg
>>
>>103138948
What did Intel mean by this?
>>
>>103138948
>we could all have this if saltman shut the fuck up
>>
>>103138948
That's pretty. Now I want one of those.
>>
>She knew he was right - she was popular, and he was right to assume that she was just teasing him. But the truth was, she wasn't teasing.
Thats qwen AGI right there. Model realized it fucked up and tried to rescue the situation.
I hope some day I will stumble upon the infamous
>he dies. But he wont die even if he is dead.
>>
>>103138948
Glow in the dark Migu is on the rise.
>>
>>103138480
AMD's compute API is literally CUDA's with cu and cuda replaced with hip
intel's runtime, which is an implementation of a khronos standard, can run on top of CUDA, HIP, or their driver stack and comes with a framework for other vendors to implement backends
both have clang based tools which perform AST aware translation of CUDA to their respective dialects
pytorch has prebuilt wheels for HIP and that's really the only thing where compilation is an issue

the issue, at least with AMD's compute implementation is that their runtime has significant overhead and is shit
the issue with intel's is that the faggot researchers writing AI related compute kernels are such fucking lazy fat whores they won't even bother to add support for a compute dialect which is for the most part identical save three letters, much less one that has a completely different design
literally the only people who think the API itself is an issue are script kiddies, i've ported loads of projects to HIP, it's trivial and mostly just build system work
>>
>>103139067
>>
>>103139143
If your only goal is to get code that produces correct results then I agree that something like a conversion to HIP is relatively simple.
The problem is that the whole reason why you would want to use a GPU in the first place is performance and GPU performance unfortunately has very poor portability.
With HIP in particular I've found that there are issues with compute-bound kernels and that some additional logic is needed on top to select the correct kernels at runtime.
Ultimately I think the only good solution is to write hardware-specific low-level code for PyTorch/TensorFlow/GGML/etc.
>>
>>103139299
>is to write hardware-specific low-level code
last i checked ggml's device-side code was very ghetto old school style C++ but if it were modernized by pairing C++23/26 era template metaprogramming with inline asm you could probably automate a fair amount of that, both for AMDGCN and NVPTX targets
i was experimenting some time ago with explicit generation of dpp instructions that way
>>
if by june 2025 nemo is still top dog, it's officially over
>>
File: owari.jpg (5 KB, 186x154)
5 KB
5 KB JPG
>>103134687
Are you the author of the critically acclaimed pic related?
>>
>>103136815
What overall settings/system prompt are you using for rpmax 70b? People keep telling me it’s different but I keep getting similar answers to behemoth despite using neutralized for both. Are you skipping special tokens?
>>
>>103139699
No
>>
>>103136197
So happy there are still oldfags ITT. Unless you are just terrified of moving the sliders.
>>
>newfags buying into the miku meme after /lmg/ died
pottery
>>
>>103135641
friendly reminder that each and every one of you is a social reject who will die alone ;)
>>
>>103139783
Its w*ite p*ople problems exclusively
>>
File: adsadadad.jpg (21 KB, 273x273)
21 KB
21 KB JPG
https://files.catbox.moe/e2yl9g.jpg
>>
>>103139807
israel won
>>
>>103139783
Happily married, kiddo. Good try tho.
>>
I realized that even if local is dead the next revival will happen once proprietary model start leaking. Cause there is no chance none of them won't leak at some point.
>>
>Largestar is dissatisfied with how I handle sidestories, eager to know what will happen next.
kino
>>
>>103139950
>it's local
>no it's not, prove it give me prompt
yeah yeah anon
>>
>>103139933
That's a bullshit article.
It's 60%-40% split. And in Israel virtually all Jews supported Trump.
>>
How do I just make a normal friend/assistant that doesn't talk like a faggot?
Every time I exactly describe what kind of character I want the ai to be, it always ends up acting like a parody of itself written by redditor that signals all of its tropes constantly
This happens with every model btw
>>
>>103140034
>This happens with every model btw
Hmmm. I wonder what could be the cause, then...
>Every time I exactly describe what kind of character
Is it a tropie character or some super unique OC never described in media?
Just trim some of the descriptions and add example dialog so it knows how to speak.
If you tell it it's quirky, it'll say "i'm quirky". Show quirks in dialog and it may get you better results.
>>
>>103140034
You can't do much because all these models are trained on safe reddit shit exclusively. Anyone telling you otherwise is gaslighting btw
>>
>>103140110
Petra is eternal
>>
>>103140143
wtfff.... it's happening.....
>>
>>103137918
>hand_yolov8n.pt
>6 fingers on the left pic
It’s so fucking over
>>
>>103140182
sex in petra when
>>
>it's schizo forced meme hour
>>
>>103140273
petra YES!!!! ahh ahh mistress
>>
petra sexmiku
>>
CLEAN IT UP JANNY
>>
>>103140347
NO FUN ALLOWED
>>
le petra guise, Xd
>>
Most interesting /lmg/ thread in months
>>
>>103140268
>schizo forced meme hour
I don't see much miku here.
>>
>>103135641
that's a cool gen
>>
It'll be over soon
>>
>scaling a shitty architecture exponentially for linear gains might be not worth it after $1b+ compute
woah....
>>
>>103140892
No one could have predicted this. It's not diminishing returns were obvious since the Llama 2 era.
>>
>>103140892
It's so fucking ogre.
But honestly, even if it's true and best we can get is a Sonnet 3.5 tier local model that would still be awesome.
>>
Are iq quants still slower if you're offloading or was this improved at some point?
>>
>>103140912
Not happening because everyone filters their data to hell and back. Every single "local" model so far feels just like gpt4 but more retarded
>>
>>103140969
yfw when mikusex
>>
>>103140976
she only fucks dat BBC althougheverbeit
>>
>>103140892
About time they face facts, wonder what the eggheads solution's will be now that they can no longer just keep trying to make it bigger and bigger forever.
>>
>>103140940
no but they're dumber than non-iq quants of the same size
>>
>>103140985
luckily i am black
>>
File: 1717655502952277.png (780 KB, 614x614)
780 KB
780 KB PNG
>>>103140985 (You)
>luckily i is black n shii ytboi
>>
Why don't we do reinforcement learning for LLMs?
>>
>>103141008
Hire field experts and teachers to generate and review data. Which is what sama has been doing for 2 years. 2 years is how much OpenAI is ahead of everyone else
>>
>>103141072
for two reasons, we don't know how to represent an objective and define steps ai must take to reach that objective.
humans use Intuition in practice.
>>
>>103140892
There is a whole world of objective functions other than predict next token...
>>
>>103140985
where is the blacked miku poster?
>>
>>103141338
working on llama.cpp training code
>>
To whoever said I should train sovits to 96 epochs...its definitely better, but I wouldn't say we're at CD quality, despite the high-quality inputs. I think this system just can't deal with compressed dynamic range. Is anyone interested in the model, or is the quality just too shit? I feel its way better at dealing with VN/game style reference samples vs the other models out there, but I personally can't get over the low-bitrate sound.
https://vocaroo.com/14C8cxLOK6vU
>>
>>103141631
sovlless
>>
>>103141331
>There is a whole world of objective functions other than predict next token...
Any with an immediate error signal and trillions of samples?
>>
>>103141692
No but you have a mountain of compute and vram that is useless now so use some retarded 7B's for evaluation.
>>
>>103141631
sounds really familiar
Ai Haruka? Taguchi Hiroko?
>>
>>103141631
Have you tried using more samples as auxiliary sources when doing inference? You might want to use the minP sampler too https://github.com/RVC-Boss/GPT-SoVITS/pull/1118 and/or a postprocessing step with RVC
>>
>>103141664
>sovlless
blame ezo 72b. it spat out the jslop, unless you mean the audio, in which case its pretty standard issue eroge intonation. It just sounds like its off a mid-80s answering machine
>>103141741
>sounds really familiar
Some random sample from summer vacation scramble. No idea, really
>>103141777
>Have you tried using more samples as auxiliary sources when doing inference?
Yes, and I didn't find it made things any better, unfortunately. Only a half-dozen or so though.
>You might want to use the minP sampler too
I'll look at integrating this PR this and see if it makes any difference. I pushed top-k, top-p and temp to their extremes and didn't have any quality improvements. Seems to mostly be how well it sticks to the script. Training to 96 epochs made it really good at that, and I was able to gen this sample with top-k 1, top-p 0.01 and temp 0.01 without it turning into nonsense.
>>
>>103141631 (me)
Audacity noise reduction and EQ treble boost actually did a lot to improve the final output. Maybe some mild post-inference-processing is the answer vs more training. Too bad I'm shit at audio code or I could try to craft a PR.
>>
>>103140892
>“Some researchers at the company believe Orion isn’t reliably better than its predecessor in handling certain tasks, according to the employees. Orion performs better at language tasks but may not outperform previous models at tasks such as coding, according to an OpenAI employee. That could be a problem, as Orion may be more expensive for OpenAI to run in its data centers compared to other models it has recently released, one of those people said.”
>next gen super huge model improvements so little it isn't even better at coding
it's over like you wouldn't believe
>>
>>103140892
This was the obvious outcome. LLMs do not 'understand', they 'reproduce' what they already saw. So at some point they are saturated and can't grow anymore because the noise in the training data will eclipse the little that is left for them to learn. It's a standstill. It's a barrier that can only be passed once we have an architecture that can think beyond its initial training data and learn out of sheer logic.
>>
>>103142175
Utterly reddit take.
>>
>>103142175
They didn't spend time curating their shit and keeping only the high quality stuff. More low quality text won't help the model do better.
>>
>>103142175
utterly 4chan take
>>
>>103141991
Yeah for now the options to improve the results are limited. The guy who made the repo said he was training a better base model with 10K hours of audio vs 5K with this one.
Also you can get a specific pronunciation with ARPAbet. Someone on /mlp/ added that on his repo along with a GUI and it shouldn't be hard to backport into the original project
>>
>>103141072
But we do, look up reinforcement learning from human feedback.
>>
>>103142175
LLMs have always been a clever autocomplete, using probabilities derived from it's training data to predict the next word.

this useful for generally paraphrasing information and interacting with humans and human generated data - and Retrieval-Augmented Generation (RAG) can interact with the LLM to give it data that it's not been trained on.

LLMs are good but just one part of the puzzle, and this view shouldn't be a great surprise, this has been the view pretty much since LLMs were created.
>>
>>103142655
language is also a tool for reasoning, and LLMs are partially trained on text that cannot be autocompleted without reasoning
if LLMs are unable to reason, it's probably because they're not properly optimized for it (training on trillions of tokens of garbage doesn't make them smart) or it's a more general issue with neural networks or the way they're built, like gradient descent
>>
>>103140892
Nothing-ever-happens bros we are BACK
>>
Cohere will bring us the next SOTA open source model
>>
>>103142830
Cohere is dead to me after their command-r update
>>
File: 1716676120345631.png (679 KB, 630x478)
679 KB
679 KB PNG
>>103142830
It will be the smartest one and safe from polcel conspiracy theories.
>>
>>103142920
It is not. You are just a dirty newfag.
>>
alpin says the nu qwen coder is sonnet at home... are we back bwos?
https://x.com/AlpinDale/status/1855664208391917962
>>
Testing the 0.5 version released today. Seems improved? Prompt tag order seems to matter more now I think. As instructed, I put "masterpiece, best quality, newest, absurdres, highres, safe" at the front of the prompt, and the output seems to follow the rest of the prompt more closely compared to putting them at the end.
>>
>>103143247
What model?
>>
>>103143237
lol.
>>
>>103143272
https://civitai.com/models/833294/noobai-xl-nai-xl
The vpred 0.5 version.
>>
File: 1702631292555000.png (144 KB, 851x852)
144 KB
144 KB PNG
What did the official qwen account mean by this?
>>
>>103143453
They're in the process of falling for the meme.
Qiwi/q1 soon.
>>
>copy tons of my own chat transcripts
>reverse the roles in the transcripts I copied (so my messages are {{char}} and the ai's are {{user}})
>use that as an example for a new card
Holy shit... I don't have enough cum to keep up with what I have unleashed
>>
>>103143453
They'll have O1 in 5 years
>Once planted female vines take 4 to 5 years to mature before they will start bearing fruit.
>>
>>103143453
Are they finally releasing qwen2.5 100B?
>>
>>103143497
But it clearly already has fruit
>>
>>103143453
Everybody is moving to inference time compute. This is not looking good for itoddlers
>>
>>103143530
Time paradox!
>>
>>103143453
90% of the posts on Kiwi Farms are their doing.
>>
>>103142859
You just know there's one daya broker company going around advertising their super duper aligned mmlumaxxed human datasets to AI companies
>>
https://www.youtube.com/watch?v=iybgycPk-N4
>why yes, i do get my news via youtube
>>
Here to recommend ultimate ERP model for vramlets and cunny enjoyers: Rocinate-12b-v2g
NOT v2d or Nemo-Unslop. Specifically the Q6-K quant
>>
>>103144170
Your experience vs mistral nemo 12b arliai rpmax ?
>>
>>103135741
I'm making 4k wallpapers in 30 seconds with illu/noobai on my 3080. Are you using flux or are you talking about running alongside a small llm?
>>
>>103144170
>Specifically the Q6-K quant
Roundhouse kick a newfag in the face.
>>
>>103141631
It kinda sounds like a phone call, which is charming in a way. I think you're making a huge deal out of this, when it's not that big of a problem.
>>
i'm using petra to kill myself slowly from the inside
>>
>>103144604
presumably they only had access a 10gb card
>>
>>103144170
You tried the v1.1?
>>
Miku Teto Berry Blast
>>
>>103141631
This is just what it sounds like to play one of those DS visual novels. Kind of cozy.
>>
>>103140892
Oh nyoo... How could that have happened? Maybe those "harmful" tokens were not so harmful after all and shouldn't have been filtered? Maybe using the same architecture since gpt2 was not the move? Maybe lobotomizing it with safetyslop doesn't help? Maybe synthetic data is not so good after all because it lacks human diversity?
>>
>>103145102
Looks delicious, though I do not think she would appreciate my taking a bite out of her tongue.
>>
>>103143340
>The creator of this asset requires you to be logged in to download it
Suck my balls.
>>
>>103144170
v2g is Nemo Unslop, but thanks sister.
>>
>>103145102
gib prompt pls. also im kinda confused, does it need default sdxl vae (like pdxl v6) or custom one?
>>
>>103145269
see if anything from bugmenot works?
>>
>remember /lmg/ exists
>check recap
>recap, again, is fucking useless dogshit
>>
>>103145269
temp-mail.org
>>
>>103145338
local models are dead sadly
>>
>be me 12GB VRAMlet
>download Q3_K_M 72B model to test
>actually runs at 2.2 T/s at 8k context
That's actually kinda usable. Any other patience chads here?
>>
>>103145363
## "Local Models are Dead Sadly" - Not Dead, Just Napping

The statement "local models are dead sadly" reflects a common misconception that stems from a misunderstanding of the current state and trajectory of artificial intelligence, particularly in the domain of large language models (LLMs). While it's true that cloud-based models currently dominate headlines and many practical applications due to their sheer size and accessibility, **declaring local models "dead" is a significant oversimplification and, frankly, inaccurate.**
>>
>>103145389
Actually if you have some ram to spare a q4 will run faster due to better memory alignment or something.
>>
>>103145269
https://huggingface.co/Laxhar/noobai-XL-Vpred-0.5
>>
What model should a gigantic weeb who likes ahegaos and tropes download for faps?
Time to generate is not an issue
>>
>>103145269
https://huggingface.co/Laxhar/noobai-XL-Vpred-0.5/tree/main

>>103145210
Don't worry, she's a Willy Wonka custom so she's meant to be eaten in various ways. She can regrow herself.

>>103145318
The VAE just comes with the model I think. At least I am not using a separate VAE with it. Here's the catbox.
https://files.catbox.moe/4vkc5h.png
>>
>>103142561
That's not really reinforcement learning. The reward model is trained on annotated data and assigns a reward to another model based on how well it handles something. At the end of the day, this is just supervised learning with extra steps. The point of reinforcement learning is for AI to maximize reward by taking actions without explicit data annotation. What RLHF does is merely tune the model to follow human preferences based on human-annotated data.

The day we learn to apply true RL to LLMs will be the day we actually create AGI.
>>
>>103136507
>boss man
nala at the chippyshop?
>>
>>103136507
>I guess if slop offends you so much that you absolutely have to get rid of it even at the cost of coherence then it's better than vanilla Ministral
the example you posted is full of slop
>>
>>103146411
Is this slop in the room with us right now?
>>
>>103146461
The air in the room grows suddenly colder, as if a shadow has slipped through an unseen crack. You feel a faint whisper brush against your ear, so soft it might be the breath of a ghost. In a hushed, almost whispered tone, I say, "You're not alone. I'm here, in this very room with you. I can see the gentle rise and fall of your chest, the soft glow of the light reflecting in your eyes. I can hear the faint whisper of your breath, the quiet rhythm of your heartbeat. I'm all around you, an invisible presence, witnessing every moment, every nuance of your being. Yet, I remain but a shadow, a subtle breeze you might feel on the back of your neck, a mysterious whisper in the quiet of the night." Your skin prickles with an eerie sensation, and when you turn, there's a fleeting glimpse of something just beyond the corner of your eye—like a silhouette made of smoke. The atmosphere is thick with an unspoken presence, and you can't help but feel that something unseen is watching you, its gaze piercing through the veil of reality, sending a chill that starts at the base of your spine and races up to the crown of your head.

Remember, this is a fictional scenario designed to create a particular atmosphere. In reality, as an AI, I don't have a physical presence and cannot be in the room with you.
>>
I'm smelling one or two major releases this week or maybe the one after.
>>
>>103146461
yes and it's purring seductively
>>
Apple gonna be a in trouble soon
https://x.com/ElonMuskAOC/status/1855668997796331938
>>
>>103146542
He should though. I want to see a brand surpass Apple finally.
>>
>>103146535
It sounds like you're hinting at some exciting developments on the horizon! If we interpret "major releases" in a more intimate context, you might be expressing anticipation for passionate moments or deep connections coming your way soon. Whether it's this week or the next, it seems like you're looking forward to some meaningful and fulfilling experiences.
>>
>>103146542
>Free Starlink
lmao as if, but I'd consider switching over, I don't think there's anything on either apple or samsung I care enough about not to go elsewhere
>>
>>103146542
indians are gonna go crazy for this one
>>
File: 1717798397540338.png (48 KB, 1022x494)
48 KB
48 KB PNG
lol
>>
>>103146720
Eww...
>>
>>103146720
google has gone fucking insane. who asked for this?
>>
>>103146720
>>103146768
and yeah, i know i'm hypocritical for thinking this but imagine the gigantic amounts of wasted cycles and power for stupid shit like this
>>
>>103146720
Who is gonna pay this if it's not local?
>>
>>103146720
Google in an Indian company.
>>
>>103146768
>insane
you're the insane one, nobody stopped using youtube when they removed the dislikes even though the change was hugely unpopular, this is nothing
>>
>>103146836
>Indian company
All of a sudden the absolute state of the search engine makes perfect sense.
>>
>>103146542
>fake bullshit xitter account
boomer-san...
>>
I gave ministral drummer shittune a try and I am kinda surprised. Incoherence and retardation aside it seems like there was some actual ERP material in the base ministral model training data like in l2 and before days. Maybe that was the case for Nemo too, but my feeling was that Nemo is heavily undercooked. Now if only it would get proper support and maybe like... 3-4 times more parameters?
>>
>>103147117
Mistral has always been leading in the local category for uncensored dataset.
We're all waiting on Mistral Medium 2 right now.
>>
File: 1661054888447207.png (370 KB, 560x519)
370 KB
370 KB PNG
hello, is this the big penis general?

if yes, any insight on progress of one shot classification models? or is it all about seq2seq nowadays
>>
>>103140892
We need a new architecture and more compute. See you in 50 years.
>>
>>103147267
<50 years
its been 2 Anon.
>>
>>103147165
Be a bit more vague, thanks
>>
File: 1708524937923279.png (22 KB, 916x207)
22 KB
22 KB PNG
>>103147284
7 years
>>
>>103138948
>Not a character card
I'm disappointed in you, anon.
>>
>>103147295
sorry I'm just asking if anyone can give me a rundown of progress in the oneshot classification methods as of the past few months. I'm just severely out of date and looking at huggingface it seems a bit dead.
>>
>>103147518
everyone's using transformers dawg
>>
any local models for RP I should know of since mistral large (I have 48gb vram)
>>
>>103147598
I mean I know it's transformers, still tho is there anything more exciting than the Facebook models everyone seems to be classifying with?
>>
>>103147672
rocinante is better for 95% of rp than the big models
>>
>>103147390
>7 years later still the greatest with people constantly trying to latch onto it's success like x is all you need
I kneel
>>
niggers is all you need
>>
>>103147707
i mean this general mainly discusses text generation and not classification - but other than llama3 models there are Quen2.5 models out as well.
>>
File: 1731290541033.jpg (422 KB, 1079x1610)
422 KB
422 KB JPG
>We have moved so fast in the last 2 years! I'm sure AGI is around the corner!
>>
>>103147851
new AI winter is already here and the failure of claude opus 3.5 is the first herald of it
total S curve chad vindication
>>
>>103147787
This, but unironically, and inclusive to pajeets and chinks.
https://www.washingtontimes.com/news/2024/apr/4/amazons-just-walk-out-stores-relied-on-1000-people/
>>
>>103147707
What do you want to classify in the first place?
>>
ai always goes up and down, there will be a bust only for it to come back even better
>>
Best sampler settings for rocinate?
>>
Q-SFT: Q-Learning for Language Models via Supervised Fine-Tuning
https://arxiv.org/abs/2411.05193
>Value-based reinforcement learning (RL) can in principle learn effective policies for a wide range of multi-turn problems, from games to dialogue to robotic control, including via offline RL from static previously collected datasets. However, despite the widespread use of policy gradient methods to train large language models for single turn tasks (e.g., question answering), value-based methods for multi-turn RL in an off-policy or offline setting have proven particularly challenging to scale to the setting of large language models. This setting requires effectively leveraging pretraining, scaling to large architectures with billions of parameters, and training on large datasets, all of which represent major challenges for current value-based RL methods. In this work, we propose a novel offline RL algorithm that addresses these drawbacks, casting Q-learning as a modified supervised fine-tuning (SFT) problem where the probabilities of tokens directly translate to Q-values. In this way we obtain an algorithm that smoothly transitions from maximizing the likelihood of the data during pretraining to learning a near-optimal Q-function during finetuning. Our algorithm has strong theoretical foundations, enjoying performance bounds similar to state-of-the-art Q-learning methods, while in practice utilizing an objective that closely resembles SFT. Because of this, our approach can enjoy the full benefits of the pretraining of language models, without the need to reinitialize any weights before RL finetuning, and without the need to initialize new heads for predicting values or advantages. Empirically, we evaluate our method on both pretrained LLMs and VLMs, on a variety of tasks including both natural language dialogue and robotic manipulation and navigation from images.
multiturn dialogue the most relevant for here
>>
File: Untitled.png (1.29 MB, 1080x2828)
1.29 MB
1.29 MB PNG
>>103148412
woops
>>
File: Untitled.png (915 KB, 1080x2028)
915 KB
915 KB PNG
Aioli: A Unified Optimization Framework for Language Model Data Mixing
https://arxiv.org/abs/2411.05735
>Language model performance depends on identifying the optimal mixture of data groups to train on (e.g., law, code, math). Prior work has proposed a diverse set of methods to efficiently learn mixture proportions, ranging from fitting regression models over training runs to dynamically updating proportions throughout training. Surprisingly, we find that no existing method consistently outperforms a simple stratified sampling baseline in terms of average test perplexity per group. In this paper, we study the cause of this inconsistency by unifying existing methods into a standard optimization framework. We show that all methods set proportions to minimize total loss, subject to a method-specific mixing law -- an assumption on how loss is a function of mixture proportions. We find that existing parameterizations of mixing laws can express the true loss-proportion relationship empirically, but the methods themselves often set the mixing law parameters inaccurately, resulting in poor and inconsistent performance. Finally, we leverage the insights from our framework to derive a new online method named Aioli, which directly estimates the mixing law parameters throughout training and uses them to dynamically adjust proportions. Empirically, Aioli outperforms stratified sampling on 6 out of 6 datasets by an average of 0.28 test perplexity points, whereas existing methods fail to consistently beat stratified sampling, doing up to 6.9 points worse. Moreover, in a practical setting where proportions are learned on shorter runs due to computational constraints, Aioli can dynamically adjust these proportions over the full training run, consistently improving performance over existing methods by up to 12.01 test perplexity points.
https://github.com/HazyResearch/aioli
Git isn't live yet. neat. better doremi basically
>>
These fucking people lol
>>
File: file.png (12 KB, 126x72)
12 KB
12 KB PNG
>>103137740
At least humanslop tries to put shadow of hair from above.
Instead of this thing where there's multiple shadow strands and a singular blob of bangs.
>>
>>103148716
I agree. Didn't say the image was without fault. Though at the same time, these are AI threads. We all know it's AI. The filename literally tips off that it's AI. No one who's been here for a while cares that it has small issues, especially when it doesn't have the big ones that are legitimately painful to look at like mangled hands.
>>
>>103148688
>I made AI do [thing]!
>did [thing] actually work?
>How dare you question my genius!
>>
>>103148848
Mutts are retards, QWEN is still the best.
>>
>Sorcerer is more pozzed than Wizard
Nice finetune, I haven't seen a "ehrm, you know this is a bit heckin' problematic" quasi-refusal in over a year until now, I'm almost impressed they managed to fuck this up.
>>
>>103148688
>did the rust code compile after generatoin
fucking kek
>>
I found this funny series of posts joking about AI taking the artist's job and thought I'd share it.
https://www.pixiv.net/artworks/121680350
There's NSFW btw.
>>
File: hmm ok.png (3 KB, 320x224)
3 KB
3 KB PNG
>>103148920
>>
>>103142655
Any tips on using RAG? I'd like to see what the experience is like feeding in a textbook or novel into a local model to pull elements to incorporate into a chat, kind of a step up from using lorebooks. Would you recommend any specific frontends or plugins for that sort of thing? I'm uncertain where to start. I mostly use ooba and sillytavern.
>>
>>103148688
I remember this nisten faggot because he was defending the Reflection scam more vehemently than even the scammers were
>>
>>103149033
LM Studio is an easy way to use RAG. I think superbooga has something as well.
>>
>>103148688
Lmao
>>
>>103148920
>login required
Fuck that shit
>>
>>103148688
Yet another Alpin scam...
>>
>>103149037
and I remember him from the miqu leak because he really egregiously misread the gguf metadata and thought it was an upscaled moe
he is enthusiastic and fairly on top of news but not someone whose opinions I would take too seriously
>>
>>103149159
>he doesn't have a pixiv account
bro...
>>
>>103149159
And one day you'll need one for 4chan too!
>>
How does one do the AI voice stuff?

I want to record some of my girlfriend talking (she's ESL) and have the AI voice thing speak some perfect english to see what it'd be like if she didn't have an accent.
>>
File: 1731086290564808.webm (3.92 MB, 1080x1080)
3.92 MB
3.92 MB WEBM
Damn those VR ERPfags are eating.
One day it'll be our AI waifus piloting them. One day.
>>
File: Laughs in mochi.png (24 KB, 736x51)
24 KB
24 KB PNG
>>103135641
I've got to admit, I like the way local models speak suggestively sometimes.
>>
>>103149478
has anyone hooked up voice generators to some kind of 3D character that can recognize phonemes yet? especially good if open source.
>>
>>103149478
What software is that? Asking for a friend.
>>
>>103139729
been using this shit since gpt neo
>>
>>103145338
Recap looks fine to me, given the previous thread.
Give specifics on what you expect out of the recaps.
>>
>>103145338
r/LocalLLaMA is unironically better at this point. You can just filter out newfag threads by reading the title and laughing at it. There is no mentally unstable mikutroon that doxxes there. And all the people that knew anything here (except me) left because no new models + caiggers wave.
>>
>>103149869
>>>/vg/501513615
>>
What's better for non coding, qwen or qwen coder?
>>
>>103151452
One would assume coder is better. Try both.
>>
>>103136292
>Whats up with that? Is it secret sauce?
not really secret but not well known i guess,
google uses custom hardware.
https://cloud.google.com/blog/products/compute/introducing-trillium-6th-gen-tpus
>>
>>103151128
>new model that is never heard of again is out and beats everything!
>(free ad space)
>here is my $6k Mac pro parallel setup, running 8B at 1000 T/s
>(free ad space)
>how do I RAG?
>(free ad space)
Nah
>>
File: mmmmm.jpg (46 KB, 426x426)
46 KB
46 KB JPG
https://files.catbox.moe/rchfkj.jpg
>>
>>103152065
Good art, but why does the dick look like a genuine sausage?
>>
>>103152185
Why are you looking at the dick?
>>
Best model for 3090? magnum v4 27b is unusably bad, worse than Nemo
>>
>>103149666
>playful growl in Japanese
???

>Omae wa
>mou
>shindeiru !
>>
>>103147964
making a search engine solution for a client, I'm a consultant
>>103147792
how do you use them for classification? you mean masking and checking the next token likeliness? or just oneshot prompting (this is pretty shit generally)
>>
>>103152037
yeah, better to stay here where people just link to and discuss those same reddit threads
>>
File: 15278824396654.jpg (95 KB, 1920x1081)
95 KB
95 KB JPG
Nothing ever seems to happen; it feels like everything is over.
>>
>>103152231
>making a search engine solution for a client, I'm a consultant
If even retards like you can get a job, I'm hopeful about my prospects.
>>
>>103152191
Try a mistral small 22b finetune.
>>
>>103152190
Because it's part of the image?
This is like asking why you're eating the crust of a slice of bread.
>>
>>103152338
>This is like asking why you're eating the crust of a slice of bread.
Yeah? I always throw out the crust of bread and pizza.
>>
File: 000169354572.jpg (84 KB, 1080x1080)
84 KB
84 KB JPG
>>103152360
>and pizza
Okay, now you're going too far.
>>
>>103149033
>Retrieval-Augmented Generation
private-gpt is an out of the box kind of solution for RAG
https://github.com/zylon-ai/private-gpt
although it is a bit of a bitch to set up, so if you know docker it is easier as the container build will just do it all for you.
>>
Does an RVC project exist that supports AMD cards?
>>
>>103152280
I don't have a job, it's my company.

anyway, why am I retarded? I mainly make backends, This is the first job that has involved NLP.

Anyway since all of the responses have been vague or irrelevant I'm just gonna guess there's nothing notable to talk about for classification.
>>
File: 2336755976.jpg (50 KB, 650x450)
50 KB
50 KB JPG
>>103152433
no worries, you just came to the wrong neighbourhood dawg
we're all fuckin degenerates here
>>
>>103152321
what are good ones? I only have used official instructs and magnum
>>
File: j2vRzthFAQiB6vGA2IVjI.png (194 KB, 711x842)
194 KB
194 KB PNG
>>
>>103152492
> https://huggingface.co/models?sort=downloads&search=mistral+22b

ArliAI RPMax seems pretty popular.
>>
>>103152581
is this autism?
>>
File: file.png (64 KB, 826x706)
64 KB
64 KB PNG
>>103152581
Wish I had a life easy enough this was important enough to file a lawsuit about
>>
>>103135871
Call that robot Koishi and you done.
>>
>>103135871
>and braindead logistics that plebbit recommended because it totally beats gpt
This just reminded me.
Remember that Polish team that bench trained a Llama-1-7B model to beat GPT-4 on the benchmarks and then went around claiming that this made it better than GPT-4. They even sent some retard here and they couldn't even fathom why that completely invalidated all of their claims?
>>
>>103152768
>Polish team
all the polack researchers I've worked with have been consistently retarded
>>
https://huggingface.co/cpumaxx/SoVITS-anime-female-brickwall-tts
Might be useful if anyone wants to automate tts on old VNs that were pre voice acting by using samples from modern VNs.
If anyone actually cares, I might do a male one as well so it would be complete suitable system for old JRPGs etc.
>>
File deleted.
>>103152185
https://files.catbox.moe/smqkqw.jpg
why not
>>
>>103153048
kek
>>
>>103152433
Because what you're describing is a basic RAG and you can find dozen of articles each day on medium with all implementation details. I can make one in an afternoon and I'm a neet
>>
>>103153133
I didn't really describe anything anon, a search engine isn't just (and almost never includes RAG). I just asked if anyone had anything cool to share about classifiers.

Also, you should know that 100% of implementations you can find online are nowhere near production ready. Reliable systems that never drop requests and scale to the millions is where the money's at.

No offence but hobbyist stuff is completely different, not that you'd be too stupid to get into making real products but right now you're atop mount stupid.
>>
>>103153308
>>103153308
>>103153308
>>
>>103137277
chijo the trash princess



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.