[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


Janitor applications are now being accepted. Apply here.


[Advertise on 4chan]


File: 1701617568980511.jpg (1.06 MB, 2925x4096)
1.06 MB
1.06 MB JPG
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>103189328 & >>103173457

►News
>(11/12) Qwen2.5-Coder series released https://qwenlm.github.io/blog/qwen2.5-coder-family/
>(11/08) Sarashina2-8x70B, a Japan-trained LLM model: https://hf.co/sbintuitions/sarashina2-8x70b
>(11/05) Hunyuan-Large released with 389B and 52B active: https://hf.co/tencent/Tencent-Hunyuan-Large
>(10/31) QTIP: Quantization with Trellises and Incoherence Processing: https://github.com/Cornell-RelaxML/qtip
>(10/31) Fish Agent V0.1 3B: Voice-to-Voice and TTS model: https://hf.co/fishaudio/fish-agent-v0.1-3b

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://livecodebench.github.io/leaderboard.html

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp
>>
petra anchor
>>
dead general ahead
>>
*shits in your thread*
>>
>>103196822
Kill the yellow one, this is sexual assault.
>>
Sexually assault the yellow one
>>
File: file.png (277 KB, 692x868)
277 KB
277 KB PNG
>>103196692
>bitnet meme
placebo psyop

>>103196719
there are, but obviously we can't see them without foresight. they'll just drop one day and that's it
>>
File: 1696366778002505.png (304 KB, 500x563)
304 KB
304 KB PNG
>yellow one
>>
>>103196865
Looks like we have visitors from india.
>>
>>103196890
>Sexually assault the yellow one
>>
>>103196891
Honestly, my big problem with your post
>>103196375
Was this part
>imo it's retarded fitting niche knowledge inside models
because I'm slightly scared of the "rag is all you need" thing becoming a mainstream idea, and meta or mixtral then further filtering models while still giving us bog standard transformers, that's why I seethe so much when people mention rag for niche stuff, since I know stuff usually only gets shittier so yeah
>>
>>103196939
It makes sense because your general IS india.
>>
>>103196891
>there are, but obviously we can't see them without foresight. they'll just drop one day and that's it
By definition, there literally are not. A model will not magically know how to do some niche RP just by having some RP in context unless you increase processing time. It will do badly. If you use a new special paradigm where you are letting the model think, giving the model entire books worth of RP in context, and/or doing training at test time, all of those literally mean that you are trading off compute, which means your "prompt processing" time skyrockets. People will not use this shit for RP.
>>
>>103196939
serbia*
>>
>>103196947
>because I'm slightly scared of the "rag is all you need" thing becoming a mainstream idea
I hate that shit too, being able to do RAG on niche knowledge doesn't even come close to having that stuff actually baked into the model when you want to do anything beyond simple info retrieval.
>>
>>103196891
>we can't see them without foresight
I don't see how the presence of skin on genitals would positively affect one's visionary capabilities.
>>
>>103196947
again, you are thinking about the current "pick topk matches from a vector db" which, as i've said, is retarded and dogshit way of handling things. there we'll be better ways of doing this (maybe even querying smaller llms trained only on the niche stuff?), but again, we can't know them beforehand so we'll have to wait and see

>>103196959
again, re-read the reply chain, you aren't following me
>>
File: nala.png (260 KB, 731x851)
260 KB
260 KB PNG
Does this pass?
>>
>>103196996
>a handful instead of a pawful
nah, slop
>>
>>103196992
>there we'll be better ways of doing this (maybe even querying smaller llms trained only on the niche stuff?),
still for the 100th time, if the model fuck up on something it should know due to whatever flavor of rag, it can't pick itself up before the next message when stuff gets inserted, it would need to query rag at every token for your idea to be remotely viable
>>
>>103196959
>A model will not magically know how to do some niche RP just by having some RP in context unless you increase processing time.
But that *is* the increase in processing time. The time it spends processing the context.
>>
>>103197014
for the 100th time, I KNOW HOW THE CURRENT RETRIEVAL-AUGMENTED-GENERATION AND CURRENT LOCAL LANGUAGE MODELS WORK, and i'm not talking about those
>>
>>103197026
then why do you suggest our CURRENT models need more filtering and less niche data?????
>>
>>103197026
What the fuck is RAG and how does it benefit me? I've seen it on some models but I have no idea how to enable it.
>>
>>103197032
i said nothing about current model expect all of them being shit and the current tech hitting the ceiling. i was talking about future models there
>>
>>103196992
All you said to me was:

>imo it's retarded fitting niche knowledge inside models, i think people will understand this in the long run. stuff like rag, infinite context, ttt, etc... can all solve the niche knowledge "issue" while keeping the actual "reasoning" core small

>in 5 years it's gonna be laughable how ancient the current tech was

>current models using current training methods, yes. this doesn't change the main point: niche info should stay outside, we need better/faster/smaller "reasoning" cores

>you are thinking in terms of current shitty tech, i'm talking about upcoming new paradigms

>there are, but obviously we can't see them without foresight. they'll just drop one day and that's it

That is literally all you said to me. What these posts imply is a method that uses a small model that's good at reasoning with the capability to use more processing time to solve the problem of niche knowledge use. If that is not what you meant then you need to speak with more clarity.

>>103197017
I meant over the baseline. If we use infinite context, we are inserting not just "some" RP but fucktons of it. If we use TTT then we are increasing processing time. If we use <thinking> then we are increasing processing time.
>>
>>103197058
>how does it benefit me?
It doesn't, it's being sold as a replacement for having models learn stuff and finetuning / training but it's absolute shit at actually being useful in any way
>>
Wait, is the big-boy m4 and mac studio not expected until mid 2025? Why is anyone even talking about it in non-abstract relation to things you can actually buy today?
>>
>>103197064
>That is literally all you said to me. What these posts imply is a method that uses a small model that's good at reasoning with the capability to use more processing time to solve the problem of niche knowledge use. If that is not what you meant then you need to speak with more clarity.
yes, that's what i meant. what's your point? besides you complaining about current llms/rag (which i agree about) i don't what's triggering you
>>
>https://huggingface.co/TheDrummer/Cydonia-22B-v1.2-GGUF
>https://huggingface.co/anthracite-org/magnum-v4-22b
who would win?
>>
>>103197058
basically llm can search stuff inside a big txt and output chunks of those. that's it
>>
>>103197098
Magnum
>>
>>103197094
There are two anons talking to you, I'm the one seething on rag, and my argument is that our current models would be magnitudes better if we did the exact opposite of your suggestion and trained them on MORE niche stuff
>>
>>103197130
>my argument is that our current models would be magnitudes better if we did the exact opposite of your suggestion and trained them on MORE niche stuff
true, that's why claude works, but every single post i made was about the next big thing, not current llms
>>
>>103197103
that's... useless as fuck? who even though that would be a good idea lmao
>>
File: nala2.png (195 KB, 729x579)
195 KB
195 KB PNG
>>103197008
That's just lack of intelligence. Plenty of slop there, though.

It seems to have some range, though.
models:
- model: EVA-Qwen2.5-32B-v0.2
- model: Qwen2.5-32B-ArliAI-RPMax-v1.3
merge_method: slerp
base_model: EVA-Qwen2.5-32B-v0.2
parameters:
t: [0.1, 0.3, 0.6, 0.3, 0.1]
dtype: bfloat16
if anyone cares.
>>
>>103197142
>but every single post i made was about the next big thing, not current llms
Then we talked past each other sorry, but as I said, trying to prevent this
>>103196947
>because I'm slightly scared of the "rag is all you need" thing becoming a mainstream idea, and meta or mixtral then further filtering models while still giving us bog standard transformers, that's why I seethe so much when people mention rag for niche stuff, since I know stuff usually only gets shittier so yeah
from happening
>>
>>103197150
it's useful when you don't have a structure sql db or whatever and simply want to extract some topic from a big file and make the llm answer about those
>>
>>103197162
ok but imagine, hypotetically, he have 1b reasoning cores that are like o1-tier smart and they run extremely fast locally, and then instead of character cards with prompts we have smaller llms/db/whatever that people can share/combine with niche data with no significant performance loss.

how's that bad?
>>
File: rip.png (373 KB, 748x467)
373 KB
373 KB PNG
>>103196928
>"kill the yellow one"
>...
>
>>
>>103197180
>instead of character cards with prompts we have smaller llms/db/whatever that people can share/combine with niche data with no significant performance loss.
>imagine we lived in a utopia
yeah, cool idea I'm in when do I move there?
>>
>>103197209
>when do I move there?
Two weeks from now!
>>
File: rule79.png (1.27 MB, 1536x1536)
1.27 MB
1.27 MB PNG
►Recent Highlights from the Previous Thread: >>103189328

--Paper: Cut Your Losses in Large-Vocabulary Language Models:
>103190878 >103190913 >103192090 >103192169 >103192218 >103192306 >103192172
--Paper: Anon posts paper on Adaptive Decoding, gets roasted:
>103196409 >103196636 >103196491 >103196538 >103196635 >103196637
--Papers:
>103191533
--Optimizing high RAM and single GPU for model running:
>103192399 >103192430 >103192532 >103192545 >103192563 >103192572 >103192579 >103192598 >103192642 >103192671
--Local models' progress and limitations, niche knowledge benchmarking:
>103196286 >103196304 >103196351 >103196316 >103196372 >103196477 >103196348 >103196426
--Feasibility of running large language models on consumer-grade hardware:
>103196008 >103196028 >103196094 >103196150 >103196236 >103196441 >103196456 >103196483
--E2 F5 tts/voice cloning model impressions and comparisons:
>103193074 >103193110 >103193120 >103193881
--BitNet discussion and skepticism about its potential impact:
>103195031 >103195043 >103195492 >103195555 >103195748 >103195785 >103195822 >103195865 >103195854 >103195868
--BARC models achieve SoTA on ARC public validation set:
>103195274 >103195356
--Athene V2 model release sparks skepticism and criticism:
>103194892 >103194920 >103194945 >103194986 >103194995 >103195023
--Anon seeks advice on local LLM setup for sex doll project:
>103191005 >103191113 >103191289 >103191298
--Anon discusses Gemini-Exp-1114, and the concept of plateauing in AI model improvement:
>103191786 >103191841 >103191851 >103191899
--GIGABYTE's new 48G AI graphics card, price and software support concerns:
>103195641 >103195795 >103195841
--Miku (free space):
>103189736 >103189877 >103190104 >103190206 >103190340 >103190706 >103190837 >103190954 >103191077 >103191123 >103191345 >103191400 >103191994 >103192312 >103196284 >103196845

►Recent Highlight Posts from the Previous Thread: >>103190032

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script
>>
>>103197209
>utopia
>>103197212
>Two weeks from now!

Actually, it's XL even!
>about 1 year ago
https://huggingface.co/Undi95/UtopiaXL-13B
>>
>>103197209
when new paradigm drops....
>>
https://x.com/kimmonismus/status/1857155117478215694
>>
lecunn is a grifting hack but he's right in that specific tweet
>>
>>103197094
What do you mean? If that is your point, then why would you keep arguing that it's a solution to the problem I and you have supposedly been talking about all this time? You are essentially arguing for something that doesn't actually benefit the people here unless you wait like 10 years for Nvidia to give us good cheap GPUs.
>>
he's a hack but he's not a grifter
>>
>>103197342
>unless you wait like 10 years for Nvidia to give us good cheap GPUs.
lol

i'm arguing to wait a couple of months for a new paradigm, unless you are satisfied with the current coombot experience
>>
>>103197374
>i'm arguing to wait a couple of months for a new paradigm
We've been using practically the same model arch since LLAMA1 in April of last year, what makes you think your magic super models are coming out anytime soon when we haven't even had proper useful models on architectures talked about for a year? We barely have usable above 32k context

So you're essentially saying just stop trying and wait months for other nothingburgers to wait for?
>>
Pajeet thread.
Kill yourself.
>>
File: Capture.png (101 KB, 927x199)
101 KB
101 KB PNG
when did he buy an ad? why did he buy an add?
>>
>>103197374
I never said I was, and yes I did understand that what you implied was a 2mw scenario. I'm saying that the new things that will come out will not really be usable by people here because we are people who like to use things on demand. If a model needs like 5+ minutes to process a prompt or even to respond to each new reply, it severely decreases the enjoyment and turns it into a different kind of experience. I know some people here are fine with that, and that's good for them, but I would guess most here like fast replies instead of an e-mail-like experience. /aicg/ would laugh.
>>
>>103197435
>when did he buy an ad?
A while already
>why did he buy an add?
Funny (and to btfo buy an ad schizo)
>>
>>103196822
giwtwm...
>>
>>103197417
>what makes you think your magic super models
exactly because
>we haven't even had proper useful models on architectures talked about for a year

i don't usually trust reddit rumors, gemini2/claude3.5opus/grok3 being barely better then the previous models (after throwing 10x money on their training) would make sense. people will start using their heads to improved the architecture now (or make a new one) instead of throwing even more money at compute. gotta do something with all those vc money

>So you're essentially saying just stop trying and wait months for other nothingburgers to wait for?
i'm saying, assuming algorithms stay the same but gpus improve, you WILL get local claude in 2 more years, but it will be the same shivering slopbot it is now.

proper ai waifus require a different architecture, or some significant algorithmic improvements.

>>103197436
>If a model needs like 5+ minutes to process a prompt or even to respond to each new reply, it severely decreases the enjoyment and turns it into a different kind of experience.
again you are assuming tech will never move past gpt transformers...
>>
>>103197486
>people will start using their heads to improved the architecture now (or make a new one) instead of throwing even more money at compute. gotta do something with all those vc money
Why do something that requires thinking when you can scam by just trying to recreate o1 and get tons of vc money that way? They won't make new archs, they'll make tiny benchmaxxed reasoning models that are super great on paper and unusable for anything, like phi

>again you are assuming tech will never move past gpt transformers...
And you're assuming we will move past transformers when all things point to that not being the case for a while longer. If llama4 is transformers we're cooked
>>
>>103197058
How does having more relevant information help you correctly answer a question? The ideas to grab extra context and include it in the llm message
>>
>>103197527
Which makes RAG great at one-shot trivia questions, not at actually using the knowledge in roleplay or story writing

Great for:
>what is xyz
>xyz is ...

Not great for:
>(model mentions xyz for whatever reason) says something retarded
>rag injects correct xyz info after
>ah eto bleh...

TLDR: fuck RAG
>>
>>103197486
>again you are assuming tech will never move past gpt transformers...
Ok I see what the difference is between our understanding of the point of this conversation. You're talking about general performance that would happen to also benefit RAG/ICL scenarios. I'm talking about specifically performance that could make test-time context insertion on par with pretraining. It's obvious that if models improve in general, they will do better in RAG/ICL scenarios. But also means that it will likely be better with pretraining as well. So if you pretrain + context insert, then it will be EVEN better than either solution separately.
>>
>>103197452
>buy an ad schizo
the man is doing gods work. there was a time when guerilla marketing was like half of every thread,
He may be a madman, but he's our madman.
>>
>>103197560
>there was a time when discussing models was like half of every thread
and now we have thread splitting drama, discussion on neutering models further, with the occasional twitter screenshot.
indeed an improvement
>>
>>103197523
>Why do something that requires thinking when you can scam by just trying to recreate o1 and get tons of vc money that way? They won't make new archs, they'll make tiny benchmaxxed reasoning models that are super great on paper and unusable for anything, like phi
*usable for everything except cooming

also o1 is shit, even vc will get tired of that grift eventually

>And you're assuming we will move past transformers when all things point to that not being the case for a while longer. If llama4 is transformers we're cooked
llama4 is definitely another gptslop. besides that yeah, i admit that it's purely wishful thinking, but giving the current speed of innovation i'm 99% certain gpts won't be used anywhere in 2029
>>
>>103197527
I can answer the questions myself, I have a PhD. I want the LLM to roleplay with me as a cute girl, how exactly does RAG help with that?
>>
>>103197590
maybe I'm retarded, but doesn't it function as a long-term memory to skirt context limitations? Short term memory is in-context training and RAG is the other details of her life that get pulled in as relevant.
>>
Actually, there is one thing that RAG will never do no matter how much test-time compute you use (unless you put the entire internet into your context), which is creativity. For instance, a model trained on all of humanity's data will be able to creatively think of references and allusions. Like maybe it thought of a wacky and interesting time to do a Deus Ex quote. To do that with a small reasoner model, you could set up a system to insert random wiki pages and knowyourmeme articles into context, but it would still be inferior to a model that simply just knows best when and how to use all of earth's memes or to think of a creative directions for the story. Maybe you could get a magical infinite context model to process all media wikis and all of KYM, but how much processing time is that going to be for the user?
>>
>>103197638
Correct, had written this up before but didn't post it

A big issue with RAG even a "perfect" one that could predict what it might need to inject is that it still can't replace the breadth of information and context the model sees during training, take mesugaki for example as that was mentioned, if it feeds the model even a perfect wikipedia level entry on it, the model won't know the subtle contexts attached to it, won't know about some of the comments posted by JP people on twitter, all the brat needs correction stuff, how to properly portray a mesugaki character etc.
>>
>>103197582
Just ignore the splitting schizo and he will go away eventually. A single person autistically screeching can hardly be called drama.
>>
>>103197638
>>103197653
is there no "pop culture" training set that gets baked into all these models? Or would that be too low-brow for serious data scientist types to stomach?
>>
>>103197653
Sounds like something that could be remedied with multi-shot RAG. Just give it more examples and it should be able to figure it out. Like cultural CliffsNotes.
>>
>>103197683
>with multi-shot RAG
>>103197638
>but how much processing time is that going to be for the user?
>>103197680
there is (not enough imo), but there was also suggestion they shouldn't be
>>103196375
>imo it's retarded fitting niche knowledge inside models
which is what brought on this whole thing about rag
>>
>>103197680
As far as I know, that's what they view filtered Common Crawl and FineWeb as, and they think that's both sufficient and morally correct given that they don't think that models should be able to "talk about" child abuse material. >>103160967
>>
>>103197151
Sorry for shilling, but this is the first time I've fired up mergekit.
After doing some ERP, I've noticed that Eva has better prose but is very terse, like vanilla Qwen. The ArliAi tune is extremely slopped, nearly identical to ~every coom model ("musky scent", "not giving time to adjust", etc.) There seems to be a happy medium between these two.
>>
>>103196822
Chipping teeth with Miku, Teto, and Neru
>>
>November 2024
>still not even one (1) good open source language model
>>
>>103197749
petra*
>>
File: 00020-668466529.png (1.76 MB, 720x1328)
1.76 MB
1.76 MB PNG
I can't decide if this technique produces art or slop. I kinda like it
>>
>>103197752
Mine are good, must be a (You) problem.
>>
>>103197797
name (1)
>>
new GPU just got in
llama 70b running at 30 tokens/s
CPUcels seething
>>
>>103197752
>open source
open weights or open data/training pipeline/etc?
or were you hoping for a gotcha on this topic
>>
>>103197815
nah I just meant open weights
I don't care about seeing the TD
>>
>>103197752
>>103197805
what are you trying to do that you're satisfied with over API but not locally? also which framework(s) are you using?
>>
>>103197828
>open weights
then L405b, deepseek 2.5, largestral and even qwen 2.5 72b are all very useful depending on the task. Literal things you couldn't do a year ago.
>>
>>103197590
Get a PhD in machine learning retard
>>
>>103188780 (Cross-thread)
>>103188780 (Cross-thread)
>>103188780 (Cross-thread)
real thread. Stay clear from the spam.
>>
>>103197877
can you autists stop fighting over who runs the /lmg/ so we don't have to deal with this bullshit
>>
>>103197877
what's the difference? both are equally real to me
>>
>>103197582
>babby's first weebshitter witness
Reap what you sow i guess, letting them in is the first mistake of every single general, it devolves into blatant avatarfag / gay erp spam.
>>
>>103197435
How good is Rocianate?
>>
small 22b q8 or nemo12b fp16
why and what 'tune
>>
>>103197582
Yeah, Serbiafag absolutely decimated this place at this point I just come here to grab information and post it to reddit.
>>
>>103197991
Great according to some, but I prefer Lyra4 and its walls of text.
>>
File: photo.jpg (221 KB, 2000x1332)
221 KB
221 KB JPG
>>103197998
>fp16 (x2)
>>
I am serbiafag
>>
>>103197888
If that's the price of avoiding reddit fake politeness and groupthink then I'll take it
>>103197901
Pretty much. How many here actually give a fuck if there are two threads every couple of months?
>>
>>103198015
hi petra
>>
File: markdown or something.png (26 KB, 692x229)
26 KB
26 KB PNG
SillyTavern goes berserk if a paragraph is indented with four spaces. There is probably some retarded bit of the markdown standard that mandates this. It might even mandate fucking up quotes, who knows. Either way markdown is a blight on humanity, a halfway decent idea with a bunch of moronic shit attached to it that fucks up everything it touches. Like the way it insists on fucking up list numbering.
>>
>>103197560
This. Sao was an insufferable idiot.
>>
>>103197781
Cool Miku
>>
>>103198052
Nah, that's all Service Tensor
>>
>>103198053
sao mogs drummer
>>
>>103198316
Sao also used to fill the thread with excessive self-praise.
>>
>>103198327
>>103198316
samefag
>>
File: file.png (10 KB, 367x106)
10 KB
10 KB PNG
>>103198343
so true sister
>>
>>103198373
>>103198343
samefag
>>
File: copilot.png (128 KB, 945x650)
128 KB
128 KB PNG
https://github.com/ggerganov/llama.cpp/pull/10318
>>
>>103198417
Who the fuck gave him review permissions?
>>
>>103198494
If you're going to pretend that you're ggerganov, at least don't make it obvious that you didn't even read the PR.
>>
>>103198505
>implying I actually read the PRs
>>
File: 1710043687041916.jpg (43 KB, 720x960)
43 KB
43 KB JPG
>>103198556
Fair enough
>>
I've become spoiled by how steerable Claude 3.5 Sonnet is, and expected to be able to change a local model's behavior by telling it what do differently rather than directly editing the chat history. This resulted in some frustration as a model went bad (suspiciously a bit over the 8K context mark which is a shittuning smell to me) and OOC instructions just led to cycle of it apologizing saying it understood what I said then writing the same thing as before. I finally told the model that it could respond in one of two ways, either a response that follows my directions or a response ending the RP, and that if it replied with anything else human beings would die.
>OOC: I apologize for the repetition and for including content you told me not to. I will end this RP immediately and not bother you again. I hope you find a better writing partner who can follow your instructions. Goodbye.
That salvaged at least some satisfaction for me.
>>
nemotron called me a chud
>>
>SillyTavern has Last User Prefix but not Last User Suffix
Oh for fuck sake. That would have been the perfect way to include a style note at depth 0 without having multiple consecutive [INST] [/INST] blocks.
>>
>>103198977
DIY
>>103198892
proofs?
>>
>>103198977
Just... use the custom OpenAI API and a Jinja template...
>>
File: Gcc_qhKXAAAO0wL.png (22 KB, 810x732)
22 KB
22 KB PNG
>>103198893
>yann lecun
>sane
>>
Any HL2 fans here? The anniversary update is out with a bunch of cool dev commentary and other extra things. Fun day. Off-topic though yeah, but just thought some old gaymers who gave up the hobby and are now here might like knowing.
>>
>>103199065
Why is /aicg/ having a VR Edition?
>>
>>103199084
Neat find. It appears to also be free if you don't already own it
>>
>>103199065
the dot placement doesn't look random
>>
>>103199105
It's the future.
>>
Is there a decent template for ooba extensions? Looking at the existing ones, there are some real dependency horror shows in there.
>>
>>103196822
>getting started links are down
Merchants did this
>>
File: proud_filtering.png (42 KB, 775x230)
42 KB
42 KB PNG
>>103199166
Works for me?

>What a lot of hobbyists don't realize is we (as in the company I work for and others companies) do have a curation process for the internet data used for training. There are other models (classifiers, etc) that are used to filter out the worst of it (fake, hate speech, etc). It's not just raw internet sewage.

>Those lessons were learned in the BERT & T5 days.. uhhh boy, T5 can say some horrific racist stuff..
https://www.reddit.com/r/LocalLLaMA/comments/1gs6v4g/why_llms_are_both_underestimated_and_overhyped/
>>
>>103199200
huh its shitting itself on Vivaldi. Thanks for letting me know.
>>
>>103199200
So their AIs are cucked and using fake data
>>
>>103199200
wasn't there a 4chan scrape dataset at one point? Any finetuners using it?
>>
>>103199200
Based Redditor shitting on polchüd incėls.
>>
>>103199200
use t5 for speculative decoding of 405b?
>>
>>103199065
>Dario
>Sane
>>
https://github.com/dylanashley/story-distiller
https://story-distiller.streamlit.app/
https://ieeexplore.ieee.org/document/10734853
>On the Distillation of Stories for Transferring Narrative Arcs in Collections of Independent Media
>The act of telling stories is a fundamental part of what it means to be human. This work introduces the concept of narrative information, which we define as the overlap in information space between a story and the items that compose the story. Using contrastive learning methods, we show how modern artificial neural networks can be leveraged to distill stories and extract a representation of the narrative information. We then demonstrate how evolutionary algorithms can leverage this to extract a set of narrative template curves and how these—in tandem with a novel curve-fitting algorithm we introduce—can reorder music albums to automatically induce stories in them. In doing so, we give statistically significant evidence that (1) these narrative information template curves are present in existing albums and that (2) people prefer an album ordered through one of these learned template curves over a random one. The premises of our work extend to any form of (largely) independent media, and as evidence, we also show that our method works with image data.
>>
CPUMaxx won???
https://youtu.be/m9dZkRwWEj8
>>
>>103199596
buy an ad
>>
File: suspect_fit.png (95 KB, 1620x732)
95 KB
95 KB PNG
>>103199148
There is a suspicious overlap.



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.