[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/v/ - Video Games


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: 185672834638432.png (28 KB, 512x512)
28 KB PNG
Have they added new features like image, audio or video generation yet?
>>
These guys literally offer a $996 subscription option. Yes, effectively $1k per month. What fucking shamelessness lmao.
>>
>>741887470
Is there no way of locally running It with chinese AI models?
>>
>>741888027
>locally running
Unless you have a GPU rack and memory that brings your collective VRAM and RAM well into triple digits, no. Running most Chinese models locally is out of the question for even someone with a very good gaming rig. You could run stuff like Wayfarer (12B), but models of that size are quite dumb.

But you can definitely have them served to you for cheaper via various online model hosting services. Uncensored too. With SillyTavern or some other frontend, you could wrangle it into a CYOA format without much trouble or effort.

AIDungeon's offering is an absolute scam. For example, its $50 tier limits Deepseek v3.2 to 16k token context. To give you an idea how ridiculous that is, you'd struggle to rack up $25 with 32k context via the official API even with spamming requests for several hours each day thanks to insanely cheap caching.
>>
>>741888027
the original release of ai dungeon is locally run on gpt2
>>
>>741891746
a good chatbot runs on a single 1070ti. you really really don't need those large general models capable of science and programming to just do some chat and roleplay. just set up a RAG for it to keep it's context in line with notes and keep itself from going schizo.
>>
>>741891746 (Me)
> you'd struggle to rack up $25 with 32k context via the official API even with spamming requests for several hours each day thanks to insanely cheap caching.
And mind you, that applies to the *newer* (and considerably better) Deepseek v4 pro, which that same $50 tier limits to "1000 tokens per credit," as if you're prompting Claude's Opus. Just scummy all around. Nick Walton is absolutely counting on his users never looking up the market API pricing to see just how much they're being fleeced.
>>
This is the best thing to come out of AI Dungeon, period
>>
>>741891746
>16k token context
What does that actually mean in terms of what you can do with it or how much text you can generate
>>
>>741886018
nigger who the fuck cares about those mormons anymore? everyone either runs their ai locally or pays for an api key like deepsneed or claude directly so they can use it with whatever frontend like sillytavern, ai dungeon only had worth in the past because they were offering openai for VERY cheap, and then they lost that deal. I wont pretend those early days were great, I was one of the dudes paying for dragon, but yeah they are completely worthless today
>>
>>741893058
it just means how much the model can remember back in the chat history i.e. context, i forget how much 16k is though but its a sizable amount
>>
>>741893058
It's a tiny bit more powerful than you can run on RTX 3060 shitbox. Big boys play with Gemini and its staggering ONE MILLION token context.
>>
>>741893194
Will Gemini let me write cringey lesbian romancecore and harem stories?
>>
>>741893160
doesn't the official deepseek v4 API have something like 1M (one million) context size?
>>
>>741886018
buy an ad and/or kys
>>
>>741893448
It will do anything unless it's non-consensual, then you have to word it nicely and it will still let you do it.
>>
File: D33XGhwWAAAiGwe.jpg (20 KB, 511x296)
20 KB JPG
>>741892661
No, I absolutely require an entire GB200 cluster melting down trying to serve my extremely convoluted +100k token fetish prompt reroll spam because getting any tiny detail slightly wrong would break my immersion. You wouldn't understand.
>>
The worst thing about this AI chatting 'hobby' as a whole is that we still don't have some proper blobber game out of it.
>AI Roguelits
Even more primitive than AI Dungeon.
>>
>>741886018
What's the point of using this over Silly Tavern with some other model, either from Openrouter or Local?
>>
>>741893517
i forget how much things have evolved but theres a point in a massive context size where it stops becoming, like, actual easily accessible memory, sure you can pull things out if you specifically mention keywords from like a hundred conversations ago but chances are it wont organically remember shit
thats why I stick to ~50k to save tokens and money especially cause im a claude user
>>
File: 1774605447569150.jpg (26 KB, 375x411)
26 KB JPG
>>741886018
>2011+15
>AI Dungeon
Just lurk one of the AI roleplay generals for a few hours and you'll figure out something better
>>
File: china won.png (39 KB, 772x765)
39 KB PNG
>>741893517
Yes. It is also extremely cheap and will write anything, including NSFL.
>>
File: swapmode.png (790 KB, 1215x3390)
790 KB PNG
>>741886018
just install sillytavern instead of this goofy mormon shit
>>
>>741892661
Sorry it's literally impossible for me but I'm happy and envious your standards are that low. More intelligence = more natural conversation, understanding of fetishes, etc etc its super obvious why it helps
>>
>>741886018
just use this
https://perchance.org/ai-rpg
>>
>>741894457
perchance is dogwater for poor kids and thirdworlders
>>
>>741886018
GTA 5 modders actually made a good implementation of AI tech:
https://youtu.be/3_Eng-F4akI
>>
>>741894132
It's true that most models' effective context size is smaller than their stated context size, where at certain point output quality degrades to the point of uselessness. However, it's also true that the effective context size is increasing more and more as new models get released. You can definitely push Deepseek v4 for quite longer than you could Deepseek v3. I tend to not push it too far myself, but anecdotally I've seen anons say they've gotten to over 200k context of writing smut before it started shitting itself.
>>
>>741894643
i love NPCs that talk to me like i'm their manager
>>
>>741888027
I run gemma-4-26B-A4B-it-qat-UD-Q4_K_XL on a 5070 ti with 100k context window through Koboldcpp. It works perfectly fine for this, and generates at like 60-90 tokens per second.
Try Marinara Engine if you want a simplified setup where you just load it up and go, and you're looking for more of an RPG experience with a battle system and inventory.
>>
>>741894280
try a model with a narrower training set. the problem that you are experiencing is the problem that was written about in the tinystories paper. a small model trained on a narrow domain dataset that still reaches language fluency is equal to and sometimes even more capable than a large broadly trained model at a target domain. its all about optimising the target knowledge to parameter ratio. more people should read the tinystories paper.
>>
>>741895079
local is just ass bro, you can't magic it into being good
maybe in a few years when hardware changes but there's just no way to make it feasible on a hobbyist level current-day
>>
>>741894608
so is AI dungeon lol
>>
>>741895079
Mate if you want to have extremely basic roleplays where you sex someone up in 5000 words or less that's completely fine. I'm sure any dumbass model is fine for that. But I'm doing extremely long stories with lore and character relationships etc etc out the wazoo and only an intelligent model can handle that without the need for extreme tard wrangling. I think the issue is that we're not talking about the same use cases.
>>
>>741893549
sounds like a cuck behavior, no thanks
>>
>>741895079
there's no "narrower" domain for creative writing, it encompasses everything. if all you do is generic fantasy adventures in generic fantasy land, you might see success with a small model, but for someone like me who likes really "out there" fiction it doesn't work in the slightest.
>>
>>741895826
bro extremely long stories is what the RAG is for. unless you're doing something like having the llm compare two editions of entire encyclopedias and then point out what was missing from the older edition then RAG is enough and you don't actually need ultra long context. long context is just a huge processing bloat with every query. let the RAG periodically index your conversation history and you are good.
>>
>>741896683
>"out there" fiction
bro just look up urban fantasy
add 10k urban fantasy stories to your dataset instead of 10k fantasyland stories, done
>>
>>741886018
Dudes.

I want you to give it to me straight: am I retarded? I've been using CharacterAI for a few years, right? Well it's beyond dogshit now, and about 2 months ago I switched to using the literal Google Gemini AI app for roleplay.

My mind has been blown. It is virtually 2x better than CharacterAI in every major aspect. I honestly am not sure if there's any other AI models available that are better than Gemini at roleplaying, I am in fucking heaven.

Are LLMs really going to only get better over time? FUCK fanfiction, man! We've got AI now, we're in the goddamn future!
>>
>>741896683
creative writing does not need to know every language on the planet and technical specs about frigidaire and other junk thats in all those large general datasets. it actually gets in the way. like holy shit have you never had shit like chatgpt making a character pull out a cellphone in a medieval fantasy roleplay. that shit happened regularly. had to constantly tard wrangle to keep it on setting. narrow models do not have this issue.
>>
What model do you recommend for running locally if I only have 6GB of VRAM?
>>
>>741898159
look into gemma or kimi
>>
>>741898159
something like Phi with conversation history and setting materials put into RAG
>>
>>741897665
switch to sillytavern, you're still playing with baby toys
>>
I want to use NIM but I'm too much of a pussy to do anything risky since my phone number is linked to that shit
>>
>>741897665
Gemma4 31B is widely considered to be one of the best medium sized models to run locally right now.
I switched from GLM and was impressed by its roleplaying capabilities. Gemma very rarely says something that's illogical in the situation.
>>
>>741886018
There is way better shit than AI Dungeon now. Just browse /LMG/ on /g/
>>
>>741898960
>going to /g/ for llm advice
that board is the only toilet jeets know how to use
>>
I checked out AI Dungeon. What a joke.
They have 2k-4k context size in their free tier. Even their $15/month tier is only 4k context size for most models. If you want the same 32k context size that you can get on an 5070 Ti with a decent model, that's $100/month
>>
>>741897682
right, narrow models have the opposite issue where you try to do something outside their little box and they just shit the bed.
>>
>>741886018
I wish someone would set up a locally run version of infinite worlds. Silly Tavern just doesn’t seem capable of that.
>>
>>741899125
jeets do not touch local llms. they api call to corporate service llms and then shill for the cloud services model.
>>
>>741899515
expand the dataset then
skill issue
>>
>>741899531
Explain Mutahar
>>
>>741899671
>expand the dataset
>encounter another issue
>expand the dataset again
>encounter another issue
>repeat ??? times
>end up at a bloated cloud model
hmm
>>
>>741899697
Mutahar is a brown white man
>>
>>741899745
You aren't white, Muhammad.
Fuck off back to brownsville
>>
File: 1635888463044.gif (2.85 MB, 200x234)
2.85 MB GIF
How about you first get the text generation to not be the hallucinatory, alzheimer, no object permanence bullshit that AI in general is, before you start adding other shit on top of it?
>>
>>741893058
A good rule to remember is that a token = 0.75 words, or a word consists of 1.33 tokens. So 10,000 tokens = a text consisting of 7500 words. It won't be exactly that, but it'll be close to it 99% of the time.
>>
>>741894256
Don't mention Taiwan though
>>
>>741903198
All of those come from having bigger model size and context length
>>
People still use AI Dungeon? Isn't it a thing that got left behind in like 2022?? Wtf people still use it? It was so ass back then
>>
what is this, 2019?
>>
>>741904201
AID was only relevant from December 2019 to December 2020 when they introduced those shards.
>>
>>741903198
RAG was one of the tools developed to literally prevent this
>>
>>741905485
RAG is a meme technology that never works right.
>>
>>741903198
long context is one of the contributing factors that develops into hallucinations. it becomes a needle in a haystack problem.
>>
>>741905808
skill issue. every modern model setup has a RAG on the back end. Agentic stuff is all a modified RAG.



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.