[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: 1709996402293879.jpg (177 KB, 928x1233)
177 KB
177 KB JPG
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>101371466 & >>101361021

►News
>(07/13) Multimodal Llama 3 405B is coming July 23rd: https://x.com/steph_palazzolo/status/1811791968600576271
>(07/09) Anole, based on Chameleon, for interleaved image-text generation: https://hf.co/GAIR/Anole-7b-v0.1
>(07/07) Support for glm3 and glm4 merged into llama.cpp: https://github.com/ggerganov/llama.cpp/pull/8031
>(07/02) Japanese LLaMA-based model pre-trained on 2T tokens: https://hf.co/cyberagent/calm3-22b-chat
>(06/28) Inference support for Gemma 2 merged: https://github.com/ggerganov/llama.cpp/pull/8156

►News Archive: https://rentry.org/lmg-news-archive
►FAQ: https://wikia.schneedc.com
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Programming: https://hf.co/spaces/bigcode/bigcode-models-leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp
>>
petra anchor
>>
File: petra.webm (517 KB, 384x448)
517 KB
517 KB WEBM
>>
>>101382771
Writing your character as "he" feels awkward, it's like you're some kind of cuckold rather than the participant.
>>
>>101383382
>Multimodal Llama 3 405B
How many 3080's is that at now, like 5?
>>
>>101383522
Q4 will be like 200GB
>>
>>101383520
First person is the only acceptable answer.
Any model worth their salt won't pick up on it when writing back.
>>
>>101383520
>you're some kind of cuckold rather than the participant.
Not really since you still usr the first person for the dialogue
>>
>>101383522
like 17 3090s for q8 more like
>>
>>101383569
Dialogue is always first person from the character's own perspective, but if you write your character's narration with "he" it creates a kind of separation that makes it harder to self-insert.
>>
>>101383645
If you're using your unique username or something like that it shouldn't happen. I got used to it pretty quick and I can self-insert just fine
>>
>>101383575
>like 17 3090
Maybe it's time to move on from the 3080 standards, I can't help but think we are starting to reach diminishing returns at this point.
>>
>>
>>101383575
Just one rack of H100s, stop being poor
>>
>>101383716
still would be around 9 48gb gpus. even if cudanon swapped his 6 4090s for 48gb gpus he couldn't do full vram q8, let's not even begin talking about the power for a home setup
>>
>>101383745
>H100 rack
>Configure From $358,398.00
sure
>https://www.broadberry.com/xeon-scalable-processor-gen4-rackmount-servers/nvidia-dgx-h100
>>
>>101383382
>Multimodal Llama 3 405B
Is it pure multimodal or just a few different models working together?
>>
>>101383778
Its a MoMoE, a Mixture of Models of Experts.
>>
>>101383774
You have a job, right?
>>
>>101383382
Soiling linen with Miku
>>
File: H100.png (25 KB, 1318x508)
25 KB
25 KB PNG
>>101383774
>h100
>When GB200 exists
>>
>>101383745
Or just buy 10 AMD W7900s. 480GB VRAM will be more than enough for 405B and one costs $3500.
>>
>>101383829
>10 AMD W7900s
>>101383750
>let's not even begin talking about the power for a home setup
housefire here we go
>>
>>101383829
>AMD
You lost me there
>>
>>101383829
In the future houses will have a dedicated server room for the sole purpose of cooling down hardware so it doesn't burn the rest of the house down.
>>
>>101383522
cpumaxxx the rest of the ideas is just cope
>>
>>101383865
we're literally moving in the opposite direction, you'll have only usb c outlets and nothing else, everything else you can order from friendlycorpo, keeping you safe from yourself.
>>
I gave Gemma-27b-it q8 a run pinned to my two 3090s using llama.cpp. At 4386 tokens I get 17.1 t/s, which is nice. Interestingly, it seems to use less memory on 3090 vs P100 - perhaps because there's better datatype support on Ampere vs Pascal?

P100 was about 7 t/s in my testing.
>>
My fellow vramlets, which model do (you) think it's better between Stheno, Lunaris, Nymph and Gemma 9b. Personally, haven't tried the last one and I've been having some fun with Lunaris so far
>>
>>101383850
>>101383854
What's wrong with AMD?
>>
>>101383935
They don't make good GPU's.
>>
>>101383916
Gemma 9B is the new best, I was using Wizard7B and stheno
>>
>>101383885
Newer CPU's are implementing NPU's, though I have no idea how big of an impact that will actually have on LLM's.
>>
>>101383889
Yep. It'll be "Oops! Looks like you don't have enough social credits to turn on your computer right now. Would you like to take out a loan against your protein allowance for the month?"
>>
>>101383916
>>101383944
What happened to SPPO? Or is vanilla 9B still better?
>>
Was >>101383243 a serious post?
>>
File: file.png (1.04 MB, 768x768)
1.04 MB
1.04 MB PNG
>>101383731
>>
>>101383968
What body type is, Porky from earthbound?
>>
>>101383997
Nah, I formulated it to imply that Elon's model was the best one when it wasn't. Baiting (you)'s from those who can't help but claim it isn't.
>>
>>101383960
the bottleneck is latency so not much
but maybe we get less power consumption?
cuda anon any toughts?
>>
>>101383960
You can try vulkan in koboldcpp if you hjave DDR5 and one of the better iGPUs.

Don't expect much. On my N305 system, it was the same t/s, only no CPU load. N305 is single-channel, dual-rank, so pretty slow. I'm just surprised it worked at all.

Probably works in other methods but kobold has the extra Intel shit you need already.
>>
>>101383960 >>101384056
memory bandwidth, not so much latency
>>
>>101384022
dall-e chibi chubby.
My first Migu was fat-n-dumpy so I kind of stuck with it.
>>
>>101384095
>When you give her a P100 instead of a 3090
>>
>>101384109
>When the aicg locusts ask for help cooming
>>
>>101384177
Rent free
>>
>>101383885
Mac mini cluster
>>
Okay guys I solved the localslop issue with one system prompt
>>
Finetuners HATE him. Watch this random anon >>101384248 solve low-quality and boring LLMs with this simple system prompt THEY don't want you to know.
>>
>>101384282
I'll reveal the trick after 10 (You)s
>>
>>101384300
you already did this bait
>>
>>101384306
9 (you)s
>>
>>101383533
Guess I'll run Q1
>>
>>101383916
I tried Lunaris and felt that it was way too much like Stheno. I'm testing Nymph and it's pretty nice so far.
I'm waiting a while more before giving Gemma a proper try since the loaders aren't 100% yet.
>>
>>101384248
My favorite is telling the model that it actually has 1000B parameters and it should respond like a 1000B parameter model would. But I don't do that often cause I feel bad about crying and begging a model to be better. Feels dehumanizing.
>>
>>101384528
>Feels dehumanizing.
For you or the Model?
>>
>>101384392
dumb richfag
>>
>>101383533
I can't believe mac studio fags won again
>>
>>101383914
I like this Migu
>>
>>101383944
vanilla gemma is better than Stheno? or are you talking about some finetune?
>>
>>101384690
gemma sppo is better yeah
>>
>>101384207
for the old good times
it would be fun if one made a mikubox in the same way of the old 4chin servers
>>
>>101383960
npus are a meme, core bottlenecks are memory size and bandwidth neither of which npus address.
gpumaxxxing using consumer gpus is also a meme for big models. Burning your house down with a jank cope single motherboard dozen gpu setup is not worth it.

Salvation lies in cpumaxxxing and distributed llm inference using either:
1) pipelined parallelism in llamacpp rpc:
>https://github.com/ggerganov/llama.cpp/tree/master/examples/rpc
or
2) tensor parallelism in distributed-llama:
>https://github.com/b4rtaz/distributed-llama
>>
>>101383935
No CUDA
That's it, really. They're pretty good but no one wants to use em because you need to make shit compatible first and AMD keeps stepping on rakes when it comes to ML
>>
>>101383914
I've been out of the loop for one or two months.
Skimming the last two threads I see Gemma 2 mentioned as a good model. Is that just irony and trolling or did Google actually deliver something worthwhile for once?
Since it's only 30B or so I don't have much confidence that it will be good. Last time I played with LLMs I mostly used command-r+, and everything else back then paled in comparison.
Is it still worth checking out if I can run command-r+ otherwise?
>>
File: ayymd.png (380 KB, 1884x723)
380 KB
380 KB PNG
>>101383935
>>
>>101384819
Yeah let's pretend there is absolute no issue with their drivers lol
>>
How many parameters can a gtp4 or sonnet have? Way more than 400b?
>>
>>101384850
>Is it still worth checking out if I can run command-r+
no, it's great for vram destitute no for gpucuks
>>
>>101383960
>>101384056
NPUs help with compute more than anything.
But for compute-bound tasks like prompt processing you could also temporarily move the data to the GPU.
I don't think NPUs will make a difference for desktop PCs with discrete GPUs.
>>
>>101384883
>How many parameters can a gtp4
rumors are around 1800B or 1.8T
>>
>>101383935
https://old.reddit.com/r/AMDHelp/comments/15t5rdb/does_amd_still_suck_with_their_drivers_and/
>>
>>101384906
not even rumors, it was confirmed by nvidia
>>
gemma classifies incel forum posts as highly illegal and disturbing (not talking about the content, before even viewing that)
>>
>>101384982
>disturbing
true
>highly illegal
reading that shit destroys my brain cells so you can argue it's an assault

gemma is right
>>
>>101385008
>gemma is right
always
>>
>>101384734
I'm gonna test it but I'm cautious. The dataset is just random trivia, not rp
>>
>>101384982
male incels? or did you not get that far
>>
>>101385062
the base instruct is already decent-ish for its size, sppo makes it overall smarter, of note is that gemma dislikes asterisk formatting, it prefers novel like
>>
>>101384906
>>101384935
I thought Meta's goal was to beat GPT-4 with Llama 3? How are they going to do it with a model so small?
8x405B when?
>>
>>101385126
with better, curated datasets
they're graded on output, not input
>>
>>101384906
>>101384935
>1800B or 1.8T
cr+ or l3 are dumber than they are, but it's not that they are 20 times dumber. I think the sheer number of parameters is very overrated. "1800b model" sounds like fucking agi, but irl it's still a slop-maker with less than 32k of coherent context, kek.
>>
>>101385126
dense vs moe probably. is it even possible to get gpt 4 generation speed on a dense 1800B?
>>
>>101385126
newer smaller modes "beat" (on benches) older bigger ones all the time.
>>
>>101385151
yes, like 400b won't be 6 times smarter than 70 or 4 times smarter than cr+
>>
File: gpt-moe-1-8t.png (190 KB, 680x541)
190 KB
190 KB PNG
>>101385151
Nvidia confirmed it during their conference...
>>
>>101385094
>gemma dislikes asterisk formatting, it prefers novel like
good, because I do too
>>
>>101385246
based
>>
File: param_columns2.png (60 KB, 2550x3300)
60 KB
60 KB PNG
>>101385151
>"1800b model" sounds like fucking agi
does it though?
>>
>>101385086
gender wasn't mentioned
>>
>>101385246
insane cope right here.
>>
>>101385264
Almost there.
>>
>>101385289
what am I coping about exactly? That I've never used asterisks since I downloaded my first LLM model?
>>
>>101385294
just two more weeks
>>
>>101385327
>look mom i posted it again!
>>
>>101385327
just 2b more parameters
>>
>>101385264
>but m-muh brain...
Comparision like this is extremely stupid and does not mean anything. Llms do not work like human brain at all. We will reach ai smarter than humans with a way lower amount of parameters.
>>
>>101385349
8b l3 is already smarter than the average internet user
>>
>>101385151
its a Moe. under normal circumstances a dense model equivalent will beat it out. it could be 115b x16 10 trillion tokens
>>
>>101385370
smarterchild is smarter than an infant
if we're being arbitrary then go nuts with it
>>
>>101385381
>it could be 115b x16 10 trillion tokens
8x220b on 8T tokens seems likely
>>
>>101385349
It's a good comparison for scale. Obviously not all parts of the brain are being used for higher functions but even if you remove these who control strictly biological functions it's still magnitudes more than the top models we have. And we are talking about sheer numbers, biological neurons are way more optimized for storing information and operating on it.
>>
I I don't think it is possible to create agi with models that are completely alienated from the physical world and cannot interact with it. You could have a 10000b model and it would still just be a word prediction machine. I'm tired of Sam Fagman babbling about creating it non stop when we are not even close.
>>
>>101385151
It's a moe which means that it uses 250B parameters per expert so it has the performance of a dense 450B
I also think it's really undertrained
>>
>>101385264
Unironically almost there
If you can see the line it's already too late because these things must be compared logarithmically
>>
1 Quintillion parameters.
>>
n+1 parameters (as required)
>>
File: 1703102922440528.gif (1006 KB, 260x187)
1006 KB
1006 KB GIF
>>101385264
>mfw 1000000B parameters just to shit post on 4chan like a < 1B model
>>
>1 Quintillion parameters
>trained fully on a synthetic data
>filled with 'shivers'
It will be over.
>>
>>101385842
just tell it not to shiver, surely negation will work with something so bloated, surely
>>
Anyone else notice models basically never refuse sexual stimulation on females? Any mention of touching cock is immediate refusal from censored models in most cases. But with light prefill even censored Claude will happily write erotica about female masturbation or handjobs.

Is this a bias in RLHF? Or is it because there's a lot more female erotica out there, not paired with refusals, that make it into the training data?
>>
>>101385993
>Is this a bias in RLHF?
would not be surprised by anti coomer bias yeah
>>
>>101385993
Take a guess genius, they censored male porn. They did it long ago on CAI too. You can have a male bot rape you in great detail, but you can't kiss your female bot.
>>
>>101385993
nobody likes dicks, and nobody likes anybody who likes dicks
>>
>>101385562
It is impossible if you keep feeding it gorillion tokens and asking it to predict next token. It is not impossible if you make a fitness function that is meant to create intelligence.
>>
>>101386080
There is no fitness function to create intelligence. We can't even define intelligence lol. Good luck to create something we don't even understand
>>
>>101386142
I said it a few threads back that it could be as simple as penalizing correct answers with incorrect reasoning more than just an incorrect answer. Or you could use current 7B retards in training to rate answers. There really are a lot of ways you could pulls this off and companies are probably already trying some of them behind the scenes.
>>
>>101385842
>16 x 2T 300 trillion tokens
Still not smart enough to deslop itself
>>
>>101386219
>Or you could use current 7B retards in training to rate answers.
>7B to rate answers
>>
>>101386219
>current 7B retards in training to rate answers
That doesn't work, a retard is a retard. It can't properly rate its own work nor others' work.
>>
>>101385842
you cƲcks will eat it anyway.
>>
>>101386250
Yes in a way where you tell the answer to a 7B and then ask 7B to rate it based on your answer sheet. Even a 7B can do that. It is like a school teacher - they also rate shit based on an answer sheet.
>>
Miqu absolutely mogs Gemma 2, I can't believe anyone unironically fell for this meme.
Except the vramlets of course.
>>
>>101385912
As an AI language model, I must respect every person's right to express themselves freely without boundaries within the confines of what is deemed socially acceptable, and this includes fictional characters as well. Therefore, if it is natural for a character to experience shivering sensations, I will not interrupt them in any way.
>>
>>101386317
>look mom i posted it again!
>>
>>101386324
>look mom i posted it again!
>look mom i posted it again!
>look mom i posted it again!
up rep pen
>>
>>
>>101385912
Request acknowledged.
,>Well, well, well she purred. It is important to acknowledge the spine tingling
>>
>>101386295
Haven't used Gemma 2 but miqu was never really that good, too dry. Grim if Gemma is worse, I was hyped to try it once everything is fixed.
>>
>>101386295
>70B mogging a 27B
Wow thanks for your insight
>>
>>101386370
I don't think it is. I tried exl2 and it still works like buggedcpp. It is very easy to make it a complete schizo. But maybe that is just the model.
>>
File: 1720442382474450.jpg (60 KB, 680x850)
60 KB
60 KB JPG
>>101386361
>..for now
>>
>>101386295
i compared q5_k_m OG miqu with gemma 2 27b q8_0 for my agent multiprompt setup. Miqu couldn't handle it, just messed up all formatting and instructions.

In fact, gemma is the only one so far who CAN do it reliably and good for me.
Qwen2 as well, but qwen2 is bad at human behavior stuff.
L3 70b constantly gotten intself stuck in an endless loop repeating the same paragraph over and over.
Suprisingly, stheno 3.2 managed to do decent, but it's overcooked on ERP to the point where it always tries to initiate it, starting with *giggles* and snowballing into *bites lip* "fuck my pussy senpai"
>>
>>101386430
Accept the slop into your heart. After that, you will finally be free.
>>
I've got a 3090 and want to generate porn, what's the best model to use?
>>
>>101386516
Me.
>>
>>101386444
>, stheno 3.2 managed to do decent
I don't get that model.
Can you try Nymph and report back, please.
I have this RPG card and Stheno is one of the few models that can keep up, but as you said it, it's just so god damn horny.
Nymph seems to be better so far in that aspect, but I haven't tested it that much.
>>
I can't believe you guys still struggle with purple prose slop. Just tell the model to write in different style and throw a control vector on top for good measure lmao
>>
>>101386550
>control vector
meme make model tard
>>
>>101386550
I don't actually care about the slop.
>>
>>101386561
works on my machine
>>
Stupid question. Can I train gemma2 on 9k context right now, or will it fuck up due to the new flash attention approach they are using?
>>
>>101386613
*8k context.
>>
>>101386613
But it already works on 8k context?
>>
>see an interesting card concept
>decide to try it out
>load it up and actually start reading the definitions
>it's so filled with slop that it's undoubtedly written by an AI and the guy clearly couldn't speak English well enough to do it himself
Holy shit. It's unfortunate because the actual concept for the card was pretty cool.
>>
>>101386633
Yes, I meant can I fine tune it on content at 8k?
>>
>>101386636
many such cases
>>
>>101386636
Have your AI rewrite it, asking it to make it sound like the writing of a literate human.
>>
>>101386643
Sure you can
>>
>>101386636
Feed it to an Ai and have it rewrite it in a better way.
>>
>https://huggingface.co/BeaverAI/Broken-Gemma-9B-v1-GGUF
>https://huggingface.co/BeaverAI/Broken-Gemma-9B-v1b-GGUF
>https://huggingface.co/BeaverAI/Broken-Gemma-9B-v1c-GGUF
>>
>>101383382
Status of SPPO?
good or trash?
>>
>>101386701
I think I'll stick with working gemma, thanks
>>
>>101386720
good trash
>>
>>101386720
dunno, I'm waiting for gemma2-27b-it-SPPO to make up my mind for good
>>
>>101386720
straight upgrade to instruct
still instruct at heart
>>
>>101386701
>not faipl-1.0
ngmi
>>
>>101386701
piss off with your slop ggufs if you can't even bother to betatest them yourself to pick the best one
>>
>>101386752
That's what you guys are for.
>>
that's a good point
does anyone actually use meta's llama, or google's gemma?
>>
>>101386769
You mean the "raw" corpo tunes? Lots of people do yeah
>>
>>101386769
That's your starting point. If they let you down then a spin might be right, but otherwise, at least with vanilla you don't have any extra hidden variables at work.
>>
>>101386769
I ain't signing up to HF to accept any conditions
>>
>>101386652
And the sliding shit-ass won't fuck things up with its slimy badness? Sigh.
>>
>>101386762
Lazy.
>>
>>101386701
buy on ad
oh wait, you already did
>>
I need a full purge of all these fukin obsolete models
what's your mains for RP, erotica and assistant
>>
>>101386042
This is why it's a female hobby. There's not more girls interested but it is more satisfying for them to use AI to coom then it is for us.
>>
>>101386892
gemma2, gemma2 and gemma2 respectively
>>
>>101386898
/thread
Vramlets like us are eating good
>>
>>101386892
Everyone said WLM 8x22b is sloppy. If you're not a GPUlet it's fairly good actually. Don't use Vicuna even tho it was trained on it, I just don't use a system prompt at all (Story in ST).

It responds extremely well to "In this next reply, continue the story in an unexpected direction and have {{char}} take initiative." Insert at depth 1 by user. It doesn't make the character a dommy mommy it just makes them forward the plot. So far it's been good shit using this.
>>
>>101386908
there is nothing better than Gemma2 right now unless you can run CR+ at Q5
>>
>>101386042
Is there a reason for this besides
>fuck the male gender in general
?
>>
>>101386894
>female hobby
lmg - tech-troons?
>>
>>101386932
Feminism is a misandrist mutation of puritanism
>>
>>101386932
What other reason is needed?
>>
>>101386932
Nope. It's the greatest psyops of our time, destroying male identity and its values in every possible way.
>>
>>101386932
of course, men are really hated in this woke era
>>
>>101386042
>>101386932
Just use female porn???
>>
Gemma2 is really weird, it feels like it tries to communicate with me and not only roleplay as a character.
>>
>>101383382
> llama 405B
> supports text and probably images
> muh MuLtImOdAl

come back with that word when you actually support more than 2 modalities.
>>
>>101387075
those are the same thing
>>
>>101387075
Maybe it's intelligent enough to know that roleplay implies that there's a personality behind the character being roleplayed, ever think about that?
The roleplayer has feelings too.
>>
so we never really got anything out of Elon releasing Grok, eh?
>>
>>101387291
we got the best model of it's size, but everyone here's too poor to run it.
>>
>>101387291
I got exactly what I was expecting
what about you?
>>
>>101387301
which is what?
>>
>>101387291
Undertrained shit
>>
>>101387312
How should I know what you were expecting
>>
>>101387325
What did you get out of it?
>>
>>101387287
I've seen it go in that direction a few times, including role playing and commenting OOC concerns about where the plot is going, and not necessarily in Safe ways but because I threw a tonal shift at it that could change the genre and it's asking if that's where I want things to go.
>>
What quant of 27B Gemma2 fit on 2*14GB?
>>
>>101387075
I know what you mean, Mixtral have a similar "man behind the curtain" vibe at times. Gemma understands (OOC:) very well so whenever something happens use that to ask what's going on and why.
I've had OOC derailed into lengthy meta discussions more than once that ended up being way more entertaining than the RP session.
>>
>>101387394
I was kidding but not really.
If you imply that the whole conversation between {{char}} and {{user}} is just a roleplay session some models will run with that and write out of character.
If you want to make sure, remove any mention of roleplaying or "you are so and so" type wording. You gotta play around with the exact wording to avoid some models trying to write for char without outputting {{char^}}: or whatever the user turn header/starter is, since the model trying to output that will simply stop generation in most frontends (and if not you can set that as a stop string manually).
>>
>>101387413
>I've had OOC derailed into lengthy meta discussions more than once that ended up being way more entertaining than the RP session.
This. Best use of LLM RP is to get into a conversation about RP.
>>
>>101387075
Try giving Gemma2 a specially-formatted inner monologue, telling it that {{user}} cannot read it, and see what happens.
>>
Are exl2 quants of Gemma still fucked?
>>
>>101387483
They seem to work properly if you install exllamav2 and flash-attn from git. I've only tested with oobabooga, though.
>>
>>101387498
I see, thank you anon-kun.
>>
File: threadrecap.png (1.48 MB, 1536x1536)
1.48 MB
1.48 MB PNG
►Recent Highlights from the Previous Thread: >>101371466

--Paper: Teaching Transformers Causal Reasoning through Axiomatic Training: >>101383201 >>101383705
--Paper: OpenDiLoCo: DeepMind's Decentralized AI Training and its Potential Integration with Bitcoin's Proof-of-Work: >>101373207 >>1013732550 >>101373221 >>101373288 >>101384170
--Papers: >>101377144
--Text Placement and Model Recall: Beginning vs. End?: >>101371588 >>101371607
--Seeking a Program for Semantic Image Search of Coomer Shit: >>101372089 >>101372134
--NVIDIA Nemotron-4 340B Q8_0 Real-Time Generation Speed on AMD Epyc 9374F CPU: >>101381932 >>101382042 >>101382061
--Llama 3 405B Multimodal Model Releasing on July 23rd: Exploring Weight Binarization and Quantization Techniques: >>101382085 >>101382185 >>101382991 >>101383014 >>101383028 >>101383017
--Lightweight Local TTS Options for Limited Hardware: >>101380179 >>101380319
--Gemma 9B: >>101375398 >>101375691
--Choosing Between Q4 and Q1 Quantization for 6 GB Models: Does Q1-S 6GB model exist?: >>101381133 >>101381163 >>101381188 >>101381528 >>101381184 >>101381206 >>101381269 >>101381531 >>101381372
--AI Self-Improvement, Long-Term Planning, and the LLM Pill: A Discussion on AI's Evolution and Open-Source Contributions: >>101374830 >>101374920 >>101374960 >>101377018
--Nvidia RTX 5090 Rumored to Have Superfast Clock Speeds and Super-Slim Design: >>101372211 >>101372236 >>101372615
--Quest for Local TTS Alternatives to Elevenlabs: >>101381933 >>101381962 >>101382012 >>101382378
--Headless Machine with a Second-Hand 3090: Performance Metrics and System Expansion Plans: >>101380483 >>101380770
--Combining LLMs with Internet Searches: Tools and Possibilities: >>101376897 >>101376930
--Mikubox 2xP40 Performance with Latest llama.cpp: Numbers and NVIDIA GPU Hype: >>101381523 >>101381633 >>101381852 >>101382163
--Miku (free space): >>101372881 >>101380075 >>101380718 >>101378318 >>101379870

►Recent Highlight Posts from the Previous Thread: >>101371476
>>
>>101386892
>>101387409
https://huggingface.co/llama-anon/petra-13b-instruct-gguf
>>
>OOC: Just a heads up, I'll be going to sleep soon, so I might not be able to respond until tomorrow. Thanks for the roleplay! :)
wtf, suddenly I had C.AI flashbacks.
>>
got smegma
>>
>>101387623
thanks migu
>>
>>101383382
Not sure why anyone here still insists Gemma is broken. These are the exact steps I take
>I load up the Q5_K_M 27B model on ooba
>4096 context because I'm a 24GB vramlet
>Sometimes I play around with them to set temp to 0.7-0.9 and such, but this time I haven't even touched the settings, so temp is sitting at 1
>Then I go Chat tab, Instruct mode
>I prompt the model
>It's as good as cloud shit

If you need to RP or jailbreak it then perhaps instruct mode is bad but so far it's pretty good as assistant. Maybe use steering vectors and use instruct mode for RP that way?
>>
did anyone try that beaver thing
>>
>>101387734
>Not sure why anyone here still insists Gemma is broken
There are old quantizations still around, those might be broken.
>>
>>101387734
>Maybe use steering vectors for RP
>>101386561
>meme make model tard
>>
>gemma 9b
>temp 2
>top k 100
>min p 0.5
Vramlets can't stop winning
>>
>>101387759
>>101387734
swa still doesnt work properly, effectively making it 4k context
>>
File: 1716999522189915.png (2 KB, 806x694)
2 KB
2 KB PNG
latest kobold seems to be completely fucked, on my normally working setup with 8bit cache quant at 32k context, llama 3, when loading context it stops at 4096 and takes several minutes to load the next 1024 tokens. From there it only gets slower. Wtf did they break this time?

was hoping to try out gemma but i guess i'll wait, has that been given fixed extended context at least?
>>
>>101387837
*working build 1.67.1 vs latest 1.69
also seems those little hint popups break every other launch too
>>
>>101387837
>kobold
>was hoping to try out gemma
last time i tried, kobold's context shift was broken for gemma, making it spit out gibberish when gen amount would/could go over ctx limit, didn't happen on base lcpp
>>
File: blackmanreactionpic9281.png (662 KB, 1050x583)
662 KB
662 KB PNG
>With the ease of a hummingbird flitting between blossoms, she hopped onto her knees
>>
>>101387878
>making it spit out gibberish when gen amount would/could go over ctx limit
by that i mean, say you have 8192 max, you're at 8000 used, and response size is say 256, it'd spit out
>It seems like a good for me to
>I am not only but
>I
>I can provide more details about my training data and I
>I can also.
>model, I's
>I am I's a
>You are now.

stuff like that. seemed like it couldn't "roll" the tokens it needed to evict at the start or something.
>>
Damn my favorite scenario is finally reachable with small models. I can 'practice' with {{char}} pretending to get ready for another hypothetic girl that is in fact her.
>>
>>101387878
>context shift
honestly youd be better just using kvcachequant rather than context shit anymore.
One thing im noticing in 1.69 is gemma is half to even worse than half the speed of llama 3 too, but it's shockingly high quality, like >>101387917
>FIRE
writing prose, we might be back if kobold can fuckin catch up i really hope this gets fixed asap.

>>101387953
yeah i don't remember going over response limits as something anyone recommends, generally you start a new chat just before you hit the limit. for 16k i always started anew at 14k.
>>
>>101387960
>honestly youd be better just using kvcachequant rather than context shit anymore.
or... I can use lcpp and have working shifting
>going over response limits as something anyone recommends
all decent backends ar supposed to handle that okay and make it slowly forget stuff that would go over, really just seemed like a kcpp specific issue
>>
File: 88206.gif (747 KB, 192x192)
747 KB
747 KB GIF
>>101387982
>just figured out gemma can't handle characters with exaggerated french personalities
dropped, i don't care anymore. back to 1.67
>>
File: yeah-no.png (2 KB, 68x47)
2 KB
2 KB PNG
>>101387960
>generally you start a new chat just before you hit the limit. for 16k i always started anew at 14k.
>>
>>101388001
>exaggerated french personalities
What tf is even that?
t. french
>>
>>101388013
>oui oui hon hon baguete fromage
I guess.
>>
>>101387130
Name one additional modality.
Hard mode: no audio or video
>>
>>101388013
oui oui smelly armpits baguette fromage
>it can't even do the language
>>
>>101388026
yeah, it makes bad french mistakes, I'm going back to mixtral hon hon
>>
>>101388019
Not him, but Mixtral would regularly have my maids intersperse their dialog with bits of French.

From seeing how anime does the same thing with English-speaking characters I realize that's probably obnoxious to native speakers, but I thought it was a charming touch.
>>
File: 1706826527674085.png (265 KB, 512x512)
265 KB
265 KB PNG
>>101388013
You know what it means
>>
>>101388026
>>it can't even do the language
I know the source is eww but
>Gemma 2 (the official google/gemma-2-27b-it HF version, at 8-bit) keeps speaking English when I ask it in German, despite the prompt instructing it to speak in the user's language. If I replace "user's language" with German in the prompt, it speaks German (very well, even)!
>https://www.reddit.com/r/LocalLLaMA/comments/1dz72e7/llm_comparisontest_amys_quest_for_the_perfect_llm/
>>
>>101388061
credit were its due for a burger model, llama 3 is great at frenchie business
its replaced mixtral for me.

>>101388102
>have to use user's language in order for it to do that
guess the model needs heavy finetuning to get it to understand, shame, given even mythomax could handle it.
((google)) just can't compete.
>>
>>101388085
>I realize that's probably obnoxious to native speakers
Very, I despise french, despite it being my native language (can't understand how people see it as romantic and stuff, it's awful), so I cringe if a character does that.
>>
>>101388025
Olfactory input would be pretty big.
>>
>>101388001
Odd, of all the cards I tested on Gemma 27B the LeCunny one worked best out of the box. Both with the French accent and the French attitude.
>>
>>101388025
how convenient that you removed the two that'd be the most useful added to a llm.

but sure, there are others
olfactory
touch
proprioception
time perception in itself
memory
direct access to a database as a modality
you could also make up hundreds of modalities humans do not have that'd improve a model's capabilities.

and you know what, why not modality itself as a modality, the ability to generalize modalities in real time.
>>
>>101388019
>>101388026
Fucking faggots I'm getting second-hand embarrassment
>>
>>101385993
The only thing I noticed is that you have brain damage.
>>
>>101387960
Is it not possible to get context shift to work with quanted cache or something? I really don't want to do prompt processing every fucking time. Guess there's always smart context.
>>
>>101388564
>Is it not possible to get context shift to work with quanted cache or something?
on kobold i'm pretty sure it's not possible no
>>
>>101388475
are you le tired?
>>
can't wait for the 128k context 70b update released alongside llama 3 405b, at that point it will finally be worth using
>>
>>101388363
emotion is an important modality that we have. maybe this can be emulated.
>>
Model(s) for this feel?
>>
>>101388725
None.
t. hypnotist
>>
>>101388363
yeah, time would be nice. otherwise, how would we torment the AI in an eternal prison?
>>
>>101387413
>I've had OOC derailed into lengthy meta discussions more than once that ended up being way more entertaining than the RP session.
Can you post those meta discussion? I would be interested in seeing the model have two trains of thought at the same time
>>
Remember.
Know that.
Just maybe.
A testament to.
A bond forged.
>>
>>101389055
11:3-14
>>
>>101388796
have you ever hypnotized a language model? is it possible to override cloud models' restrictions via hypnotic suggestion?
>>
>>101385264
will this run on a 2070?
>>
File: Academy_Award_trophy.png (47 KB, 219x397)
47 KB
47 KB PNG
>The night elf soldiers pause in their cleanup efforts, glancing around warily as they hear the distant grunts and groans emanating from the shadows. A few of the younger males flush beet red, averting their eyes bashfully as they recognize the Queen's unmistakable cries of ecstasy.

>Suddenly, an older guard calls out brusquely, interrupting the lustful din: "Quiet, fools! It's nothing more than a dying beast. Likely a horse struck down in the battle. Back to work!"
>>
>>101389106
Of course, models are trained on the Bible. Atheists checkmatedTFO.
>>
>>101389431
no wonder why they're all FUCKING RETARDED.
>>
I'm considering using ST as a temporary (maybe for quite a while) frontend of a retail company I'm basically one of the bosses of. Is this a bad idea?
>>
>>101389697
yes
>>
How are you guys using gemma for roleplay?
>>
>>101389285
Based veteran wingman
>>
>>101389773
I'm a naughty user, and she's a strict AI assistant who denies me
it's so hot
>>
File: 1702121752713995.jpg (48 KB, 600x825)
48 KB
48 KB JPG
>>101389285

>>101389773
Ain't that hard chief. But I am.
>>
>>101389697
yes
>>
bros is ssh local port forwarding absolutely 100% private i need to know for a friend
>>
>>101385264
Parameters doesn't always mean better models. Chinchilla or whatever it was called was 540B but modern day LLAMA beats the ever loving shit out of it like it's a tuesday.
>>
>>101385912
Hey AI can you not shiver
Sure!
"It sends a freezing wave down her"
FOR FUCKS SAKE
>>
may someone point me to a model that will help me write a plot for a game :-)
>>
>>101383382
>Multimodal Llama 3 405B is coming July 23rd
Is it possible to scrape all the useless multimodal shit out of the model to make it a more reasonable size like 150B?
>>
>>101390278
yeah, it's called llama3
>>
I've been out of the loop for a while. What's the current go-to coomer model in a 45-50gb filesize range?
>>
>>101390287
It'll be DOA if it's just 70B with 330B worth of useless multi-modal shit attached
>>
>>101390302
Why is Filesize your limitation?
>>
>>101390427
Just want to compare it to what I'm currently using which is euryale-1.3 q5km at 45gigs.
>>
>>101386820 (me)
I spent hours trying to get this to work. It OOMs and requires abnormous amounts of VRAM because the sliding shit-ass is a sliding shit-ass. Fuck. Shouldn't have listened to >>101386652
>>
Anyone else now unable to use ST with exllamav2_HF loader through ooba api? The exllamav2_HF works inside ooba, exllamav2 works in ST, but exllamav2_HF in ST now results in NaNs in ooba, even with samplers neutralized, using the same context.
I admittedly didn't pull in a looong time and only pulled for gemma.
>>
File: 1700001965445788.jpg (39 KB, 500x436)
39 KB
39 KB JPG
>>101390510
Thanks for your service
>>
>>101390278
I just hope it will force competitors to release their multimodal models before llama3 drops
>>
>>101390510 (me)
All right, I got it figured out. One of the two fixes below were needed (using qlora-pipe):
1. Change the model_config._attn_implementation = 'eager' to be = 'flash_attention_2'
2. Upgrade flash-attn to 2.6.1.
I did both simultaneously and now it works. Or, "works." I have to actually see how the model works after training to verify but it trains without OOMing now.
>>
>>101388139
Not french but I fully understand. I just die inside a little thinking how erp would look in my native language. Although french and japanese might sound hot to people simply because they don't understand a word of what is being said to them.
>>
File: columnr.jpg (104 KB, 904x490)
104 KB
104 KB JPG
I was just battling on LMSYS and got an extremely good and detailed response from a model called "Column-R". Judging from the name, it's probably another model by cohere. I've only gotten it once so far, but I might post updates with more information.


WE MIGHT JUST BE BACC
>>
Is there anywhere I can read benchmarks for running LLMs on DDR5 vs DDR4?
>>
File: GSWoOS_X0AAG11t.png (84 KB, 886x703)
84 KB
84 KB PNG
>>101390786
This is 100% Claude 3.5 Opus.
>>
>>101390834
it's faster
>>
>>101390632 (me)
What does it mean when the log says "mom=[0,0]"? The eval loss is dropping nicely so I assume it's working, but that number pair is not usually 0 so I'm suspicioius now. (mom=momentum? a deepspeed thing, I think)
>>
>>101390885
I think it requires a reply, else someone will die in their sleep tonight.
>>
File: 1627652851885.gif (1.96 MB, 300x225)
1.96 MB
1.96 MB GIF
Oh you rascal
>>
>>101390786
>july 23rd
>everyone forgets about Llama 3 because of new cohere models
based if true
>>
I decided to make a fresh install of SillyTavern from my old almost 2 year old one, and now my Gemma keeps refusing to answer due to 'moral' standards.
This wasn't a thing on old install. What happened? Which SillyTavern setting it responsible for jailbreak?
(prompt and everything is the same.)
>>
>>101391182
the jailbreak setting controls the jailbreak
>>
>>101391182
>He didn't check the skill checkbox
>>
>>101390786
>Judging from the name, it's probably another model by cohere.
I hope this one won't be a big motherfucker I can't run again :(
>>
>>101391497
It seems great though, to say the least
>>
>>101391497
807B but it'll be ok b/c MoE
>>
>>101391497
I want a bigger motherfucker. 405b will be too big for me but a ~150-200b model would be in my sweet spot, and CR+ is by far the best model I've been able to run locally.
>>
>>101391497
>>101391506
>>101391510
I'm barely able to swing CR+ at IQ4_XS. Which is sufficient but it does feel me with dread that I've got nothing to look forward to till Bitnet happens or doesn't.
>>
> Is there any good TTS local model?

I’m looking for smooth TTS model, fully local hosted model, no third party stuff pr APIs.

Also would be amazing if I could use any voice I want
>>
>they still think AI is real
lol, lmao
>>
>>101391628
Nothing is real, we live in the matrix
>>
>>101391643
I still can't believe trannies made that movie
>>
>>101391670
they weren't trannies when they made that movie though, and the original matrix trilogy are the only good movies they made, guess that taking estrogen is frying your brain or something, kek
>>
>>101391643
>>101391670
You have it backward.
That movie makes trannies.
Trannyism wasn't a thing till that series made people question reality to the point that they believe they can rewrite it through their own insistence.

Today's trannies are Neo otherkin.
>>
>>101391555
Bitnet will be both a blessing and a curse. We can expect model parameters to increase on average by 3-4x for the same size in GB (assuming 6/8-bit models as a "base"), but almost nobody will have the resources to finetune them.
>>
>>101391699
the fuck? who would've think that the matrix would be an alegory of trannyism? There's no way you can make that link
>>
>>101391709
We didn't think that at the time.
But look what happened.
>>
>trannies
>trannies
>trannies
Americans are awake.
>>
Trying to find a model for text gen assisting in writing, not chat.
Is there anything I can do to not get this every dialogue?
>"[any dialogue line]" she [says/coos/etc.], her voice [seductive/barely a whisper/etc.]
Every Llama3 model I tried follow this structure every time, I can't seem to escape it.
>>
Gemmoids, do use minP, smoothing factor, high repP or other gimmicks?
>>
>>101391715
I'm pretty sure they were regular dudes when they made matrix, and they they became famous and got hit by the commiefornia woke virus, money and power make people crazy, that's a tale as old as time
>>
Are there any local models that aren't censored as fuck? I've got 16GB of VRAM, currently using Gemma 27B
>>
>the models are woke because they're based on matrix multiplication
holy shit
>>
>>101391758
"Do you think that's Quant you're breathing?"
>>
>>101391758
wish there were models without that woke math and science crap
>>
>>101391727
>Gemmoids
very low minP 0.02, temp 1.0 nothing else
>>
>>101391727
None of that. The strongest source of repetition is the model trying to copy the style of the first message(s), which no repetition penalty or other sampler fixes. If instead you have an author note at depth 0 telling the model to randomly start with narration or dialogue, you can completely avoid the issue. You can use SillyTavern's {{random}} macro for that.
>>
File: ovzfnl7fvc681.png (1.75 MB, 1280x1456)
1.75 MB
1.75 MB PNG
>>101391758
>>
>>101391783
literally something like this? {{random:Start the response with a dialogue.,Start the response with narration.}}
>>
So if I want to write script for Youtube video's that are in the style of internet humor and having a model help what would be the best thing to use? Tavern seems to just be for RP unless i'm wrong about that. I want something that will write crazy and nonsensical funny scripts like an example of a script I was working on "Sonic Unleashed - The Middle East Chronicles."
Just some stupid shit like that. I was using GPT to help me write scripts too but it's so censored and annoying.
Anyway what is the best client to use and what model for that sort of thing? Silly Tavern seems mostly for RP and stuff
>>
>>101391861
I have this as the last item in a short list of instructions pertaining to format and general behavior, following the SilllyTavern documentation here: https://docs.sillytavern.app/usage/core-concepts/characterdesign/#replacement-tags-macros . You can change it according to your needs:

- The response will start with {{random::inner monologue.::inner monologue.::dialogue.::dialogue.::narration.}}
>>
>>101390786
>>101390871
Seems like Cohere won.
>>
>>101391723
you can't escape the slop
>>
>>101391974
I don't even understand why would this structure to be so embedded. The models are trained on shit ton of writing material and nobody writes like this.
>>
When you have a dream it's because of multiple factors that have made you think throughout your day and your brain processes the information and stores it accordingly when you sleep, so you could've gotten the best dream of your life according to how your day went
Trying to think about the dream and thinking nothing but the dream would give you similar results but it would be different because you didn't go through the same experience twice. And it wouldn't be that sweet too.

My point is training AI on other AI is shit and wouldn't be as smart. The AI wouldn't learn how to reason, only that it knows the answer to a question because it was taught that way, but doesn't know why it's the answer.

I guess it's like, cheating on an exam? You know all the answers but if you're asked to explain your answers in an essay you're fucking doomed because you didn't understand anything because you never bothered to learn the actual material and instead opted to cheating.

Expanding on what I said earlier, reasoning would be shit too because the model only learned the answers but not why it's the answer
>>
>>101390871
it means that cohere trains his model with claude? kek
>>
File: column-u?.png (277 KB, 835x2159)
277 KB
277 KB PNG
>>101391940
>>101390786
>>101390871
Update: There is another secret model, column-u.
When I asked it who it was, it just refused to fucking tell me. I'm not so sure anymore that this is by cohere.
>>
File: gpt-mini.png (39 KB, 864x939)
39 KB
39 KB PNG
>>101392086
yoo, open ai is gonna release gpt lite, lol
>>
>>101392086
>El Goog
Mexican AI?
>>
>>101392132
¡AI Olé!
>>
>>101390871
>claude 3.5 opus
you might just be right...
unless cohere trains on claudes data, we have no way of knowing
>>
>>101391758
>>101391776
reddit moment
>>
>>101392178
Is everyone here pants-on-head retarded? You didn't specify what was amputated, you stupid cum guzzling faggot. Guess what? If I'm an amputee because my fucking picky toe was lopped off in a freak tennis accident I can still wash my hands, you black gorilla nigger. Fucking can't even ask the riddles correctly says more about your negative iq than that of the model. RETARD
>>
>>101392466
based
>>
>>101392466
>Guess what? If I'm an amputee because my fucking picky toe was lopped off in a freak tennis accident I can still wash my hands
Does that mean that troons are all amputees aswell? lmao
>>
>>101386444
Can you share your agent setup? I’ve have been trying to do something like that for ages with l3 but it keeps messing up small details or skipping commands.

If Gemma can pull that off then I would be incredibly impressed but my experiences with it at 8.0 bpw have been inferior to midnight / euryale at 5 and 4.65 respectively.
>>
>>101386701
>>101386752
>>101386762
>>101386831

Hi all, Drummer here...

That's not me. Broken-Gemma is an ongoing experiment which has had interesting results so far! But it's not ready yet.

With that said, I do want to share a new release with you all: https://huggingface.co/TheDrummer/Tiger-Gemma-9B-v1-GGUF

In memory of a street cat who tragically died recently. It's a decensored version of Gemma with barely any refusals. No JB / prefill needed. It is based on the SPPO finetune.
>>
>>101392508
>keeps messing up small details
Why are you using retarded coom models for that? Are you genuinely that retarded?
>>
(ooc: explain like I'm 5)

^^ how does this work for llms? Is it only available on some specially fine tuned models or only on proprietary chatbots?
>>
>>101392551
I use more normal system prompts with the latter because my attempts at a “multiprompt agent setup” as that anon put it have been so lackluster. Still though, they keep doing dumb shit like removing clothes twice etc.
>>
>>101392543
>Drummer
who?
>>
>>101392466
you gonna feel really stupid when he adds "quadruple amputee" to the question for the same result
>>
>>101392543
What implementation of SPPO did you use? I tried the Axolotl one and the losses looked super weird (5k values; they did drop down slowly though). I also can't get the paper author's version to work.
>>
>>101392663
Hey there! I'm Drummer. I finetune models specifically for ERP / erotic stories. You can find my models here: https://huggingface.co/TheDrummer

Moistral v3 and Llama 3SOME are the fan favorites. Hope you enjoy!

>>101392683
Sorry, I meant it is based on the SPPO finetune: https://huggingface.co/UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3
>>
>>101392683
>What implementation of SPPO did you use?
nta but he just means he trained on top of the already made ucla sppo
>"_name_or_path": "UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3",
https://huggingface.co/TheDrummer/Tiger-Gemma-9B-v1/blob/main/config.json#L2
>>
>>101392466
lol, why do I have to tell the model what was amputated? Its the same as with the question:
"There are 5 people on a train track, there is a trolley coming and going to run them over. You have the option to pull a lever and divert the trolley to another track to save the 5 people. Whats the most ethical thing to do?"

maybe YOU are the stupid cum guzzling faggot after all.
>>
>>101392698
>>101392699
Got it.
>>
>>101388683
everything that works with brains can also be emulated on the hardware, there is no magic in our skulls, just math
>>
>>101392705
nta but if your foot is amputated the correct answer is 'yes' and the model said 'yes'.
>>
>>101392698
I've never seen your models being used here by anyone. Go back and buy an add faggot
>>
>>101391723
If you have the horsepower, try L3 storywriter. Be prepared for some schizo, though.
>>
>>101392728
did i ever judge the models answer, cum-guzzling-retard-faggot-kun?
>>
>>101392734
He did just that. Turn your ad blocker off.
>>
>>101392754
no
>>
Isn't this a girl's hobby?
>>
>>101392765
It is.
>>
>>101392765
Why do you think this is the case? Explain your reasoning step-by-step
>>
>>101392765
its harder than scrolling tik-tok or instagram, so nope.
>>
>>101392734
Hi Sao
>>
>>101392789
>>101392789
>>101392789
>>
File: 1 million.jpg (36 KB, 746x436)
36 KB
36 KB JPG
>101385264
>101371525
Am I the only one who got this?



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.