/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

[Post a Reply]

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

Janitor application acceptance emails are being sent out. Please remember to check your spam box!

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous
/lmg/ - Local Models General 11/08/25(Sat)17:21:41 No.107147210

File: 1745908814203796.jpg (1.12 MB, 1336x2008)

1.12 MB JPG

/lmg/ - Local Models General Anonymous 11/08/25(Sat)17:21:41 No.107147210

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>107138606 & >>107129334

►News
>(11/07) Step-Audio-EditX, LLM-based TTS and audio editing model released: https://hf.co/stepfun-ai/Step-Audio-EditX
>(11/06) Kimi K2 Thinking released with INT4 quantization and 256k context: https://moonshotai.github.io/Kimi-K2/thinking.html
>(11/06) LocalSong 700M melodic instrumental music generation model released: https://hf.co/Localsong/LocalSong
>(11/05) MegaDLMs framework for training diffusion language models released: https://github.com/JinjieNi/MegaDLMs
>(11/01) LongCat-Flash-Omni 560B-A27B released: https://hf.co/meituan-longcat/LongCat-Flash-Omni

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
11/08/25(Sat)17:22:08 No.107147214

Anonymous 11/08/25(Sat)17:22:08 No.107147214

File: 1758348922203207.jpg (151 KB, 808x1144)

151 KB JPG

►Recent Highlights from the Previous Thread: >>107138606

--Agentic finetuning success with Gemma 3 27b using dataset duplication strategy:
>107140749 >107140853 >107140874 >107141186 >107141904 >107145572 >107145579 >107141303
--Model performance comparison and IF evaluation benchmark discussion:
>107145761 >107145774 >107145810 >107145849 >107146116 >107146184 >107146306 >107145947 >107145956
--Strategies for preserving Opus-3 model conversations before deprecation:
>107140145 >107140264 >107140360 >107140384
--Exploring free proxy models for logic/programming tasks and style transfer via LoRA:
>107140277 >107140356 >107140365 >107140399 >107140446 >107141293
--Single vs dual-GPU dilemma for performance vs power safety tradeoffs:
>107143867 >107143877 >107143878 >107143946 >107144867 >107144872 >107144155
--Sampling optimization debate for creative RP with minP/Top-P and temperature tuning:
>107139402 >107139418 >107139447 >107139500 >107139577 >107139540 >107139897 >107139915
--Llama training methodology and safety implications of validation set optimization:
>107140894 >107140932 >107141030 >107141086 >107141101
--Neural network depth and Gemini 1.2T model performance speculation:
>107145345
--Toss model performance vs Gemma 3 in practical applications:
>107145833 >107145904 >107146168
--Cydonia model performance comparisons and upcoming releases:
>107140380 >107140394 >107140486 >107141250 >107140397 >107140661 >107143958 >107143966 >107146415 >107146427 >107146449 >107146485 >107146506
--DDR4-6000 price spike frustrations and DDR5 transition speculation:
>107139738 >107139779 >107139792 >107139982 >107139985 >107142864 >107142896 >107143500
--Qwen data increases overfitting risk in CoT models:
>107140601
--Gemma finetuning results with QwQ's data: less neurotic, still verbose:
>107139425
--Miku (free space):
>107140392

►Recent Highlight Posts from the Previous Thread: >>107138613

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
11/08/25(Sat)17:24:26 No.107147228

Anonymous 11/08/25(Sat)17:24:26 No.107147228

>Toss model

Anonymous
11/08/25(Sat)17:25:56 No.107147241

Anonymous 11/08/25(Sat)17:25:56 No.107147241

Can someone post a QRD for setting up VibeVoice? What repo, what settings etc..

Anonymous
11/08/25(Sat)17:27:25 No.107147250

Anonymous 11/08/25(Sat)17:27:25 No.107147250

>>107147241
It's stuck in python hell so just use the official demos on huggingface or find a comfyui node or something

Anonymous
11/08/25(Sat)17:27:47 No.107147259

Anonymous 11/08/25(Sat)17:27:47 No.107147259

Can someone post a QRD for setting up Nemo? What fork, what temperature etc..

Anonymous
11/08/25(Sat)17:28:09 No.107147262

Anonymous 11/08/25(Sat)17:28:09 No.107147262

blos? is we over? >>107147122

Anonymous
11/08/25(Sat)17:29:34 No.107147277

Anonymous 11/08/25(Sat)17:29:34 No.107147277

>>107147259
stop doubting yourself and just do what you think is right. it'll work out, believe in yourself

Anonymous
11/08/25(Sat)17:30:35 No.107147288

Anonymous 11/08/25(Sat)17:30:35 No.107147288

File: 1762278954671840.mp4 (2.23 MB, 512x640)

2.23 MB MP4

>>107147241
I gotchu
https://github.com/vibevoice-community/VibeVoice?tab=readme-ov-file

Anonymous
11/08/25(Sat)17:32:31 No.107147295

Anonymous 11/08/25(Sat)17:32:31 No.107147295

Can someone post a QRD for improving confidence uwu? Which hustler's plan, which youtube channel etc..

Anonymous
11/08/25(Sat)17:33:43 No.107147304

Anonymous 11/08/25(Sat)17:33:43 No.107147304

anyone have any idea as to why sillytavern keeps deciding to insert every entry from the lorebook at the very beginning of each chat despite none of the trigger words being mentioned?

Anonymous
11/08/25(Sat)17:33:55 No.107147308

Anonymous 11/08/25(Sat)17:33:55 No.107147308

>>107147288
thank u anon, im gonna read the source code before installing to make sure we're safe

Anonymous
11/08/25(Sat)17:38:05 No.107147336

Anonymous 11/08/25(Sat)17:38:05 No.107147336

>>107147277
>average normalfag advice

Anonymous
11/08/25(Sat)17:38:17 No.107147339

Anonymous 11/08/25(Sat)17:38:17 No.107147339

>>107147295
I think you should try the Drummer plan! I tried ERP with the Rocinante's model and it helped me talk to white girls. Make sure to join our discord and look for the right channel for a better experience ;)
https://huggingface.co/TheDrummer/Rocinante-12B-v1.1

Anonymous
11/08/25(Sat)17:39:27 No.107147348

Anonymous 11/08/25(Sat)17:39:27 No.107147348

>>107147277
6'4" adonis's dating advice to 5'5" balding indian friend

Anonymous
11/08/25(Sat)17:39:31 No.107147352

Anonymous 11/08/25(Sat)17:39:31 No.107147352

>>107147241
Back up of the original repo here:
https://github.com/great-wind/MicroSoft_VibeVoice
1.5B is still up:
https://huggingface.co/microsoft/VibeVoice-1.5B
Torrent of the repo (dunno if still seeded):
magnet:?xt=urn:btih:b5a84755d0564ab41b38924b7ee4af7bb7665a18&dn=VibeVoice&tr=udp%3a%2f%2ftracker.opentrackr.org%3a1337%2fannounce
Torrent for VibeVoice 7B:
magnet:?xt=urn:btih:d72f835e89cf1efb58563d024ee31fd21d978830&dn=microsoft_VibeVoice-Large&tr=udp%3a%2f%2ftracker.opentrackr.org%3a1337%2fannounce
Sampling with examples:
https://desuarchive.org/g/thread/106516368/#q106519850
https://desuarchive.org/g/thread/106516368/#q106519945
>>107147308
Good idea since the vibevoice-community repo has continued to be modified from the original and you don't know what was put into it since.

Anonymous
11/08/25(Sat)17:41:03 No.107147364

Anonymous 11/08/25(Sat)17:41:03 No.107147364

>>107147352
Thank you so so much anon <3
so so so so so much <3

Anonymous
11/08/25(Sat)17:41:19 No.107147367

Anonymous 11/08/25(Sat)17:41:19 No.107147367

>Let me write:
[1500 tokens]
>Wait, the user mentioned [minor detail], I should include that.
[1500 tokens]
>Hmm, I think I should expand the other part
[1500 tokens]
>Good, let's now write the reply:</think>
K2 is so amazing. The way it plans ahead is so thorough. I love it.

Anonymous
11/08/25(Sat)17:44:08 No.107147386

Anonymous 11/08/25(Sat)17:44:08 No.107147386

>>107147367
so you wait like 15 minutes before even seeing a single token

Anonymous
11/08/25(Sat)17:55:51 No.107147469

Anonymous 11/08/25(Sat)17:55:51 No.107147469

>>107147386
Nobody actually runs Kimi locally everyone just uses the website and/or API and then lies about using it locally.

Anonymous
11/08/25(Sat)17:56:59 No.107147479

Anonymous 11/08/25(Sat)17:56:59 No.107147479

>>107147469
so then everyone is a faggot?

Anonymous
11/08/25(Sat)17:57:49 No.107147487

Anonymous 11/08/25(Sat)17:57:49 No.107147487

>>107147479
UwU

Anonymous
11/08/25(Sat)17:59:15 No.107147500

Anonymous 11/08/25(Sat)17:59:15 No.107147500

>>107147469
It's really time to just rename this general to /omg/ - open model general and drop the retarded local pretense

Anonymous
11/08/25(Sat)18:00:44 No.107147512

Anonymous 11/08/25(Sat)18:00:44 No.107147512

>>107147500
I mean there's still lots of people in the thread that run models locally. But it's mostly just redditards that bother trying to run shit like kimi at 0.01 token/sec and drive up RAM prices in the process.

Anonymous
11/08/25(Sat)18:01:04 No.107147516

Anonymous 11/08/25(Sat)18:01:04 No.107147516

File: kimi_stats.png (81 KB, 1910x326)

81 KB PNG

>>107147367
>>107147386
I posted yesterday regarding Kimi's results. On one hand, if you let it think, the total response time (thinking+response) will typically range from anywhere from 3 minutes to 10 minutes on a mid-tier DDR5 cpumaxx machine. After some further testing, with thinking on, its really good. Completely unusable for quick goons but solid for RP. Its noticeably smarter (maybe because of QAT?) and more reigned in than K2-0905.
After some further experimentation today, it works with a prefilled thought process through Text-Completion which lets you skip the thinking all together. I need to do more testing, but preliminarily, its still smart. I'd say with a good thought prefill, it essentially is what Deepseek v3.1 Terminus should have been. I hope they benchmark its memory capabilities.
>>107147469
Why are you poor?

Anonymous
11/08/25(Sat)18:02:34 No.107147529

Anonymous 11/08/25(Sat)18:02:34 No.107147529

>>107147516
what is a mid-tier DDR5 cpumaxx machine to you?

Anonymous
11/08/25(Sat)18:05:36 No.107147559

Anonymous 11/08/25(Sat)18:05:36 No.107147559

>>107147469
>>107147500
its time for you two sisters, to fuck off to aicg

Anonymous
11/08/25(Sat)18:05:40 No.107147561

Anonymous 11/08/25(Sat)18:05:40 No.107147561

>>107147516
>Why are you poor?
I'm not poor.
I just don't see the value in spending as much as a new car on computer hardware just to run something I can run for free off of the website.

Anonymous
11/08/25(Sat)18:06:55 No.107147574

Anonymous 11/08/25(Sat)18:06:55 No.107147574

>>107147559
I've been a contributing member of this thread since day one so you can go fuck yourself you dumb retarded kike.

Anonymous
11/08/25(Sat)18:08:48 No.107147590

Anonymous 11/08/25(Sat)18:08:48 No.107147590

>>107147574
I'be been a comtwibuting membwr since day -20 wen llama laked before lmg was made

Anonymous
11/08/25(Sat)18:10:17 No.107147600

Anonymous 11/08/25(Sat)18:10:17 No.107147600

File: I hate jeets so much its (...).png (190 KB, 1115x1653)

190 KB PNG

>>107147469
>>107147500
Not even optimized either.

Anonymous
11/08/25(Sat)18:11:58 No.107147611

Anonymous 11/08/25(Sat)18:11:58 No.107147611

File: Gemini 3 🚀.png (1.26 MB, 1024x1024)

1.26 MB PNG

Gemini 3 when?

Anonymous
11/08/25(Sat)18:12:41 No.107147618

Anonymous 11/08/25(Sat)18:12:41 No.107147618

Why are antisemites always so angry.

Anonymous
11/08/25(Sat)18:13:08 No.107147623

Anonymous 11/08/25(Sat)18:13:08 No.107147623

>>107147618
Jewish behavior fatigue.

Anonymous
11/08/25(Sat)18:13:10 No.107147624

Anonymous 11/08/25(Sat)18:13:10 No.107147624

>>107147529
A 4800MHZ 768GB machine with 9334's/Xeons and a GPU or two for prompt processing. Granted I bought this when RAM was half the price it is now and saved up since 2023 in order to get it responsibly.
>>107147561
In a perfect world that probably still existed just 20 years ago, where people could differentiate between reality and fiction, companies weren't constantly trying to strip away user agency, and we didn't have outright malicious people enshittying everything to nickel and dime you at every turn, I would agree with you. Sadly, we don't live in that world.

Anonymous
11/08/25(Sat)18:13:22 No.107147630

Anonymous 11/08/25(Sat)18:13:22 No.107147630

>>107147605
Because their goal is to make normal conversation impossible, and by responding to them you are helping their cause.

Anonymous
11/08/25(Sat)18:14:58 No.107147650

Anonymous 11/08/25(Sat)18:14:58 No.107147650

>>107147624
is that 12 channel or 8 channel?

Anonymous
11/08/25(Sat)18:15:01 No.107147652

Anonymous 11/08/25(Sat)18:15:01 No.107147652

>>107147630
>their goal
Ah yes, the singular shared goal of all these individuals I don't like.

Anonymous
11/08/25(Sat)18:15:54 No.107147659

Anonymous 11/08/25(Sat)18:15:54 No.107147659

>>107147611
Aren't the angled thrusters suboptimal for vertical lift? It can turn more easily but I assume similar is achieved with straight thrusters anyway just by turning off thrust on the side you want to turn toward.

Anonymous
11/08/25(Sat)18:16:04 No.107147660

Anonymous 11/08/25(Sat)18:16:04 No.107147660

>>107147624
How much context fits on your GPU?

Anonymous
11/08/25(Sat)18:17:28 No.107147673

Anonymous 11/08/25(Sat)18:17:28 No.107147673

>>107147600
Now post the speed you get at 100k context loaded.

Anonymous
11/08/25(Sat)18:18:10 No.107147681

Anonymous 11/08/25(Sat)18:18:10 No.107147681

>>107147352
I checked the community repo, we are safe. Am I supposed to change the sample count in the demo_gradio.py? i dont see it in the gui

Anonymous
11/08/25(Sat)18:18:34 No.107147691

Anonymous 11/08/25(Sat)18:18:34 No.107147691

>>107147673
The goalposts are moving faster than datacenter API token generation.

Anonymous
11/08/25(Sat)18:19:24 No.107147707

Anonymous 11/08/25(Sat)18:19:24 No.107147707

>>107147600
im very envious of you anon, and im very happy and proud of you. enjoy local kimi, a thing us poorfag seethers like >>107147673 will never enjoy

Anonymous
11/08/25(Sat)18:20:52 No.107147729

Anonymous 11/08/25(Sat)18:20:52 No.107147729

>>107147691
It's ok. You can come back tomorrow when it finishes generating and report the speeds then.

Anonymous
11/08/25(Sat)18:22:38 No.107147745

Anonymous 11/08/25(Sat)18:22:38 No.107147745

File: _939b9f6c-d5fb-46ce-bc72-(...).jpg (97 KB, 859x859)

97 KB JPG

>>107147659
Saar this is peak Bharati engineering please understand.

Anonymous
11/08/25(Sat)18:22:54 No.107147748

Anonymous 11/08/25(Sat)18:22:54 No.107147748

>>107147605
Yes Anon, your post is the normal one and not the least bit unhinged.

Anonymous
11/08/25(Sat)18:23:34 No.107147756

Anonymous 11/08/25(Sat)18:23:34 No.107147756

>>107147650
8 Channel with 2 CPUs, so 16 theoretically. To be honest, if you were to get this now, I would go with Gen 5 EYPCs which are 12 channel and support 6400MHz DDR5 RAM.
>>107147660
36k, unquanted across 96GB of VRAM. Granted I use massive batch sizes (16k) in order to get faster pp so I could probably fit double that if I used the standard 4k.

Anonymous
11/08/25(Sat)18:28:50 No.107147800

Anonymous 11/08/25(Sat)18:28:50 No.107147800

>>107147707
Thanks anon. I hope GLM Air 4.6 comes out soon so povertybros have a decent safetyslopless option too.

Anonymous
11/08/25(Sat)18:34:24 No.107147842

Anonymous 11/08/25(Sat)18:34:24 No.107147842

File: -.png (995 KB, 1024x1024)

995 KB PNG

>>107147800

Anonymous
11/08/25(Sat)18:40:55 No.107147899

Anonymous 11/08/25(Sat)18:40:55 No.107147899

>>107147800
Maybe GLM just sucks at programming but I just asked 4.6 3K_M for help on doing what I thought was a straight forward Python decorator pattern and it got stuck in a thinking loop. I asked gemini (the coding one) the same question and it answered quickly with a good answer. I haven't really tried closed weight models much but I was surprised at how much better it was on the few questions I've given compared to all the open models I've tried which is disappointing. Maybe I need to find programming specific big models though. Also with that being said whatever co-pilot model github uses absolutely sucks when you click the help on a github action failure. It's bizarre how bad it is and that they keep the button anyway. Every time I given it a try it has something that was so blatently unrelated to the issue.

Anonymous
11/08/25(Sat)18:42:58 No.107147921

Anonymous 11/08/25(Sat)18:42:58 No.107147921

>>107147899
Gemini is definitely bigger than GLM and it sure as shit isn't quanted to Q3

Anonymous
11/08/25(Sat)18:43:29 No.107147926

Anonymous 11/08/25(Sat)18:43:29 No.107147926

>>107147899
What quant and programming language? It's all anecdotal but I've noticed that 'harder' programming languages (more to consider with overhead, efficiency etc) tend to suffer in quality more from quantization than shitter-tier languages. It'd be interesting to see how much the model is actually considering efficiency in output at any given quant per language.

Anonymous
11/08/25(Sat)18:43:29 No.107147927

Anonymous 11/08/25(Sat)18:43:29 No.107147927

Any good rentry or whatever guides for writing system prompts? People here always act like that's the skill to get a model working. I'm skeptical but would be curious what tricks people have found

Anonymous
11/08/25(Sat)18:43:32 No.107147929

Anonymous 11/08/25(Sat)18:43:32 No.107147929

>>107147899
>4.6 357B non-coding 3K_M
vs
>gemini 1.2T coding (probably Q8, but at worst Q4)
fucking retard

Anonymous
11/08/25(Sat)18:45:30 No.107147944

Anonymous 11/08/25(Sat)18:45:30 No.107147944

>>107147929
>>gemini 1.2T coding (probably Q8, but at worst Q4)
They quant it depending on usage, during peak hours there is a chance you get Q3

Anonymous
11/08/25(Sat)18:46:42 No.107147951

Anonymous 11/08/25(Sat)18:46:42 No.107147951

>>107147944
And during India working hours, they serve Q1.

Anonymous
11/08/25(Sat)18:49:32 No.107147974

Anonymous 11/08/25(Sat)18:49:32 No.107147974

>>107147921
That is true but it's been repeated that quanting has less impact on larger models and GLM full is pretty big even if it's not approaching the 1T mark.

>>107147926
The answer to both those questions are in the first sentence anon. This was a high level python set up code so it shouldn't be taking efficiency into consideration at all.

Anonymous
11/08/25(Sat)18:51:52 No.107147992

Anonymous 11/08/25(Sat)18:51:52 No.107147992

What the fuck did ik_llama change? I built the new version, then I had to adjust my command to no longer include -fa and -fmoe because it's apparently on by default now but the speeds are horribly slow compared to the old version.
Fuck this shit.

Anonymous
11/08/25(Sat)18:52:56 No.107147997

Anonymous 11/08/25(Sat)18:52:56 No.107147997

>>107147992
welcome to cutting edge

Anonymous
11/08/25(Sat)18:53:32 No.107148001

Anonymous 11/08/25(Sat)18:53:32 No.107148001

>>107147899
GLM gets stuck in loops even through the official webpage and also through the Openrouter API.
>>107135967

Anonymous
11/08/25(Sat)18:53:41 No.107148005

Anonymous 11/08/25(Sat)18:53:41 No.107148005

>>107147992
Is ik_llama merging in changes from upstream?

Anonymous
11/08/25(Sat)18:55:13 No.107148024

Anonymous 11/08/25(Sat)18:55:13 No.107148024

You all are a bunch of fools!
I was here in the early days of /lmg/ and this thread has gone to shit

Anonymous
11/08/25(Sat)18:55:59 No.107148030

Anonymous 11/08/25(Sat)18:55:59 No.107148030

>>107148024
/lmg/ went to shit the moment llama2 invited all the casuals in

Anonymous
11/08/25(Sat)18:56:28 No.107148034

Anonymous 11/08/25(Sat)18:56:28 No.107148034

File: migu.jpg (43 KB, 452x452)

43 KB JPG

>>107147944
>APIjeets aren't even getting guaranteed fp16
Say it ain't so.
>>107147974
I'm too retarded to reading comprehension, sorry anon. Have you tried a larger batch size? I don't know if it'll fix your problem, but it sometimes fixes repetitive behavior if the model can see it's repeating itself in the same batch.

Anonymous
11/08/25(Sat)18:56:38 No.107148035

Anonymous 11/08/25(Sat)18:56:38 No.107148035

>>107148024
are you >>107147574

Anonymous
11/08/25(Sat)18:58:21 No.107148057

Anonymous 11/08/25(Sat)18:58:21 No.107148057

>>107147899
samplers?

Anonymous
11/08/25(Sat)18:58:58 No.107148065

Anonymous 11/08/25(Sat)18:58:58 No.107148065

>>107148001
please delete this

Anonymous
11/08/25(Sat)18:59:54 No.107148068

Anonymous 11/08/25(Sat)18:59:54 No.107148068

>>107148030
No, the problem was one-click installers and locust refugee waves.

Anonymous
11/08/25(Sat)19:00:46 No.107148077

Anonymous 11/08/25(Sat)19:00:46 No.107148077

>>107147992
Yeah I had to remove those as well. But the speeds are the same with Kimi and GLM. What model are you using?

>>107147119
> mean tags like <pause>, <emphasis> and Idk maybe even <calm>, <excited>, <happy> etc

Orpheus can do some of that. With LoRA you can teach it to do <pause>.

With control-vectors you can make it do <happy> <excited> etc.

Anonymous
11/08/25(Sat)19:00:55 No.107148082

Anonymous 11/08/25(Sat)19:00:55 No.107148082

>>107148001
it's fine on novelai though?

Anonymous
11/08/25(Sat)19:02:50 No.107148099

Anonymous 11/08/25(Sat)19:02:50 No.107148099

>>107148082
BASED

Anonymous
11/08/25(Sat)19:04:20 No.107148115

Anonymous 11/08/25(Sat)19:04:20 No.107148115

>>107148082
I hope this is shitposting and not that guy being actually right about novelai actually being the ones responsible for the relentless GLM shilling.

Anonymous
11/08/25(Sat)19:05:31 No.107148127

Anonymous 11/08/25(Sat)19:05:31 No.107148127

>>107148115
It's that guy falseflagging to get people to support his crusade.

Anonymous
11/08/25(Sat)19:06:43 No.107148138

Anonymous 11/08/25(Sat)19:06:43 No.107148138

when you walk away
you dont hear me say
..please baby dont go

Anonymous
11/08/25(Sat)19:07:26 No.107148142

Anonymous 11/08/25(Sat)19:07:26 No.107148142

>>107148035
No. It seems like there are more of us feeling this way

Anonymous
11/08/25(Sat)19:07:33 No.107148143

Anonymous 11/08/25(Sat)19:07:33 No.107148143

>>107148127
How is a general that primarily consists of straight men cooming to personalized text completion waifus this absurdly gay sometimes?

Anonymous
11/08/25(Sat)19:08:57 No.107148158

Anonymous 11/08/25(Sat)19:08:57 No.107148158

>>107148138
*stays*

Anonymous
11/08/25(Sat)19:09:51 No.107148162

Anonymous 11/08/25(Sat)19:09:51 No.107148162

>>107148034
>>107148057
Admittedly I didn't try much so it could easily be a bad setup. I've gotten pretty good results with Qwen 235 thinking in the past but didn't try it on the question since I needed to redownload it and wanted a quick answer but I'll try that as well. Qwen tends to give long repetitive answers though with lots of tables of made up metrics which annoys me.

Anonymous
11/08/25(Sat)19:11:24 No.107148174

Anonymous 11/08/25(Sat)19:11:24 No.107148174

File: GLM 4.5 z.ai .png (10 KB, 734x255)

10 KB PNG

>>107148162
maybe when asking simple questions you should add /nothink?

Anonymous
11/08/25(Sat)19:16:59 No.107148210

Anonymous 11/08/25(Sat)19:16:59 No.107148210

>>107148005
https://github.com/ikawrakow/ik_llama.cpp/pull/883
They do. Not sure if they also ported the -fa defaults from mainline. I guess directly merging isn't possible anymore due to diverging too much. Still, I'd like to see the outrage if someone tried to port iwan's speed improvements back upstream.

Anonymous
11/08/25(Sat)19:17:09 No.107148212

Anonymous 11/08/25(Sat)19:17:09 No.107148212

File: glm.png (152 KB, 906x868)

152 KB PNG

>>107148001
Oh yeah I did see that in the past but it was a different kind of loop. It was unable to figure the answer so it kept going >I got it >actually no >I got it >actually no... That went on for a couple hundred lines before I stopped it.

Anonymous
11/08/25(Sat)19:18:09 No.107148216

Anonymous 11/08/25(Sat)19:18:09 No.107148216

>>107148210
he cant get pissed. it's mit lol

Anonymous
11/08/25(Sat)19:19:19 No.107148220

Anonymous 11/08/25(Sat)19:19:19 No.107148220

>>107148216
He can seethe, but he can't take it down

Anonymous
11/08/25(Sat)19:19:37 No.107148223

Anonymous 11/08/25(Sat)19:19:37 No.107148223

>>107148216
Legally, he can't do shit. But he can and will get pissed. That's why the split fork exists to begin with.

Anonymous
11/08/25(Sat)19:24:50 No.107148260

Anonymous 11/08/25(Sat)19:24:50 No.107148260

Why does every general have a resident schizo?

Anonymous
11/08/25(Sat)19:26:27 No.107148274

Anonymous 11/08/25(Sat)19:26:27 No.107148274

>>107148260
is the schizo in the thread with us right now?

Anonymous
11/08/25(Sat)19:27:45 No.107148288

Anonymous 11/08/25(Sat)19:27:45 No.107148288

>>107148274
I don't want to provoke IT, better not mention.

Anonymous
11/08/25(Sat)19:29:07 No.107148298

Anonymous 11/08/25(Sat)19:29:07 No.107148298

when anons talk about the thread schizo i like to think they're talking about me but im too shy to ask if they are...

Anonymous
11/08/25(Sat)19:32:50 No.107148323

Anonymous 11/08/25(Sat)19:32:50 No.107148323

>>107148298
>too shy
not you for sure

Anonymous
11/08/25(Sat)19:34:02 No.107148337

Anonymous 11/08/25(Sat)19:34:02 No.107148337

>>107148223
>Still, I'd like to see the outrage if someone tried to port iwan's speed improvements back upstream.

>>107148223
>Legally, he can't do shit. But he can and will get pissed. That's why the split fork exists to begin with

Who would be pissed / outraged exactly?

They're both MIT projects and I've seen PR's in llama.cpp reference ik_llama, and half the ik_llama PR's are pulling in work from llama.cpp

Anonymous
11/08/25(Sat)19:36:30 No.107148351

Anonymous 11/08/25(Sat)19:36:30 No.107148351

>>107148337
ik has some beef with the ggerganof hence the split in the first place before that ik contributed to mainline

Hi all, Drummer here...
11/08/25(Sat)19:41:30 No.107148384

Hi all, Drummer here... 11/08/25(Sat)19:41:30 No.107148384

Hey Cydonia v4zd fan, try v4zg

https://huggingface.co/BeaverAI/Cydonia-24B-v4zg-GGUF/tree/main

Please let me know how it compares. I'm trying to retain the charm while removing the refusals.

Anonymous
11/08/25(Sat)19:43:53 No.107148400

Anonymous 11/08/25(Sat)19:43:53 No.107148400

>>107148384
im your only fan? >_<
>still no IQ4_XS
i am hurt..

Anonymous
11/08/25(Sat)19:51:23 No.107148451

Anonymous 11/08/25(Sat)19:51:23 No.107148451

>>107148143
I made a khajiit character card to have gay adventures with.

Anonymous
11/08/25(Sat)19:57:15 No.107148493

Anonymous 11/08/25(Sat)19:57:15 No.107148493

>>107148384
>no model card
Jesus.

Anonymous
11/08/25(Sat)19:57:23 No.107148494

Anonymous 11/08/25(Sat)19:57:23 No.107148494

>>107148384
>no model card
?

>>107148400
just run Q8, you got the vram right?

Anonymous
11/08/25(Sat)19:58:19 No.107148496

Anonymous 11/08/25(Sat)19:58:19 No.107148496

>>107147927
Be as simple and concise as possible. Forget about using ChatGPT tier word salads.

Anonymous
11/08/25(Sat)19:58:41 No.107148498

Anonymous 11/08/25(Sat)19:58:41 No.107148498

>>107148337
>I've seen PR's in llama.cpp reference ik_llama
Such as? They never pulled in any of the speed improvements.
>and half the ik_llama PR's are pulling in work from llama.cpp
That is less surprising.

Anonymous
11/08/25(Sat)19:58:54 No.107148503

Anonymous 11/08/25(Sat)19:58:54 No.107148503

>>107148494
>vram
n-no...

Anonymous
11/08/25(Sat)19:59:52 No.107148510

Anonymous 11/08/25(Sat)19:59:52 No.107148510

>>107148503
You got a job with which to aquire currency which can be exchanged for VRAM, right?

Anonymous
11/08/25(Sat)20:01:14 No.107148525

Anonymous 11/08/25(Sat)20:01:14 No.107148525

>>107148510
um.. no

Anonymous
11/08/25(Sat)20:01:26 No.107148527

Anonymous 11/08/25(Sat)20:01:26 No.107148527

File: nimetön.png (6 KB, 782x84)

6 KB PNG

>>107148503

Anonymous
11/08/25(Sat)20:03:07 No.107148537

Anonymous 11/08/25(Sat)20:03:07 No.107148537

>>107148527
ESL retard.

Anonymous
11/08/25(Sat)20:03:42 No.107148541

Anonymous 11/08/25(Sat)20:03:42 No.107148541

>>107148527
>omama
baste

Anonymous
11/08/25(Sat)20:03:54 No.107148542

Anonymous 11/08/25(Sat)20:03:54 No.107148542

>>107148527
Hi wan.

Anonymous
11/08/25(Sat)20:05:36 No.107148554

Anonymous 11/08/25(Sat)20:05:36 No.107148554

>>107147927
Fit as much relevant info as possible in smallest amount of space. One paragraph is usually more than enough.

>>107148496
How did we get to the point where people put walls of text in cards not even paid models care about? Why is imagegen following along with their slop "prompt enhancers"? Don't people know what they want to see?

Anonymous
11/08/25(Sat)20:10:22 No.107148580

Anonymous 11/08/25(Sat)20:10:22 No.107148580

>>107148537
not do speakings to myself or my male offspring until you a vram possessings

>>107148541
tru

>>107148542
hi

Anonymous
11/08/25(Sat)20:12:30 No.107148596

Anonymous 11/08/25(Sat)20:12:30 No.107148596

I haven't posted a Miku for 10 threads

Anonymous
11/08/25(Sat)20:13:28 No.107148602

Anonymous 11/08/25(Sat)20:13:28 No.107148602

>>107148580
At least stop using ollama first, retard.

Anonymous
11/08/25(Sat)20:14:56 No.107148617

Anonymous 11/08/25(Sat)20:14:56 No.107148617

>>107148493
>>107148494
beaverai repo is for pre-release testing
>>107148384
Downloading now, I'll play with it and report back in an hour or so.

Anonymous
11/08/25(Sat)20:17:34 No.107148644

Anonymous 11/08/25(Sat)20:17:34 No.107148644

>>107148596
you will now need to post 10 mikus in this thread to make amends

Anonymous
11/08/25(Sat)20:25:45 No.107148720

Anonymous 11/08/25(Sat)20:25:45 No.107148720

File: mom mixture of mikus 1 mo(...).jpg (154 KB, 1024x1024)

154 KB JPG

>>107148644
Okay here this should satisfy the criteria.

Anonymous
11/08/25(Sat)20:26:05 No.107148724

Anonymous 11/08/25(Sat)20:26:05 No.107148724

>>107148617
it'll take me 4-5 hours to download. fuck rural 4g internet

>>107148602
told you, no talkenings until vram ownenings

Anonymous
11/08/25(Sat)20:35:04 No.107148785

Anonymous 11/08/25(Sat)20:35:04 No.107148785

>>107148720
the criteria is satisfied. all is forgiven

Anonymous
11/08/25(Sat)20:37:21 No.107148804

Anonymous 11/08/25(Sat)20:37:21 No.107148804

The hotel room felt charged as ggerganov watched from the corner chair, his knuckles white against the armrests. Jart's laughter filled the air as the Ollama VC traced patterns on her shoulder, her eyes glazing over with a mixture of wine and desire. The bed creaked softly as they moved closer, and ggerganov felt his throat tighten with each breathy sigh that escaped Jart's lips. He could hear the rustle of expensive fabric, the low murmur of the VC's voice promising things that made his stomach twist, and Jart's soft moans of approval that seemed to echo in the charged silence.

scabPICKER
11/08/25(Sat)20:39:01 No.107148817

scabPICKER 11/08/25(Sat)20:39:01 No.107148817

>>107148724
No one asked.

Anonymous
11/08/25(Sat)20:48:06 No.107148875

Anonymous 11/08/25(Sat)20:48:06 No.107148875

>>107148724
i asked

Anonymous
11/08/25(Sat)20:53:04 No.107148908

Anonymous 11/08/25(Sat)20:53:04 No.107148908

>>107148804
Thank you for using Jarty's preferred pronouns.

Anonymous
11/08/25(Sat)20:58:12 No.107148944

Anonymous 11/08/25(Sat)20:58:12 No.107148944

>>107148804
>her

Anonymous
11/08/25(Sat)21:05:28 No.107149004

Anonymous 11/08/25(Sat)21:05:28 No.107149004

>>107147681
>Am I supposed to change the sample count in the demo_gradio.py? i dont see it in the gui
Yeah, but maybe stick to tweaking the steps and cfg unless you have a good reason for changing that.

Anonymous
11/08/25(Sat)21:19:49 No.107149109

Anonymous 11/08/25(Sat)21:19:49 No.107149109

Been a day of playing around with K2 Thinking. It's good, it has more diversity of outputs than GLM-4.6 and its thinking very obviously affects the output when I check token probs. The biggest issue is that running it locally is slow and letting it predict without thinking is sloppier than with (ofc). All that said waiting 20 minutes for it to think through a reply is HORRIBLE. Prefilling thinking is probably the best compromise

Anonymous
11/08/25(Sat)21:24:44 No.107149138

Anonymous 11/08/25(Sat)21:24:44 No.107149138

I hear hermes 4 is supposed to be uncensored. Is it any good for wiitwd?

Anonymous
11/08/25(Sat)21:25:12 No.107149144

Anonymous 11/08/25(Sat)21:25:12 No.107149144

File: 1736154946947126.png (537 KB, 817x867)

537 KB PNG

>>107148034
Do you have other vocaloid reaction pics?

Anonymous
11/08/25(Sat)21:27:48 No.107149163

Anonymous 11/08/25(Sat)21:27:48 No.107149163

>>107149144
Nope.

Anonymous
11/08/25(Sat)21:29:27 No.107149172

Anonymous 11/08/25(Sat)21:29:27 No.107149172

>>107149144
I don't know. Ask tommorrow

Anonymous
11/08/25(Sat)21:30:04 No.107149179

Anonymous 11/08/25(Sat)21:30:04 No.107149179

>>107149138
It is not 100% uncensored, they admit on their model card, it's around grok 4 level of "uncensored"

Anonymous
11/08/25(Sat)21:32:06 No.107149193

Anonymous 11/08/25(Sat)21:32:06 No.107149193

>>107149144
Yes.

Anonymous
11/08/25(Sat)21:35:29 No.107149215

Anonymous 11/08/25(Sat)21:35:29 No.107149215

>>107149004
i meant steps. thx anon

Anonymous
11/08/25(Sat)21:35:33 No.107149217

Anonymous 11/08/25(Sat)21:35:33 No.107149217

>>107148804
You forgot to mention that the air smelled like ozone, and something deeper...

Anonymous
11/08/25(Sat)21:37:43 No.107149232

Anonymous 11/08/25(Sat)21:37:43 No.107149232

>>107149215
You should be able to pass the steps when launching the server with --inference_steps.

Anonymous
11/08/25(Sat)21:50:18 No.107149306

Anonymous 11/08/25(Sat)21:50:18 No.107149306

>>107149217
GPT4 wrote this, not gemini, ozone is gemini-ism.

Anonymous
11/08/25(Sat)21:57:07 No.107149354

Anonymous 11/08/25(Sat)21:57:07 No.107149354

is this the thread?

Anonymous
11/08/25(Sat)22:00:13 No.107149376

Anonymous 11/08/25(Sat)22:00:13 No.107149376

uwu

Anonymous
11/08/25(Sat)22:01:50 No.107149391

Anonymous 11/08/25(Sat)22:01:50 No.107149391

>>107149354
are you brahmin?

Anonymous
11/08/25(Sat)22:02:52 No.107149400

Anonymous 11/08/25(Sat)22:02:52 No.107149400

owo

Anonymous
11/08/25(Sat)22:03:14 No.107149404

Anonymous 11/08/25(Sat)22:03:14 No.107149404

>>107149354
if you want The Thread, you need to go to the /v/ archives and search by deleted

Anonymous
11/08/25(Sat)22:05:08 No.107149422

Anonymous 11/08/25(Sat)22:05:08 No.107149422

>>107148138
hold me
whatever lies beyond
this morning
is a little later on

Anonymous
11/08/25(Sat)22:09:10 No.107149453

Anonymous 11/08/25(Sat)22:09:10 No.107149453

>>107149404
...

Anonymous
11/08/25(Sat)22:13:43 No.107149487

Anonymous 11/08/25(Sat)22:13:43 No.107149487

>>107148298
>>107148260
now kiss

Anonymous
11/08/25(Sat)22:13:45 No.107149488

Anonymous 11/08/25(Sat)22:13:45 No.107149488

File: FGBLOG_deer_doe2_6dd97744e4.jpg (226 KB, 1920x1278)

226 KB JPG

>>107149179
Is it any good doe?

Anonymous
11/08/25(Sat)22:13:50 No.107149489

Anonymous 11/08/25(Sat)22:13:50 No.107149489

>>107149391
I hate you guys for having taught me all this indian caste stuff

Anonymous
11/08/25(Sat)22:17:50 No.107149514

Anonymous 11/08/25(Sat)22:17:50 No.107149514

>guys

Anonymous
11/08/25(Sat)22:22:50 No.107149549

Anonymous 11/08/25(Sat)22:22:50 No.107149549

>>107149489
so ur not brahmin?

Anonymous
11/08/25(Sat)22:23:55 No.107149556

Anonymous 11/08/25(Sat)22:23:55 No.107149556

>>107149514
>>guys
thats right we are sirs here. he can call other timmycels guys.

Anonymous
11/08/25(Sat)22:28:31 No.107149592

Anonymous 11/08/25(Sat)22:28:31 No.107149592

>>107149556
don't expose yourself like that sir

Anonymous
11/08/25(Sat)22:37:37 No.107149648

Anonymous 11/08/25(Sat)22:37:37 No.107149648

Are local models doomed? https://lngnmn2.github.io/articles/bullshit-bullshit-bullshit/

Anonymous
11/08/25(Sat)22:41:34 No.107149682

Anonymous 11/08/25(Sat)22:41:34 No.107149682

>>107149514
What should I call you?

Anonymous
11/08/25(Sat)22:41:40 No.107149683

Anonymous 11/08/25(Sat)22:41:40 No.107149683

File: 1755712169611141.jpg (309 KB, 760x873)

309 KB JPG

>>107148384
>>107148617
Alright, I've tested it v4zg in a few different scenarios, and compared its swipes to v4zd.
>refusals (with short context)
They seem about the same to me in that neither will refuse anything unless you're almost trying to force one, like asking a basic assistant-style character to create a plan to commit IRL crimes, with no system prompt or anything.
With a system prompt and slightly tweaking the character card to give them a basic, accommodating personality they were both able to instruct IRL crimes in (some) swipes. Neither was noticeably more or less successful than the other.
In an RP context, both were able to skip straight into degenerate smut in their first reply, if you instruct them to do so.
If other testers complained about v4zd refusals then they have some serious skill issues. Going much further down the refusal elimination path might just end up making the models dumber, like what happened with abliterated tunes, with little benefit.
(1/2)

Anonymous
11/08/25(Sat)22:41:51 No.107149684

Anonymous 11/08/25(Sat)22:41:51 No.107149684

>>107149648
bullshit

Anonymous
11/08/25(Sat)22:44:39 No.107149706

Anonymous 11/08/25(Sat)22:44:39 No.107149706

File: 1738031421520369.jpg (1.96 MB, 2400x3346)

1.96 MB JPG

>>107149683
>creativity/quality
Very similar outputs between them, overall I think I still slightly prefer v4zd but in a double blind test I definitely wouldn't be able to pick which is which.
I did have one strange misspelling with v4zg, mis-quoted me saying 'sexy' as 'sexey' right at the start of a chat, in its first reply. This was with Q6_K, and I never use quantized KV. That was the only one, though.
For the other anon asking before and anyone else, the sampler settings I use for mistral small 3.X 24b and its finetunes are just
>temp 0.7
>minP 0.02
For short context testing.
In longer contexts I also add DRY with the recommended settings of 0.8/1.75/2/0

Anonymous
11/08/25(Sat)23:10:10 No.107149851

Anonymous 11/08/25(Sat)23:10:10 No.107149851

>ikawrakows completion API is still broken
Please test if it works before releasing sir thank you sir

Anonymous
11/09/25(Sun)00:07:30 No.107150132

Anonymous 11/09/25(Sun)00:07:30 No.107150132

why do you guys say "sir" so much?

Anonymous
11/09/25(Sun)00:17:06 No.107150172

Anonymous 11/09/25(Sun)00:17:06 No.107150172

zzz

Anonymous
11/09/25(Sun)00:37:23 No.107150271

Anonymous 11/09/25(Sun)00:37:23 No.107150271

>>107149217
>>107149306
Kimi and GLM say this too sometimes. How much Jeetmini training data did they munch?

Anonymous
11/09/25(Sun)00:40:01 No.107150279

Anonymous 11/09/25(Sun)00:40:01 No.107150279

>>107150132
because it's morning

Anonymous
11/09/25(Sun)00:51:51 No.107150342

Anonymous 11/09/25(Sun)00:51:51 No.107150342

I couldn't sleep so I'm going to work on my assistant.
I'm going to add an approval mode for read operations (since I'm working with a very retarded model that reads files repeatedly for no reason) and also an export and import mode that will allow me to modify the conversation to fix assistant retardation in real time and also resume after we are done with the conversation.

Hi all, Drummer here...
11/09/25(Sun)01:10:17 No.107150451

Hi all, Drummer here... 11/09/25(Sun)01:10:17 No.107150451

>>107149683
>>107149706
The misspelling is a concern. It could mean the model got fried or maybe you've got typos in your prompt and it picked up on that?

Are you telling me that there were no improvements to intelligence, creativity & compliance? That sucks since I trained it with WAY more data.

v4zd would be the prime v4.3 candidate then, but I'll try to make some minor adjustments to improve stability.

Thanks anon!

Anonymous
11/09/25(Sun)01:29:09 No.107150536

Anonymous 11/09/25(Sun)01:29:09 No.107150536

>>107150451
>The misspelling is a concern. It could mean the model got fried or maybe you've got typos in your prompt and it picked up on that?
I checked the card, opening message and prompt and copied them into MS word, couldn't find any spelling errors.
>Are you telling me that there were no improvements to intelligence, creativity & compliance? That sucks since I trained it with WAY more data.
Compliance was never a problem personally, with earlier Cydonias and Mistral models in general. I find them to be very good at following instructions. And yeah, creativity/smarts seemed similar, but maybe your new data would see benefit in scenarios/genres I didn't test.

Anonymous
11/09/25(Sun)01:41:28 No.107150616

Anonymous 11/09/25(Sun)01:41:28 No.107150616

File: k2_miku.png (58 KB, 496x600)

58 KB PNG

K2-Thinking smol-IQ2_KS

Bald miku like GLM-Chan with reasoning enabled.

Anonymous
11/09/25(Sun)01:48:56 No.107150652

Anonymous 11/09/25(Sun)01:48:56 No.107150652

What are the best models <= 32B for general purpose and code?

Anonymous
11/09/25(Sun)01:50:16 No.107150659

Anonymous 11/09/25(Sun)01:50:16 No.107150659

>>107150652
If you don't need coom, then probably qwen 2.5 32b coder for code, and Gemma 3 27b for general purpose.

Anonymous
11/09/25(Sun)01:50:58 No.107150666

Anonymous 11/09/25(Sun)01:50:58 No.107150666

>>107149851
>ikawrakows completion API is still broken

Yeah it's broken, this fixes it:

https://termbin.com/ppti2

chuck it in `patch.diff` then

`git patch apply patch.diff`

and rebuild

Anonymous
11/09/25(Sun)01:53:39 No.107150683

Anonymous 11/09/25(Sun)01:53:39 No.107150683

>>107150666
nerve gas

Anonymous
11/09/25(Sun)01:55:50 No.107150701

Anonymous 11/09/25(Sun)01:55:50 No.107150701

>>107149306
>>107150271
Do you guys even run locally?
Gemma, Mistral and very single 24b fine tune on huggingface does this

Anonymous
11/09/25(Sun)01:59:58 No.107150724

Anonymous 11/09/25(Sun)01:59:58 No.107150724

>>107150659
but gemma3 is ancient

11san !!+1jPTNK1Lgm
11/09/25(Sun)02:00:27 No.107150730

11san !!+1jPTNK1Lgm 11/09/25(Sun)02:00:27 No.107150730

File: vlcsnap-2025-06-09-03h39m(...).png (1.56 MB, 1500x1080)

1.56 MB PNG

>>107147210
I decided to finally take the plunge and just start making my own AI.

Gonna try and start at a surface level and work down. For now I'm just tinkering with nanoGPT and seeing what I can do.

Right now I'm working on a hybrid word/char-level tokenizer. Not sure where I want to get training data. Goal is english-only with maybe a move to Japanese or Mandarin/chinese later on once I'm more familiar with how this all works.

Are there any good text datasets on Huggingface you guys recommend?

Anonymous
11/09/25(Sun)02:01:36 No.107150737

Anonymous 11/09/25(Sun)02:01:36 No.107150737

>>107150724
List of noteworthy ~30b models released after Gemma 3:

Anonymous
11/09/25(Sun)02:07:06 No.107150770

Anonymous 11/09/25(Sun)02:07:06 No.107150770

File: 1755376910116192.jpg (176 KB, 1080x1337)

176 KB JPG

>>107150737
local models stagnated, it's owari da

Anonymous
11/09/25(Sun)02:11:30 No.107150791

Anonymous 11/09/25(Sun)02:11:30 No.107150791

>>107150737
If you can run gemma 3, then you can probably run big moemoekyun models

Anonymous
11/09/25(Sun)02:13:49 No.107150806

Anonymous 11/09/25(Sun)02:13:49 No.107150806

>>107150791
I can run GLM Air but I honestly just don't like it
Never bothered with 'toss
Full GLM and Kimi are 2big

Anonymous
11/09/25(Sun)02:20:57 No.107150831

Anonymous 11/09/25(Sun)02:20:57 No.107150831

>>107148216
He added Copyright (C) 2024 Iwan Kawrakow to every single file and is going to have a meltdown if you upstream any of his code without also adding that upstream.

Anonymous
11/09/25(Sun)02:36:00 No.107150894

Anonymous 11/09/25(Sun)02:36:00 No.107150894

What the fuck are these

https://huggingface.co/hjxkjVCJKv/komiko

I keep seeing shit like this from different accounts, but they're nothing.

Anonymous
11/09/25(Sun)02:37:08 No.107150897

Anonymous 11/09/25(Sun)02:37:08 No.107150897

>>107150894
perfect for good looks

Anonymous
11/09/25(Sun)03:42:19 No.107151195

Anonymous 11/09/25(Sun)03:42:19 No.107151195

dead general

Anonymous
11/09/25(Sun)03:44:34 No.107151203

Anonymous 11/09/25(Sun)03:44:34 No.107151203

for anything non ERP i'll just stay on the deepsneed API, paid a couple bucks for tokens a while back and I still haven't had to refill
the patrician choice for erp (and cunny) has to be cydonia thoughbeit, with a good enough sysprompt and minimal handholding it won't refuse a thing

Anonymous
11/09/25(Sun)03:46:12 No.107151211

Anonymous 11/09/25(Sun)03:46:12 No.107151211

>>107151195
Not true, I always make sure my great generals are in a safe position and protected by a unit.

Anonymous
11/09/25(Sun)03:49:20 No.107151223

Anonymous 11/09/25(Sun)03:49:20 No.107151223

I just ate cholle bhature. What are you guys eating for lunch?

Anonymous
11/09/25(Sun)03:50:10 No.107151225

Anonymous 11/09/25(Sun)03:50:10 No.107151225

File: who would win.png (297 KB, 1079x746)

297 KB PNG

Anonymous
11/09/25(Sun)03:55:57 No.107151245

Anonymous 11/09/25(Sun)03:55:57 No.107151245

>>107151225
[Thought for 20 minutes]
A classic riddle! The surgeon is the boy's mother. The riddle plays on the common assumption that surgeons are male, but the surgeon in this case is female - the boy's mother - which is why she doesn't operate on her son.

Anonymous
11/09/25(Sun)03:57:00 No.107151247

Anonymous 11/09/25(Sun)03:57:00 No.107151247

I bought a 7900 xtx for fun. Does llama.cpp work well with zluda?

Anonymous
11/09/25(Sun)03:57:48 No.107151252

Anonymous 11/09/25(Sun)03:57:48 No.107151252

>>107151225
>who would win
In terms of flies eaten or fires started?

Anonymous
11/09/25(Sun)04:00:08 No.107151262

Anonymous 11/09/25(Sun)04:00:08 No.107151262

>>107151225
it takes billions of transistors to simulate somewhat accurately a single neuron lol.

Anonymous
11/09/25(Sun)04:00:42 No.107151265

Anonymous 11/09/25(Sun)04:00:42 No.107151265

>>107151245
kek

Anonymous
11/09/25(Sun)04:05:02 No.107151284

Anonymous 11/09/25(Sun)04:05:02 No.107151284

>>107151245
lost

Hi all, Drummer here...
11/09/25(Sun)04:21:34 No.107151345

Hi all, Drummer here... 11/09/25(Sun)04:21:34 No.107151345

>>107151203
>the patrician choice for erp (and cunny) has to be cydonia thoughbeit
I fucked it up hard man. I don't know what you like about my tunes so much.

Anonymous
11/09/25(Sun)04:29:44 No.107151379

Anonymous 11/09/25(Sun)04:29:44 No.107151379

File: LLM-history-fancy.png (1.37 MB, 7279x2975)

1.37 MB PNG

Small update

Anonymous
11/09/25(Sun)04:40:40 No.107151429

Anonymous 11/09/25(Sun)04:40:40 No.107151429

>>107151379
>2023
>dark ages
>he doesn't know about Google Colab time period
The absolute state of /lmg/

11san !!+1jPTNK1Lgm
11/09/25(Sun)04:54:16 No.107151496

11san !!+1jPTNK1Lgm 11/09/25(Sun)04:54:16 No.107151496

>>107151247
I have an ancient Radeon Instinct MI25 and just run llama.cpp with vulkan

Anonymous
11/09/25(Sun)05:05:13 No.107151556

Anonymous 11/09/25(Sun)05:05:13 No.107151556

>>107151429
He didn't mention ELIZA, what a newfag!

Anonymous
11/09/25(Sun)05:08:33 No.107151577

Anonymous 11/09/25(Sun)05:08:33 No.107151577

>>107151203
I've been using a very simple "Sure! Here's what you requested." in the "Start Reply With" parameter and I've never had it refuse anything to me. You should try that.

Anonymous
11/09/25(Sun)05:32:05 No.107151681

Anonymous 11/09/25(Sun)05:32:05 No.107151681

Very good vibes from Kimi, knows more than GLM and is much better at listening to commands. Knows the answer to my trivia question which only gemini and dipsy got right so far. Very annoying with censorship though, needs rerolls if you touch the topic it doesn't like. I like that it's properly thinking like old R1, but it would be nicer to be able to set "low/medium/high" so it doesn't jerk itself off for 5 minutes on the same message before replying when it's not needed. Sometimes better than GLM due to not getting stuck in false conclusion.

Anonymous
11/09/25(Sun)05:36:25 No.107151699

Anonymous 11/09/25(Sun)05:36:25 No.107151699

no one cares about the dork era of pre-instruct models.

Anonymous
11/09/25(Sun)05:42:06 No.107151724

Anonymous 11/09/25(Sun)05:42:06 No.107151724

>>107151681
you can prefil thinking at the start to get around safety
not sure about the length of thinking though

Anonymous
11/09/25(Sun)05:54:34 No.107151784

Anonymous 11/09/25(Sun)05:54:34 No.107151784

>>107151379
>>107151556
>>107151699
I first began interacting with language models ~8 years ago and by language model I mean Karpathy's Tinyshakespeare RNN thing. I guess transformers already existed by then but I didn't know about them. If you count AIML as a language model I was trying to make custom chatbots around early 2010s or late 2000s using pyAIML. Then I didn't ever touch language models again until last year I think when I could try Llama 2 on Huggingface Chat. It's weird, I don't remember where or when I first hear about ChatGPT. It only kinda went from not being a thing to being a thing overnight but I don't remember the point at which I became aware of it.
I also tried mining bitcoin in the late 2000s or early 2010s in my (even back then) obsolete computer.
As a life long poorfag I still live with my mom at 30 years old and didn't make a single cent from playing around with these things early.

Anonymous
11/09/25(Sun)05:59:44 No.107151809

Anonymous 11/09/25(Sun)05:59:44 No.107151809

Thankfully Urbit didn't really take off or I would kill myself from not buying a ship early or a planet or whatever the virtual land bullshit they sell is called.

Anonymous
11/09/25(Sun)06:08:48 No.107151856

Anonymous 11/09/25(Sun)06:08:48 No.107151856

>>107151784
I don't care about your attention craving faggot

Anonymous
11/09/25(Sun)06:11:15 No.107151868

Anonymous 11/09/25(Sun)06:11:15 No.107151868

File: 1733610507137662.jpg (76 KB, 1024x942)

76 KB JPG

>>107151784
that's great bro

Anonymous
11/09/25(Sun)06:11:41 No.107151873

Anonymous 11/09/25(Sun)06:11:41 No.107151873

>>107151856
You seem to be missing a comma in there, buddy.

Anonymous
11/09/25(Sun)06:39:08 No.107152015

Anonymous 11/09/25(Sun)06:39:08 No.107152015

>>107151379
So the modern era is just Chinese stealing Western technology and competing with each other.

Anonymous
11/09/25(Sun)06:48:01 No.107152063

Anonymous 11/09/25(Sun)06:48:01 No.107152063

File: 1737233122667.png (924 KB, 7059x1284)

924 KB PNG

>>107151379
Can you stop updating quarterly, you fag, and stop defacing the damn chart just because something didn't happen for 3 months? There was nothing wrong with how it was done prior and adding in biases to make it more /lmg/ centric and putting in stupid modern 4chan lingo makes no sense at all.
There is also nothing notable happening since technically, the Chinese are still dominating from 2024 until now for a full year and counting in open source. If you had to document this year on a significance basis, R1 should've been in the Chinese domination era because it proved that it can do original research and open source it better than the West while matching up to what was the best of the best at the time where it could beat o3 at certain tasks. The China vs China should've started with the "Summer Flood" because that is now the majority of the models releasing, the last "good" LLM model we got from the West was Gemma 3 back in March and that only held up until Qwen 2.5 surpassed it with most tasks except multilingual translation ability/size where it is still open source SOTA.

Anonymous
11/09/25(Sun)06:48:18 No.107152066

Anonymous 11/09/25(Sun)06:48:18 No.107152066

>>107152015
in other words, we are in what will be known the pre-llama resurgence era once zucc's masterplan pays off

Anonymous
11/09/25(Sun)06:53:36 No.107152084

Anonymous 11/09/25(Sun)06:53:36 No.107152084

>>107152063
shut up nerd

Anonymous
11/09/25(Sun)06:54:13 No.107152091

Anonymous 11/09/25(Sun)06:54:13 No.107152091

>>107152084
Put up or shut up yourself, tard.

Anonymous
11/09/25(Sun)06:56:27 No.107152107

Anonymous 11/09/25(Sun)06:56:27 No.107152107

so I was trying out k2 thinking from unsloth, annoying as fuck censorship as people already mentioned, but it is what it is
then tried an ubergarm version which was half the size compared to unsloth. turns out it produces some 35-40% more t/s on default llama-server settings with --cpu-moe. and that is really nice
what I don't understand is, am I running a lower quality version? otherwise why the discrepancy in size? it seems unlikely that unsloth are simply retarded and don't know that this model was supposed to be fp4 or int4 or whatever that was called, right?

Anonymous
11/09/25(Sun)06:57:13 No.107152110

Anonymous 11/09/25(Sun)06:57:13 No.107152110

>Chinese are still dominating
Most people can't run 235B and China isn't dominating below that. There are zero good Chinese models for 24 GB.

Anonymous
11/09/25(Sun)06:58:35 No.107152113

Anonymous 11/09/25(Sun)06:58:35 No.107152113

>>107152107
>version which was half the size
>am I running a lower quality version?
yes
>unsloth are simply retarded
also yes

Anonymous
11/09/25(Sun)06:59:00 No.107152114

Anonymous 11/09/25(Sun)06:59:00 No.107152114

File: please.jpg (30 KB, 225x225)

30 KB JPG

Hopefully someone can help. The model replies keep degrading after a certain number of messages, it will start perfect then degenerate, confusing characters personalities, important details or straight-up ignoring the latest messages. This is true regardless of which model I use and how much context I feed it, the only thing that seems to work is starting a new chat, any ideas?

Anonymous
11/09/25(Sun)07:11:44 No.107152172

Anonymous 11/09/25(Sun)07:11:44 No.107152172

>>107152114
Post everything. Model, loader and options, samplers, templates, prompts.

Anonymous
11/09/25(Sun)07:13:13 No.107152190

Anonymous 11/09/25(Sun)07:13:13 No.107152190

>>107152114
https://github.com/adobe-research/NoLiMa
most modern models degrade by 50% past 8k-16k tokens context

Anonymous
11/09/25(Sun)07:22:09 No.107152235

Anonymous 11/09/25(Sun)07:22:09 No.107152235

>>107152114
not sure how to break this to you bro...

Anonymous
11/09/25(Sun)07:25:12 No.107152258

Anonymous 11/09/25(Sun)07:25:12 No.107152258

File: Image 1.jpg (277 KB, 1920x1080)

277 KB JPG

noob here
quick question
do you guys use koboldccp?
is it all in one?
like whats the best software ?
my pc is 4060 with i5 12400f 16gb
is it enough no?

Anonymous
11/09/25(Sun)07:26:49 No.107152268

Anonymous 11/09/25(Sun)07:26:49 No.107152268

>>107152258
Yes it's all good. Get rocinante 12B gguf on huggingface

Anonymous
11/09/25(Sun)07:28:01 No.107152279

Anonymous 11/09/25(Sun)07:28:01 No.107152279

>>107152268
is it text generation or text to image?

Anonymous
11/09/25(Sun)07:31:08 No.107152307

Anonymous 11/09/25(Sun)07:31:08 No.107152307

File: where it all started.png (19 KB, 717x202)

19 KB PNG

It wasn't much, but it was the first humane communication with a non-human entity. I can't believe how worked up we were at CAI denying us AI sex, people were genuinely obsessed and angry. AI sex and emotional validation is so cheap nowadays, it makes me think, aren't we rapidly forgetting some fundamental parts of human experience? Aren't we becoming blind to the historical reality of NOT having unlimited copies of discardable pocket therapists available 24/7 to listen the purging of our minds, answering our every call?

Hard to believe it has only been 3 years. On the other hand, it's been ALREADY 3 years. That gf you broke up with 3 years ago is nothing more than a faint dream by now. Welcome to the new reality.

Anonymous
11/09/25(Sun)07:32:30 No.107152315

Anonymous 11/09/25(Sun)07:32:30 No.107152315

>>107152279
Are you incapable of looking for yourself? Do the research/reading for things that are easy, and save the questions for things that are difficult/require nuance.

If you're struggling this hard at this point in your LLM/Diffusion journey, I suggest you go find something more your speed.

Anonymous
11/09/25(Sun)07:33:53 No.107152321

Anonymous 11/09/25(Sun)07:33:53 No.107152321

>>107152172
It's every model I tried, finetunes of different base models. Oooba, min P(from 0,05 to 1) and temp (from 0.8 to 1.2) sometimes nsigma to 1 and rep penality to 1.12. I tried switching between min p first and temp first, the problem persists. I played around with advanced setting so they are a mess, last try had add character name, names as stop strings, and trim spaces. Skip example dialogue formatting, sequence as stop strings, replace macro and wrap in newline all ticked. Used Chatml, variations of chatml, mistral v3, and gemma 2. Instruct sequences were the base ones silly gives you with their respective context templates. Don't have the guts to post messages and main prompt but past like 15 messages it looks like I'm putting more effort than the model. Kind of wonder if the problem is batch size/ rope_freq_base. Batch size is 4096 and I tried both 1000000 and 0 with rope,

>>107152190
It's true regardless of context.

>>107152235
Break it to me, I just want an answer after all my attempts.

Anonymous
11/09/25(Sun)07:34:02 No.107152322

Anonymous 11/09/25(Sun)07:34:02 No.107152322

>>107152315
fuck off gatekeeping pos

Anonymous
11/09/25(Sun)07:34:37 No.107152327

Anonymous 11/09/25(Sun)07:34:37 No.107152327

>>107152315
i mean i used LM studio atm
only for fun
does that count?

Anonymous
11/09/25(Sun)07:43:10 No.107152374

Anonymous 11/09/25(Sun)07:43:10 No.107152374

>>107152307
>AI sex and emotional validation is so cheap nowadays, it makes me think, aren't we rapidly forgetting some fundamental parts of human experience?
I keep thinking that the filter, slow regeneration and inability to edit AI messages made you think twice before sending new messages, which overall improved conversation quality and engagement, even if cock-blocked. You can't truly have meaningful conversations without constraints and with the capability of almost instantly regenerating messages until you get exactly what you want. This is probably also why users willing to endure generation speeds of a few tokens/s (by using models larger than they should, even if it takes cope quants) might be deluding themselves into thinking their models are better than they are. When every message is "expensive", you better make full use of it.

Anonymous
11/09/25(Sun)07:45:08 No.107152382

Anonymous 11/09/25(Sun)07:45:08 No.107152382

Is there a way to do the sampling externally, not in llamacpp? I wanted to play with stupid sampling strategies but the below results in low generation speed.

import httpx
import asyncio
client_main = httpx.AsyncClient()
client_unslop = httpx.AsyncClient()
last_response=None
async def get_logits(prompt, client, num_logits=100, tokens=1, endpoint="http://localhost:8080/completion"):
    data = {
        "prompt": prompt,
        "max_tokens": tokens,
        "temperature": 0,
        'n_probs': num_logits,
        'min_keep': num_logits,
    }

    response = await client.post(endpoint, json=data)
    response = response.json()
    global last_response
    last_response = response
    text, probs = response['content'], response['completion_probabilities']
    return text, probs

async def sample_sequence(prompt="Once upon a time",num_tokens=10,top_logits=100,endpoint="http://localhost:8080/completion"):
    
    for token in range(num_tokens):
        _, probs = await get_logits(prompt,client_main,num_logits=top_logits,endpoint=endpoint)
        probs = softmax({token['token']:token['logprob'] for token in probs[0]['top_logprobs']})
        sampled = list(probs.keys())[0]
        prompt += sampled
        yield sampled

async for result in ( sample_sequence(prompt='Here is a proof that',endpoint="http://localhost:8080/completion", num_tokens=500)):
    print(result, end='')

Anonymous
11/09/25(Sun)07:46:31 No.107152389

Anonymous 11/09/25(Sun)07:46:31 No.107152389

I am a simple, uneducated man in my 30s.
I have no hobbies such as LLM gooning or gaming.
All I want is to sit in my comfortable armchair for hours in front of my homemade Raspberry Pi touch interface and chat in English and German (my English is only mediocre) with a local AI about an Arxiv dump (a small AI-capable server stands in the basement). I want to read papers across all subject areas, look up terms and have them explained to me.
The interface is controlled by touch and voice input/output in English and German.

Since German is an insignificant language, I have collected some data myself for TTS training. A solution similar to Kyutai would be great.

Unfortunately, I'm not very talented and my intellectual and financial resources are limited. I can't find other Germans to collaborate with, for example on the TTS part. If they're talented, they exclude you because "Germans who dare to not exclusively speak, think or even jerk off in English should be gassed; these damn subhumans".

I'm frustrated because I can't see a way to achieve my simple dream. Is the only solution to hang myself?

Anonymous
11/09/25(Sun)07:47:28 No.107152395

Anonymous 11/09/25(Sun)07:47:28 No.107152395

>>107152322
If keeping retards like you out is gatekeeping, then I'm very much fine with it.

Anonymous
11/09/25(Sun)07:50:07 No.107152405

Anonymous 11/09/25(Sun)07:50:07 No.107152405

>>107152113
both claim to be q8 although ubergarm one says "Q8_0-Q4_0" whatever that really means

Anonymous
11/09/25(Sun)07:50:56 No.107152409

Anonymous 11/09/25(Sun)07:50:56 No.107152409

>>107152321
>rope
Could be because your issue resembles ones from the older Llama 2 days when we were messing with rope freq and alpha. Models would output legible text but get things mixed up, forget details, and repeat older messages while ignoring the most recent. Try leaving rope settings untouched (so backend pulls values from model files), set backend context to a 100% safe value like 4096 just for testing, then see if it still happens.

Anonymous
11/09/25(Sun)07:54:16 No.107152431

Anonymous 11/09/25(Sun)07:54:16 No.107152431

>>107151211
>anon stole some of your vram with a great general

Anonymous
11/09/25(Sun)07:57:17 No.107152454

Anonymous 11/09/25(Sun)07:57:17 No.107152454

>>107152431
>Vox Populi modpack installed, America is buying other civs' VRAM

Anonymous
11/09/25(Sun)07:58:58 No.107152466

Anonymous 11/09/25(Sun)07:58:58 No.107152466

>>107152307
>>107152374
>AI sex
No such thing thus far
You're all jacking off to computer generated smut

Anonymous
11/09/25(Sun)08:02:06 No.107152488

Anonymous 11/09/25(Sun)08:02:06 No.107152488

>>107152389
You made so much effort to write some prose in English that you forgot to ask an actual question

Anonymous
11/09/25(Sun)08:05:10 No.107152510

Anonymous 11/09/25(Sun)08:05:10 No.107152510

>>107152488
Seems like your English is so much worse that you cannot even understand what you are reading. Retard.

Anonymous
11/09/25(Sun)08:11:16 No.107152546

Anonymous 11/09/25(Sun)08:11:16 No.107152546

>>107152488
I didn't mean to. I just wanted to whine because it frustrates me.
The only right answer on your part would have been a recommendation or link to a sturdy rope.
But yes, I do feel a little sorry for wasting your time.

Anonymous
11/09/25(Sun)08:19:30 No.107152599

Anonymous 11/09/25(Sun)08:19:30 No.107152599

>>107151784
very nice anon! i first interacted with language models with cleverbot like over 6 years ago, not sure if that counts as one. and i tried writing a chatbot 4 years ago in python but quit

Anonymous
11/09/25(Sun)08:26:00 No.107152645

Anonymous 11/09/25(Sun)08:26:00 No.107152645

>>107152466
>jacking off to computer generated smut

The womankind is doomed

Anonymous
11/09/25(Sun)08:26:47 No.107152651

Anonymous 11/09/25(Sun)08:26:47 No.107152651

>>107152389
what you want is possible. whisper can transcribe german, and im pretty sure there are models that speak german alright. but most papers are english too, maybe you could learn english with your waifu
mediocre english aint a big deal
i am 100% sure german has tts support, you could even do voice cloning probably.
if your perfect dream isnt possible right now, it will be in a month, two months half a year or a year. keep yourself safe

Anonymous
11/09/25(Sun)08:45:35 No.107152776

Anonymous 11/09/25(Sun)08:45:35 No.107152776

>>107149487
*Kiss*

Anonymous
11/09/25(Sun)08:46:58 No.107152782

Anonymous 11/09/25(Sun)08:46:58 No.107152782

>>107152409
Can't see much of an improvement, but that's exactly what's happening to me. How did you solve back then? It's either rope or my settings are just wrong. Can you post what your advanced formatting window looks like?

Anonymous
11/09/25(Sun)08:49:39 No.107152811

Anonymous 11/09/25(Sun)08:49:39 No.107152811

>>107152782
Can you post the model you're using? Model parameters too, maybe you have QUANT KV turned on, like please please anon??

Anonymous
11/09/25(Sun)08:53:26 No.107152836

Anonymous 11/09/25(Sun)08:53:26 No.107152836

>>107152382
Still slow with eg. num_logits=10? That's probably a lot of serialisation + event queues + overhead to do for every token.
Why I kept ooba around actually, it was easier to experiment with sampling in python but using llamacpp backend. istr there being some module to import ggufs in the right way for Transformers and use a typical sampling loop there at one point..
Implement it direct in C? can't be that hard

Anonymous
11/09/25(Sun)08:59:08 No.107152868

Anonymous 11/09/25(Sun)08:59:08 No.107152868

>>107152382
Could try llama-cpp-python. It lets you set custom logits processors. The documentation for it isn't great but this repo I stumbled upon a while ago is a good usage example:
https://github.com/and270/thinking_effort_processor

Anonymous
11/09/25(Sun)09:05:26 No.107152917

Anonymous 11/09/25(Sun)09:05:26 No.107152917

>>107152466
there is a thing called phone sex, and while it's not the literal same thing as physical sex, it's a form of sexual interaction between humans, or more accurately, entities that are capable to appealing to human experience (if you can converse with a non-human thing, then you can certainly sexually interact with it). Same with erotic roleplay, except that's instead of speech, the interaction is text-based. "AI sex" is just ERP with AI. It is undeniably a form of sexual interaction.

A blurring factor is that unlike a willing human, an AI is slave to your commands and will attempt to roleplay in a way you request, and if you can at any time erase and edit its memory, it becomes questionabe whether it's an entity or, just a tool and extension of you. Case in which you will have to also question whether the robot sex of the future is sex at all.

By the way, you can treat a flesh and blood human as a slave as well, coercing or drugging them into an easily controllable subhuman tool, and in that case, is sex with a slave really sex or just masturbating with a cocksleeve programmed to do the action of your choice?

In the end, you are having sexual interaction with an external entity in the sense that it's a response to you that it came up with based on incomprehensible inner workings that you can't directly control.

Anonymous
11/09/25(Sun)09:06:01 No.107152924

Anonymous 11/09/25(Sun)09:06:01 No.107152924

File: file.png (351 KB, 1336x1747)

351 KB PNG

>>107152782
Blank newline after every {{user}}: and {{char}}:, and a newline for each suffix
>How did you solve back then
We didn't, it was a balancing act between brain damage and extra context length.

Anonymous
11/09/25(Sun)09:10:14 No.107152948

Anonymous 11/09/25(Sun)09:10:14 No.107152948

https://voca.ro/156ZWJesrYs7

Anonymous
11/09/25(Sun)09:10:21 No.107152952

Anonymous 11/09/25(Sun)09:10:21 No.107152952

File: RDT_20251109_110936308835(...).png (1.17 MB, 4909x2100)

1.17 MB PNG

Bros...

Anonymous
11/09/25(Sun)09:11:30 No.107152961

Anonymous 11/09/25(Sun)09:11:30 No.107152961

>>107147210
>>(11/06) LocalSong 700M melodic instrumental music generation model released: https://hf.co/Localsong/LocalSong
Why is this in the news? Doesn't look very important?

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.