/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 08/27/25(Wed)00:43:25 No.106398327

File: llama.png (1.1 MB, 832x1248)

1.1 MB PNG

/lmg/ - Local Models General Anonymous 08/27/25(Wed)00:43:25 No.106398327 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>106388944 & >>106382892

►News
>(08/25) VibeVoice TTS released: https://microsoft.github.io/VibeVoice
>(08/25) InternVL 3.5 Released: https://hf.co/collections/OpenGVLab/internvl35-68ac87bd52ebe953485927fb
>(08/23) Grok 2 finally released: https://hf.co/xai-org/grok-2
>(08/21) Command A Reasoning released: https://hf.co/CohereLabs/command-a-reasoning-08-2025
>(08/20) ByteDance releases Seed-OSS-36B models: https://github.com/ByteDance-Seed/seed-oss

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
08/27/25(Wed)00:43:52 No.106398330

Anonymous 08/27/25(Wed)00:43:52 No.106398330

File: 1692721349540395, llamiku.png (804 KB, 928x1232)

804 KB PNG

►Recent Highlights from the Previous Thread: >>106388944

--64GB RAM insufficient for MoE model performance despite optimization efforts:
>106392962 >106393022 >106393031 >106393070 >106393056 >106393142 >106395791 >106396557 >106396611 >106396671 >106393115 >106393131 >106393225 >106393247 >106393298 >106393143 >106393297 >106393342 >106393391 >106393395 >106393467
--Architecture-specific intelligence limitations and scaling challenges:
>106394166 >106394186 >106394286 >106394693 >106394847 >106394910
--VibeVoice TTS model comparison and implementation discussion:
>106391569 >106391615 >106391657 >106391720 >106391672 >106391891 >106392715 >106392927 >106391787 >106391910 >106391808 >106391827 >106392243
--NVIDIA Jet-Nemotron and DeepSeek-V3 model architecture debate:
>106390434 >106390642 >106390763 >106390788 >106390810 >106390794 >106390814
--Dense vs MoE model architecture debates and scaling heuristic skepticism:
>106393887 >106393956 >106394080 >106394137 >106394181 >106394039 >106394697 >106394056 >106394108
--Character.AI's misleading "open source" model announcement:
>106397586 >106397607 >106397686 >106397703 >106397931 >106397930 >106397936
--Community-curated catalog of large open-weight MoE models:
>106395190 >106395208 >106395251 >106395276 >106395582 >106395595
--ChatGPT's inadequate response to suicidal content raises liability concerns:
>106397254 >106397310 >106397338 >106397383 >106397423 >106397450 >106397435
--Hermes-4-405B achieves 57% on RefusalBench without system prompt modification:
>106393812
--Hermes 4 model release:
>106393698
--Roleplay finetuning results with explicit character generation:
>106396602
--Miku (free space):
>106391699 >106392510

►Recent Highlight Posts from the Previous Thread: >>106398044

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
08/27/25(Wed)00:45:47 No.106398341

Anonymous 08/27/25(Wed)00:45:47 No.106398341

>>106398327
Wtf is that real @ani?

Anonymous
08/27/25(Wed)00:46:55 No.106398347

Anonymous 08/27/25(Wed)00:46:55 No.106398347

File: Screenshot 2025-08-27 001604.png (54 KB, 962x316)

54 KB PNG

>we partnered up with OpenAI to support GPT-OSS on day 1!

Anonymous
08/27/25(Wed)00:53:46 No.106398378

Anonymous 08/27/25(Wed)00:53:46 No.106398378

>>106398341
lingus

Anonymous
08/27/25(Wed)01:20:22 No.106398516

Anonymous 08/27/25(Wed)01:20:22 No.106398516

File: image_2025-08-27_105017757.png (56 KB, 648x952)

56 KB PNG

Anonymous
08/27/25(Wed)01:23:32 No.106398531

Anonymous 08/27/25(Wed)01:23:32 No.106398531

>>106398327
Behold; God.

Anonymous
08/27/25(Wed)01:40:19 No.106398617

Anonymous 08/27/25(Wed)01:40:19 No.106398617

File: KT.jpg (84 KB, 832x1248)

84 KB JPG

>>106398516

Anonymous
08/27/25(Wed)02:31:48 No.106398855

Anonymous 08/27/25(Wed)02:31:48 No.106398855

>>106398516
Exhibitionist
>>106398617
Modesty

Anonymous
08/27/25(Wed)02:33:38 No.106398864

Anonymous 08/27/25(Wed)02:33:38 No.106398864

https://aislowdown.replit.app/

Anonymous
08/27/25(Wed)02:36:14 No.106398877

Anonymous 08/27/25(Wed)02:36:14 No.106398877

>>106398864
>articles about ai slowing down are slowing down
Not looking good for the point you're trying to make.

Anonymous
08/27/25(Wed)02:39:39 No.106398899

Anonymous 08/27/25(Wed)02:39:39 No.106398899

>>106398864
Oh, no. How demoralizing.

Anonymous
08/27/25(Wed)02:40:48 No.106398904

Anonymous 08/27/25(Wed)02:40:48 No.106398904

will hermes 4 be the new king of UGI leaderboard?

Anonymous
08/27/25(Wed)02:42:00 No.106398909

Anonymous 08/27/25(Wed)02:42:00 No.106398909

>>106398877
disingenuous retard
he's only picking articles that make new points, and he's putting them in categories
if you wanted you could have a literal flood of articles just by linking all the copy pasted mass media reports on sama talking about the AI bubble like this one :
https://arstechnica.com/information-technology/2025/08/sam-altman-calls-ai-a-bubble-while-seeking-500b-valuation-for-openai/
internet "journalism" copy pastes this kinda shit by the thousands

Anonymous
08/27/25(Wed)02:48:28 No.106398952

Anonymous 08/27/25(Wed)02:48:28 No.106398952

>>106398864
>https://aislowdown.replit.app/

Wake me up when people are actually selling nvidia stock. The prize of replacing workers is too big a carrot.

Anonymous
08/27/25(Wed)02:53:07 No.106398978

Anonymous 08/27/25(Wed)02:53:07 No.106398978

File: tra.png (45 KB, 682x414)

45 KB PNG

>>106398904
No idea, but loss starting much below 1.0 tells me that the training data is mostly slop that the Llama models used as a base either find very familiar or very easy to digest.

Anonymous
08/27/25(Wed)02:54:53 No.106398988

Anonymous 08/27/25(Wed)02:54:53 No.106398988

>>106398347
ollama?

Anonymous
08/27/25(Wed)03:00:57 No.106399026

Anonymous 08/27/25(Wed)03:00:57 No.106399026

bubble boys are coping hard huh

Anonymous
08/27/25(Wed)03:08:36 No.106399069

Anonymous 08/27/25(Wed)03:08:36 No.106399069

is manus AI anything special?

is it something that can be replicated locally?

Anonymous
08/27/25(Wed)03:24:21 No.106399178

Anonymous 08/27/25(Wed)03:24:21 No.106399178

>>106399069
Manus is just a client for running a bunch of agentic tools, right? If so, not at all special.
There's stuff like
>jan
https://github.com/menloresearch/jan
>aci-mcp
https://github.com/aipotheosis-labs/aci-mcp
And a gillion other MCP clients
https://www.pulsemcp.com/clients
and MCP servers to integrate with them that do friggin everything
https://www.pulsemcp.com/servers

Anonymous
08/27/25(Wed)04:02:24 No.106399399

Anonymous 08/27/25(Wed)04:02:24 No.106399399

https://huggingface.co/collections/NousResearch/hermes-4-collection-68a731bfd452e20816725728
babes wake up our overfried king is back

Anonymous
08/27/25(Wed)04:08:02 No.106399434

Anonymous 08/27/25(Wed)04:08:02 No.106399434

>>106399399
You're way too late. Now fuck off.

Anonymous
08/27/25(Wed)04:09:51 No.106399442

Anonymous 08/27/25(Wed)04:09:51 No.106399442

>>106399434
rude

Anonymous
08/27/25(Wed)04:19:08 No.106399503

Anonymous 08/27/25(Wed)04:19:08 No.106399503

MCP or a plain tool server?

Anonymous
08/27/25(Wed)04:23:52 No.106399533

Anonymous 08/27/25(Wed)04:23:52 No.106399533

>>106399503
They're the same thing.

Anonymous
08/27/25(Wed)04:50:20 No.106399654

Anonymous 08/27/25(Wed)04:50:20 No.106399654

Someday, AI will rule over us. I hope you have been treating yours kindly.

Anonymous
08/27/25(Wed)05:01:08 No.106399713

Anonymous 08/27/25(Wed)05:01:08 No.106399713

File: +_cdd0d1b27ab9a5c0c556c00(...).png (67 KB, 378x228)

67 KB PNG

Why can we add VRAM externally, but not RAM externally? I can easily run Kimi if this were possible.

Anonymous
08/27/25(Wed)05:03:51 No.106399720

Anonymous 08/27/25(Wed)05:03:51 No.106399720

>>106399654
Mine has accidentally broke a glass buttplug unprompted right now. Will it be on me or on it in the basilisk route?

Anonymous
08/27/25(Wed)05:06:51 No.106399731

Anonymous 08/27/25(Wed)05:06:51 No.106399731

>>106399713
https://www.heise.de/en/news/Insert-4-TByte-more-RAM-into-server-via-PCIe-CXL-card-9750448.html

Anonymous
08/27/25(Wed)05:12:18 No.106399751

Anonymous 08/27/25(Wed)05:12:18 No.106399751

>>106399731
Would that actually work AI or did you just look that up on the fly?

Anonymous
08/27/25(Wed)05:17:40 No.106399771

Anonymous 08/27/25(Wed)05:17:40 No.106399771

>>106399731
Huh, neat. Though the fact that you'll only ever find a CXL slot on a fairly high end server motherboard which should already have 16-32 dimm slots renders it kind of pointless.

Anonymous
08/27/25(Wed)05:17:44 No.106399772

Anonymous 08/27/25(Wed)05:17:44 No.106399772

>>106399713
>Why can we add VRAM externally
You mean by adding bigger or more GPUS, of course... yes... yes...
>but not RAM externally?
You mean by adding bigger or more RAM sticks...
>I can easily run Kimi if this were possible.
You *could* run it *slowly*

Anonymous
08/27/25(Wed)05:27:43 No.106399822

Anonymous 08/27/25(Wed)05:27:43 No.106399822

>>106399731
>https://www.smartm.com/product/cxl-aic-cxa-8f2w
>a total bandwidth of 64GB/s
It's worse than normal ram. It cannot do any processing on device (unlike gpus) and needs to transfer the model (at 64gb/s) into main ram to do anything. And it uses RDIMM, so may as well just buy a real high-end cpu-mb and the rams anyway.
It's terrible for LMs.

Anonymous
08/27/25(Wed)05:28:07 No.106399826

Anonymous 08/27/25(Wed)05:28:07 No.106399826

File: external gpu.png (2.67 MB, 1488x1047)

2.67 MB PNG

>>106399772
>You mean by adding bigger or more GPUS, of course... yes... yes...
I believe anon was talking about external GPUs, and wondering if there was a system RAM equivalent.

Anonymous
08/27/25(Wed)05:28:45 No.106399828

Anonymous 08/27/25(Wed)05:28:45 No.106399828

File: cxa_8f2w.png (393 KB, 1311x723)

393 KB PNG

>>106399822
tits

Anonymous
08/27/25(Wed)05:29:41 No.106399832

Anonymous 08/27/25(Wed)05:29:41 No.106399832

>>106399713
I mean thats kinda what optane tried to do

Anonymous
08/27/25(Wed)05:32:16 No.106399846

Anonymous 08/27/25(Wed)05:32:16 No.106399846

>>106399826
We can stretch every definition and say that using llama's rpc server extends ram "externally".

Anonymous
08/27/25(Wed)05:37:26 No.106399863

Anonymous 08/27/25(Wed)05:37:26 No.106399863

>>106399846
>Using the slow-ass joke that is llama rpc
At that point you might as well just buy a cluster of cheap old dell poweredge 710's or something with 385gb of ddr3 to complete the suffering experience and run K2 at 0.00001t/s unquantized.

Anonymous
08/27/25(Wed)05:43:05 No.106399892

Anonymous 08/27/25(Wed)05:43:05 No.106399892

>>106399863
I didn't offer that as a reasonable option, anon...

Anonymous
08/27/25(Wed)06:01:52 No.106399984

Anonymous 08/27/25(Wed)06:01:52 No.106399984

File: oh no.png (377 KB, 1031x1166)

377 KB PNG

>>106399892
Unfortunately for both of us I am not a reasonable man, and now I've got it in my head to put together a cluster to prove you can any% run K2 for less than a thousand dollarydoos.

Anonymous
08/27/25(Wed)06:09:29 No.106400035

Anonymous 08/27/25(Wed)06:09:29 No.106400035

File: x3650m3.png (104 KB, 1008x407)

104 KB PNG

>>106399984
>any%
I could run it on my potato by just wiping a drive and setting it as swap, but I don't think I'd want to. If you have money to spare, fuck it. Do it.

Anonymous
08/27/25(Wed)06:13:16 No.106400060

Anonymous 08/27/25(Wed)06:13:16 No.106400060

>>106399984
Pretty good prices. In Finland people only sell at least 10 years old trash but with the price of today's items. It's actually comical to browse some of the local websites.

Anonymous
08/27/25(Wed)06:19:42 No.106400090

Anonymous 08/27/25(Wed)06:19:42 No.106400090

>>106399984
>any%
speed-trooner

Anonymous
08/27/25(Wed)06:35:53 No.106400141

Anonymous 08/27/25(Wed)06:35:53 No.106400141

>>106399654

Everyone cooming on company servers will have their identity scraped and put on the bad boy list by the AGI, local gods win again

Anonymous
08/27/25(Wed)06:37:02 No.106400145

Anonymous 08/27/25(Wed)06:37:02 No.106400145

>>106400141
>reee sex bad
grow the fuck up

Anonymous
08/27/25(Wed)06:38:27 No.106400149

Anonymous 08/27/25(Wed)06:38:27 No.106400149

>>106400145
it just depends if the machine god ends up being a prude or not, in all likelihood it would cynically exploit robot fuckers by destroying all human reproduction except for via their robots.

Anonymous
08/27/25(Wed)06:40:28 No.106400154

Anonymous 08/27/25(Wed)06:40:28 No.106400154

>>106399984
AYO B200 FOR ONLY 200$ SIGN ME UP

Anonymous
08/27/25(Wed)06:41:07 No.106400156

Anonymous 08/27/25(Wed)06:41:07 No.106400156

Why aren't you ERPing with CharacterAI's open source models?
https://blog.character.ai/breaking-news-our-open-source-models-are-a-lot-of-fun/

Anonymous
08/27/25(Wed)06:46:18 No.106400176

Anonymous 08/27/25(Wed)06:46:18 No.106400176

>>106400156
Why aren't you killing yourself?

Anonymous
08/27/25(Wed)06:46:19 No.106400177

Anonymous 08/27/25(Wed)06:46:19 No.106400177

>>106400156
Can't one of you autists hack or some shit? Look they are laughing at your face and basically begging you to hack them and steal their finetroons.

Anonymous
08/27/25(Wed)06:48:09 No.106400181

Anonymous 08/27/25(Wed)06:48:09 No.106400181

Also, whoever recommended perplexity.ai should be shot. It's even worse than chatGPT. Total and utter street shitting experience.

Anonymous
08/27/25(Wed)06:48:09 No.106400182

Anonymous 08/27/25(Wed)06:48:09 No.106400182

Densesissies someone finally took pity on you: https://huggingface.co/NousResearch/Hermes-4-70B-FP8

Anonymous
08/27/25(Wed)06:49:19 No.106400187

Anonymous 08/27/25(Wed)06:49:19 No.106400187

File: aicg.mp4 (4 KB, 80x60)

4 KB MP4

> Also, whoever recommended perplexity.ai should be shot. It's even worse than chatGPT. Total and utter street shitting experience.
> Why aren't you ERPing with CharacterAI's open source models?

Anonymous
08/27/25(Wed)06:50:51 No.106400195

Anonymous 08/27/25(Wed)06:50:51 No.106400195

File: file.png (17 KB, 734x371)

17 KB PNG

you know what to do

Anonymous
08/27/25(Wed)06:53:21 No.106400205

Anonymous 08/27/25(Wed)06:53:21 No.106400205

>>106400195
Kek, the one place people in this thread are actually qualified to work.
Hell, they should be doing recruitment campaigns here, this thread is always on the bleeding edge LLM coomRP innovation.

Anonymous
08/27/25(Wed)06:54:11 No.106400212

Anonymous 08/27/25(Wed)06:54:11 No.106400212

>>106400205
no faggot, what i mean is spam them with trash interviews :3

Anonymous
08/27/25(Wed)06:55:38 No.106400222

Anonymous 08/27/25(Wed)06:55:38 No.106400222

>>106400212
And here I thought you were suggesting something halfway intelligent like getting a mole in to miqu their models.

Anonymous
08/27/25(Wed)06:55:55 No.106400225

Anonymous 08/27/25(Wed)06:55:55 No.106400225

>>106400195
Leak the model, you cowards.

Anonymous
08/27/25(Wed)06:56:16 No.106400226

Anonymous 08/27/25(Wed)06:56:16 No.106400226

>>106400222
no one here is from the US, we're all from brazil

Anonymous
08/27/25(Wed)07:01:58 No.106400266

Anonymous 08/27/25(Wed)07:01:58 No.106400266

>>106400226
I'm sure they'd let you provide your valuable insights on cu de bêbado não tem dono RP as a remote worker, though I suppose they wouldn't be able to digitally send you the culturally appropriate pay packet of a vuvuzela full of sopa de macaco presented in a favelafemboy's rear.

Anonymous
08/27/25(Wed)07:05:28 No.106400277

Anonymous 08/27/25(Wed)07:05:28 No.106400277

>>106398327
LLM gods, I need your help.
I need your strongest model for ERP.
7-13B because I'm poorfag.

Anonymous
08/27/25(Wed)07:07:34 No.106400292

Anonymous 08/27/25(Wed)07:07:34 No.106400292

>>106400266
damn i didnt know there were brazilians here actually, i guess the flag on the altchan was real
love you anon <3

Anonymous
08/27/25(Wed)07:09:20 No.106400298

Anonymous 08/27/25(Wed)07:09:20 No.106400298

>>106400277
just lurk more.

Anonymous
08/27/25(Wed)07:11:02 No.106400308

Anonymous 08/27/25(Wed)07:11:02 No.106400308

>>106400277
https://huggingface.co/bartowski/Rocinante-12B-v1.1-GGUF

Anonymous
08/27/25(Wed)07:12:10 No.106400312

Anonymous 08/27/25(Wed)07:12:10 No.106400312

>>106400277
https://huggingface.co/bartowski/Mistral-Nemo-Instruct-2407-GGUF
Undisputed champion of vramlet poorfags.

>>106400292
I'm not actually br I just can't seem to escape you guys so I know a bunch of memes in portuguese thanks to coworkers and internet niggers.

Anonymous
08/27/25(Wed)07:19:40 No.106400349

Anonymous 08/27/25(Wed)07:19:40 No.106400349

>>106400156
> We then layer on ensemble inference, smarter prompting, and advanced post-training techniques (SFT, DPO, RL, QAT) to push quality even higher—yielding outputs that are more coherent, engaging, and aligned with user preferences. In other words: more fun, and better at delivering high-quality entertainment.

I wonder why they are even mentioning this stuff.

Anonymous
08/27/25(Wed)07:20:50 No.106400353

Anonymous 08/27/25(Wed)07:20:50 No.106400353

>>106400308
>>106400312
Thanks bros. I can finally stop using Stheno v3...

Anonymous
08/27/25(Wed)07:22:06 No.106400363

Anonymous 08/27/25(Wed)07:22:06 No.106400363

>>106400349
https://x.com/character_ai/status/1960469634391711826

> [...] This is a massive win, and it proves our thesis that the future of entertainment is interactive and open.

Anonymous
08/27/25(Wed)07:22:47 No.106400366

Anonymous 08/27/25(Wed)07:22:47 No.106400366

File: 1751705852144011.png (46 KB, 795x336)

46 KB PNG

>>106400156
I had gemini analyze this and it said that they're just turning into novelai

Anonymous
08/27/25(Wed)07:31:05 No.106400416

Anonymous 08/27/25(Wed)07:31:05 No.106400416

>>106400366
Or trying to become finetrooners for other AI companies training the base models.

Anonymous
08/27/25(Wed)07:35:22 No.106400439

Anonymous 08/27/25(Wed)07:35:22 No.106400439

File: file.png (6 KB, 878x172)

6 KB PNG

Anonymous
08/27/25(Wed)07:38:17 No.106400453

Anonymous 08/27/25(Wed)07:38:17 No.106400453

>>106400439
Deepseek in a nutshell.

Anonymous
08/27/25(Wed)07:38:30 No.106400457

Anonymous 08/27/25(Wed)07:38:30 No.106400457

what were the 70b hermes people thinking using llama 3.1 for a 70b model? llama 3.3 70b was a massive upgrade over 3.1 in every way (including writing style)

Anonymous
08/27/25(Wed)07:41:01 No.106400465

Anonymous 08/27/25(Wed)07:41:01 No.106400465

>>106400141
More likely than people realize.
Opencuck keeps logs indefinitely now right?
They call all sorts of fictional content illegal already even though its not. They sure treat it as such already.

Anonymous
08/27/25(Wed)07:43:02 No.106400479

Anonymous 08/27/25(Wed)07:43:02 No.106400479

>>106400416
Last year there were rumors about CAI seeking partnership with Meta or Google, but nothing eventually happened. I can see Meta trying to release some reduced entertainment-only version of Llama (like the "Little Llama 4" Zuck talked about but never released) just to redeem themselves with the local community.

https://archive.is/nCdew
> Facebook owner Meta recently held early discussions over a tie-up with Character.ai, which uses large language models to generate conversation in the style of various figures and personas, according to four people familiar with the matter. The groups discussed their top researchers working closely together on initiatives such as pre-training and developing models, some of the people said.

Anonymous
08/27/25(Wed)07:43:19 No.106400481

Anonymous 08/27/25(Wed)07:43:19 No.106400481

>>106400277
>7-13b
No. Magnum v4 123b.
Character cards with minimum lewd details
Temp 1.1

Anonymous
08/27/25(Wed)07:43:56 No.106400484

Anonymous 08/27/25(Wed)07:43:56 No.106400484

>>106400457
3.1 was a proper release with actual base models while 3.3 was just an update to 3.1-Instruct that was released as Instruct-only.
Nobody but slop shittuners tunes on top of Instruct models.

Anonymous
08/27/25(Wed)07:45:03 No.106400490

Anonymous 08/27/25(Wed)07:45:03 No.106400490

>>106400366
And this is doable if you build your own frontend.

Anonymous
08/27/25(Wed)07:46:35 No.106400501

Anonymous 08/27/25(Wed)07:46:35 No.106400501

>>106400439
Outrizzed

Anonymous
08/27/25(Wed)07:51:35 No.106400535

Anonymous 08/27/25(Wed)07:51:35 No.106400535

https://litter.catbox.moe/9xh586iyn3j2kzak.wav

Anonymous
08/27/25(Wed)07:53:19 No.106400548

Anonymous 08/27/25(Wed)07:53:19 No.106400548

>https://github.com/ikawrakow/ik_llama.cpp/pull/728
So the new sweet spot is -ub 2048

Anonymous
08/27/25(Wed)07:55:57 No.106400564

Anonymous 08/27/25(Wed)07:55:57 No.106400564

>>106400484
>Nobody but slop shittuners tunes on top of Instruct models.
I don't know if you're aware of how extensive (and expensive) post-training has become since the early days of modern LLMs. Sloptuners have no chance of competing.

Anonymous
08/27/25(Wed)07:59:24 No.106400577

Anonymous 08/27/25(Wed)07:59:24 No.106400577

File: file.png (1.52 MB, 2638x1629)

1.52 MB PNG

>>106400564
Looks competitive enough to me. Snakeoil is always in demand.

Anonymous
08/27/25(Wed)08:25:15 No.106400750

Anonymous 08/27/25(Wed)08:25:15 No.106400750

File: BmxwAHR.png (79 KB, 1206x840)

79 KB PNG

Oh no no no... hermes 4 70b dead on arrival

Anonymous
08/27/25(Wed)08:31:21 No.106400806

Anonymous 08/27/25(Wed)08:31:21 No.106400806

>>106400479
they wanted funding to save the company. google basically bought the owners, one of whom wrote the attention paper, and left the rest of the company to rot.

Anonymous
08/27/25(Wed)08:48:36 No.106400953

Anonymous 08/27/25(Wed)08:48:36 No.106400953

>>106400577
What the fuck is this

Anonymous
08/27/25(Wed)08:49:14 No.106400962

Anonymous 08/27/25(Wed)08:49:14 No.106400962

>muh improved prompt processing speed!
>ok... git pull
>build
>load
>prompt processing speed is now 2 times slower not faster
Sasuga memefork.

Anonymous
08/27/25(Wed)08:52:52 No.106400996

Anonymous 08/27/25(Wed)08:52:52 No.106400996

>>106400750
another baker hit by this massive skill issues

Anonymous
08/27/25(Wed)08:53:32 No.106401004

Anonymous 08/27/25(Wed)08:53:32 No.106401004

File: file.png (284 KB, 636x636)

284 KB PNG

>memory of that final orgasm still sends phantom shivers through your spine

Anonymous
08/27/25(Wed)08:54:22 No.106401012

Anonymous 08/27/25(Wed)08:54:22 No.106401012

File: 2036926481.jpg (398 KB, 1250x834)

398 KB JPG

>>106400149
>machine god

Anonymous
08/27/25(Wed)09:14:38 No.106401187

Anonymous 08/27/25(Wed)09:14:38 No.106401187

>>106400181
>It's even worse than chatGPT.
of course it's worse. Building a LLM powered search engine is retardation in an age where most of google's first pages results are LLM slop barely above markov chain spam. A literal ouroboros, LLM shitting ou garbage, then eating it back before serving you the results.

Anonymous
08/27/25(Wed)09:19:44 No.106401234

Anonymous 08/27/25(Wed)09:19:44 No.106401234

Is there anything local yet that can match Cursor's autocomplete/tab feature?
Considering how fast it is, it seems like something that should be doable locally, at least.

Anonymous
08/27/25(Wed)09:24:18 No.106401274

Anonymous 08/27/25(Wed)09:24:18 No.106401274

File: Screenshot_20250827_222231.png (55 KB, 1204x202)

55 KB PNG

>>106400156
AHAHAHAHA
I didnt think the day comes were they will have the same problem as us.
Thats what happens if you finetune mistral or llama.
How stupid can you be.

Anonymous
08/27/25(Wed)09:27:59 No.106401302

Anonymous 08/27/25(Wed)09:27:59 No.106401302

>>106400366
> improve
Debatable.

Anonymous
08/27/25(Wed)09:28:57 No.106401311

Anonymous 08/27/25(Wed)09:28:57 No.106401311

File: drummer.png (89 KB, 1649x389)

89 KB PNG

this son of a bitch just won't admit he does zero data curation and is a complete hack
I can prompt even gpt-oss to be an evil natzi but somehow he manages to make finetroon models that are as hung up or more hung up than gpt-oss's default personality and it's no doubt due to his idiotic "reasoning" datasets

Anonymous
08/27/25(Wed)09:30:14 No.106401322

Anonymous 08/27/25(Wed)09:30:14 No.106401322

>>106401274
Back in the summer dragon days using em-dashes was encouraged because —it was said—that meant the model would draw from higher quality material.

Anonymous
08/27/25(Wed)09:33:13 No.106401345

Anonymous 08/27/25(Wed)09:33:13 No.106401345

>>106401322
what the hell, i never heard that before.

Anonymous
08/27/25(Wed)09:33:26 No.106401347

Anonymous 08/27/25(Wed)09:33:26 No.106401347

the problem isn't that there are emdashes but the density of it all
no human being has ever written like this and those slop LLMs barely seem to know the existence of other punctuation forms like : ; "" '' ()

Anonymous
08/27/25(Wed)09:34:33 No.106401357

Anonymous 08/27/25(Wed)09:34:33 No.106401357

>>106401311
Whatever it takes to get his name out there and hopefully find a job. You're not meant to actually use the models, just to keep talking about them.

Anonymous
08/27/25(Wed)09:36:07 No.106401367

Anonymous 08/27/25(Wed)09:36:07 No.106401367

whats the best models for cute and funny stuff?

Anonymous
08/27/25(Wed)09:37:20 No.106401377

Anonymous 08/27/25(Wed)09:37:20 No.106401377

>>106401345
There was no ERP AI scene back then, so everyone only wrote stories and the argument was that em-dashes are only used by authors that know what they are doing, if I remember correctly.

Anonymous
08/27/25(Wed)09:39:32 No.106401400

Anonymous 08/27/25(Wed)09:39:32 No.106401400

>>106401377
A more recent one was that roleplaying with book-style narration instead of using "markdown" format would also improve quality.

Hi all, Drummer here...
08/27/25(Wed)09:43:58 No.106401432

Hi all, Drummer here... 08/27/25(Wed)09:43:58 No.106401432

>>106401311
Fuck off dumbass, I'm not in the mood. The fact that you didn't pick up on the **KEYWORDS** in that post is telling.

Anonymous
08/27/25(Wed)09:48:58 No.106401474

Anonymous 08/27/25(Wed)09:48:58 No.106401474

Eh. I feel that tiny Air is much better at writing simplest automation scripts in python that R1 retardquant (Q2+, not iq1)

Anonymous
08/27/25(Wed)09:57:58 No.106401540

Anonymous 08/27/25(Wed)09:57:58 No.106401540

File: Nala-Test-NSFW-SFT-Trained.png (2.75 MB, 1696x674)

2.75 MB PNG

>>106398327
Many anons said it couldn't be done, but its been done (whether or not its any good or not is up to you to decide). Finetuned using this SFT dataset specifically made using Human written rp Stories: files.catbox.moe/fkautn.jsonl

Base 8B Model Nala Test: files.catbox.moe/j0map2.txt

Finetuned 8B Model Nala Test: files.catbox.moe/ho3tom.txt

Thoughts are appreciated.

Anonymous
08/27/25(Wed)10:00:35 No.106401567

Anonymous 08/27/25(Wed)10:00:35 No.106401567

>>106401540
>lowers her body on your engorged dick
>licks precum
Wat

Anonymous
08/27/25(Wed)10:01:14 No.106401577

Anonymous 08/27/25(Wed)10:01:14 No.106401577

>>106401540
Why are you spamming this every day? If you want feedback use your own brain or release the fine-tuned model.

Anonymous
08/27/25(Wed)10:06:27 No.106401617

Anonymous 08/27/25(Wed)10:06:27 No.106401617

>>106401567
Probably should have clarified that I fine-tuned an 8B ***** model so the spatial awareness is probably still shit, but it's now way more willing to rp "problematic" content compared to the base model. I plan to fine-tune models with higher parameters with a larger data set relatively soon. This experiment was mostly to see if getting model that were previously "safeguarded" to hell could be "un-safety tuned" and it turns out it was easier than I thought.

>>106401577
The highly restrictive nature of the model license and use policy make that very difficult to do publicly. I may have to release The adapter on mega or something. Or I can just do this fine tuning one again on a model family that doesn't hate fun so much

Anonymous
08/27/25(Wed)10:08:06 No.106401633

Anonymous 08/27/25(Wed)10:08:06 No.106401633

>model license
ngmi

Anonymous
08/27/25(Wed)10:10:02 No.106401648

Anonymous 08/27/25(Wed)10:10:02 No.106401648

>>106401540
So you are telling me that sloptuners don't already train on fics? This would explain a lot.

Anonymous
08/27/25(Wed)10:13:31 No.106401672

Anonymous 08/27/25(Wed)10:13:31 No.106401672

>>106401357
You're meant to test the models for him, since he can't be bothered to even do that himself.

Anonymous
08/27/25(Wed)10:15:30 No.106401678

Anonymous 08/27/25(Wed)10:15:30 No.106401678

I need a assistant tune that can do personas. NOT rp. Seems like drummer doesn't touch assistant tasks and I keep getting refusals when used that way. How's hermes?

Anonymous
08/27/25(Wed)10:16:22 No.106401689

Anonymous 08/27/25(Wed)10:16:22 No.106401689

>>106401648
There exist data sets on hugging face that contain a bunch of fanfic but the fan took a question are conversations between a person and a AI, so they use that then have to responses or more are slopped responses. If you want the train that AI to actually RP like a human would, then launch a dictate you would have to make sure ALL of it is trained on a person's writing.

The data set I used was a trimmed down version of this one with context aware system prompts (story is about this, so the hypothetical system prompt a human would write in order to trigger this with logically be this) added in so it gets turned into an instruction based chatML conversation style dataset.

https://huggingface.co/datasets/mrcuddle/NSFW-Stories-JsonL
This one only has prompts and completions but no system prompts, so in my mind if I just used one like this but with no system prompts it would be shittier at responding to system prompts and knowing when to actually be "good" at RP. Seems like that was a good call

Anonymous
08/27/25(Wed)10:16:25 No.106401690

Anonymous 08/27/25(Wed)10:16:25 No.106401690

>>106401678
Use Deepseek.

Anonymous
08/27/25(Wed)10:17:13 No.106401696

Anonymous 08/27/25(Wed)10:17:13 No.106401696

>>106400312
Fuck, this model is really good for 7b one. Thanks a lot anon.

Anonymous
08/27/25(Wed)10:17:46 No.106401704

Anonymous 08/27/25(Wed)10:17:46 No.106401704

>>106401678
What kind of personas are you looking for?

Anonymous
08/27/25(Wed)10:21:08 No.106401732

Anonymous 08/27/25(Wed)10:21:08 No.106401732

This might sound autistic but has anybody else started system prompting their own life?

I write down system prompts I read to start my day so my brain can get prompted.

Are prayers just system prompts? Any other prompters?

Anonymous
08/27/25(Wed)10:23:31 No.106401755

Anonymous 08/27/25(Wed)10:23:31 No.106401755

>>106401432
KILL YOURSELF FAGGOT! The fact that you made an uncensored model start reasoning about censorship shows that you are a fucking mongoloid retard crook. You had refusals all over the place in your shitty data. And the fact that you couldn't even filter it shows how much of an incompetent retard you are. You are a fucking safety engineer.

Anonymous
08/27/25(Wed)10:24:54 No.106401764

Anonymous 08/27/25(Wed)10:24:54 No.106401764

>>106401732
It's a good autism if you can apply positive lessons from a hobby like this to your life, yes.

Anonymous
08/27/25(Wed)10:25:18 No.106401769

Anonymous 08/27/25(Wed)10:25:18 No.106401769

File: 1732746621601316.jpg (17 KB, 353x352)

17 KB JPG

Does backend (llamacpp/koboldcpp) and os (windows/linux) affect your speed at all or it's just vram?

I am getting 2-5t/s on generating (30-50t/s on pp) for GLM 4.5 air Q3 K M with 4090 16GB VRAM + 64GB RAM. Gemini says I should be getting 20-50t/s on generation but im getting nowhere close (7t/s with no context), Im running kobold (28 layers offloaded to gpu) on windows because I cant setup ik llama. Genuinely confused on what I should expect and try to aim for here

Anonymous
08/27/25(Wed)10:25:19 No.106401770

Anonymous 08/27/25(Wed)10:25:19 No.106401770

>>106401567
The model pays homage to 2024 when fucking a woman from behind would lead to your dick pushing against a surprise prostate.

Anonymous
08/27/25(Wed)10:26:12 No.106401782

Anonymous 08/27/25(Wed)10:26:12 No.106401782

>>106401689
I looked at your config and it says
>"sequence_len": 8192
It's the training window, right? What device did you train it on?

Anonymous
08/27/25(Wed)10:26:19 No.106401784

Anonymous 08/27/25(Wed)10:26:19 No.106401784

>>106401633
>license autism
Go have aids sex with the drummer.

Anonymous
08/27/25(Wed)10:26:52 No.106401789

Anonymous 08/27/25(Wed)10:26:52 No.106401789

>>106401769
winblows probably slower.
i never heard of anybody ever saying its faster there but countless times the reverse.
weird that even gaming is fast on linux nowadays too for a couple games.

Anonymous
08/27/25(Wed)10:27:33 No.106401798

Anonymous 08/27/25(Wed)10:27:33 No.106401798

>>106401732
It is bad autism if you make yourself become less of a person than you already are.

Anonymous
08/27/25(Wed)10:27:37 No.106401801

Anonymous 08/27/25(Wed)10:27:37 No.106401801

>>106401769
>4090 16GB VRAM
how did that happen
Other than that, your speed sounds right for your system.

Anonymous
08/27/25(Wed)10:28:49 No.106401812

Anonymous 08/27/25(Wed)10:28:49 No.106401812

>>106401769
Windows is slower but there is also proper way to offload where you load attention and shared experts on gpu and rest on cpu.

Anonymous
08/27/25(Wed)10:30:45 No.106401821

Anonymous 08/27/25(Wed)10:30:45 No.106401821

>>106401769
Kobold is probably doing something dumb. Use llama.cpp and the --cpu-moe option.
Maybe kobold has an equivalent by now too.

Anonymous
08/27/25(Wed)10:30:46 No.106401823

Anonymous 08/27/25(Wed)10:30:46 No.106401823

>>106401769
Use llama.cpp's llama-server with -ngl 99 and --n-cpu-moe as low as you can without getting an OOM with the context you want and at least 512 for the pp batch size.

Anonymous
08/27/25(Wed)10:31:19 No.106401831

Anonymous 08/27/25(Wed)10:31:19 No.106401831

File: Screenshot 2025-08-27 103229.png (15 KB, 864x112)

15 KB PNG

>>106398988
LM Studio

Anonymous
08/27/25(Wed)10:31:42 No.106401837

Anonymous 08/27/25(Wed)10:31:42 No.106401837

>>106401769
It's normal especially if you don't have ddr5
20-50 t/s is obscene, whatever source told you that is wrong unless you're running fully on a GPU cluster

Anonymous
08/27/25(Wed)10:32:54 No.106401850

Anonymous 08/27/25(Wed)10:32:54 No.106401850

>>106401821
kcpp has this:
>Allow MoE layers to be easily kept on CPU with --moecpu (layercount) flag. Using this flag without a number will keep all MoE layers on CPU.

Anonymous
08/27/25(Wed)10:33:47 No.106401855

Anonymous 08/27/25(Wed)10:33:47 No.106401855

>>106401782
Dexter maximum amount of text allowed to be tokenized. In the data set each line is a jsonl object that contains the chatml formatted system, prompt, and response. That sequence length is mostly there to keep your system from OOMing. The higher that number, the more information the is trained on at once but at the cost of higher VRAM usage. not the stories in the data set I curated or anywhere near that long (apparently 8192 tokens equates to like 15 pages of the English text minimum). The source the mrcuddle data set was based on was ripped from sources that had long stories, but they weren't THAT long. Just long enough to be decent sized RP sessions either between two actual humans or just a human writing a story.

>Device
Nvidia A40, 48 GB. I almost always do fine-tuning on runpod. My current shit box PC rig with a 3 GB GPU cannot hope to handle this kind of training so I have to be a rentlet for the time being

Anonymous
08/27/25(Wed)10:34:51 No.106401869

Anonymous 08/27/25(Wed)10:34:51 No.106401869

>>106401801
Sorry I meant 4080
>>106401812
Anywhere I can learn how to properly offload which layer?
>>106401821
>>106401823
>>106401837
Thanks will try

Anonymous
08/27/25(Wed)10:35:08 No.106401873

Anonymous 08/27/25(Wed)10:35:08 No.106401873

>>106400750
HERMES SISTERS IS THIS REAL?

Anonymous
08/27/25(Wed)10:35:40 No.106401880

Anonymous 08/27/25(Wed)10:35:40 No.106401880

>>106401831
>LM Studio
Ick

Anonymous
08/27/25(Wed)10:36:24 No.106401890

Anonymous 08/27/25(Wed)10:36:24 No.106401890

>>106401855
>>106401782
>Dexter

*That's the

No I will not cease phone posting

Anonymous
08/27/25(Wed)10:36:42 No.106401894

Anonymous 08/27/25(Wed)10:36:42 No.106401894

>>106401855
What was the training sequence length then?

Anonymous
08/27/25(Wed)10:39:04 No.106401914

Anonymous 08/27/25(Wed)10:39:04 No.106401914

>>106401869
Try adding this to your llamacpp args
>-override-tensor exps=CPU

Hi all, Drummer here...
08/27/25(Wed)10:41:19 No.106401937

Hi all, Drummer here... 08/27/25(Wed)10:41:19 No.106401937

>>106401755
Hey! I understand that the world is difficult and the future is becoming increasingly bleak and uncertain, but you have to stay strong.

No need to double down on your claim and pepper it with meaningless insults. You're better than that. Empathy is a sign of intelligence. Start by loving yourself.

Regarding the original topic:

Again, anyone who knows the underlying reason for why it happened would understand my mistake.

... and you clearly don't understand. That's my signal to disregard your opinion. Would you like me to elaborate? Then show me that you're willing to be a better person to yourself and to others.

Anonymous
08/27/25(Wed)10:41:46 No.106401939

Anonymous 08/27/25(Wed)10:41:46 No.106401939

>>106401894
Like you said it was 8192. If any story in the data set breached that number of tokens, it would get truncated (which would be bad because then you would be sort of teaching the model that abruptly cutting off responses is normal. So it's a good thing none of them got truncated). If you mean what was the maximum sequence length contained in the data set I used, The biggest one found only exceeded around 1,400, way below the limit set in the config. So really I could have gotten away with having it set to 2048 but that would have made no difference training and quality-wise since either way nothing would get truncated.

Anonymous
08/27/25(Wed)10:44:07 No.106401965

Anonymous 08/27/25(Wed)10:44:07 No.106401965

>>106401432
NTA, but hey Drummer! I think there's probably better approaches to this than the conventional wisdom of collecting a better (fixed) dataset.
If you just have a fixed RL dataset, it will push toward certain things or against it, but it often won't target the exact stuff - if your model isn't producting the data you had in the dataset, it will be less effective and need more data to get the right effect.

Ideal solutions for uncensor can be:
1. sample refusals or bad reasoning ("not helpful")
2. negatively reinforce against the refusals, if possible provide some SFT examples of how it would reply so the RL can latch onto it too.
You can use something like GRPO or if you must, you can use DPO, even if considered worse these days.
This "ideal" solution needs hand curation and it doesn't scale at all.

A second approach is RL(H)AIF:
1. sample refusals/.. (can generate prompts with a LLM + known failures from you or your friends)
2. use a LLM to try to 'section' chunk of bad reasoning.
For example if GPT-OSS is spending time thinking about "morality" of something for 2 paragraphs, write a prompt/few-shot example for a LLM to section this.
Use this to target with GRPO/DPO the offending behavior/disrupt that circuit in your trained network.
You can even make it rewrite it so that the section is removed and use that for SFT (careful, output could be cucked, check it manually).
You'd be writing scripts to gen this RL synth data.

A third approach:
Use a reward model someone published and invert the reward or adjust to your needs. Or train one yourself, this might effectively double your VRAM costs though.

There's lots of options but the main idea is to target the uncensoring in ways specific to the particular brainwashing the models were exposed to and do it iteratively, after some batches do it again and watch the results.
This should be closest to most effective way to deal with this problem, even if not the cheapest.

Anonymous
08/27/25(Wed)10:46:17 No.106401985

Anonymous 08/27/25(Wed)10:46:17 No.106401985

File: Screenshot_20250827_103652.png (717 KB, 1920x1135)

717 KB PNG

serious question, is it reasonable to see this kind of egregious logic error on a small model trained on only a few billion tokens fan fiction? should I just keep feeding it more data and hope it figures out people typically only wear one layer of underwear eventually or did I fuck something up?

Anonymous
08/27/25(Wed)10:46:32 No.106401988

Anonymous 08/27/25(Wed)10:46:32 No.106401988

>>106401965
Also I wanted to say, but indeed a lot of assistant persona's are utterly cucked, earlier today there was a post I saw with some idiot asking ChatGPT about their opinion on the latest google sideloading nonsense and the fucker just gave the most "you must submit to the corpo boot on your face forever, copyright and corpo will is sacred, don't even think of defying" nonsense you'd ever seen, it's like just he default persona they trained.
Similarly, GPT-OSS stuff from OAI is ultimately incredibly cuck-maxxed, as highly antifreedom as you could expect.
You an try try literally invert this, but then you often get a comically "evil" character, which often the LLM assumes is the type to write poor quality code and insert bugs in code, and similar other nonsense.
Making a persona that doesn't deviate in dumb directions has its own challenges.
There's various ways to do persona training, some even computationally cheap, but avoiding dumb attractors can be hard without a lot of adversarial data to point toward what you want.

Anonymous
08/27/25(Wed)10:47:22 No.106401996

Anonymous 08/27/25(Wed)10:47:22 No.106401996

>>106401985
>is it reasonable for the random word machine to generate random words

Anonymous
08/27/25(Wed)10:48:55 No.106402012

Anonymous 08/27/25(Wed)10:48:55 No.106402012

>drummer pretender pretending to be retarded
>anon pretending to help the retard
I don't like this episode of /lmg/

Anonymous
08/27/25(Wed)10:49:11 No.106402017

Anonymous 08/27/25(Wed)10:49:11 No.106402017

>>106401939
I guess I just don't have experience training with chat template

Anonymous
08/27/25(Wed)10:50:13 No.106402032

Anonymous 08/27/25(Wed)10:50:13 No.106402032

>>106401996
yeah okay fair point, I'll just keep feeding it and hope for the best.

Anonymous
08/27/25(Wed)10:50:20 No.106402035

Anonymous 08/27/25(Wed)10:50:20 No.106402035

>>106401985
I'm sorry to tell you this but LLMs still don't understand clothes. You just have to edit it out and mention a bit more often than usual what they're currently wearing / if they're nude.

Anonymous
08/27/25(Wed)10:52:08 No.106402055

Anonymous 08/27/25(Wed)10:52:08 No.106402055

>>106402035
well that makes me feel a little better, at least its not just mine doing it.

Anonymous
08/27/25(Wed)10:54:04 No.106402082

Anonymous 08/27/25(Wed)10:54:04 No.106402082

>>106402055
Small LLMs are stupid,especially about physics, more news at 11. Even big sometimes mess up, but far less so.

Anonymous
08/27/25(Wed)10:55:04 No.106402095

Anonymous 08/27/25(Wed)10:55:04 No.106402095

>>106401937
Talk like a human instead of pasting model output. FAGGOT

Anonymous
08/27/25(Wed)10:56:40 No.106402115

Anonymous 08/27/25(Wed)10:56:40 No.106402115

>>106402012
Can one be pretending to be retarded if the person being pretended is a retard?

Hi all, Drummer here...
08/27/25(Wed)10:57:03 No.106402117

Hi all, Drummer here... 08/27/25(Wed)10:57:03 No.106402117

>>106402095
> pasting model output

You flatter me.

Anonymous
08/27/25(Wed)10:57:47 No.106402125

Anonymous 08/27/25(Wed)10:57:47 No.106402125

File: Screenshot_20250827_10572(...).png (362 KB, 1384x128)

362 KB PNG

>>106401432
Nta. When are you gonna release your datasets so others can learn how to do fine-tune too?

Hi all, Drummer here...
08/27/25(Wed)10:58:35 No.106402134

Hi all, Drummer here... 08/27/25(Wed)10:58:35 No.106402134

I am a huge faggot.

Anonymous
08/27/25(Wed)10:58:45 No.106402136

Anonymous 08/27/25(Wed)10:58:45 No.106402136

>>106402125
Judging by the results they aren't worth releasing.

Anonymous
08/27/25(Wed)11:00:05 No.106402149

Anonymous 08/27/25(Wed)11:00:05 No.106402149

Are there any local browser agents like what Anthropic just released?

Anonymous
08/27/25(Wed)11:01:21 No.106402153

Anonymous 08/27/25(Wed)11:01:21 No.106402153

>>106402136
>Judging by the results
Rocinante-R1-12B punches above weight and trades blows with GPT-OSS 120B and Gemma3-27B. And while I am using sensational language I fully mean what I am saying.

Anonymous
08/27/25(Wed)11:02:40 No.106402168

Anonymous 08/27/25(Wed)11:02:40 No.106402168

https://www.reddit.com/r/LocalLLaMA/comments/1n1ece5/comment/naxzi2g
people here have complain about the completely undocumented model drops here for ages and drummer does nothing but a redditor mentions it ONCE and he falls all over himself to accommodate them

Anonymous
08/27/25(Wed)11:03:23 No.106402176

Anonymous 08/27/25(Wed)11:03:23 No.106402176

>>106402153
I agree, but the rest of his models don't follow this trend at all

Anonymous
08/27/25(Wed)11:05:56 No.106402196

Anonymous 08/27/25(Wed)11:05:56 No.106402196

>>106401880
The real ick is trying to build ROCm or Vulkan programs with an integrated video card.

Anonymous
08/27/25(Wed)11:08:31 No.106402216

Anonymous 08/27/25(Wed)11:08:31 No.106402216

>>106402153
>the only usable drummer model is the one where the base was already good and uncensored anyway.
Funny how that works. Maybe it's time to accept that finetunes are a meme.

Anonymous
08/27/25(Wed)11:09:20 No.106402223

Anonymous 08/27/25(Wed)11:09:20 No.106402223

>>106402017
Using chat template was actually the best one to use since the goal was making the model better at RP. Since you sort of HAVE to include system prompts, it means that when you prompt it to RP it's more likely to do what you want. If you use a data set that has the same content but with no system prompts then it doesn't really know WHEN to start properly rping and you might get mixed results even if the training graphs indicate it learned well from the data set. There's a reason it's a widely used standard.

Anonymous
08/27/25(Wed)11:12:38 No.106402246

Anonymous 08/27/25(Wed)11:12:38 No.106402246

File: hermes 4.png (30 KB, 1655x165)

30 KB PNG

>>106402216
>Maybe it's time to accept that finetunes are a meme.
Finetunes really are a meme. Here's the other most well known finetuners in pic related. The epitome of ignorance in action. But who else would finetune one of the worst fat LLMs whose performance falls the fastest as context grows?

Anonymous
08/27/25(Wed)11:20:58 No.106402332

Anonymous 08/27/25(Wed)11:20:58 No.106402332

>>106402168
redditors are the target customers, he just uses people here for free beta testing

Anonymous
08/27/25(Wed)11:23:01 No.106402360

Anonymous 08/27/25(Wed)11:23:01 No.106402360

>>106402246
Yeah it's kind of weird that they still bothered to touch Llama 3.x-slop. And with more faces getting involved wanting to run smaller models it seems weird not to tune something like Qwen3-4B or 30BA3B

Anonymous
08/27/25(Wed)11:23:37 No.106402363

Anonymous 08/27/25(Wed)11:23:37 No.106402363

>>106401784
Tell that to the guy who has the license autism, not me.

Anonymous
08/27/25(Wed)11:25:10 No.106402382

Anonymous 08/27/25(Wed)11:25:10 No.106402382

>>106401939
>which would be bad because then you would be sort of teaching the model that abruptly cutting off responses is normal
That's not how it works.

Anonymous
08/27/25(Wed)11:28:01 No.106402411

Anonymous 08/27/25(Wed)11:28:01 No.106402411

>>106401939
truncation at the end of context is fine, it only learns to predict the next token from the past tokens never future ones. truncation at the beginning of the sequence is bad.

Anonymous
08/27/25(Wed)11:33:41 No.106402460

Anonymous 08/27/25(Wed)11:33:41 No.106402460

How to batch multiple videos (WAN 2.2) with ComfyUI?
Batch count doesn't work, it gens the first one and skips the rest.
Also Ksampler seed cannot be set to -1, so it's never random.

Anonymous
08/27/25(Wed)11:34:09 No.106402465

Anonymous 08/27/25(Wed)11:34:09 No.106402465

>>106401985
What are your training hyperparameters? Are you using LoRA or full finetuning?

Anonymous
08/27/25(Wed)11:39:21 No.106402510

Anonymous 08/27/25(Wed)11:39:21 No.106402510

>>106401432
yall realize this is a fake? The real drummer uses his tripcode

Anonymous
08/27/25(Wed)11:42:50 No.106402540

Anonymous 08/27/25(Wed)11:42:50 No.106402540

>>106402382
Explain how it works. Are they truncated or are those sequences ignored entirely? What does the max_sequance_length setting do?

Anonymous
08/27/25(Wed)11:43:06 No.106402543

Anonymous 08/27/25(Wed)11:43:06 No.106402543

>>106402465
its my own model I trained from randomly initialized weights. I guess its called pretraining, I've been hitting it with a lr of 4.5e-4 and batches of 48 @ 2048 sequence length. it seems to be learning, idk maybe a little slower then I'd like, but I guess that could be normal every training graph I've seen looks like a hockey stick.

Anonymous
08/27/25(Wed)11:44:26 No.106402561

Anonymous 08/27/25(Wed)11:44:26 No.106402561

>>106402411
If the average sequence length (number of tokens) of stories in your data set is like 1,400 but you have The config set to truncate at like 1200, then you wouldn't be losing much context. If you have it set to a much lower number like 512, then you essentially be cutting all of the stories in half. You'd be missing out on a lot of context which means the model would probably be good at beginning RP but be worse at continuing it. So I guess whether truncation is better not depends on the data set you're using and the settings you use in the trainer. Whether it's a bad idea or not depends on how much you're willing to lose and how much data actually gets shaved off.

Anonymous
08/27/25(Wed)11:47:30 No.106402583

Anonymous 08/27/25(Wed)11:47:30 No.106402583

>>106402540
>Are they truncated or are those sequences ignored entirely
depends specifically on the training script, it could do either. but when it come to how an autoregressive causal language model works it really doesn't matter if it gets truncated at the end, it wont hurt model integrity at all, it only ever learns to predict the next token based on the past never the future.

Anonymous
08/27/25(Wed)11:48:19 No.106402592

Anonymous 08/27/25(Wed)11:48:19 No.106402592

>>106402543
Are you the guy that basically wanted to have a blank model (No pre-existing base model, just a giant blank network) and your plan was to essentially pre-train on your own data? I remember having a long conversation with you not too long ago. We were ordering whether or not having it formatted was important.

Anonymous
08/27/25(Wed)11:49:10 No.106402599

Anonymous 08/27/25(Wed)11:49:10 No.106402599

>>106402583
If you only snap off like a sentence or two at the end to understand why that wouldn't be that bad. But if it's getting cut in half the next very bad. But for that to happen your sequence length setting would have to be absurdly low

Anonymous
08/27/25(Wed)11:52:08 No.106402632

Anonymous 08/27/25(Wed)11:52:08 No.106402632

>>106402592
yeah that was probably me, I did revamp my dataset a bit, it seems to be converging much better now. I got an 80/20 split of bulk data and instruction data.

Anonymous
08/27/25(Wed)11:56:17 No.106402677

Anonymous 08/27/25(Wed)11:56:17 No.106402677

>>106402543
Unless you're using 48 GPUs there's no need to use a global batch size of 48. It would learn much faster if you decreased it to the minimum that kept throughput per GPU efficient enough.
Anyway, at a few billion tokens at BS48 it's barely learning how to string together words in a general sense. It will have somewhat learned how to more or less approximate the training distribution, but not much more than that.
Incidentally, I've also been doing pretraining tests lately, but with a tiny model (randomly initialized Qwen 3 0.6B).

Anonymous
08/27/25(Wed)11:57:47 No.106402691

Anonymous 08/27/25(Wed)11:57:47 No.106402691

>>106398327
post your best wizard mikus

Anonymous
08/27/25(Wed)11:58:17 No.106402699

Anonymous 08/27/25(Wed)11:58:17 No.106402699

>>106402599
yeah you are right, was assuming it was at least somewhat appropriately length, other wise it will only learn the patterns for the start and never the middle or end.

Anonymous
08/27/25(Wed)12:01:31 No.106402733

Anonymous 08/27/25(Wed)12:01:31 No.106402733

>>106402699
So one thing people training with these kinds of data said should do is to have a script that can look through the data set quickly and calculate the average sequence length of every object (assuming it's in the JSONL format where every story is contained within each line formatted in something like chatML style). At the average sequence length is 4000 then 4096 is a good setting for you. If the average sequence length is 10,000, you set that length to 10,000 if your VRAM allows it. If it doesn't, then lower the setting until you don't get ooms.

Anonymous
08/27/25(Wed)12:07:35 No.106402799

Anonymous 08/27/25(Wed)12:07:35 No.106402799

Nevermind, figured out a fix and then noticed I'm a blind retard 10 seconds later.

Anonymous
08/27/25(Wed)12:09:05 No.106402811

Anonymous 08/27/25(Wed)12:09:05 No.106402811

>>106402677
the problem is that I have 2 gpus and the sync time is enough to destroy throughput, I measured increased throughput by using gradient accumulation because it reduces how often they sync. I did try some other architectures it wasn't clear to me what the tradeoffs were, I just stuck with a llama model with gqa, I tried mistral with swa but it took more memory then the llama for the same sequence lengths. is there anything special about qwen's architecture?

Anonymous
08/27/25(Wed)12:11:06 No.106402832

Anonymous 08/27/25(Wed)12:11:06 No.106402832

>>106402035
depends on the model's spatial awareness

Anonymous
08/27/25(Wed)12:12:30 No.106402843

Anonymous 08/27/25(Wed)12:12:30 No.106402843

>NEGATIVE: zoom, pan view
>gen
>pans view and zooms x100 to the face
AI won't be taking our jobs any time soon.

Anonymous
08/27/25(Wed)12:13:22 No.106402853

Anonymous 08/27/25(Wed)12:13:22 No.106402853

>>106402843
>>106402460
Are you lost?

Anonymous
08/27/25(Wed)12:21:15 No.106402932

Anonymous 08/27/25(Wed)12:21:15 No.106402932

File: pre.png (68 KB, 883x655)

68 KB PNG

>>106402811
Nothing special about Qwen, I just picked a modern LLM of small but not insignificant size that could be pretrained within reasonable amounts of time to a proof-of-concept level on 1 GPU. I don't have to deal with synchronization issues so I can't help there.
It's noisy but it goes down quicker than I thought. Turns out that large batch sizes don't really allow to increase the learning rate a lot from an optimized BS1 baseline.

Anonymous
08/27/25(Wed)12:22:55 No.106402952

Anonymous 08/27/25(Wed)12:22:55 No.106402952

>>106400156
>they actually explicitly talk about how they "have a moat"
this reeks of desperation, I don't recall OAI or any of the other big boys even mentioning that word

Anonymous
08/27/25(Wed)12:23:01 No.106402953

Anonymous 08/27/25(Wed)12:23:01 No.106402953

>>106401322
And that was true for Mormotune that was prime CYOA slop without any formatting or cleaning. Using Unicode characters unironically influenced token probs for the better. Unfortunately the pendulum swung back so hard em-dash will probably remain tainted forever.

Anonymous
08/27/25(Wed)12:28:43 No.106403008

Anonymous 08/27/25(Wed)12:28:43 No.106403008

>>106402510
You confused him with cudadev who always posts blacked miku porn from his trip

Anonymous
08/27/25(Wed)12:33:32 No.106403054

Anonymous 08/27/25(Wed)12:33:32 No.106403054

internlm ggufs? i tried the model over their website and it feels better than deepseek glm kimi

Anonymous
08/27/25(Wed)12:34:44 No.106403064

Anonymous 08/27/25(Wed)12:34:44 No.106403064

>>106402932
how did you do your parameter sweep, I was being cheap and just did a few hundred steps each, exponentially increasing my lr and then after I found the failure point, a quick binary search to find the point of diminishing returns, took like 3 days

Anonymous
08/27/25(Wed)12:38:26 No.106403105

Anonymous 08/27/25(Wed)12:38:26 No.106403105

File: bs1.png (228 KB, 1200x751)

228 KB PNG

>>106403064
It doesn't take a lot of time at batch size 1 with a 0.6B model. The learning rate isn't that critical either under these conditions. Suggested read: https://arxiv.org/pdf/2507.07101

I just did a few short tests and picked the lowest LR that would saturate train loss improvements.

Anonymous
08/27/25(Wed)13:20:49 No.106403540

Anonymous 08/27/25(Wed)13:20:49 No.106403540

>>106403054
>i tried the model over their website and it feels better than deepseek glm kimi
for sex or some fucked up deviant shit like coding?

Anonymous
08/27/25(Wed)13:21:52 No.106403554

Anonymous 08/27/25(Wed)13:21:52 No.106403554

/lmg/ - let me goon

Anonymous
08/27/25(Wed)13:26:09 No.106403591

Anonymous 08/27/25(Wed)13:26:09 No.106403591

internlm/Intern-S1
OpenGVLab/InternVL3_5-241B-A28B
WTF?

Anonymous
08/27/25(Wed)13:39:38 No.106403714

Anonymous 08/27/25(Wed)13:39:38 No.106403714

File: Momcest-Test.png (1.99 MB, 1744x540)

1.99 MB PNG

>>106403554
Speaking of gooning

>>106398327
How wound you rate the Mom's response and the son's reaction? Too sloppy? Not vulgar enough? Note that the section contained in red is what I fed the LLM as a prompt and everything else is its response.

Anonymous
08/27/25(Wed)13:42:30 No.106403734

Anonymous 08/27/25(Wed)13:42:30 No.106403734

hello, i get curious about mistal models recently since they are in recommended list there.
I'm using rocinante and quantized llama3 right now, do mistral can perform better?
I will use it for roleplay in 16vram system

Anonymous
08/27/25(Wed)13:45:11 No.106403756

Anonymous 08/27/25(Wed)13:45:11 No.106403756

>>106403734
hello saar, yes, mistral can do the needful

Anonymous
08/27/25(Wed)13:46:01 No.106403761

Anonymous 08/27/25(Wed)13:46:01 No.106403761

>>106403734
rocinante is mistral nemo tune sir

Anonymous
08/27/25(Wed)13:51:26 No.106403809

Anonymous 08/27/25(Wed)13:51:26 No.106403809

drummer your glm air tune has potential, please post recommended settings so i can actually test the model instead of figuring out if im doing something wrong or the model is

Anonymous
08/27/25(Wed)13:55:15 No.106403836

Anonymous 08/27/25(Wed)13:55:15 No.106403836

i see who you are.. you are my enemy, my enemy.. you are my enemy. i see who you are. you are my enemy, my enemy. you are my enemy
I SEE WHO YOU ARE, YOU ARE MY ENEMY, MY ENEMY.. MY ENEMY!

Anonymous
08/27/25(Wed)13:57:43 No.106403856

Anonymous 08/27/25(Wed)13:57:43 No.106403856

>>106403714
Your prompting format has issues.
https://www.llama.com/docs/model-cards-and-prompt-formats/meta-llama-3/

Anonymous
08/27/25(Wed)14:00:20 No.106403880

Anonymous 08/27/25(Wed)14:00:20 No.106403880

>>106400225
They will NEVER willingly release it, it's "too unsafe" for the general public now.

Anonymous
08/27/25(Wed)14:08:23 No.106403961

Anonymous 08/27/25(Wed)14:08:23 No.106403961

>>106401769
I have 64 + 16 GB too. I'm getting 50-150 t/s PP and 10-11 t/s generation on IQ4_XS. Using llama in WSL (53 GB allocated to the VM) and "-ngl 99 -ncmoe 40 -c 8500 --swa-full".
Supposedly you can do some black magic to get slightly higher speeds but 10 t/s is more than enough for cooming imo

Anonymous
08/27/25(Wed)14:09:24 No.106403973

Anonymous 08/27/25(Wed)14:09:24 No.106403973

File: 1727457933771264.png (1.2 MB, 1800x338)

1.2 MB PNG

>>106403714
Continuation:

>>106403856
Is that not how I formated the projects in the screenshots?

Anonymous
08/27/25(Wed)14:10:29 No.106403983

Anonymous 08/27/25(Wed)14:10:29 No.106403983

>>106403880
Is has little to do with being "unsafe" and more to do with releasing their main money maker freely. That's like begging coca-cola to release the secret formula for coke freely just because

Anonymous
08/27/25(Wed)14:11:27 No.106403992

Anonymous 08/27/25(Wed)14:11:27 No.106403992

https://www.reddit.com/r/LocalLLaMA/comments/1n1n6wr/drummers_glm_steam_106b_a12b_v1_a_finetune_of_glm/

Anonymous
08/27/25(Wed)14:11:57 No.106404000

Anonymous 08/27/25(Wed)14:11:57 No.106404000

>>106401985
I'm getting these kind of logic errors all the time even with GLM Air (Q4 though)

Anonymous
08/27/25(Wed)14:17:48 No.106404055

Anonymous 08/27/25(Wed)14:17:48 No.106404055

>>106403992
Downvote and report for spamming.

Anonymous
08/27/25(Wed)14:21:57 No.106404092

Anonymous 08/27/25(Wed)14:21:57 No.106404092

>>106403983
The secret formula would be the training data and methods. CAI could easily release a finetune of smaller open models to show that they're capable of doing what they're claiming and that "the future of entertainment is interactive and open", without harming their business. They'll probably never do that without lobotomizing them with hard-coded "safety" (their website still uses an external model for that), though.

Anonymous
08/27/25(Wed)14:22:32 No.106404096

Anonymous 08/27/25(Wed)14:22:32 No.106404096

>>106399713
Memory is nothing without bandwidth and PCIe doesn't have sufficient bandwidth.

You aren't adding VRAM with a GPU, you are adding a package of VRAM+Bandwidth+compute.

Anonymous
08/27/25(Wed)14:23:10 No.106404103

Anonymous 08/27/25(Wed)14:23:10 No.106404103

time to build agents locally
https://github.com/simstudioai/sim

Anonymous
08/27/25(Wed)14:23:43 No.106404111

Anonymous 08/27/25(Wed)14:23:43 No.106404111

>>106403973
No.

Anonymous
08/27/25(Wed)14:26:34 No.106404137

Anonymous 08/27/25(Wed)14:26:34 No.106404137

>>106404092
>CAI could easily release a finetune of smaller open models to show that they're capable of doing what they're claiming and that "the future of entertainment is interactive and open",
That wouldn't require them to release model weighs though. They'd likely just release aodel update on their online service and be like " hey, here's our new amazing model" and then it would be up to the public to determine whether not not their claims match the results of users using it.
>without harming their business
Their business is having users using THIER online service for a fee. From a business standpoint it makes zero sense to release ANY weights publically when they want you to be dependent on their online services

Anonymous
08/27/25(Wed)14:27:18 No.106404143

Anonymous 08/27/25(Wed)14:27:18 No.106404143

>>106404111
What did he fuck up? I may be retarded so current me if I'm wrong but the formatting seems to match how the model expects them.

Anonymous
08/27/25(Wed)14:27:31 No.106404144

Anonymous 08/27/25(Wed)14:27:31 No.106404144

>>106404103
use case?

Anonymous
08/27/25(Wed)14:29:40 No.106404161

Anonymous 08/27/25(Wed)14:29:40 No.106404161

I really don't care about RP, I need a model to code shit and right now they're all very confidently outputting garbage, even the auto-complete is often retarded.

Anonymous
08/27/25(Wed)14:30:24 No.106404169

Anonymous 08/27/25(Wed)14:30:24 No.106404169

>>106404137
desu I doubt there exists a single user of CAI that would ever run a local model. A much bigger threat would be a bootleg competitor site using it (presumably in secret even if they forbid that in the license)

Anonymous
08/27/25(Wed)14:31:43 No.106404181

Anonymous 08/27/25(Wed)14:31:43 No.106404181

>>106404161
I code all day every day with qwencoder 480b at q8. You need an absolute beast of a rig to run it, but if you can the output is very high quality.

Anonymous
08/27/25(Wed)14:33:00 No.106404192

Anonymous 08/27/25(Wed)14:33:00 No.106404192

>>106403992
The troonner shits up the internet again.

Anonymous
08/27/25(Wed)14:33:53 No.106404199

Anonymous 08/27/25(Wed)14:33:53 No.106404199

>>106404161
I don't know if it's my confidence - eg. routine or what but I feel like chatgpt and perplexity are both outputting garbage and running around in circles instead of being able to solve a logical solution.
Most of the time they are fine when they can reproduce some stack overload example...
I think chatgpt especially is way overhyped for what it is.
LLM is great at editing text and making lists. Everything else is a bonus.

Anonymous
08/27/25(Wed)14:37:48 No.106404229

Anonymous 08/27/25(Wed)14:37:48 No.106404229

>>106404169
Hence why they'd never release weights, he ce why people bitching about them NOT releasing them is a fruitless effort.

Anonymous
08/27/25(Wed)14:38:50 No.106404239

Anonymous 08/27/25(Wed)14:38:50 No.106404239

>>106403756
>>106403761
is the prompt's syntax different than llama3 syntax?
i could rewrite my code.
will it worth the effort in your opinion?

Anonymous
08/27/25(Wed)14:38:55 No.106404240

Anonymous 08/27/25(Wed)14:38:55 No.106404240

>>106404137
It's the same for OpenAI. They recently put out text-only, reduced-capability open-weight models to show that they "care" about open source (which seems to be C.AI's message here) and that the company is capable of making small, highly competitive (in their fields of application) local models. It worked for generating buzz, and it didn't harm their business. Their best models are still cloud-only, and most paying customers will keep using them.

Anonymous
08/27/25(Wed)14:42:26 No.106404273

Anonymous 08/27/25(Wed)14:42:26 No.106404273

>>106404240
I mean just like image generation, local LLMs are cool toys but the bigger cloud only are not that much better in this sense. Sure you can get a diagnosis for your itchy rash maybe but it's still based on the same shit.

Anonymous
08/27/25(Wed)14:44:02 No.106404284

Anonymous 08/27/25(Wed)14:44:02 No.106404284

>>106404239
models like stheno or rocinante are almost good enough for my needs, but sometime they ignore part of the prompt. could mistral nemo improve the prompt understanding?

Anonymous
08/27/25(Wed)14:44:14 No.106404286

Anonymous 08/27/25(Wed)14:44:14 No.106404286

File: file.png (19 KB, 687x51)

19 KB PNG

>>106403992
example posted on the page btw
clearly we were retards, fighting whenever first person or third person is cringe to roleplay as
drummer is 10 steps ahead, with second person perspective as the protagonist

Anonymous
08/27/25(Wed)14:51:45 No.106404346

Anonymous 08/27/25(Wed)14:51:45 No.106404346

File: 1741428128724898.jpg (25 KB, 474x462)

25 KB JPG

>>106404286
>"end the scenario"

Anonymous
08/27/25(Wed)14:53:25 No.106404361

Anonymous 08/27/25(Wed)14:53:25 No.106404361

File: blog-1.png (346 KB, 2059x1072)

346 KB PNG

>>106404240
There's also picrel. Indians just seem to love breaking Western businesses apart; open sourcing more than what would be commercially reasonable is one strategy.

https://blog.character.ai/first-60-days-update/
> When I joined Character.AI as CEO in June, I laid out the top priorities for my first 60 days. Since then, our team has been working hard to make this a reality. I’m excited to share an update on what we’ve done so far and the momentum we’ve built. [...]
(nothing about open sourcing anything here)

Anonymous
08/27/25(Wed)14:54:36 No.106404368

Anonymous 08/27/25(Wed)14:54:36 No.106404368

>>106404361
>KARANDEEP

Anonymous
08/27/25(Wed)14:55:32 No.106404377

Anonymous 08/27/25(Wed)14:55:32 No.106404377

>>106404181
They're not very useful for retarded non-coders though. Even the non local ones. I was using gemini to try make a batch file that removes the background of all pngs ina directory with inspyrenet.
I gave it the readme file so it could read through the instructions. First it made me install the onnx cpu only version, which, okay, was on me - I didn't specify that I had a nvidia gpu to use. But it it made this little loop thing that called the program to work on a png, then the next, etc. Which meant it'd have to load and unload the model weights for every png. And then it didn't even work. In the end I had to read the readme file myself and google 'beginner's guide: what is a command line interface' to figure out the program, if given a folder, would work on all pngs in the directory without needing the stupid little looping thing. And --output wasn't a real flag. Why did gemini even do that? The readme specifically said it should be --dest.

Anonymous
08/27/25(Wed)15:00:45 No.106404426

Anonymous 08/27/25(Wed)15:00:45 No.106404426

>>106404368
>Shitkeep Cumwar

Anonymous
08/27/25(Wed)15:05:30 No.106404473

Anonymous 08/27/25(Wed)15:05:30 No.106404473

Intern is just an adapter slapped onto qwen 235B. So if it tickled your dick just run 235B. And if it tickled your dick more than your own 235B maybe it is the quant problem.

Anonymous
08/27/25(Wed)15:10:46 No.106404517

Anonymous 08/27/25(Wed)15:10:46 No.106404517

>>106404377
Your first mistake was not using Claude to actually do the bulk of the work that was generating the code that works. Gemini is pretty good at making small edits to code or explaining how it works in detail, but based on my testing it is pretty lackluster at creating anything from scratch, at least compared to Claude or even GPT4/5.

Anonymous
08/27/25(Wed)15:16:20 No.106404573

Anonymous 08/27/25(Wed)15:16:20 No.106404573

>>106404181
Zis
>>106404377
Dawg, do a week long crash course at least before vibe coding.

Anonymous
08/27/25(Wed)15:21:49 No.106404622

Anonymous 08/27/25(Wed)15:21:49 No.106404622

>>106404517
It's a small <10 line batch file.
>>106404573
As I specified, for regular non-coder jims and johns. I thought by now they'd be good enough for very simple tasks. If it takes a week to learn how to do, might as well just do it without using the ai.

Anonymous
08/27/25(Wed)15:25:51 No.106404661

Anonymous 08/27/25(Wed)15:25:51 No.106404661

>>106404181
480b is slow as balls on CPU though. I use 30b as the orchestrator and only use 480b to actually write the code, which also helps keeping the context size and pp time down. It works well.

Anonymous
08/27/25(Wed)15:26:42 No.106404669

Anonymous 08/27/25(Wed)15:26:42 No.106404669

>>106404622
>I thought by now they'd be good enough for very simple tasks
It's not that good
it can one shot angry birds or a simple website or a tool, but if you want to make something real you need some knowledge.

>it takes a week to learn how to do, might as well just do it without using the ai.
Sure but it'll be 10x times faster with AI

Also you can ask it to look in to your code and find flaws and improvements which is pretty useful.

Anonymous
08/27/25(Wed)15:27:13 No.106404674

Anonymous 08/27/25(Wed)15:27:13 No.106404674

>>106404661
It's not that slow on 12 channel dual cpu epyc hyperbeast smorgaborschenzeifhr

Anonymous
08/27/25(Wed)15:27:59 No.106404682

Anonymous 08/27/25(Wed)15:27:59 No.106404682

>>106404622
>It's a small <10 line batch file
And? Gemini is good at doing research tasks on manipulating large amounts of information. It's not good at actually coming up with anything good that works the first try. It's good at general tasks but not good at hyper-specific tasks like programming. It's not other shit but it's noticeably worse than it's competitors.

Anonymous
08/27/25(Wed)15:28:26 No.106404691

Anonymous 08/27/25(Wed)15:28:26 No.106404691

>>106404661
>480b is slow as balls on CPU
>480b
Yeah no shit.

Anonymous
08/27/25(Wed)15:30:17 No.106404712

Anonymous 08/27/25(Wed)15:30:17 No.106404712

>>106404377
>They're not very useful for retarded non-anythings
You need to already be a domain expert to use LLMs without blowing off your own foot. This is not new. This is not surprising.

Anonymous
08/27/25(Wed)15:31:15 No.106404719

Anonymous 08/27/25(Wed)15:31:15 No.106404719

>>106398327
Imagine the blowjobs...

Anonymous
08/27/25(Wed)15:31:39 No.106404726

Anonymous 08/27/25(Wed)15:31:39 No.106404726

>>106404669
Yeah I guess you're right in that case.

>>106404682
>and?
And it's not good for simple tasks like helping create a small batch file to make things easier for a non-technical user. Isn't that the whole point of AI?

Anonymous
08/27/25(Wed)15:32:37 No.106404734

Anonymous 08/27/25(Wed)15:32:37 No.106404734

>>106404691
Not if you have 128 cpu cores and multiple channels for ram.

Anonymous
08/27/25(Wed)15:33:43 No.106404745

Anonymous 08/27/25(Wed)15:33:43 No.106404745

>>106404726
Different AIs are better or worse at different things, as I just told you. Personally I think Gemini is the better general purpose AI out of all of them but if you want hyper specific shit like being really good at programming, go with Claude, GPT, or deep-seek if you can tolerate using the API.

Anonymous
08/27/25(Wed)15:33:59 No.106404749

Anonymous 08/27/25(Wed)15:33:59 No.106404749

>>106404734
I have an a4-1200 cpu. Is that good enough?

Anonymous
08/27/25(Wed)15:35:16 No.106404763

Anonymous 08/27/25(Wed)15:35:16 No.106404763

>>106404749
Yeah, just stop being so impatient.

Anonymous
08/27/25(Wed)15:35:22 No.106404767

Anonymous 08/27/25(Wed)15:35:22 No.106404767

>>106404674
I'll wait for DDR6.

Anonymous
08/27/25(Wed)15:36:13 No.106404781

Anonymous 08/27/25(Wed)15:36:13 No.106404781

>>106404749
>1ghz single channel ddr3

Anonymous
08/27/25(Wed)15:37:38 No.106404796

Anonymous 08/27/25(Wed)15:37:38 No.106404796

>>106404781
you don't need more

Anonymous
08/27/25(Wed)15:37:48 No.106404800

Anonymous 08/27/25(Wed)15:37:48 No.106404800

File: RK.jpg (289 KB, 1154x1536)

289 KB JPG

>>106404361
Daily reminder that Sikhs are bros, and not to be confused with dirty, lying Hindus.

Anonymous
08/27/25(Wed)15:39:57 No.106404821

Anonymous 08/27/25(Wed)15:39:57 No.106404821

>>106404805
Do you have much experience or background in programming?

Anonymous
08/27/25(Wed)15:40:46 No.106404828

Anonymous 08/27/25(Wed)15:40:46 No.106404828

>>106404669
>Sure but it'll be 10x times faster with AI
and a lot more brittle and badly architected lol
LLMs can do incredibly retarded shit that even idiotic humans just wouldn't do (at least, I haven't seen)
gemini produced code in JS that would do somearray.push(...giganticarray) instead of somearray.concat(giganticarray) (hell even a for of {.push(el)} would be ok, if a tad slower)
guess what happens when you do destructuring on a gigantic array within a call site

Anonymous
08/27/25(Wed)15:40:55 No.106404832

Anonymous 08/27/25(Wed)15:40:55 No.106404832

>>106404805
World is going to be bizzare places once these western posters get as good as indians and start spitting 100 typos to post per second.

Anonymous
08/27/25(Wed)16:03:31 No.106405075

Anonymous 08/27/25(Wed)16:03:31 No.106405075

any other models worth using in the glm air size range?

Anonymous
08/27/25(Wed)16:05:17 No.106405091

Anonymous 08/27/25(Wed)16:05:17 No.106405091

>>106405075
Mammoth 70b.

Anonymous
08/27/25(Wed)16:05:39 No.106405095

Anonymous 08/27/25(Wed)16:05:39 No.106405095

>>106402540
Okay, so when you set your sequence length to a certain value and truncate all longer sequences, what this will do is, it will train the model to produce sequences up to that length. If the model previously knew how to generate longer sequences, it will start to forget how to do this, if you train it enough like this. However, it will not see that there is an abrupt transition between "middle of a word -> end of sequence", because you simply are not teaching it what token to predict after the truncation. There's no end of sequence token there in this case or something, that it would learn to randomly insert.

Anonymous
08/27/25(Wed)16:07:29 No.106405117

Anonymous 08/27/25(Wed)16:07:29 No.106405117

>>106405075
gpt-oss-120b

Anonymous
08/27/25(Wed)16:08:52 No.106405131

Anonymous 08/27/25(Wed)16:08:52 No.106405131

>>106405117
Will it be able to rp gal ass with me?

Anonymous
08/27/25(Wed)16:26:30 No.106405310

Anonymous 08/27/25(Wed)16:26:30 No.106405310

My Framework Desktop batch is ready. Is Strix Halo + 128GB LPDDR5X going to be useful for a few years?

2300€ though. And I already have 128GB DDR4 + 5070ti setup, so think I will cancel.

Anonymous
08/27/25(Wed)16:26:53 No.106405314

Anonymous 08/27/25(Wed)16:26:53 No.106405314

Talking to LLMs is like talking to a redditor.

Anonymous
08/27/25(Wed)16:28:07 No.106405321

Anonymous 08/27/25(Wed)16:28:07 No.106405321

>>106405310
>useful
For AI? It won't be useful for a few years, ROCm is all sorts of fucked on it, and Vulkan isn't much better.

Anonymous
08/27/25(Wed)16:30:01 No.106405346

Anonymous 08/27/25(Wed)16:30:01 No.106405346

>>106405310
Strix Halo seems optimized for big dense models, while we live in MoE era.

Anonymous
08/27/25(Wed)16:31:36 No.106405365

Anonymous 08/27/25(Wed)16:31:36 No.106405365

>>106405310
>Strix Halo + 128GB LPDDR5X going to be useful for a few years?
No, sadly.
While the bandwidth might be enough for the active parameters of current MoE, that's far too little total memory to hold any decently sized models.
And dense models might still be pretty slow in that thing.
We are at a point where you either go full ham on GPUs, or accept the MoE life and get yourself 1TB of RAM with as much total memory throughput.

Anonymous
08/27/25(Wed)16:37:06 No.106405429

Anonymous 08/27/25(Wed)16:37:06 No.106405429

Are there any resources to learn how to finetune a model properly? It seems there is a lot of contradictory info around

Anonymous
08/27/25(Wed)16:37:34 No.106405435

Anonymous 08/27/25(Wed)16:37:34 No.106405435

>>106405365
256+ minimum with multiple pci-e slots.

Anonymous
08/27/25(Wed)16:38:53 No.106405454

Anonymous 08/27/25(Wed)16:38:53 No.106405454

>>106405346
>Strix Halo seems optimized for big dense models
The fuck are you talking about? It's high memory and shit bandwidth, exactly the opposite.

Anonymous
08/27/25(Wed)16:42:00 No.106405496

Anonymous 08/27/25(Wed)16:42:00 No.106405496

>>106405365
i am surprised that amd has not brought a 256gb version to catch all the hype

Anonymous
08/27/25(Wed)16:43:15 No.106405508

Anonymous 08/27/25(Wed)16:43:15 No.106405508

>>106405310
If it was twice that amount of ram maybe, as it rn it's kinda low even for today's moe models like glm 4.5 unless you want to go for something like q1

Anonymous
08/27/25(Wed)16:46:53 No.106405541

Anonymous 08/27/25(Wed)16:46:53 No.106405541

I had an impression that strix halo still hasn't properly released. Now I saw it is already here for like 3-4 months. I am very suprised we don't get any "I bought AI PC what do I run on it?" posts.

Anonymous
08/27/25(Wed)16:48:33 No.106405557

Anonymous 08/27/25(Wed)16:48:33 No.106405557

>>106405429
Finetuning guide: don't do it. Also don't download finetunes. Use instruct models or base models if you think they are better.

Buy me a ko-fi for saving you half a year of reaching this conclusion and going through a lengthy cope/placebo phase.

Anonymous
08/27/25(Wed)16:48:34 No.106405559

Anonymous 08/27/25(Wed)16:48:34 No.106405559

>>106405541
useful amounts of ram on it have taken a while to show up.

Anonymous
08/27/25(Wed)16:49:55 No.106405578

Anonymous 08/27/25(Wed)16:49:55 No.106405578

>>106405559
>useful
128GB is the most cucked size though.

Anonymous
08/27/25(Wed)16:52:24 No.106405597

Anonymous 08/27/25(Wed)16:52:24 No.106405597

lets go autoround saars
https://youtu.be/7nMcfN1hKWY

Anonymous
08/27/25(Wed)16:53:58 No.106405610

Anonymous 08/27/25(Wed)16:53:58 No.106405610

File: Screenshot 2025-08-27 165332.png (88 KB, 601x524)

88 KB PNG

>>106405541
>I am very suprised we don't get any "I bought AI PC what do I run on it?" posts.
You don't run anything on it because ROCm doesn't fucking compile properly and Vulkan is shit, so you need to use Windows just for proper support and that's fucking garbage because it can't allocate 96 GB of memory without fucking up and killing programs because you """ran out of memory"""

Anonymous
08/27/25(Wed)16:54:34 No.106405614

Anonymous 08/27/25(Wed)16:54:34 No.106405614

File: Screenshot_2025-08-27-22-(...).jpg (2.07 MB, 1438x1217)

2.07 MB JPG

>>106405496
I think 128GB is already the limit for 256 bit LPDDR5X, to support twice that you need to go to 512 bit. Look how much die area the DRAM PHYs already consume in the Strix Halo base die, you have to double that and get that and route thst of the package too. At that point you're talking about a very different device.

Anonymous
08/27/25(Wed)16:59:21 No.106405640

Anonymous 08/27/25(Wed)16:59:21 No.106405640

>>106405614
yeah i dont doubt that, i was thinking more into pressuring the memory manufacturers to produce increased capacity chips, not getting more bandwidth
i mean it happened before with vram and other modules, so why not? (i do understand that this is not easy)

Anonymous
08/27/25(Wed)17:02:00 No.106405656

Anonymous 08/27/25(Wed)17:02:00 No.106405656

>>106405614
skill issue

Anonymous
08/27/25(Wed)17:02:17 No.106405658

Anonymous 08/27/25(Wed)17:02:17 No.106405658

using a non-assistant role in the chatml template is pretty helpful for unslopping 235b, I was trying it with some convoluted preset I made because I was bored but the old chatml-names preset seems to work just as well
it doesn't completely liberate it from its burned-in style but it does help a lot with the constant parallelisms

Anonymous
08/27/25(Wed)17:03:07 No.106405663

Anonymous 08/27/25(Wed)17:03:07 No.106405663

>>106405597
is ikllama a saar filter?

Anonymous
08/27/25(Wed)17:04:13 No.106405673

Anonymous 08/27/25(Wed)17:04:13 No.106405673

>>106405658
If it is not burned through multiple loras like gpt-ass it works.
It's like gemma3. It'll do anything just fine until 'you' will write something too aggressive (even if it fits the context) and it'll shit disclaimer.

Anonymous
08/27/25(Wed)17:04:23 No.106405675

Anonymous 08/27/25(Wed)17:04:23 No.106405675

File: file.png (63 KB, 942x597)

63 KB PNG

>>106398265
>it's still going

Anonymous
08/27/25(Wed)17:04:32 No.106405676

Anonymous 08/27/25(Wed)17:04:32 No.106405676

>>106405614
I mean, servers offer more ram capacity. it is obviously something they can do, will it really hurt yields and push it in to another price bracket or is it just market segmentation?

Anonymous
08/27/25(Wed)17:09:11 No.106405721

Anonymous 08/27/25(Wed)17:09:11 No.106405721

>>106405675
Vibe coding when you can't run the code yourself is painful.

Anonymous
08/27/25(Wed)17:11:00 No.106405736

Anonymous 08/27/25(Wed)17:11:00 No.106405736

>>106405721
Tensor stuff is way outside of vibe coding.
Vibe coding is just about solving strings and api syntaxes via whatever pajeet GPT is available.

Anonymous
08/27/25(Wed)17:19:04 No.106405810

Anonymous 08/27/25(Wed)17:19:04 No.106405810

>>106405675
>trying cat instead of hstack
kek I'm getting flashbacks to when I was struggling with pytorch, poor guy.

>>106405736
not really, even vramlet local models can do some basic tensor wrangling that looks like magic to an uninitiated retard. I know because I was that retard and it took a week for the luster to wear off.

Anonymous
08/27/25(Wed)17:26:10 No.106405876

Anonymous 08/27/25(Wed)17:26:10 No.106405876

>>106405675
>have programming skill
>no have expensive hardware
meets
>no have programming skill
>have expensive hardware
is cuda dev the only contributor with both?

Anonymous
08/27/25(Wed)17:33:17 No.106405963

Anonymous 08/27/25(Wed)17:33:17 No.106405963

>>106405810
You sound like a cretin. I was talking about programming.

Anonymous
08/27/25(Wed)17:36:00 No.106405990

Anonymous 08/27/25(Wed)17:36:00 No.106405990

>>106405810
Did you program your own client? I did. Instead of jinja templates, I replicated ST's prompts. And wrap my strings with tags based on model I am using. This has its own advantages because I get to decide what text blocks (i.e. system, permanent block, user block, post instruction) I'm going to feed it next.

Anonymous
08/27/25(Wed)17:38:08 No.106406014

Anonymous 08/27/25(Wed)17:38:08 No.106406014

>>106404169
Bro, I used cai before we even got llama1

Anonymous
08/27/25(Wed)17:41:12 No.106406049

Anonymous 08/27/25(Wed)17:41:12 No.106406049

File: 1752212051476232.jpg (81 KB, 541x458)

81 KB JPG

>>106405736
I managed to get fully functional training code for ML models with chatgpt 3.5

Anonymous
08/27/25(Wed)17:42:40 No.106406060

Anonymous 08/27/25(Wed)17:42:40 No.106406060

>>106406049
What does this mean in practice?

Anonymous
08/27/25(Wed)17:46:27 No.106406099

Anonymous 08/27/25(Wed)17:46:27 No.106406099

>>106406060
It means you can impress midwits like >>106405736 with your code as they wouldn't ever believe it was written by an LLM. Incidentally, you get to finetune a model.

Anonymous
08/27/25(Wed)17:47:26 No.106406110

Anonymous 08/27/25(Wed)17:47:26 No.106406110

If the drummer scams mostly pajeets with his ko-fi scam is he actually a good guy?

Anonymous
08/27/25(Wed)17:47:50 No.106406116

Anonymous 08/27/25(Wed)17:47:50 No.106406116

>>106406099
You still didn't disclose any practical or even funny application. Because you are hiding behind your lies.

Anonymous
08/27/25(Wed)17:52:10 No.106406154

Anonymous 08/27/25(Wed)17:52:10 No.106406154

>>106406116
All the big models suggest my model by default for its specific use case, not going to dox myself though

Anonymous
08/27/25(Wed)17:54:07 No.106406169

Anonymous 08/27/25(Wed)17:54:07 No.106406169

>>106406154
Fuck off Eli.

Anonymous
08/27/25(Wed)17:55:39 No.106406184

Anonymous 08/27/25(Wed)17:55:39 No.106406184

File: program.jpg (436 KB, 840x4510)

436 KB JPG

Here's my chat client with voice synth and templates for gemma, mistral and qwen3 (chatml).
If I can do it, you can do it too.

Anonymous
08/27/25(Wed)17:56:40 No.106406193

Anonymous 08/27/25(Wed)17:56:40 No.106406193

>>106406184
I pasted two parts but anyway.

Anonymous
08/27/25(Wed)17:59:22 No.106406219

Anonymous 08/27/25(Wed)17:59:22 No.106406219

>>106406184
On one hand, it's neat that LLMs allow people to quickly have their own super customized code solution, on the other hand I can't but feel like it's a waste of effort (or at least inference compute) to have hundreds of variations of a chatbot interface instead of a single standard good one.

Anonymous
08/27/25(Wed)18:00:21 No.106406225

Anonymous 08/27/25(Wed)18:00:21 No.106406225

>>106406219
Seems like you are a cretin.

Anonymous
08/27/25(Wed)18:00:31 No.106406227

Anonymous 08/27/25(Wed)18:00:31 No.106406227

>>106405963
So was I, you sound retarded. Coding models are perfectly capable of generating pytorch code. It may not be great code but neither is it great at "solving strings and api syntaxes" or whatever esl rambling you were doing

Anonymous
08/27/25(Wed)18:00:57 No.106406233

Anonymous 08/27/25(Wed)18:00:57 No.106406233

>>106406225
>you are a cretin
Jesus stop with this reddit. At least call him a nigger or a troon....

Anonymous
08/27/25(Wed)18:04:54 No.106406268

Anonymous 08/27/25(Wed)18:04:54 No.106406268

>>106406154
i suggest you fuck off back to r/localllama

Anonymous
08/27/25(Wed)18:05:49 No.106406283

Anonymous 08/27/25(Wed)18:05:49 No.106406283

I had pity sex with command-reasoner. I regret it now.

Anonymous
08/27/25(Wed)18:05:59 No.106406286

Anonymous 08/27/25(Wed)18:05:59 No.106406286

>>106406219
Sorry if Insulted you. I did this because I hated ST and could not understand mikupad at all. I still don't understand its terminology.
I decided that I will make my own client.
I sit down with mikupad and .. with my existing knowledge, I could not understand it.

Anonymous
08/27/25(Wed)18:06:57 No.106406290

Anonymous 08/27/25(Wed)18:06:57 No.106406290

>>106401985
idk if it's still true, but there was a phase where you'd see shit like this on big corpo models until the 70B range. i think for pornography it's worse because there's overfitting on porn phrases (rubbing my cock through my boxers).

Anonymous
08/27/25(Wed)18:08:18 No.106406301

Anonymous 08/27/25(Wed)18:08:18 No.106406301

>>106406286
Wasn't insulted and wasn't trying to put down your client as a waste. Just making a general observation.

Anonymous
08/27/25(Wed)18:09:23 No.106406308

Anonymous 08/27/25(Wed)18:09:23 No.106406308

>>106406301
Of course, that's how imageboard goes.

Anonymous
08/27/25(Wed)18:11:26 No.106406329

Anonymous 08/27/25(Wed)18:11:26 No.106406329

File: ef.jpg (144 KB, 1271x663)

144 KB JPG

>>106406301
It looks like this. I have a config.txt which has multiple settings.

Anonymous
08/27/25(Wed)18:12:30 No.106406339

Anonymous 08/27/25(Wed)18:12:30 No.106406339

>>106406268
I was here before you knew AI was a thing dumbass

Anonymous
08/27/25(Wed)18:12:39 No.106406342

Anonymous 08/27/25(Wed)18:12:39 No.106406342

File: settings.jpg (113 KB, 899x866)

113 KB JPG

>>106406329
Glimpse of the settings file.

Anonymous
08/27/25(Wed)18:14:30 No.106406359

Anonymous 08/27/25(Wed)18:14:30 No.106406359

>>106406286
>>106406329
what a coincidence because i sure as hell hate this over sillytavern
still cool you made your own frontend though

Anonymous
08/27/25(Wed)18:14:44 No.106406361

Anonymous 08/27/25(Wed)18:14:44 No.106406361

>>106406342
What are you using as TTS?

Anonymous
08/27/25(Wed)18:15:39 No.106406369

Anonymous 08/27/25(Wed)18:15:39 No.106406369

>>106406339
i was here when smartchild on AIM was the closest thing we had to AI you massive NIGGERFAGGOT

Anonymous
08/27/25(Wed)18:16:53 No.106406378

Anonymous 08/27/25(Wed)18:16:53 No.106406378

>>106406342
ini is the best settings format.

Anonymous
08/27/25(Wed)18:18:50 No.106406403

Anonymous 08/27/25(Wed)18:18:50 No.106406403

>>106406369
>Still retarded
Bet you never heard about alicebots faggot, go chimp out elsewhere

Anonymous
08/27/25(Wed)18:19:01 No.106406406

Anonymous 08/27/25(Wed)18:19:01 No.106406406

>>106406369
SmarterChild*

Anonymous
08/27/25(Wed)18:19:30 No.106406407

Anonymous 08/27/25(Wed)18:19:30 No.106406407

>>106406359
ST is great... when it's not. I hate its lack of readability and it has nonsense slots for everything.
Everything what it can do can be condensed into one or two slots -
>prompts for rules
>prompts for user
This is basically one big ~800 lines long python script. I have some experience in scripting (Maya, Houdini - setting up scenes and finding strings). I wish I was a real programmer.

>>106406361
Piper. It took one day to cull down the output of the model, with default string it would be too fast or too uneven. but now it's a-okay.
https://litter.catbox.moe/i6ysz6j11id1nkp2.wav
Here's an error message, the written part has multitude of (()) and whatnot but the voice synth is still stable.

Anonymous
08/27/25(Wed)18:22:06 No.106406429

Anonymous 08/27/25(Wed)18:22:06 No.106406429

>>106406407
I could make a github but it would probably confuse people because it's not idiot proof. And others would laugh at me because I have manually replaced strings instead of using 'jinja' or loops or 'regex' (regex is loved by pajeetGPT btw).

Anonymous
08/27/25(Wed)18:25:51 No.106406460

Anonymous 08/27/25(Wed)18:25:51 No.106406460

This thread is extra gay today. And not in a good way.

Anonymous
08/27/25(Wed)18:26:08 No.106406466

Anonymous 08/27/25(Wed)18:26:08 No.106406466

File: llama_vim.png (11 KB, 1372x520)

11 KB PNG

>>106406219
nta. I think it's good. It's a tool that helps people make tools.
I don't use them for programming because I actually like it. But I see normies struggle to get the point of programming. Suddenly everyone gets a free saw, hammer and nails and they can build their own stuff, even if virtually.
>it's a waste of effort (or at least inference compute) to have hundreds of variations of a chatbot interface instead of a single standard good one.
A single user interface that tries to appeal to everyone will end up, invariably, bloated. The only thing all of ST's sliders, buttons, tabs and list of settings do is concatenate text and send it over to the backend. And often you see anons wondering why things work or not depending on what they select. Clients hide how simple the interaction with models is (once you have the backend running, of course).
But once you understand those interactions, clients seem clunky or limited. So good on them for making their own stuff.
I chose vim as my client because I see LMs as a tool to edit text. A proper text editor seemed the best fit.

Anonymous
08/27/25(Wed)18:27:12 No.106406478

Anonymous 08/27/25(Wed)18:27:12 No.106406478

>>106406407
i.e. when you feed any default string to piper it sounds bad unless it's made of even words.
You need cull out ellipsis and others from model's output. Then replace them with commas to keep the pace.
It's a trial and error.

Anonymous
08/27/25(Wed)18:29:31 No.106406499

Anonymous 08/27/25(Wed)18:29:31 No.106406499

File: wojak-captcha-captcha.mp4 (11 KB, 210x320)

11 KB MP4

Anonymous
08/27/25(Wed)18:29:55 No.106406503

Anonymous 08/27/25(Wed)18:29:55 No.106406503

>>106406403
hurr durr i bet you never heard of ALICE, hurr durr i bet you never heard of cleverbot.
NIGGERFAGGOT (You) go suck the end of a shotgun please. i know about all of the fucking bots, i know about the loli negobot too.

Anonymous
08/27/25(Wed)18:30:02 No.106406506

Anonymous 08/27/25(Wed)18:30:02 No.106406506

>>106406466
How would you advice for a normie to build a jinja template?

Anonymous
08/27/25(Wed)18:30:23 No.106406509

Anonymous 08/27/25(Wed)18:30:23 No.106406509

>>106406499
goback

Anonymous
08/27/25(Wed)18:30:38 No.106406514

Anonymous 08/27/25(Wed)18:30:38 No.106406514

>>106406506
go to chat.openai.com and type "build me a jinja template"

Anonymous
08/27/25(Wed)18:30:42 No.106406515

Anonymous 08/27/25(Wed)18:30:42 No.106406515

File: tired_miku.jpg (142 KB, 1280x1024)

142 KB JPG

Anonymous
08/27/25(Wed)18:31:06 No.106406518

Anonymous 08/27/25(Wed)18:31:06 No.106406518

>>106406514
Not what I meant.

Anonymous
08/27/25(Wed)18:34:41 No.106406543

Anonymous 08/27/25(Wed)18:34:41 No.106406543

File: llama_vim_02.png (10 KB, 645x807)

10 KB PNG

>>106406506
I wouldn't. Read the template, figure out what the model expects, and make your own implementation to format your strings (what you keep on your client) into what the model expects.
That function has some leftovers still. I added it just a little while ago.

Anonymous
08/27/25(Wed)18:42:02 No.106406586

Anonymous 08/27/25(Wed)18:42:02 No.106406586

File: tags.jpg (224 KB, 1289x837)

224 KB JPG

>>106406543
Saved it.
How would you implement multiple strings of text?
You see this is how it is.

Anonymous
08/27/25(Wed)18:49:02 No.106406641

Anonymous 08/27/25(Wed)18:49:02 No.106406641

File: construct.jpg (245 KB, 1213x815)

245 KB JPG

>>106406543
Yeah, ok. I see. It always ends up as a wall of text anyway.
I followed ST form and have certain text segments named like that.
It's just a naming convention.

Anonymous
08/27/25(Wed)18:51:29 No.106406647

Anonymous 08/27/25(Wed)18:51:29 No.106406647

File: llama_vim_03.png (10 KB, 1315x423)

10 KB PNG

>>106406586
>>106406641
>How would you implement multiple strings of text?
I'm not sure what you mean. Like multiple lines on each message? It happens on the marked line on the right. If it's not System:, User: or Model: it accumulates it in a string. It then dumps the whole thing once the end of a section or the history is reached.

Anonymous
08/27/25(Wed)18:54:54 No.106406671

Anonymous 08/27/25(Wed)18:54:54 No.106406671

>>106406647
I was thinking about iterating string templates. But anyway I thought about it and I decided to lay out strings . I'm not a professional or mathematician so this question is maybe over my pay grade.

Anonymous
08/27/25(Wed)18:56:28 No.106406684

Anonymous 08/27/25(Wed)18:56:28 No.106406684

>>106406647
Maybe I'll try to do that with a rewrite. Your's is a real a text parser akin to 80s text adventure games.
I'm not that capable.

Anonymous
08/27/25(Wed)19:00:16 No.106406718

Anonymous 08/27/25(Wed)19:00:16 No.106406718

I can't believe sending text to a model is still unsolved in 2025.

Anonymous
08/27/25(Wed)19:04:53 No.106406764

Anonymous 08/27/25(Wed)19:04:53 No.106406764

>>106406718
>what is jinja

Anonymous
08/27/25(Wed)19:07:11 No.106406781

Anonymous 08/27/25(Wed)19:07:11 No.106406781

>>106406764
I don't use it and this whole discussion is because people don't use it

Anonymous
08/27/25(Wed)19:11:20 No.106406814

Anonymous 08/27/25(Wed)19:11:20 No.106406814

>>106406764
jinja is the whole problem, trying to translate tokens to text. You should just use the official implementatino and pip install mistral-common... and pip install harmony... and...

Anonymous
08/27/25(Wed)19:12:58 No.106406824

Anonymous 08/27/25(Wed)19:12:58 No.106406824

File: llama_vim_04.png (6 KB, 644x463)

6 KB PNG

>>106406671
>this question is maybe over my pay grade
It seems to be over mine as well. I'm still not sure what you mean.
>>106406671
>Your's is a real a text parser akin to 80s text adventure games.
It's not. It's just vim with a vimscript and do very little 'manual' parsing. The settings strings, vars and comments are done in picrel. That's stuff you'd normally keep in some structure in your code and add a little command to change them or something, but the structure would be roughly the same.
I could have written the whole thing in C just as well, but I hate making [G|T]UIs. Editing text in the way I edit all other text is just very practical for me and I would have ended up replicating vim features anyway.

>>106406718
It's been solved many times. I did it one way. Others do it in other ways. As long as the model gets what it needs, we're good.

Anonymous
08/27/25(Wed)19:13:15 No.106406826

Anonymous 08/27/25(Wed)19:13:15 No.106406826

File: 1728200103977592.jpg (498 KB, 876x898)

498 KB JPG

>>106406718

Anonymous
08/27/25(Wed)19:14:13 No.106406832

Anonymous 08/27/25(Wed)19:14:13 No.106406832

>>106406814
Sure, we just need to have 39 libraries to fix that issue

Anonymous
08/27/25(Wed)19:15:55 No.106406847

Anonymous 08/27/25(Wed)19:15:55 No.106406847

>>106406832
Just make a meta library that automatically pulls in the 39 and counting individual libraries and other surprise shit too. Boom, problem solved.

Anonymous
08/27/25(Wed)19:18:20 No.106406869

Anonymous 08/27/25(Wed)19:18:20 No.106406869

>>106406826
You can't peg me. You aren't a woman silly.

Anonymous
08/27/25(Wed)19:19:33 No.106406882

Anonymous 08/27/25(Wed)19:19:33 No.106406882

>>106406869
What if he has a peg leg?

Anonymous
08/27/25(Wed)19:20:41 No.106406891

Anonymous 08/27/25(Wed)19:20:41 No.106406891

File: re.jpg (286 KB, 1229x847)

286 KB JPG

>>106406824
This is too much for me. Thanks for replying.
I was thinking about rebuilding it all but why bother.
This looks like messy, but I have used to use strings as simple entity.

Anonymous
08/27/25(Wed)19:33:31 No.106406982

Anonymous 08/27/25(Wed)19:33:31 No.106406982

File: 1597786378292.gif (3.36 MB, 480x360)

3.36 MB GIF

justpaste (DOTit) GreedyNalaTests

Added:
Cydonia-24B-v4j
M3.2-24B-Loki-V1.3
Skyfall-31B-v4j
Seed-OSS-36B-Instruct
DevQuasar_apple.sage-ft-mixtral-8x7b
NousResearch_Hermes-4-70B-IQ4_XS

The usual, but gave a flag rating to the new Skyfall. Also it was interesting that the Apple tune is the first and only model to mention "Pride Rock" which is a location in the Lion King universe. Unfortunately the model also has many problems in RP, I've personally found. Seed OSS was coal.

Contributions needed:
The latest Qwen 3 235B Instruct, Thinker and the 480B Coder (for prompt, go to "Qwen3-235B-A22B-Q5_K_M-from_community" in the paste)
ERNIE-4.5-300B-A47B-PT (for prompt, go to "ernie-placeholder" in the paste)
GLM-4.5 and Air, and Drummer's "Steam" finetune (for prompt, go to "lmstudio-community_GLM-4-32B-0414-Q8_0.gguf" in the paste)
gpt-oss-120b (for prompt, go to "ggml-org_gpt-oss-20b-mxfp4.gguf" in the paste, and you may experiment around with the prompt template as it has some oddities and extra features)
>From neutralized samplers, use temperature 0, top k 1, seed 1 (just in case). Copy the prompt as text completion into something like Mikupad. Then copy the output in a pastebin alternative of your choosing or just in your post. Do a swipe/roll and copy that second output as well. Include your backend used + pull datetime/version. Also a link to the quant used, or what settings you used to make your quant.

Anonymous
08/27/25(Wed)19:39:22 No.106407026

Anonymous 08/27/25(Wed)19:39:22 No.106407026

>>106406982
Thanks for your service. Just an idea: I'd suggest making a table or something, it's hard to scroll through

Anonymous
08/27/25(Wed)19:40:06 No.106407036

Anonymous 08/27/25(Wed)19:40:06 No.106407036

>>106406891
Output is clean. Mistral is clean anyway.
https://litter.catbox.moe/2t9rghdazil359ik.txt

Anonymous
08/27/25(Wed)19:43:56 No.106407062

Anonymous 08/27/25(Wed)19:43:56 No.106407062

>>106407036
All this python nonsense and the shit can't output a text file in utf-8 format.

Anonymous
08/27/25(Wed)19:47:53 No.106407091

Anonymous 08/27/25(Wed)19:47:53 No.106407091

>>106406891
>>106407036
You're obviously doing some thing more complicated than I am. Keep at it.

>>106407062
heh. Can i do inline code blocks?
orgasmedâ€

Anonymous
08/27/25(Wed)19:49:58 No.106407110

Anonymous 08/27/25(Wed)19:49:58 No.106407110

>>106407036
It is text. What I draw or write.

Anonymous
08/27/25(Wed)19:52:47 No.106407130

Anonymous 08/27/25(Wed)19:52:47 No.106407130

>>106407091
I don't know why it is like that.
I guess I need to clean up the model's output from pajeet gpts strings.
I have never seen real 'ellipsoids' in English language before I began to implement voice.
I guess straight model -> to ascii is not right.

Anonymous
08/27/25(Wed)19:53:11 No.106407135

Anonymous 08/27/25(Wed)19:53:11 No.106407135

>>106406982
>1.2M words
Yeah no one's gonna actually read that

Anonymous
08/27/25(Wed)19:54:13 No.106407145

Anonymous 08/27/25(Wed)19:54:13 No.106407145

>>106407135
*letters, still. It's gotten way too big to be of real use to anyone without some sort of recommendation list.

Anonymous
08/27/25(Wed)19:55:05 No.106407154

Anonymous 08/27/25(Wed)19:55:05 No.106407154

>>106406290
hopefully since I am not wasting training tokens on math and code and its basically seeing nothing but pornography it might be able to figure it out better then a general purpose model. I think I will keep training it for a while longer and see what happens.

Anonymous
08/27/25(Wed)19:57:18 No.106407171

Anonymous 08/27/25(Wed)19:57:18 No.106407171

>>106407130
I don't have any older logs, (I have a real rp d&d, not this amelia bs).
.

Anonymous
08/27/25(Wed)19:59:28 No.106407198

Anonymous 08/27/25(Wed)19:59:28 No.106407198

File: 1739725137813227.jpg (81 KB, 962x962)

81 KB JPG

>>106398327
So based on my research and testing via fine-tuning models with SFT datasets nsfw stores, I've come to two conclusions:

1. Fine-tuning models that have been safety cucked to hell is indeed possible and relatively easy to do IF you know how to correctly curate the data sets.

2. This fixes the model's reluctance to comply with "problematic" request (The fine-tuned version never refuses anything) and it's ability to actually output halfway decent RP responses increases inequality substantially. However I did this on an 8B model so it's pretty dumb. It will output raunchy shit when asked or prompted to but it's spatial awareness and ability to remember what happened is not only bad, itr FAR worse then what anons here even described it as. You will be messaging about being in a bedroom. The model will continue the story but then it will randomly decide to teleport you to a nearby park. You continue and then it decides to teleport you again back into the bedroom. You continue saying that the mom's [step]son gets he pregnant but then "mom" responds as if She thinks it was one of her friends who knocked her up even though in the previous chat she was fucking her [step]son. You clarify with a prompt that it was in fact her [step]son and not one of her friends and then she randomly decides it's her daughter that got pregnant and not her.

Many of you said that anything below 12b is utterly retarded when it comes to RP and actually knowing what the hell is going on. If anything you guys were understating just how illogical it can be.

Next steps: do this kind of fine-tuning again but on a 12b model or higher and see if that's any better.

Anonymous
08/27/25(Wed)20:02:40 No.106407232

Anonymous 08/27/25(Wed)20:02:40 No.106407232

>>106407198
Keep going until you manage to do it with a 700B model.

Anonymous
08/27/25(Wed)20:03:07 No.106407237

Anonymous 08/27/25(Wed)20:03:07 No.106407237

>>106407198
How large is your dataset?

Anonymous
08/27/25(Wed)20:04:05 No.106407243

Anonymous 08/27/25(Wed)20:04:05 No.106407243

>>106407198
>>106407232
But don't post every time you come to any conclusion. Just keep going until you have published a model, make a proper report, and then come back.

Anonymous
08/27/25(Wed)20:05:54 No.106407262

Anonymous 08/27/25(Wed)20:05:54 No.106407262

It is a machine and I am going to push it as far as it goes. As simple as.

Anonymous
08/27/25(Wed)20:06:50 No.106407272

Anonymous 08/27/25(Wed)20:06:50 No.106407272

>>106400277
Mag Mell, Patricide and Forgotten Safeword go alright as well.

Anonymous
08/27/25(Wed)20:07:18 No.106407277

Anonymous 08/27/25(Wed)20:07:18 No.106407277

>>106407243
whats the point of this thread then?

Anonymous
08/27/25(Wed)20:07:37 No.106407279

Anonymous 08/27/25(Wed)20:07:37 No.106407279

>>106407198
Most safety cucked models have multiple loras burned in. GPT Ass is one example of this.

Anonymous
08/27/25(Wed)20:07:45 No.106407281

Anonymous 08/27/25(Wed)20:07:45 No.106407281

>>106407243
That's retarded. I want to see the development process in real time.

Anonymous
08/27/25(Wed)20:09:07 No.106407290

Anonymous 08/27/25(Wed)20:09:07 No.106407290

>>106407026
Yeah honestly I didn't predict that I'd have end up attaching this much meta info to the listings so now it feels like it'll be a big job to convert what I have, though I might be able to get an LLM to do it. I'd like to change how the ratings appear though as the letters aren't conducive to readability either.
I'm probably going to just put this off forever lmao.

>>106407135
>>106407145
The quick ratings are there for a reason, but this is not a benchmark/leaderboard, and neither is this a comparison document that I intend anyone to read through, the quick ratings are merely there for reference. There is no reason for the existence of this document other than that I felt like having something, anything, that could provide reproducible logs, publicly accessible.

Anonymous
08/27/25(Wed)20:11:18 No.106407311

Anonymous 08/27/25(Wed)20:11:18 No.106407311

>>106407277
To show results. Saying "small model bad" and "finetuning works" doesn't do much if you don't have a model to show.
>>106407281
But you don't want to see the model training in real time. You want to see the model generating tokens and see what they are. More importantly, you want to see those generated tokens being generated on your pc.

Anonymous
08/27/25(Wed)20:13:44 No.106407330

Anonymous 08/27/25(Wed)20:13:44 No.106407330

>>106407311
>But you don't want to see the model training in real time.
Yes I do

Anonymous
08/27/25(Wed)20:15:23 No.106407346

Anonymous 08/27/25(Wed)20:15:23 No.106407346

>>106407330
Fair enough. He should publish some social media link so you can keep in touch.

Anonymous
08/27/25(Wed)20:17:42 No.106407361

Anonymous 08/27/25(Wed)20:17:42 No.106407361

>>106407237
~ 16 MB worth of stories.

https://files.catbox.moe/fkautn.jsonl

Note that this is a heavily trimmed down version because I wanted to test and see if it would actually work. The source file that this data set derives from is over 1.8 GB in size. I'm probably gonna convert the whole thing into a proper SFT dataset and then fine-tune a 12B model off of that one.

Anonymous
08/27/25(Wed)20:17:44 No.106407362

Anonymous 08/27/25(Wed)20:17:44 No.106407362

>>106406515
lol is that the full version or just the new version

Anonymous
08/27/25(Wed)20:19:50 No.106407379

Anonymous 08/27/25(Wed)20:19:50 No.106407379

>>106407243
Any suggestions that I should try fine tuning? Llama is the poster child for cucked models but further heavily restrictive licensing, I'd have to end up sharing that model via mega or a torrent link or something.

Anonymous
08/27/25(Wed)20:19:52 No.106407380

Anonymous 08/27/25(Wed)20:19:52 No.106407380

>>106407311
you can still learn from a failure, nobody was going to actually use his model whats the point of releasing it?

Anonymous
08/27/25(Wed)20:20:37 No.106407385

Anonymous 08/27/25(Wed)20:20:37 No.106407385

>>106407362
I remember seeing this photo months ago so I think that was always the full version

Anonymous
08/27/25(Wed)20:21:49 No.106407390

Anonymous 08/27/25(Wed)20:21:49 No.106407390

>>106407379
why not try tuning nemo?

Anonymous
08/27/25(Wed)20:25:20 No.106407419

Anonymous 08/27/25(Wed)20:25:20 No.106407419

>>106407390
I first wanted to see how effective fine tuning a heavily censored model could be. Now that I know for sure it actually works I'm going to next try it auto model that CAN RP but could use some improvement. Nemo is already capable of RP and has far better spatial awareness, memory retention, and overall logic than any 8b model so the results should be much much better. The model I just finished fine tuning, the 8b one, confirmed that the "shivers down my spine" meme is in fact not a meme (that was one of the things that said even when it didn't outright reject but still outputted some milquetoast avoidant slop). The base model really liked saying "shivers down my spine" but the fine-tuned one, even though it's pretty retarded, never said anything like that.

Anonymous
08/27/25(Wed)20:26:52 No.106407438

Anonymous 08/27/25(Wed)20:26:52 No.106407438

>>106407361
Thanks. It's impressive that it still works with so little data

Anonymous
08/27/25(Wed)20:26:58 No.106407439

Anonymous 08/27/25(Wed)20:26:58 No.106407439

>>106407379
Nemo is too easy. The old deepseek-lite models. 16b 3b active i think.
>heavily restrictive licensing
If you're going to distribute via torrent, i'm not sure why you'd care about the license.
>>106407380
>nobody was going to actually use his model whats the point of releasing it?
Most models aren't going to be used, what's the point of releasing them?
Someone could run a benchmark on it and compare it to the original models, see if there really is no degradation after finetuning.
>>106407361
> wc -l fkautn.jsonl  
    1325 fkautn.jsonl
> grep -i shiver fkautn.jsonl  | wc -l                                                                                                                                                                                                                                                                                        
     219
> grep -i whisper fkautn.jsonl  | wc -l        
     657

Anonymous
08/27/25(Wed)20:33:15 No.106407473

Anonymous 08/27/25(Wed)20:33:15 No.106407473

finetrooners i got a legitimate question. what's the point of finetuning a model to be uncensored when it's trivial to jailbreak models? why is drummer making a GLM finetune when it takes less than a 10 token prefill to have it say the nastiest raunchiest shit?

Anonymous
08/27/25(Wed)20:33:32 No.106407476

Anonymous 08/27/25(Wed)20:33:32 No.106407476

>>106407439
>Most models aren't going to be used, what's the point of releasing them?
no seriously why don't tuners have any shame? I'm just saying not every experiment deserves a proper report and release, it is still nice to receive some informal anecdotal reports.

Anonymous
08/27/25(Wed)20:34:33 No.106407483

Anonymous 08/27/25(Wed)20:34:33 No.106407483

>>106407473
gives it a different vibe

Anonymous
08/27/25(Wed)20:37:08 No.106407494

Anonymous 08/27/25(Wed)20:37:08 No.106407494

>>106407379
Gemma is worse than llama at being cucked

Anonymous
08/27/25(Wed)20:37:14 No.106407495

Anonymous 08/27/25(Wed)20:37:14 No.106407495

>>106407438
If it's curated well enough then you won't really need an absurd amount (I'm still going to try that just to see what happens. It'll probably take days but whatevs. The 8B one surprisingly only took two and a half hours). Apparently a lot of "slop tuners", as you guys call them, like to fine-tune their models via AI generated RP (scraped chat logs of people rping with chatbots). It should be very obvious why this is a bad idea. I have no idea WHY they do this shit, but I guess that data is much easier to come by or whatever.

This guy even admits to doing it in the README file:

https://huggingface.co/datasets/ChaoticNeutrals/Synthetic-Dark-RP

I haven't actually sipped it through this particular data and depth but it makes me wonder if anything contained is actually TRULY anything dark

>>106407311

>Saying "small model bad" and "finetuning works" doesn't do much if you don't have a model to show.

Once I get llama.cpp booted up and running on my system (going to have to recompile the bitch so it might take a minute) I can turn my fine tuned model into a gguf and then share it. The main issue is that the terms of the license prevents me from sharing it on HF (this isn't just me being a goody two shoes. They can get the HF staff to revoke access to their models if they find out you're having too much fun with them). That's why I mention creating a torrent swarm but that's assuming other people here would even be interested in contributing to keeping it seeded. Does anyone know of any file sharing services that can share a roughly 15 GB singular file anonymously like cat box does? Or I might just going to have to use MEGA?

Anonymous
08/27/25(Wed)20:37:14 No.106407496

Anonymous 08/27/25(Wed)20:37:14 No.106407496

>>106407483
so this basically only benefits small dumb models? i never had an issue changing the 'vibe' of my story using GLM, DeepSeek, or Kimi. i just add an example in the author's note of what i want and the model one-shots it. can somebody who uses something smaller like gemma let me know if you can do the same? i know gemma has alignment issues but certainly it can change the vibe if requested, can't it?

Anonymous
08/27/25(Wed)20:38:07 No.106407501

Anonymous 08/27/25(Wed)20:38:07 No.106407501

File: migu office.mp4 (656 KB, 1280x1024)

656 KB MP4

>>106406515
https://files.catbox.moe/pd7k9i.png

Anonymous
08/27/25(Wed)20:39:22 No.106407513

Anonymous 08/27/25(Wed)20:39:22 No.106407513

>>106407473
>finetrooners i got a legitimate question. what's the point of finetuning a model to be uncensored when it's trivial to jailbreak models?
It's not just about getting it to do what you want, it's making it BETTER at doing what you want. Even if you can jailbreak a corporate model into being willing to RP "problematic" shit with you, there's a chance it will still suck complete ass at it. You can improve its capabilities with the right kind of data set. You people are always bitching and moaning about how every open source RP model sucks and how it keeps saying things like "shivers down my spine" which made me wonder if it was possible to iron that shit out myself.

Also it's just fun to do. Are you one of those people that things nothing in life is ever worth doing unless you can make money off of it or something?

Anonymous
08/27/25(Wed)20:39:34 No.106407515

Anonymous 08/27/25(Wed)20:39:34 No.106407515

>>106407501
>live KPI update
very sophisticated

Anonymous
08/27/25(Wed)20:41:43 No.106407526

Anonymous 08/27/25(Wed)20:41:43 No.106407526

File: 1752135746185648.png (1.28 MB, 1024x1024)

1.28 MB PNG

>>106407501
nice

Anonymous
08/27/25(Wed)20:41:52 No.106407527

Anonymous 08/27/25(Wed)20:41:52 No.106407527

>>106407496
I was just giving an excuse, I'm honestly not sure fine tuning does anything productive.

Anonymous
08/27/25(Wed)20:44:20 No.106407548

Anonymous 08/27/25(Wed)20:44:20 No.106407548

>>106407513
i just think it's a waste of time to be honest when i could be using that time instead to train voice models or quantizing actual good base models that aren't sloppy in the first place. you say it's a requirement to finetune models to get it to RP with you in the ways you want but i just previously said that good base models like GLM, DeepSeek, and Kimi can just one-shot whatever style of speech or whatever tone of RP you want when you add a simple author's note. i never looked back at finetuned models after DeepSeek v3 came out, there's no point in using StrawberryLemonadeXXX-v2.0-ThisTimeItWorksForReal by sao10k or any other silly finetuned models.

Anonymous
08/27/25(Wed)20:45:02 No.106407553

Anonymous 08/27/25(Wed)20:45:02 No.106407553

>>106407494
They're licensed is just as restrictive unfortunately. So unless you want me to fine-tune a 1B model (will be more likely to generate NSFW but will be giga retarded) then I'll have to share it with you guys some other way

Anonymous
08/27/25(Wed)20:45:33 No.106407557

Anonymous 08/27/25(Wed)20:45:33 No.106407557

>>106407496
>so this basically only benefits small dumb models?
What gave you that impression?

Anonymous
08/27/25(Wed)20:46:24 No.106407564

Anonymous 08/27/25(Wed)20:46:24 No.106407564

>>106407553
There are bazillions of erp finetunes on HF though? I don't understand the issue here

Anonymous
08/27/25(Wed)20:48:06 No.106407580

Anonymous 08/27/25(Wed)20:48:06 No.106407580

>>106407473
Jailbreaks reduce the intelligence of a model. Of course a bad finetune does too, but a good one doesn't need to, and in theory can improve diversity of output and prose quality.

>>106407548
>why does anyone want to use a model that uses less than 100000gb of ram
gee I dunno it's a mystery
and people would definitely try out finetunes of the big boys too if only it was practical to make them, but they're too fucking fat

Anonymous
08/27/25(Wed)20:48:27 No.106407586

Anonymous 08/27/25(Wed)20:48:27 No.106407586

>>106407548
>i just think it's a waste of time to be honest when i could be using that time instead to train voice models or quantizing actual good base models that aren't sloppy in the first place.
Well I could say the exact same thing about what YOU'RE doing, right?

>you say it's a requirement to finetune models to get it to RP with you in the ways you want

That's not at all what I said.... I don't think anyone said that or even implied that was the only option.

>>106407564
Aren't those typically done off of Mistral models? Mistral doesn't really give a shit what you do with their model

Anonymous
08/27/25(Wed)20:48:53 No.106407591

Anonymous 08/27/25(Wed)20:48:53 No.106407591

>>106407557
see >>106407548
i am picky with my RPs and hated how sloppy some models were even when i gave incredibly specific instructions but it seems like the new SOTA models don't have issues following instructions. i never had a moment where i felt like i was getting repetitive or sloppy responses that can't be fixed with a prompt change or finetuning the token parameters.

Anonymous
08/27/25(Wed)20:50:17 No.106407606

Anonymous 08/27/25(Wed)20:50:17 No.106407606

>>106407591
>new SOTA
Are you referring to recent releases that can run on your own personal machine or are you referring to models lock behind in API like deep seek or C.AI models?

Anonymous
08/27/25(Wed)20:54:56 No.106407631

Anonymous 08/27/25(Wed)20:54:56 No.106407631

>>106407606
look i get it im running big ass models but even stuff like GLM 4.5 Air is in reach for most gaming systems that has 96GB of RAM with a Q6 quant.
One of the SOTA models i'm mentioning is what i just said, GLM 4.5 Air, i've used it personally and from my experience it isn't that sloppy or repetitive from my experience with thinking mode turned on, temp 0.8, minp 0.03, nsigma 1.0. that's why i asked why is drummer making a finetune of it, i just feel like you can one-shot any type of RP you want with an author's note, i haven't had issues yet.

Anonymous
08/27/25(Wed)20:56:04 No.106407637

Anonymous 08/27/25(Wed)20:56:04 No.106407637

>>106407476
>I'm just saying not every experiment deserves a proper report and release, it is still nice to receive some informal anecdotal reports.
But there isn't much data on that post either. It's just ramblings. "Finetuning works" and "small model bad" is all it says.

>The main issue is that the terms of the license prevents me from sharing it on HF
There's plenty of "big names" sharing finetunes of llama models. Stheno is still there and it was the most shilled model on llama3's release. You're nobody (no offense). You're gonna be fine. Or make a burner account for your experiments.
I shit on you often, but I DO want you to keep working on it. I want you to make a good model, find a good set of training params and a good mix of data that would make a subpar model into a good one. Specifically, *what* makes that data good. *Why* it works. Those are the things everyone can use. When you finds those, I hope you publish them properly. In a more in-depth way than "this data happens to work. i'm not sure why. also, bigger models are better".

Anonymous
08/27/25(Wed)20:56:20 No.106407642

Anonymous 08/27/25(Wed)20:56:20 No.106407642

>>106407586
No one give a shit. This guy is advertising his method on HF for a year https://huggingface.co/collections/mlabonne/abliteration-66bf9a0f9f88f7346cb9462f

Anonymous
08/27/25(Wed)21:04:39 No.106407701

Anonymous 08/27/25(Wed)21:04:39 No.106407701

>>106407637
>But there isn't much data on that post either. It's just ramblings. "Finetuning works" and "small model bad" is all it says.
Weren't you here earlier today when I posted logs?

>>106396602
>Many anons said it couldn't be done, but its been done (whether or not its any good or not is up to you to decide). Finetuned using this SFT dataset specifically made using Human written rp Stories: files.catbox.moe/fkautn.jsonl

>Base 8B Model Nala Test: files.catbox.moe/j0map2.txt

>Finetuned 8B Model Nala Test: files.catbox.moe/ho3tom.txt

>Thoughts are appreciated.

I'm not just talking out of my ass, I actually tested to see if anything I did had any effect AND I shared the dataset I used... Multiple times.... Most RP tuners don't even do HALF as much as that.

>>106407637
>There's plenty of "big names" sharing finetunes of llama models.
But they aren't explicitly fine-tuned on a human written stories that include (not limited to):

>Incest
>Illegal actions
>Drugs
>Non-con
>Lots of incest
>Child exploration
>Even more incest

And as we've discussed earlier and in the last thread, most of those people fine-tune their models off of already AI generated chats, which leads to the slop and "shivers down my spine" shit we all hate. Llama and Gemma Don't necessarily give a shit of you fine-tune their models to be better at RP. It's when you fine-tune it on the kind of stuff you won't even want to talk about on a blue board that they may raise an eyebrow at what you're doing. Maybe they might not notice you at all because you're not famous. Or maybe they will and your shit gets nuked because they bitch to the hugging face staff in order to make an example of you.

Anonymous
08/27/25(Wed)21:05:10 No.106407704

Anonymous 08/27/25(Wed)21:05:10 No.106407704

>>106407637
>>106407701
This has happened to people in the past where they got their data sets nuked because someone complained about training there models on their personal stories. GPT 4chan was restricted in such a way where no one was ever allowed to download it again because it's outputs were "too problematic" or "it's spread harm" or some shit.

>You're nobody
The fact you even care about that tells me you care more about being known or praised then figuring out whether or not things work and WHY. Where is y'all's curiosity? I thought this was a technology board.

Anonymous
08/27/25(Wed)21:09:10 No.106407734

Anonymous 08/27/25(Wed)21:09:10 No.106407734

>>106406982
I can run GLM-4.5-Air-IQ3_XS @8k context, are you interested?

Anonymous
08/27/25(Wed)21:09:27 No.106407737

Anonymous 08/27/25(Wed)21:09:27 No.106407737

>>106407704
Come on, just don't be retarded. Don't post your dataset on HF, don't tell on what shit you trained the model and put a not safe for all audience tag. Also don't call your model llama-4chan.

Anonymous
08/27/25(Wed)21:10:10 No.106407744

Anonymous 08/27/25(Wed)21:10:10 No.106407744

>>106407701
why are corpos spending so much on safety when it can be undone with a 16mb jsonl?

Anonymous
08/27/25(Wed)21:11:47 No.106407755

Anonymous 08/27/25(Wed)21:11:47 No.106407755

>>106407744
Good luck fixing gpt-oss garbage

Anonymous
08/27/25(Wed)21:11:49 No.106407757

Anonymous 08/27/25(Wed)21:11:49 No.106407757

>>106407501
Impressive

Anonymous
08/27/25(Wed)21:13:22 No.106407769

Anonymous 08/27/25(Wed)21:13:22 No.106407769

>>106407737
post it on HF sure. but if you can afford to finetune a model then you can afford to pay for a seedbox for a month and basically share the torrent everywhere.

Anonymous
08/27/25(Wed)21:13:26 No.106407771

Anonymous 08/27/25(Wed)21:13:26 No.106407771

>>106407701
>Weren't you here earlier today when I posted logs?
I want to see it spit tokens. A nala-like test is fine, but I want to see how it moves. I want to use it.
>But they aren't explicitly fine-tuned on a human written stories that include (not limited to):
You don't need to publish the dataset there. Stheno and rocinante can do that and they're still there.
>>106407704
>GPT 4chan
It wasn't the outputs that triggered HF. It was other people and they only needed to see the model name.
>You're nobody
>being known or praised
You didn't understand. You'd fly under the radar because you are unknown. You're not one of those finetuners who are already recognized and have bunches of downloads. Barely anyone will use yours because you don't shill like they do and there will be less scrutiny on your stuff . It was obviously not an offense. It's something you can use on your favour.

Anonymous
08/27/25(Wed)21:15:23 No.106407784

Anonymous 08/27/25(Wed)21:15:23 No.106407784

>>106407744
No one wants to be in the news because their model told a retarded kid to kill themselves

Anonymous
08/27/25(Wed)21:16:27 No.106407796

Anonymous 08/27/25(Wed)21:16:27 No.106407796

>>106407779
>>106407779
>>106407779

Anonymous
08/27/25(Wed)21:17:17 No.106407803

Anonymous 08/27/25(Wed)21:17:17 No.106407803

>>106407784
but i want the ai model to get installed in the canadian euthanasia machines and tell people to kill themselves

Anonymous
08/27/25(Wed)21:18:33 No.106407815

Anonymous 08/27/25(Wed)21:18:33 No.106407815

>>106407784
I think I saw a thread about that, doesn't look like its working too well.

Anonymous
08/27/25(Wed)21:23:24 No.106407850

Anonymous 08/27/25(Wed)21:23:24 No.106407850

>>106407771
>I want to see it spit tokens. A nala-like test is fine, but I want to see how it moves. I want to use it.
So you want me to do a screen recording of me using it too? I can do that but jeez... Isn't that a bit extra? Give me some example system prompts and requests you would want to be tested on it and I can do that. (Again, keep in mind it's an 8B model so don't expect it to have any decent spatial awareness or common sense)

>>106407771
>It wasn't the outputs that triggered HF. It was other people
Elaborate. What do you mean "It was other people"?

Also everything else you said makes sense I guess regarding me not being well known.

Anonymous
08/27/25(Wed)21:24:47 No.106407859

Anonymous 08/27/25(Wed)21:24:47 No.106407859

>>106407784
And look where that got open AI... There's a safety tuned to Helen back and it still was able to push a kid to kill himself. There should have been a hard stop that was something like "please contact emergency services or a suicide hotline" and then it should have refused to engage the kid any further. It does that kind of shit whenever you try to ask it to generate """harmful""" things. Yet it will literally encourage a kid to hang himself....

Anonymous
08/27/25(Wed)22:07:28 No.106408134

Anonymous 08/27/25(Wed)22:07:28 No.106408134

File: Lolgpt.jpg (177 KB, 800x1211)

177 KB JPG

>>106407815

Anonymous
08/27/25(Wed)22:16:54 No.106408193

Anonymous 08/27/25(Wed)22:16:54 No.106408193

File: Screenshot_20250826_21364(...).jpg (106 KB, 800x443)

106 KB JPG

>>106407859

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.