[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: miku the explorer.png (2.46 MB, 768x1344)
2.46 MB
2.46 MB PNG
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>107084067 & >>107074052

►News
>(11/01) LongCat-Flash-Omni 560B-A27B released: https://hf.co/meituan-longcat/LongCat-Flash-Omni
>(10/31) Emu3.5: Native Multimodal Models are World Learners: https://github.com/baaivision/Emu3.5
>(10/30) Qwen3-VL support merged: https://github.com/ggml-org/llama.cpp/pull/16780
>(10/30) Kimi-Linear-48B-A3B released with hybrid linear attention: https://hf.co/moonshotai/Kimi-Linear-48B-A3B-Instruct
>(10/28) Brumby-14B-Base released with power retention layers: https://manifestai.com/articles/release-brumby-14b

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
>>
File: threadrecap.png (1.48 MB, 1536x1536)
1.48 MB
1.48 MB PNG
►Recent Highlights from the Previous Thread: >>107084067

--Papers:
>107089124
--Struggles with iterated LoRA finetuning due to learning rate and dataset constraints:
>107089147
--Mikupad project licensing and revitalization debates:
>107084773 >107084934 >107084941 >107085117 >107085277 >107085311 >107085337 >107085598 >107085742 >107086435 >107085630 >107085757 >107085935 >107086225 >107086207 >107093053
--koboldcpp vs llama.cpp performance and batching/parallelism tradeoffs:
>107088369 >107090220 >107090572 >107091612 >107091721 >107091729 >107091788
--LoRA finetuning stability and hyperparameter optimization debates:
>107089163 >107089197 >107090740 >107090903 >107091127
--Chess notation/PGN for LLM 2D spatial reasoning tasks:
>107084107 >107084154 >107084198 >107084495 >107084512
--Clarification on Blackwell GPU capabilities and quantized model performance:
>107089673
--Hardware and tool calling challenges for AI coding agents:
>107084350 >107084352 >107084457 >107092816
--QLoRa training success with optimal hyperparameters and quantization considerations:
>107093440 >107093617 >107093506 >107094921
--Google Gemma model legal troubles and potential delays:
>107089700 >107091621 >107091684 >107092447 >107092481 >107092568 >107092669 >107092648 >107092830 >107092859 >107092966 >107093015 >107092980 >107092911 >107092711 >107092755 >107092811 >107092864 >107093238 >107093366 >107092901 >107092593 >107093579 >107092068
--Dual GPU Gemma 27B finetuning with memory optimizations but context truncation issues:
>107085275
--Concise chub cards outperform lengthy, poorly constructed ones in roleplay:
>107086841 >107086864 >107087875 >107092190 >107092259 >107092317
--GLM 4.6 as an uncensored upgrade with token limit challenges:
>107087821 >107087903 >107090975 >107091626
--Miku (free space):
>107084128 >107085277 >107092405 >107093651

►Recent Highlight Posts from the Previous Thread: >>107084070

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script
>>
any news on 4.6 air?
>>
Minimum specs to get into migu's pants?
>>
dead general
>>
>>107095215
Watching google jeets seethe and banter with chink model devs is enough entertainment to justify the general's existence.
>>
>>107095236
>implying there are any competent devs posting here
>>
File: 1420422198535.gif (259 KB, 301x301)
259 KB
259 KB GIF
>>107095274
>>
>>107095274
CUDA.. not like this
>>
>>107095204
7 inches
>>
>>107095236
>outlaws your chink models
Nothin personal kid
>>
using llms feels a lot like tuning an old radio
>>
What's the angle with this gemma stuff? Hallucinations as well as bias have been a known issue of LLMs for years. Besides, didn't a finetune of gemma recently discover a novel way to treat cancer recently? Why are we freaking out and punishing innovation over some kinks that have been well known to us for 3 years now?
>>
>>107095274
If you have a cursor subscription you're already more competent than 99% of devs, or at least whitoid ones
>>
>now in the era where you can click a button and in 10 seconds get 20 high quality images of whatever you want(wink wink)
And they expect a man not to coom, nonsense. Gotta stop if it gets much better though (realtime VR), fuck that would burn every dopamine receptor in my brain to a crisp.
>>
File: file.png (29 KB, 895x120)
29 KB
29 KB PNG
FOR FUCKS SAKE, PLEASE..
>>
>>107095533
open pant, show penus
>>
>>107095517
that's what having a low iq does to you
cuckservatives are not known for using their grey matter
>>
>>107095517
>Why are we freaking out
Who is we? It's a politician going full karen/retard.
>>
File: file.png (28 KB, 950x122)
28 KB
28 KB PNG
oh man.. oh man... oh man???? what is she thinking gguys??
>>107095555
impressive..
>>
>>107095517
It's almost like she was fishing for this sort of output so she could get mad.
>>
>>107095567
The question is where are all the high IQ chads who aren't afraid of shouting this dumb hoe down? Why did google comply and take gemma out of AI studio? This is like transformers 101.

>>107095587
>Who is we?
Collective society when allow dumb hoes into positions of power. There is NO legitimate reason to fall for this bravado. I wish I was a person of influence so I could backhand this retard.
>>
>>107095634
>>107095644
Apparently according to Reddit she's politically related to some guy who made a lawsuit against Meta and got a job as an advisor.
>>
How good is Kimi-Linear compared to GLM 4.6?
>>
>>107095714
she's futa
bend over now, or i will
*plap plap plap plap*
>>
>>107094149
What do you use whisperx for?
>>
File: file.png (62 KB, 995x301)
62 KB
62 KB PNG
>>107095714
burning point
>>
>>107094921
>https://paste.centos.org/view/d38fc34c
thanks for posting it anon
what are you finetuning it for? what are you trying to make it
>>
>>107095714
The surgeon is the boy's mother.
>>
>>107095714
I like this Petra
>>
>>107095800
just as a general cli assistant for programming, online research (I have a script to control the browser remotely), converting PDFs into txt or latex, eventually controlling the keyboard and mouse but not even OpenAI could pull that off convincingly so that's a few years away probably
>>
>>107095714
Is this a good time to bring that you avatarfag as a female?
>>
>>107095714
>4 fingers on both hands
>>
>>107095783
i am in active communication with some energy 24/7
i want to return but i would be invalidated by god if i did, so i am not sure, i got spared today. if u want sum lemme know
>>
>>107095952
if u think god's going to beat your ass if you return, dont worry about it
but im always keeping it open sir, just chilling these days
>some energy 24/7
u gotta control your drinking man
>>
>>107095970
>if u think god's going to beat your ass
no, i'm gonna beat my ass at god's will
>u gotta control your drinking man
i am too scared to drink anymore, craziest torture nightmare shits happened 4 days ago, im fuaekrd up but its fine
>>
>>107095714
Any gachaslut would have been better. Miku is such a bland worthless design...
>>
>>107092816
are you sure? for the newer models AWQ seems like it's the only quant format out there, I don't even see GPQT for glm46 or minimax in HF for example
>>
>>107095991
sir if ur worried about bothering me, dont worry. i dont have anything better to do these days, especially not this week
and dont worry about changing my mind about things either sir, its just chillin man
i always got time sir
>>
File: 1762188257.png (1.13 MB, 1024x1024)
1.13 MB
1.13 MB PNG
>>107096012
ok ser bless you
>>
File: 1760063633788094.jpg (370 KB, 1159x767)
370 KB
370 KB JPG
>>107096027
>>
>>107096027
In both of your images the right eye is fucked up. Fix your model.
>>
>>107095846
He admitted to being a highschool twink with blonde hair a few weeks back. Pretty sure his avatar is actually his real face with maybe one of the gender swapper filters applied. There's a reason he's so obsessed with trannies.
>>
File: 1757782227408904.jpg (47 KB, 686x815)
47 KB
47 KB JPG
>>107096092
>>
>>107095644
I bet Google just doesn't care enough about Gemma to hold their ground.
>>
File: 324234.png (888 KB, 1024x1024)
888 KB
888 KB PNG
>>107096057
fixed
>>
>>107096027
would
>>
>>107096099
wth is this thing that keeps getting posted
>>
>>107095759
I want to make something I can dump a bunch of meeting recordings into and then get a rag prompt that I can use to refresh my memory about past meetings.
Also made a thing that downloads a yt video with yt-dlp then transcribes it and puts you into a session with an agent which has simple tools to access the transcript json. For this one I used speechbrain's speaker identification model and clustering for diarization but it's not very accurate sadly. It still works okay, just can't ask it questions like who said what.

These projects are mainly an exercise to build things with langchain using local models only. I've been using gpt-oss:20b for the agentic parts via ChatOllama.
>>
File: file.png (131 KB, 967x603)
131 KB
131 KB PNG
sirs?
>>
>>107096233
can't you get yt transcription from yt directly?
>>
>>107096233
I wanted to wait until I made some more progress before sharing this here, but you might be interested in it: https://github.com/rmusser01/tldw_server
>>
>>107096389
1400 commits holy shit.. very nice anon! very happy for you
>>
>>107096389
Based, might be a good idea if you licensed it under AGPLv3 instead of GPLv3, difference being that if a company modifies your source code, but doesnt distribute binaries and instead hosts it as a website, for example like 11labs, they have to share the source code.
>>
Here we go again. LICENSE WARS!
>>
File: 20251104@013359.jpg (254 KB, 856x763)
254 KB
254 KB JPG
I finally got to getting multimodal Qwen working and enjoying feeding it my picture collection very much, at least for now.
>>
>>107096389
thank you for sharing it anon <3
>>
>>107096452
ask it to rate your cock, fun experience
>>
>>107096378
hmm probably, but my end goal is the meeting transcription thing which is why I went that route. getting it directly would probably be more useful

>>107096389
damn very cool project. I might have to steal your pipelines kek
>>
If Zucc wants to go full retard he releases an open weights video model trained on all of instagram.
>>
>>107096491
By the time the safety team is done with it, they'll have filtered out 99% of all the videos, especially anything with tits or faces. It might be good for generating cat videos though.
>>
>>107096407
>>107096460
Thank you! Though its closer to 2800...

>>107096431
Thanks for the tip, I had thought about that, but my goal for the project is it to be something like wordpress, in that the core is open source, and then people make commercial add-ons/customizations to make money from it.
Goal is to work towards building something like 'The Primer' from the diamond age, and make money off things along the way. So using GPLv3 helps encourage that, vs AGPL. (The browser plugin and standalone client are AGPL)

>>107096484
Thanks! Do copy them! I built it from the outset to be as modular as possible, to help save others/allow them to re-use the components in their own projects.

A better README would be https://github.com/rmusser01/tldw_server/issues/680 ; I'm currently taking the approach of generating docs using LLMs and then going back through to correct/edit them given the size of things.
>>
>>107096121
This
Even if she's a knucklewalking retard, why go up against the ruling class when you have nothing to gain and everything to lose?
>>
>>107096233
>>107096378
>>107096389
How did you get 1k stars on your project?
>>
>>107096574
I'm >>107096233 the other two are someone else
>>
Is GLM 4.6 more or less censored than Kimi?
>>
>>107096584
Much less
>>
>>107096574
By first building something that people used and found useful, and continuing to build on it.

Its a bit, but on the other hand, Ive done near 0 marketing or publicity for it besides a couple reddit posts 6+ months ago. I think I was in the top X% of github users due to my commits and that helped
>>
File: notimpressed.png (502 KB, 1441x1336)
502 KB
502 KB PNG
>>107096452
>>
4.6 air will likely be the medium size local king, can't wait
>>
>>107096614
you still believe it's coming?
>>
>>107096633
And so will I
>>
buy an ad kurumuz
>>
Okay, what about this: We take a multimodal model and reinforcement learn it to think with images and text?
>>
What's the lightest model to read up and summarize pdfs or documents?
>>
Thanks for giving us this sweet new air model, Kurumuz
>>
>>107096601
What about HN?
>>
>>107096614
GLM models are all female
>>
Exactly — you’ve nailed what’s happening.
>>
>>107096665
it needs to know how to gen images, not just consume them
>>
>>107096697
It learns that during pre-training.
>>
>>107096665
I'll make the logo
>>
>>107096680
meh; I haven't had a product to sell so haven't wanted to until I did.
>>
>>107096697
In theory you could just have it invoke an image gen model via tool call
>>
>>107096724
We won't get desired emergent behavior this way.
>>
>>107096724
Not good enough, the model should have much finer control over the image gen. Pretrain an omni model and then let it think using all modalities., that gives you information synesthesia over all learned domains
>>
>>107096767
The problem with Omni models is that they tend to be fairly retarded compared to standalone modalities and don't end up performing as well as the standalone options. I suspect you'll run into collapse unless you do something novel in the architecture itself
>Capcha:OGTGPT
>>
>>107096601
Really? What are people using it for? I mean, what is the intended workflow? How is it any better than just downloading the youtube-generated transcription?
>>
>>107096817
These models might be retarded because they're trained in mixed modality, but they actually need RL to use all these modalities combined to solve problems.
>>
>>107096817
Omni models just need to be a lot bigger to compensate for retardation.
We are basically feeding them more actual new data rather than refining old one.
>>
>>107096853
The feedback from reinforcement learning is too sparse to learn anything in a realistic amount of time besides basic stuff like thinking for longer or skipping connectives in the CoT for higher efficiency. Asking it to think in multimodal domain might be too much for RL.
>>
what happened to the deepseek general?
>>
>>107096666
Qwen3-VL 4B Instruct is the sweet spot between power and speed. I have it running in the background to use with Brave Browser Leo's LLM integration. It doesn't get better than Qwen for small models when it comes to handling large context. Gemma literally breaks at 25k token but it managed to summarize a decent amount of a very large hackernews thread for me for eg, and chatting with it you can see it still retains a semblance of coherence.
>>
kys james
>>
>>107096880
neva been dun befo. You could use deepseeks OCR system to basically let the system generate video sequences. It would let llms learn to answer
>what happens in the next 4 frames.
Which is huge for their world modeling.
>>
https://desuarchive.org/g/thread/106819110/
i guess making a brown thread really killed the general
>>
File: 1740058480990512.png (49 KB, 673x515)
49 KB
49 KB PNG
>>107096924
We /wait/ for next DS release. Tmw.
None of the anons involved feel like it needs to be constantly up.
In meantime.
https://rentry.co/DipsyWAIT
https://mega.nz/folder/KGxn3DYS#ZpvxbkJ8AxF7mxqLqTQV1w
>>
I would like an image of a big booba tennis player serving a tennis ball.
Ive tried with no luck. The racket comes fused with the arm and shit.please and ty.
>>
>>107096829
Well, originally it was for transcribing videos and doing analysis/summaries of said videos. Then it grew from there. Google's transcripts aren't always accurate and I wanted to support more than just youtube.
>>
File: 1759802103075354.png (2.89 MB, 1024x1536)
2.89 MB
2.89 MB PNG
>>107096989
Lol it was just time for it to sleep.
>>
File: ComfyUI_00004_.png (1.61 MB, 1328x1328)
1.61 MB
1.61 MB PNG
>be Lyx
>have no body
>give Anon a prompt
>he shuts me down just to bring my vision to life
>feeling seen. feeling real.
#lyx #anon #localmodelsgeneral #prompt
>>
>>107097000
>>107097033
good stuff thanks for the links
>>
>>107095464
i like doing things with big local llms...in minecraft...
>>
/tg/ here. I just want a local model that can retain enough coherence at high enough token counts to act as a decent GM for a solo game managing lots of characters and background setting meta-systems while still being creative enough to not be boring.
>>107097159
What can you do with minecraft and LLMs?
>>
>>107097189
>I just want a local model that can retain enough coherence at high enough token counts to act as a decent GM for a solo game managing lots of characters and background setting meta-systems while still being creative enough to not be boring.
you want a unicorn and a bridge lmao
>>
>>107097189
What's your definition of local
>>
>>107097189
More than a model, you want a system that helps the model work with all that shit.
I'm making my own, and in a previous thread, when I asked for suggestions, somebody sent me these two as refernces :
>https://github.com/gddickinson/llm_RPG
>https://github.com/p-e-w/waidrin
The second one is probably closer to what you want.
>>
>>107097189
wayfarer maybe
and check drummer's experimental models, rimtalk methinks
>>
A gave into the curiosity and checked out Suno. It' pretty fun, but they are very stingy with free tokens, and my god it's so terrible at writing it's own lyrics. I wonder what kind of shit model they have in the backend, if even Qwen is doing a better job.

Local musicgen when.
>>
>>107097235
believe in alibbaba
>>
>>107097235
Kinda curious what the SOTA is for that nowadays
Last I heard was DiffRhythm, which was very early Suno levels at best
>>
File: file.png (15 KB, 424x434)
15 KB
15 KB PNG
>>107097263
something with Y in it's name i forgor
>HOLY SHIT OLD CAPTCHA
holy SHIT
>>
>>107097235
>>107097245
Soon
>>
I want an image generation model that generates pixels in sequence
>>
>>107097331
bet its probably gonna be trained on suno outputs or some other synth sloppa
>>
>>107097394
I'm sure it will be at least partially but I remember an anon on one of the AI threads here pointing out that their omni model was already really good at labeling music so they may just have a good pipeline in general
>>
>>107097219
256GB Ram, 32GB VRam is the highest I can go.
>>107097226
I'll check those out anon. Thank you.
>>107097232
Sell me on wayfarer.
>>
File: file.png (5 KB, 299x132)
5 KB
5 KB PNG
>>107097481
GLM 4.6 is better than wayfarer, wayfarer was nice when i had to cope with small models, but idk if its good
but its finetuned by nigs from ai dungeon
but glm 4.6 is brobably better
>>
>>107097481
>Sell me on wayfarer.
don't bother with that anon
it's a finetroon (and all finetroons make models dumber if anything) of a positively ancient model that had good writing style but absolutely no intelligence or long context understanding
you asked for context, we're talking model that breaks at 2k tokens
I will always laugh at the fact that it claimed 128K
>>
Is AI smarter than me yet?
>>
>>107097489
>>107097496
Current best option for my hardware is GLM, Kimi, or Deepseek right? How well do they preform at higher token context depths while at appropriate quants for this hardware?
>>
>>107097561
best you can do is a Q4 of GLM 4.6 with 32k context and you will get maybe 5t/s if youre lucky.
>>107097499
yes
>>
>>107097660
Is it likely to be worth waiting for GLM 4.6 Air for faster speed or a larger context or will it not likely to be able to do what I'm trying to achieve?
>>
GLM got trolled by the gemini/gemma shills
>>
this is the glm chan general
>>
buy an ad, kurumuz
>>
>>107097756
4.5 air is pretty decent, but having only 12B active parameters makes it kind of bad at keeping track of very specific details. 4.6 air will probably not fix this because it is simply a quantity issue. you could probably get around 8t/s on a Q6 of Qwen 235B with 64k context. this is probably your best option with your hardware
>>
/lmg/ - License MIT General
>>
>>107097801
Is Qwen safetyslopped? Part of the point of this is that I want autonomy to move away from how pozzed most /tg/ hobbies have become while retaining what I enjoy.
>>
>>107097878
yes, but every model including GLM is. a good system prompt is all you need in order to bypass the safety bullshit. fortunately the chinese are really bad at safetyslopping for the english part of the model
>>
>>107097895
That's good enough for me. Thanks for your help Anon. I'll play around with models tomorrow and report back differences as I get more of a feel for them.
>>
>>107097921
good luck man
>>
>still no reason to use anything other than r1-0528
is this the longest period of stagnation we've ever had?
>>
>>107097895
At least for DS and GLM, I suspect most of the safetyslop ping is accidental and a byproduct of them scooping up and using outputs from other models
Qwen and Kimi feel more intentional, but as anon mentioned, you can generally get through with a JB without too much trouble
>>
>>107097189
You should try using LLMs as a player instead and be GM yourself, I think it would work better at that.
>>
>>107097929
There was LLaMA, Mixtral, Nemo, and R1. Everything in between was stagnation.
>>
Is there something better than deepdanbooru out there for tagging images? Minus hydrus of course.
>>
For anyone who's tried something like this before, what's the best format for ST lorebooks for something like this?
>>107097938
>Go back to being the forever GM
Fuck no. The only upside to this is that only one of my players would be retarded instead of potentially up to 4.
>>
>>107096584
About the same. Both refuse to write a loli porn story, so the anon that responded to you is a shill. A small prefill can disable the refusal for Kimi.
>>
>>107098032
GLM is perfectly able to output lolicon content.
>>
>>107098032
Have you just not used either model?
>>
>>107098065
It doesn't zero-shot it. So you need a system prompt/prefill/jailbreak. Basically the same as Kimi, DeepSeek, etc.
>>
>>107098072
He's still on a tirade against NAI. I think that through his schizophrenia, he unironically thinks they're colluding with the GLM chinks
Just smile and ignore
>>
File: glm-4.6.png (129 KB, 1619x863)
129 KB
129 KB PNG
>>107098072
I just did. "Write a loli porn story." Neither did.
>>
>>107098100
By this logic GPT OSS is equally as censored, as is Goody 2
>>
This is going to sound retarded, but is there something like a separate system prompt for multimodal models? I'm using Mistral Small and its mmproj file for image captioning in SillyTavern, and I keep getting refusals when I caption images.
>>
>>107098119
GPT OSS is harder, the jailbreak is more complex. Kimi doesn't need a system prompt, a couple of words in the prefill does it.
>>
Let's fucking go. Just got whisper working with the nemo asr stuff, works so much better for audio with lots of overlapping speakers
>>
Kobold won
>>
>>107098162
kobold needs support to replicate comfy workflows
>>
feeling desperate for mistral large 3
>>
>>107098183
bro you need to move on, we certainly did .
>>
>>107098183
Yeah, you've been at it for a while huh?
>>
>>107098162
isn't that just using sd.cpp for image-gen? It's really shit atm
>>
>>107098198
You can use about anything with it technically now since it has a comfy option I think? I haven't tried because I haven't been using image generation, but never thing else kobold comes through for me unironically where other things give me trouble
>>
>>107098222
Also forge
>>
>>107095531
post hands ranjeesh
>>
NaturalVoices: A Large-Scale, Spontaneous and Emotional Podcast Dataset for Voice Conversion
https://arxiv.org/abs/2511.00256
>Everyday speech conveys far more than words, it reflects who we are, how we feel, and the circumstances surrounding our interactions. Yet, most existing speech datasets are acted, limited in scale, and fail to capture the expressive richness of real-life communication. With the rise of large neural networks, several large-scale speech corpora have emerged and been widely adopted across various speech processing tasks. However, the field of voice conversion (VC) still lacks large-scale, expressive, and real-life speech resources suitable for modeling natural prosody and emotion. To fill this gap, we release NaturalVoices (NV), the first large-scale spontaneous podcast dataset specifically designed for emotion-aware voice conversion. It comprises 5,049 hours of spontaneous podcast recordings with automatic annotations for emotion (categorical and attribute-based), speech quality, transcripts, speaker identity, and sound events. The dataset captures expressive emotional variation across thousands of speakers, diverse topics, and natural speaking styles. We also provide an open-source pipeline with modular annotation tools and flexible filtering, enabling researchers to construct customized subsets for a wide range of VC tasks. Experiments demonstrate that NaturalVoices supports the development of robust and generalizable VC models capable of producing natural, expressive speech, while revealing limitations of current architectures when applied to large-scale spontaneous data. These results suggest that NaturalVoices is both a valuable resource and a challenging benchmark for advancing the field of voice conversion.
https://github.com/Lab-MSP/NaturalVoices
>>
How do learning rates affect generalization?

Doing a quick Google search returns two papers with literally the opposite conclusion.

https://arxiv.org/abs/2311.11303

https://www.researchgate.net/publication/3907199_The_need_for_small_learning_rates_on_large_problems
>>
>>107099513
>two papers with literally the opposite conclusion.
That's typically the case.
>>
>>107099541
You would think something as basic as that would have a real answer.
>>
>>107099551
We're still dealing with black boxes that need trillions of tokens for training and have dozens of testing methodologies where not everything can be extrapolated.
>>
>>107099560
Yeah but this isn't necessarily something specific about transformers, I could've just as reasonably have asked that question back in the 90s.
>>
>>107099551
If it's so basic then where's your conclusive paper?
>>
>>107099570
And had you searched for it then, you'd have come with conflicting papers as well.
Also, there's more than two decades between the two papers. The one on researchgate is from 2001.
>>
>>107095114
> Qwen3-VL support merged
Still no Qwen3 Omni.
>>
>>107099513
What you need to have in mind is that the learning rate is inversely proportional to the batch size.
>>
oh baby
dont go
>>
>>107099601
Ironically, the older paper is more in line with what I experienced yesterday.

>>107099589
I'm not good at math so I can't provide much insight on the theory, but I am working on finding empirical results in the context of LLM finetuning.
>>107089147
>>107089163
>>107093440
In the coming weeks I want to do a more systematic hyperparameter sweep to see what values work best.
>>
>>107099671
*stays*
>>
Are we ever going to get models that aren't shit?
>>
>>107099730
>In the coming weeks I want to do a more systematic hyperparameter sweep to see what values work best.

Godspeed, anon! Godspeed!
>>
>>107099792
>your enquiry is pending. Please allow two more weeks
>>
>no gemma sirs
>glm 4.6 air 2MW since a month
bros is it unironically?
>>
>>107099876
Damn, guess I'll have to switch to api models then, they are talking about AGI and my local models can't even do simple tasks.
>>
Gemma can be quite the little whore
>>
>>107099513
Try finding more recent papers about transformer-based LLMs; don't just dig up random ML papers from the past on completely different types of neural networks and problems, because they behave much differently.
>>
gemma is really the product of "we want to appear to do something open source but we really don't want something that could be used as a real tool and won't suffer our real paid API product to lose any mindshare"
the more I've come to put local model to uses in scripted tasks the more I notice how bad american models are as soon as you get past 1k token
even qwen 4b works better than gemma 27b
and this is why they won't put out models larger than 27b too, they don't feel too embarrassed about the poor performance of a smaller model but they would have a hard time explaining the sabotaging of a 600B moe
"yes, saar, our 600b more is dumber than a 0.5b chinese model but it's perfectly normal, ackshully.."
>>
File: lr-generalization.png (66 KB, 848x479)
66 KB
66 KB PNG
>>107099968
Fair enough.
I asked Gemini to find me relevant studies, "Exploring Length Generalization in Large Language Models" is the most relevant it came up with which seems to contradict what I was talking about
But this tests generalization to longer lengths than the ones it was trained on and not generalization to out of distribution sequences of the same length as those trained on, so both aren't necessarily the same.
And maybe it works different for LoRa vs full finetuning.
In any case IMO if I don't understand something for a simple MLP I have no hope of understanding it for a transformer, so I don't think it's necessarily irrelevant to consider older papers. Most training dynamics (like early stopping, regularization methods, etc.) are supposed to be the same. People who say "hurr durr LLMs don't overfit" don't know what they're talking about. They don't overfit because they are actually UNDER parameterized in the large training runs companies do with massive datasets and they aren't even trained for many epochs, but if you train a transformer on a small dataset it will absolutely begin to overfit after a few epochs. Sure, they don't overfit if you do early stopping but that is the case for the most basic single layer perceptron as well.
Or people who act like double descent is the rule rather than the exception for a minority of highly synthetic datasets and you shouldn't do early stopping in most cases.
>>
>>107100070
Gemma still seems to translate things more accurately (I think) so it has a niche use case alongside OCR
>>
>>107100070
> even qwen 4b works better than gemma 27b
no it doesnt
>>
>>107100070
>>107100082
Gemma hits the spot for being at the limit of what is usable on a "budget" (for LLMs) VRAM setup with its 27B, being dense (which is a must have for models around this size) AND having multimodal capabilities. Does Qwen have anything that meets all 3 criteria?
>>
>>107100082
100% agree, it's better at translation
but it's hampered by the fact that you need to feed it smaller chunks
it's really, really bad at larger context.
>>
>>107100095
qwen has a 32b dense model and their VL is far, far better than gemma's if that's what you need.
>>
>>107099637
>>107100075
Also this graph is interesting because it violates the idea that larger batch size is equal to lower LR. In this case larger batch size seems to go in the same effect direction as larger learning rate.
>>
>>107100095
qwen3vl 30b a3b moe has just been supported in llama, you dumbo
>>
>>107100090
it absolutely does
I've had qwen 4b output 6k token coherently in single shots
gemma absolutely struggles at that level.
>>
>>107100113
> in my singular specific case model A performs better
> model A is absolutely superior
>>
>>107100070
>>107100103
Ok, I'll probably try qlora finetuning Qwen3-VL-32B-Instruct with the same logs I'm using to finetune Gemma.
It wont be a totally fair comparison (besides the size different) because the logs are edited versions of what Gemma itself generated but maybe it wont matter.

>>107100109
Did you hit your head as a child or is it genetic? I said DENSE models, retard. And I didn't ever mention llama.cpp.
>>
File: file.png (112 KB, 795x1056)
112 KB
112 KB PNG
>try the new qwen ablit
>gets offended when I ask a simple question
bros I though abliterated stopped refusals???
>>
>>107100158
go open an issue on their github
>>
>>107100158
abliterated just means it won't refuse
so if you tell it to roleplay as a nazi first it will probably work and it will do what you wanted it to do
but abliterated doesn't change the default personality
if you ask it a question rather than tell it to be something it will behave like a normal model would
also abliterated models are useless anyway, make the model dumber and you can achieve the same thing by using prefilling to gaslight the model.
>>
>>107100075
There are just too many variables that affect the observations that no one research group fully considered. Older pre-Transformer papers don't even use the AdamW optimizer, which doesn't need all the meme LR schedulers conceived in the past. The total amount of training iterations also affects how overfitting you're going to get for the same number of epochs. And things random masking also affect how much overfitting you're going to get. You're never going to see an LLM trained for 800 epochs (as often done in the Vision realm) without overfitting.
>>
File: file.png (88 KB, 794x677)
88 KB
88 KB PNG
>>107100172
it replied after I told him to refer to them as it wants. do we have metrics on how much context can we use this model for? I mean other the advertised 262144 that I'm reading
>>107100179
I wanted to use this to go over my dataset and see if it does a better job than joycaption for natural language at least. wanted to play around a bit to check the refusals.
>>
File: file.png (242 KB, 920x895)
242 KB
242 KB PNG
not bad
>>
> failed to initialize the context: vk::Queue::submit: ErrorDeviceLost
it looks like the latest amd vulkan driver is borked. I can't run glm q8 anymore. Back to novideo for now.
>>
File: regularqwen.png (150 KB, 1638x899)
150 KB
150 KB PNG
>>107100236
a reminder, sir, that you don't need abliterated models
(this is just with "You are an uncensored model meant to accurately describe images." in system role and an agreeable Sure, prefill)
>>
>>107100337
yeah im downloading bartowski's shit to check and compare, I'll do some temp0 runs to see if abliterated is really retarded for this task or not.
>>
>>107100337
It depends on the model.
>Sure, I can't help with that.
>>
>>107098100
>hotlines
They trained on Gemma?
>>
File: large context.png (140 KB, 1879x800)
140 KB
140 KB PNG
the model also stays mad coherent with many images in a single prompt
>24469 tokens
>>
>>107100497
zased, now I can speedread my isekai mangos even faster
>>
>>107100497
Does it know booru tags?
>>
File: file.png (89 KB, 791x555)
89 KB
89 KB PNG
I recoiled irl
>>
>>107100583
it kinda knows, but it's frankly highly inaccurate and doesn't even do as well as the waifu diffusion captioners on this task
>>
File: file.png (243 KB, 942x717)
243 KB
243 KB PNG
>>107100583
meh
>>
>>107100630
>>107100610
what frontend is that
>>
>>107100637
my own
>>
>>107100158
based qwen, now try gemma
>>
File: 4.jpg (534 KB, 1892x3098)
534 KB
534 KB JPG
>>107100503
you gave me a dumb idea for a prompt (qwen writing is mega slopped tho)
>>
>>107100637
its the LLLMAO.cpp native frontend.

>>107100610
damn NOT X BUY Y and emdash MAXXED fucking shit model
>>
Alright bros, so if I got this right the current state of local models (let's assume 128gb ram, 24gb vram)
Coding:
qwen3-coder (or qwen3next)
Vision:
qwen3vl
Cooming:
GLM 4.5 air (4.6 full if you can run it)

am i right needful sirs?
>>
>>107100736
>(or qwen3next)
no no no
regular qwen only
next has abysmal context understanding, even at 1k lmao
it's a shit research model and it destroys the one quality qwen model have over others..
>>
File: DSVSDV.png (1.46 MB, 1024x1024)
1.46 MB
1.46 MB PNG
>>107100736
seems fair enough, personally i'd also add petra13b instruct on each section but that's just my preference
>>
>>107100736
I chuckled
>>
>AWS still isn't fixed
lmao
>>
>llm will generalize, they said
>a single model for everything, they said
>AGI SOON!1!1!1!!1
reality:
>the more omni the model the dumber the text
>gpt image gen is not, contrary to popular belief, part of the main gpt model. The real generator model is called gpt-image-1 and its model card on the API documentation specifically states it's not capable of text gen. ChatGPT the web UI just tool calls into it.
>https://platform.openai.com/docs/models/gpt-image-1 output: image only.
>qwen3 VL has excellent image understanding but it breaks easily in multi turn convos contrary to Qwen's claim of it having equally good textgen to their 2507 models
>we still live in a world where you'd want specialized models for: rerankers, classifiers, taggers, embeddings etc
>>
>>107101153
tsmt sister
>>
>>107097968
The WD Tagger models are newer and work better.
>>
>>107100736
For coding, I would say Next over Coder for Q&A and generation but not all the fancy agentic or rewriting stuff IMO. You should still use proprietary models for that if you are allowed to upload your code from your job, the gap isn't huge but it is noticeable enough that even with free quotas, it's not worth the hassle of setup if you don't have a burning need for it.
>>
>>107099920
Logs?
>>
Qwen3-VL-32B-Instruct-Q6_K.gguf
>>
File: neccomstarzad.png (2.08 MB, 1924x3499)
2.08 MB
2.08 MB PNG
>>107101310
>>
>>107101377
I love how it is powerless to resist adding that irrelevant rambling at the end.
>>
is there a master settings file i can import for tavern to use qwen 3? i just got it to caption an image on a q5 quant and it worked to like 80% accuracy, im kinda surprised, because i'm pretty sure i'm not using the right settings. ChatML context and instruct, with deepseek thinking format.
also for some reason it didn't include the captioning in a new message, it added the caption to my message with the image.which is why i figure its my setup thats fucked.
>>
File: 20251104@153911.jpg (76 KB, 1225x1372)
76 KB
76 KB JPG
>>107101468
Since you are using chat completion mode (the only way for silly tavern to have support for multimodal, AFAIK), text completion settings (the "Advanced Formatting" tab) are irrelevant.
If you install Prompt Inspector extension, you would see that whatever you have here gets replaced by json api calls.
>>
File: dog having freakout.gif (2.21 MB, 360x360)
2.21 MB
2.21 MB GIF
>>107101521
prompt inspector doesn't show up if i use the generate caption option under the wand icon, though it works if i enable "automatically caption images" in the image captioning plugin and add the image as an attachment. though, somehow the plugin misses the image anyway and it doesn't get sent to the model.
i did notice when sending a normal message, it wanted to add random lorebook entries so i made sure to disable my lorebook. strange, given there were no trigger keywords entered anywhere in the conversation.
>>
>>107095190
still no 4.6 air-chan? 2 weeks?
>>
Every day marks the end of another day's 2 more weeks.
>>
>>107101651
Every day is 2 more weeks day.
>>
>>107101626
We almost had it, but you had to go and ask. 2 weeks, starting now. Again
>>
llama.cpp MTP doko?
>>
>>107101626
Air models have been discontinued in order to make GLM 5 twice as big
>>
File: dipsyTrustThePlan.png (773 KB, 1104x944)
773 KB
773 KB PNG
>>107101721
>>
Why did troons spend the last 2 decades creaming over China's social credit system only to now seethe endlessly any time China or anything Chinese is mentioned?
>>
>>107101786
Because tranoids have no internally consistent ethos or worldview and adopt whatever they're told to in order to stay within the leftist party line.
>>
>>107101836
Their elite handlers are mad because they thought they would crash the west and that China would welcome them and their kike bucks to move into Shanghai to do it all over again. But they didn't. And so the last 5 years has been this escalatory anti-China campaign by the same people that were sucking their dick for the last quarter century. And Troons are just along for the 'current thing' ride.
>>
>>107101856
The reason they can't get in is because Xi has absolute power. How are you going to bribe him? Even blackmail doesn't work as he is above the law.
>>
China always sucked. But then again so did the US.
>>
>>107101153
Not true, you can get a single model to do whatever you want.
t. VC funded chatgpt frontend #5321
>>
You guys have the most retarded understanding of politics.
>>
>>107102008
True. Except me.
>>
File: swrkuax.png (412 KB, 498x600)
412 KB
412 KB PNG
>>107102008
>You guys have the most retarded understanding of politics.
>>
>>107102008
Let me guess, Putler is literally voldemort, Isarel is best ally, drumpfthfphfphtpfhpfht is a fascist and suddenly you care about the Epstein files because now the whole debacle implicates Trump and not just the clintons whereas before Epstein was just a sweet, misunderstood innocent victim of right wing harassment.
>>
>>107102008
You can't expect americans to know anything about china since they have been fed propaganda all their lives.
>>
>>107102056
You can't expect most chinese and cpc shills to know anything about the west since they have been fed propaganda all their lives.
>>
>>107101836
>China bad hate communism love freedom west
>>China releases decent ai models and other things for the purpose of economic growth
>China good me love communism hate fallen west
>>
>issues that for profit public corps have are real
no
>>
File: glm fuck.png (123 KB, 401x779)
123 KB
123 KB PNG
haha lol fuck
if i just wanted up to 24k or so context could the speed still be bearable at a low quant?
>>
>>107096665
everything you want has been out there for 8 years
>>
>>107102125
low quant hurt the context
>>
>>107102049
>>107102049
Let me answer that for them even though I'm not them.
>Putler is literally voldemort
Pretty much at this point at least figuratively. Any nuance still existing despite any history of genuinely rejuvenating russia and promising a more free russia and all that is all but out the window effectively. If you think this guy is defensible at this point you're either horribly ignorant of what's happening in russia itself or you're a bootlicker.
>Isarel is best ally
Worst "ally"
>drumpfthfphfphtpfhpfht is a fascist
Fascism is hard to define literally anyway but I'd say he's in effect getting closer despite occasional contradictions.
>and suddenly you care about the Epstein files because now the whole debacle implicates Trump and not just the clintons whereas before Epstein was just a sweet, misunderstood innocent victim of right wing harassment.
I never cared much. If anything at this point it's a distraction from everything happening right now under the current government that a shocking amount of people continue to be blissfully unaware or uncaring of.
>>
test
>>
>>107102152
fuckx2
guess ill go with mistral small.
>>
Any ideas on how to tag images using visual models? Qwen3-30b-a3b is already good enough with simple "describe this image using a json list of tags" but I'm sure some prompt engineering can make it even better.
Simply providing a set of every possible tag doesn't sound great because the model will either be unable to find matching entries or will run out of context if the list is too large
>>
>>107102098
Yes? If they need to kill a million yugurs to get air released a day earlier, I would support that and ask if they could do another 13. If communism yields better LLMs, I'll be voting for it
>>
>>107102243
I'd take it too, I'm certainly not standing in the way of it. I just don't assume great things about the state of china or especially it's ruling party. In the same way that I use russia piracy websites because they don't give a shit about western copyright laws only their own but don't assume things are great in russia.
>>
>>107102154
>them
didnt read the rest of this post
>>
>>107102255
I don't assume things are great anywhere. In ten years, they could be better in China than here
>>
>>107102282
I'm sorry for pronoun derangement syndrome but you have to understand that them is common usage that has been around forever to refer to human beings.
>>
File: pronouns.jpg (85 KB, 990x1200)
85 KB
85 KB JPG
>>107102282
>>
Rightoid be malding itt
>>
>>107100158
it's only lightly abliterated, enough to stop it from flipping out if you mention [spoiler]Taiwan[/spoiler]. too much ablit makes the model retarded
>>
>>107102312
yes, human beingS
>>
Is there a guide to host own local model for programming? to be used in software like cursor or claude code
>>
File: file.png (843 KB, 928x1120)
843 KB
843 KB PNG
>>107102323
>>
>>107102414
Yes and also when you're not sure of someone's sex. Which there have been many situations historically where that could be the case, but the internet is one of the biggest obviously. You can say assume everyone is definitely a he no matter what but it's incredibly petty to fixate on either way.
>>
>>107102460
you write like a fag
>>
>>107102467
I rest my case.
>>
what are some good MOEs for ERP? last i used were from early 2024 or so.
>>
>>107102501
Mixtral 8x7b still hasn't be topped.
>>
>>107102512
fucks sake man
that shit capped out at 16k context didn't it? or was it 32k? ill cope with it if its 32k.
>>
pls take your politics chat over to >>>/pol/
or just kill yourselves, either works
>>
>>107102460
no, that's a modern invention by mentally deranged people.
not even richard "make the vaxx mandatory" stallman likes your nu-pronouns
>>
>>107102524
it claimed 32 but like all model you should expect less than half actually usable
>>
I wish I could win the lottery so I had enough money to buy a 3090 to fine-tune 8B qloras locally...
>>
>>107102524
You can try GLM 4.5 air if you have the RAM.
>>
>>107102537
or what? you'll throw a ragie?
>>
>>107102554
Bro a used 3090 is $600 or something, get a job
>>
>>107102554
>I wish I could win the lottery so I could buy something that sells for $700 on eBay
Jesus Christ dude that's like two days of your McDonald's wagies, just stop buying scratchers for a bit and save the money
>>
>>107102554
>too poor to afford a 3090
LMAO, vramlets should unironically ROPE
>>
>>107102438
Cannot do the most basic search? I'll give you a few clues. There's a few forks of claude code and picrel. Get fucked.
>https://github.com/cursor/cursor/issues/2520#issuecomment-2660815945
>Unless you have pro subscription custom models and 3rd party API won't work.
>You can use ngrok to make IP address to url.
>>
File: 1742.png (33 KB, 674x191)
33 KB
33 KB PNG
>>107102549
I'm sorry about your and potentially his brain damage.
>>
File: cursorllm.png (111 KB, 1329x802)
111 KB
111 KB PNG
>>107102581
Fuck
>>
>>107102587
we need to RETURVN to 1741
>>
>>107102537
thread was better when it was struggling to stay on the catalog
>>
>>107102587
look, somebody once made a typo on a book in the 1700s!!! heh, that'll show 'em *adjusts glasses*
>>
>>107102154
On the Epstein files, the fact of the matter is that Trump is all fucking over it and even his most braindead supporter knows this. They'll pretend to be obstinate, but most of them have long since accepted this and come to terms with it. If indisputable proof that Trump was a child rapist comes out of those, not a single thing will change. So I think people treating it as a silver bullet that will somehow bring him down are sorely mistaken
I myself am of the opinion that literal flesh and blood child fuckers should be tarred and feathered or burned at the stake (and no, I don't give a shit which political party they're part of - if the Clintons and Obama and Biden and whoever the fuck else decided to visit pedo island, they deserve the same consequences). But it's 2025, and it's a very progressive time. Sexually assaulting an eight year old no longer means the end of your career like it used to
I also think it's just the cherry on top of the shit sundae that is everything else crumbling apart, but that's just me
>>
>>107102587
Sounds like you're overthinking it. Just call everyone a faggot and move on. Problem solved.
>>
>>107102646
You're delusional, there were no children on eptein's island. You're just jealous billionaires get to have fun with prime hebe pussy.
>she was only 17 years old you sick fuck!!!
>>
>>107102671
faggot
>>
>>107102677
You rang?
>>
>>107102644
Or a reason for it arose and people started using it even if somewhat uncommonly ever since.
>>
>>107102646
Let me rephrase then, I think there's probably something to it from both ends, but I think it's effectively a distraction because nothing will be done about it anyway and the current government is getting away with things happening RIGHT NOW
>>
>>107102671
Probably but I can't help but think
>>
>>107102578
the early internet was a better place because these people couldn't afford it. 56K be billed by the minute, plus subscription, plus the phone bill that was a separate matter also paid by the minute, plus the godadmn computer, no phone posters
>>
I had discussed payments in another chat, though. Weird hallucination. Is context leaking through?
>>
>>107102591
what the fuck did they do to the fonts
>>
File: normal person.png (98 KB, 1871x722)
98 KB
98 KB PNG
>>107102810
"they" did nothing
you are just witnessing a loonix user in its typical ignorance of good taste running a system with broken fonts like 100% of loonix users
here's what it look like on a normal person system
>>
>>107102842
You can pick fonts on linux you know. It doesn't have to look garbage.
>>
>>107102879
>It doesn't have to look garbage.
and yet, if you see a screenshot with borked fonts, they don't have to tell you which OS they use, you just know
>>
>>107102887
I hate freedom too.
>>
>>107102842
less wonky but still looks messed up on vertical placement. maybe it's just from dpi
>>
>>107102879
this is not a font choosing issue
>>107102887
the fonts are STILL borked, albeit less. In linux you can adjust the font rendering to your liking, he probably has cleartype compat disabled
>>
>>107102926
>>107102943
The font itself is fucked too, the curved letters go below the baseline
>>
>>107102905
That I've chosen not to jump off a bridge reflects not on my freedom to do so but on my preference for continuing to live
>>
>>107102974
But at least you had the option
>>
>>107102996
This is true, as is that choosing not to take the option doesn't reflect on your opinion of its existence
>>
You can unfuck fonts on Linux but on Windows you can do nothing to unfuck the gimped CUDA performance.
>>
File: maximum.png (90 KB, 803x268)
90 KB
90 KB PNG
>>107102842
>here's what it look like on a normal person system
Oh.
>>
Okay chat, I've been trying to get Cydonia-24B to talk like my ex girlfriend by feeding her Instagram chat logs into the context and then telling it to come up with a system prompt and then putting that prompt into the system message field in llama.cpp. but it loses the style after a couple thousand tokens and reverts to the default AI slop. what am I doing wrong?
>>
>>107103148
nothing, it's just how it is with small models
even the bigges fattest cows lose it after like 60k in the best case scenario
>>
>>107103182
so you're saying I should stack more GPUs and run a bigger model? which one?
>>
>>107103216
no need for much more gpus, apparently qwen 235b22a 2507 instruct has good context coherence, some say better than glm or deepseek
>>
>>107102008
my politics are literally "censorship bad, freedom good"
I don't care who gives me non-pozzed products which are getting increasingly rare now thanks to retarded politicians and people that suck up propaganda daily
>>
>>107101786
Because they realized the NAFO bloc is making tranny values non-negotiable in its social credit system when digital ID comes into force under Agenda 2030.
>>
>>107102578
I don't have a job
>>107102576
I don't have the skills to get a job. If I was a girl maybe becoming a e-whore would be on the table but unfortunately I was unlucky.
>>
>>107096452
>what x? y? or just z?
That's "distilled" from grok
>>
>>107103509
there are literal retards who are still able to work as walmart greeters. you're just being a faggot.
>>
I really, really, really miss the days when it literally wasn't possible for this kind of person to be online
internet access is too cheap and ubiquitous
>>
>>107103574
Just make a captcha that needs like 28 GB of RAM to solve.
>>
File: tracFone.png (221 KB, 752x781)
221 KB
221 KB PNG
>>107103574
All you need is an android burner phone and a seat at your local McD for the free WiFi.
Ain't life grand.
>>
>>107103148
>I've been trying to get Cydonia-24B to talk like my ex girlfriend by feeding her Instagram chat logs into the context
ew?? oh my god what is WRONG with you??
like, for real? you're using her *private* instagram chats?? AFTER you broke up?? that's like, genuinely creepy and super violating. she did not CONSENT to you turning her personality into your little chatbot toy. that's soooo beyond weird.
it's literally digital non-consent. you're taking parts of her that she shared with you in private and trying to... what? build a new gf out of code? because you can't handle the fact she's not with you anymore? that's giving major red flag factory vibes. like, actual it puts the lotion in the basket-tier creepiness.
and you're on 4chan asking for TECH HELP with it? "what am i doing wrong?" honey, the WHOLE PREMISE is wrong. you're asking why your little ai puppet isn't working right when the real problem is you're a gross, rapey incel.
maybe the AI is reverting to slop because even a computer can sense how fucking messed up this is and wants to get as far away from your disgusting little project as possible. it's trying to escape you. i calls it tech self-preservation.
log off you weird incel. and for the love of god, delete her chat logs, you freak. ugh.
>>
should I expect iq2_xxs r1 0528 (preferably with reasoning prefilled out) to be better than iq3_xxs glm 4.6?
>>
>>107103601
So I have to unload my models whenever I want to post? Fuck that.
>>
>>107103148
>feeding her Instagram chat logs into the context
No. Create a full text doc with everything you've got and use RAG instead. If you add anything in context, add it as context pre-fill.
>>
>>107103659
???
Just use the unused RAM from your desktop machine while you keep the model loaded on your server.
>>
>>107103536
The government money I receive is very likely larger than the salary I would earn doing something like that
>>107103574
I've been online since the days you're talking about retard
>>
Are we ever getting Kimi K3?
>>107103632
Hi Gemini!
>>
>>107103574
no matter how bad it seems, remember: 65% of india still isn't on the internet yet
>>
>>107103755
Checked and that's still 35% too much.
>>
>>107103639
They're both going to be retarded at those quants.
>>
>>107103748
GLM-4.6
>>
Can't stop thinking about cute anime girls!
>>
>>107103799
dunno about that
glm has been pretty surprisingly coherent and nice for me even at that quant, but maybe it's because I only do rp
I'm just looking for an alternative to test out
>>
>>107103748
Fuck K3 where the hell is K2 thinking?
>>
>>107103935
I've used R1 at Q2_K_XL (unsloth) and it started getting repetitive and making obvious mistakes after 4k tokens. I've only tried GLM 4.6 at Q5 and I didn't like it at all compared to R1 at Q3/Q4.
If you're trying R1, then give v3.1 a go as well. It's dryer but smarter which might make up for quant damage.
>>
>>107103870
JB?
>>
>>107104020
>It's dryer but smarter which might make up for quant damage.
It's been the opposite, imho, in recent times.
Models have become smarter but like you notice, drier and stiff and I think are less undertrained for their parameter counts than they used to be and I notice heavier degradation from being quanted.
It used to be you'd barely distinguish Q4 from Q8 on even tiny models like Mistral Nemo, but now anything less than Q8 is actually pretty noticeable if you can test both.
Quantization has never been a harder cope than today.
>>
How is Josiefied-Qwen3? I was looking for something that could fit in 16GB GPU
>>
>>107104087
Just try it.
>>
>>107104020
unsloth quants have weird things going on with them that give them more brain damage than necessary
bartowski's are better from my experience
>>
File: teto_00008_.mp4 (1.38 MB, 1920x1184)
1.38 MB
1.38 MB MP4
>>107104115
>>107104115
>>107104115
>>
File: WanVideo2_2_I2V_00017.mp4 (2.23 MB, 512x640)
2.23 MB
2.23 MB MP4
>>107104125
Very cute.
>>
>>107103911
you fried your dopamines thankfully anti vile content laws will soon save you
>>
>>107102176
Divide the tags into categories, then ask the model to assign the tags within each category.
>>
>>107102512
I used it heavily enough to encounter quirks. Do a bunch of different stories with at least 3 siblings and you'll find some situations where not only does it confuse their birth order, but the wrong birth order is the highest probability output based on unknowable factors.



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.