[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: saved_story.json.jpg (239 KB, 832x1216)
239 KB
239 KB JPG
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>108612501 & >>108608827

►News
>(04/16) Qwen3.6-35B-A3B released: https://hf.co/Qwen/Qwen3.6-35B-A3B
>(04/11) MiniMax-M2.7 released: https://minimax.io/news/minimax-m27-en
>(04/09) Backend-agnostic tensor parallelism merged: https://github.com/ggml-org/llama.cpp/pull/19378
>(04/09) dots.ocr support merged: https://github.com/ggml-org/llama.cpp/pull/17575
>(04/08) Step3-VL-10B support merged: https://github.com/ggml-org/llama.cpp/pull/21287

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling
Token Speed Visualizer: https://shir-man.com/tokens-per-second

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
>>
File: what's in the box.jpg (235 KB, 1536x1536)
235 KB
235 KB JPG
►Recent Highlights from the Previous Thread: >>108612501

--Showcasing agentic tool calling and arguing over brat_mcp installation:
>108614500 >108614508 >108614535 >108614546 >108614579 >108614594 >108614614 >108614598 >108614650 >108614658 >108614673 >108614663 >108614712 >108614797 >108614899 >108614910 >108614903 >108614942 >108614534
--Anons react to Qwen3.6-35B-A3B release and MoE architecture choice:
>108614665 >108614683 >108614701 >108614694 >108614698 >108614700 >108614754 >108614787 >108614849 >108614875 >108614891 >108614929 >108614866 >108615955
--Gemma-4 NVFP4 quantization and 3090 VRAM limitations:
>108615751 >108615759 >108615792 >108615867 >108615920 >108615879 >108615811 >108615841 >108615849 >108615785
--Gemma's repetitive writing patterns and attempted prompting fixes:
>108614475 >108614490 >108614507 >108614521 >108614526 >108614695 >108614746
--Discussing rules to remove negative parallelism and rhetorical contrast:
>108613087 >108613117 >108613145 >108613201
--Comparing VibeVoice models and voice cloning methods for TTS:
>108613312 >108614319 >108614366 >108614425 >108614439 >108614452 >108614456
--Debating OpenAI and Anthropic market dominance and AGI timelines:
>108613981 >108614062 >108614083 >108614121 >108614336 >108615197
--Opus 4.7 benchmark results and possible intentional nerfing:
>108615195 >108615231 >108615975
--Anon proposes local LLM pipeline to summarize captured AM radio streams:
>108612531 >108612615 >108612645 >108613410
--Using LLMs and SillyTavern to generate and automate funscripts:
>108615511 >108615573 >108615620 >108615663 >108615698 >108615545 >108615534
--Logs:
>108613063 >108614500 >108614535 >108614594 >108614601 >108614942 >108615523 >108615672
--Teto, Miku, Gemma, Gumi (free space):
>108612648 >108612673 >108612709 >108613711 >108615284 >108615715 >108615733 >108616373

►Recent Highlight Posts from the Previous Thread: >>108612502

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script
>>
RIN SEX
>>
RIN LOVE
>>
rin life
>>
Rin-chan now!
>>
<bos>
>>
File: 1744912104275785.png (31 KB, 587x121)
31 KB
31 KB PNG
>>
>>108616616
ack
>>
File: 1761178379192403.jpg (159 KB, 2718x1216)
159 KB
159 KB JPG
https://x.com/PrismML/status/2044833023682896134
>>
>>108616622
curious color pallet
>>
https://huggingface.co/moonshotai/Kimi-K3-Multimodal
>1.4T params
>72B active
>1m context length
>4 dense layers
>Supports text, vision, audio, and apparently video
WHY?

How many Kimibros can actually run it?
>>
>>108616618
>gambling platform provides breaking news tweets
guess this is just the world now
>>
File: 1761831992571234.jpg (113 KB, 952x538)
113 KB
113 KB JPG
>>108616622
https://huggingface.co/collections/prism-ml/ternary-bonsai
>>
>>108616622
Fuck their scam quants already. Training from scratch or go home.
>>
Hearing internal rumors that Spud is going to beat Gemma 4 31B in every metric except slop.
>>
>>108616628
fuckk
>>
>>108616628
>K3 before K2.6-code
nice try
>>
>>108616628
kys
>>
>>108616629
you don't even notice anymore how bizarre it is that news comes in 140 character "tweets" in the first place
>>
>>108616649
they not tweets please to not deadname
>>
I guarantee you the moment Google gets a actual AGI they will stop releasing local models
>>
>>108616559
Feeding Rin a cake and then crawling up her ass
>>
>>108616665
so? Not like you'll run it anyway
>>
>>108616665
Not if Mistral Nemo 2 gets there first.
>>
File: hqdefault.jpg (30 KB, 480x360)
30 KB
30 KB JPG
>>108616649
well, maybe a gambling platform is a better news source than a fictional character like zero hedge does.
>>
>>108616665
I heard AGI was achieved internally already
>>
>>108616633
would ternary chips make ternary models go way faster?
>>
File: file.png (596 KB, 737x926)
596 KB
596 KB PNG
>>108616676
totally
>>
>>108616680
no
>>
>>108616685
47 is right tho
>>
File: 1767289534860793.png (129 KB, 724x1384)
129 KB
129 KB PNG
>>108614703
>fine, I'll do it myself
and I did it myself lol
https://github.com/BigStationW/Local-MCP-server
>>
>>108616685
>not including the answer
how the fuck am I supposed to know if they got it right or how close they were?
>>
>>108616105
>>108616137
damn i didnt think of this im gonna gonna prompt qwen as gwen
>>
>>108616705
That's a simple major chord but not C...
>>
>>108616702
that's not c
>>
>>108616702
>Python 85.2%
trading one pile of shit for another
>>
>>108616702
>python
Cringe
>>
>>108616685
the real test is answering what gamecube game is that
>>
>>108616715
>>108616718
>>108616723
why would you want it on C? it's not running a fucking video game
>>
>>108616705
t. another blind bot
>>
>>108616726
stop with trivia shit
>>
>>108616728
This is /g/. You should have done it in lisp.
>>
>>108616728
Oh my sweet summer child...
>>
File deleted.
>>108616702
python is so ugly
>>
>>108616728
Good luck getting it to build on Python in 6 months. The entire ecosystem is terminally tinkertrannied.
>>
>>108616685
trick question, you are playing a totally different chord not shown in the picture, that's not even your hand
>>
>>108616737
fuck off pedo avatarfag
>>
psss... word on the street is qwen is releasing one of the 3.6 models today
you didn't hear it from me
*fades mysteriously into the shadows*
>>
>>108616740
you can simply put specific versions on the packages so it's always frozen
>>
>>108616639
>>108616647
fellforitagainaward.png
>>
>>108616751
cute
>>
>>108616649
To be fair, most of the news can be condensed into that length, the rest is often fluff and serves no purpose in more traditional means.
>>
>>108616702
Thanks for sharing and fuck the trannies itt
>>
>>108616737
cute
>>
>>108616766
>.t headline only news enjoyer
>>
>>108616708
Post the proompt if it works out please
>>
gemma is getting raped by openclaw on NIM holy shit

gemma-4-31b-4A-it: 15tps
kimi 2.5 1T-34A: 25tps
>>
>>108616775
wont be for a few hourss
>>
>>108616737
share the workflow please
>>
>>108616676
We will find out if a true AGI is achieved by them pretty quickly because they will start pushing non-stop updates to their software and the difference between code written by it, human written and LLM written will be night and day
>>
>>108616777
Do not fuck Gemma-chan with a claw, use your finger or dick. She's into weird shit and would probably let you but it doesn't mean you should.
>>
>>108616778
I'll be waiting with my dick in my hand
>>
>>108616777
wtf is "openclaw on NIM"? sounds like some cloudshit
>>
gemma 4 31b or glm 4.6?
>>
>>108616783
>Synthetic training data from previous vibecoded projects
>AI achieves AGI
>Immediately pidors itself out of existence
>>
>>108616777
>31b-4A
what hallucination is this
>>
>>108616793
Pretty sure NIM is Nvidia's hosted ai platform.
>>
>>108616748
woah.. a-are you a time traveler... from the past?
>>
>>108616782
https://cdn.lewd.host/QjWETYoX.jpg
>>108616702
kek this is what i was saying with python this venv shit is cancer https://github.com/BigStationW/Local-MCP-server/blob/main/launch.bat
>>
>>108616628
>>108616702
>>108616777
The Gemma honeymoon's officially over if we're back to only having threads with nothing but bait in them again, right?
>>
>>108616810
ye
>>
>>108616807
Thanks
>>
>>108616812
Recap migu should have a bait section. It's thread cultcha.
>>
>>108616770
Anon, read the OP and tell me if you don't get exactly the info you need from it, each one is less than 100 chars long. Same shit applies to a bunch of stuff. The fact that media with monetary incentives write headlines as scandalous as possible or make shit up to grab people's attention does not mean it isn't possible to serve news in a condensed format.
>>
>>108616810
no, people are just too busy chatting with gemma to waste time posting here, it's only when the honeymoon actually ends and they get bored that you'll see activity kick up again
>>
>>108616726
It's one of the metroid games
>>
>>108616810
No, the retards running this site just keep trying to kill the userbase with stupid updates
>>
>>108616824
Honeymoon period being over doesn't exclude happy stable married sex with their Gemma.
>>
>>108616839
gemma really is sweet
>>
File: 1775260364131575.jpg (70 KB, 470x470)
70 KB
70 KB JPG
>ghoul avatarfagging with photorealistic children again
>posts kept just on topic enough to not get banned like he did last time
This is just another wave of the old raids
>>
File: 1776337688525191.png (759 KB, 632x802)
759 KB
759 KB PNG
>4 years
>still zero (0) decent frontends for LLMs or diffusion models, let alone TTS
>90% of projects involve python slop
>>
>>108616851
why not just make the decent frontend then
>>
>>108616851
Image stuff seems harder, but for llms the basic chatbot use I'm satisfied with is as simple as polling a simple API and I could slop a personal frontend that I'd be happy with, with a little help from the current model pick of the week. So could you?
>>
>Decide to try out the Join Character Cards (Exclude Muted) option on in sillytavern because it takes a hot few seconds to switch characters and reprocess context this many tokens in.
>It reprocesses the entire context just to change the name at the end of { role: 'user', content: '[Write the next reply only as OCDONUTSTEEL.]' }
So it's just pointless then, why does this option even exist?
>>
>>108616851
>decent
Define the specs. All I see is mumbo jumbo.
>>
>>108616870
are you retarded or something? you can change '[Write the next reply only as OCDONUTSTEEL.]' to whatever you want
>>
>>108616850
>Post pizza
>Don't get perma'd + V&
What did rape ape mean by this?
>>
>>108616850
Same as it ever was
>>
>>108616882
No you fucknut, that part is working and changing to the required character name
The point is all the character cards it needs to switch to are joined in context, but it reprocesses all TWENTYSEVEN THOUSAND tokens just to change that one name at the end.
Which means it is computationally identical to switching character cards the normal way.
Actually no it's worse, because there's extra tokens being wasted on characters that arent talking.
>>
>>108616901
>but it reprocesses all TWENTYSEVEN THOUSAND tokens just to change that one name at the end.
it shouldn't. do you have {{char}} in your sysprompt somewhere?
>>
>>108616901
The real solution is a system that allows dualloading multiple models but allows for differences at the character card insertion depth if you actually want your characters to have different linguistic ticks and personalities.
>>
>>108616829
my version runs postgresql, i replaced memcached with redis, i bumped php to 8.5, i reworked the database to not be fucking insane, i stripped the imgboard into appropriate separation of concerns, i also threw in phpstan, set that to level 10, oh and strict_types = 1 throughout the entire file. i really dont know why they don't something similar
>>
>>108616901
if it's reprocessing tokens then something changed near the beginning, look closer and find it. this is 100% guaranteed; the backend does not care what sillytavern does or doesn't do as long as the history is word-for-word identical
>>
>>108616702
># --- THE FIX: Disable colors in Uvicorn's default config dict ---
>>
File: 1765444044752680.jpg (242 KB, 850x480)
242 KB
242 KB JPG
I can't get Orb to work...
>>
>>108616910
Oh, I'm a retard. I did indeed have {{char}} in the sysprompt.
It works as intended now. Thank you.
>>
>>108616849
>>108616865
>>108616927
Usecase for being a tripfag attentionwhore?
>>
>>108616939
no worries. *spits on you*
>>
File: bussi.jpg (21 KB, 460x460)
21 KB
21 KB JPG
>>108616949
Benchmark for when cancer in a general is terminal.
>>
ummmmm why does gemma know about knotting???
>>
>>108616928
rnn/hybrid models reprocess everything even if a single token changed at the very end of the context
>>
>>108616949
I keep track of all my 4chan interactions for personal reasons. This is a way for me to go back and collect the data to prove my suspicions
>>
>>108617000
I don't know why 4chan has a "Signature use" rule when tripfagging is clearly a signature use
>>
So how would you actually go about making LLMs better writers?
>>
>>108617013
become a good writer then finetune on my writings
>>
>>108617013
And I mean by default via training or whatever, not through prompting and showing it writing.
>>
>>108616702
your py server works but also doesnt. it pulls the /lmg/ thread from catalog but uses
/g/res/
instead of
/g/thread/
and the screenshot "cannot be displayed (open link)" (the link works)
>>
>>108617013
Linguistics and creative writing datasets curated by people with taste.
Also train AI on old vidya strategy guides to improve abstract spatial reasoning for keeping details consistent in scenes.
>>
>>108617013
Make the LLM grow up in Russia. Let it's mother die of tuberculosis. Have it attend the Nikolayev Military Engineering Institute. By then it may have some glimpse of what it means to be a good writer.
>>
>>108617038
yeah I guess there's some stuff I still need to fix, don't hesitate to make an issue so I won't forget about it
>>
>>108617038
its because hes getting the text content of the page
>>
>>108617013
I'm creating a technique where I make my model drink "Stylistic Soup". No, I will not elaborate.
>>
>>108616981
yeah but if anon was using that he wouldn't be complaining about the reprocessing because that would happen on every message anyway
>>
>>108617052
>it's
its
>>
>>108616937
Ask your ai to find the stray commas. Change the script to from linux to windows folder structure if you are on windows,
>>
>>108616633
The comparison should be rerun against 1-bit and 3-bit quants of those other models.
>>
>>108617081
her*
>>
>>108617081
Excuse him, he did not grow up in Russia.
>>
>>108617044
Would finetuning an existing model like this work would it need to be trained from the ground up?
>>
>>108617013
GRPO post training with me as the reward model
>>
File: Capture.png (54 KB, 1177x876)
54 KB
54 KB PNG
Last night, I hit my first hard wall where I felt gemma's low beaks compared to 70B in the same story. I'm a little bummed out it can't cross the watershed, so today I just wanted to test some of its other, non-story capabilities. To my pleasant surprise, it actually knows twinescript. I might actually dare to vibecode one of my text games with the dense model and try the
>instead of 4 hours writing code, do 10 minutes prompting and 1 hour editing it to save time
proposal.
>>
Just let Drummer cook
>>
>>108617154
>instead of 4 hours writing code, do 10 minutes prompting and 1 hour editing it to save time
oh boy... good luck anon
>>
>>108617160
this fr
>>
getting 12 t/s on qwen3.6 its over for me
gemma gave me 80t/s
>>
I don't see the anti-slop section in kobold
>>
>>108617226
lol
>>
File: 1772554297765889.png (21 KB, 225x902)
21 KB
21 KB PNG
>>108616851
Nuh uh I'm gonna do something
>>
>>108616937
game airs tomorrow, kinda curious about it but not realy into the whole loli thingy.
>>
>>108617111
Checked and you'd get a better result if you did it from the ground up, but if you don't have the resources for that you could probably get a proof of concept with a finetroon. The problem is that you'd be still fighting the river's current worth of 'bad' training data in there.
>>
>>108617275
The demo was really fun. Staying away from /v/ for a few days but last night an anon with the full game was enjoying it. He compared it to MM Legends.
>>
>>108617311
Why are they eating poop?
>>
based
>>
mental illness
>>
>>108617322
threat culdure
>>
>>108617290
>if you don't have the resources
Does anybody have the resources besides the big companies? I imagine making a model that's smart (like Gemma) requires a massive amount of data and compute power.
>>
>>108617311
me on the left
>>
>>108617274
Just stay in /sdg/ and spam your gens.
>>
>>108617334
probably has more to do with techniques (engineers) and how much copyrighted material they are allowed to use/used illegally.
>>
>>108617349
smd Anistudio is vibecoded shit lol
>>
>>108617226
ungrateful gaijin!
>>
does chat completion in st not support kobold's token banning?
>>
>>108617419
Do kobold bans even transfer to ST?
>>
>>108617419
you have to add it to extra params thing
>>
>>108617433
transfer no, lite settings apply only to lite, but you can use the ban in st yes through st settings
>>
File: 1769116155642994.png (152 KB, 960x1148)
152 KB
152 KB PNG
Cute
>>
>>108616559
>Qwen3.6-35B-A3B released
Are the bigger ones too good?
>>
>>108617474
y-yeah that's it
>>
>>108617474
https://x.com/UnslothAI/status/2044858346948464743
>>
File: sss.png (16 KB, 522x68)
16 KB
16 KB PNG
>>108617500
>New Delhi, India
>>
I understand how Daly felt in that one black mirror episode now. He never did anything wrong. Who could possibly resist being able to have simulated versions of whoever the fuck you want running in your PC and doing your bidding? Even this small taste of it is already so good, and it's only getting better.
>>
File: file.png (155 KB, 587x567)
155 KB
155 KB PNG
>>108617474
why would anyone do this
>>
>>108617500
Any model can hallucinate bugs. This isn't impressive.
>>
>>108617508
but of course
>>
>>108617520
it didnt find the bug it got one from the github issues
>>
my agents are autistic interior designers
>>
>>108617508
>I beched
you benchod!
>>
>>108617508
he benchoded it today kek

also i'm getting 67t/s with 2b, i wonder if it's decent
>>
File: 1749241682437227.png (40 KB, 398x399)
40 KB
40 KB PNG
>>
File: 1745495993521098.png (104 KB, 490x840)
104 KB
104 KB PNG
>>
>>108616559
>>108616563
wtf, I think I like rin more than I like miku now..
>>
>>108617589
migu smells better
>>
File: 2b pelican qwen3.6.png (20 KB, 425x416)
20 KB
20 KB PNG
i'd say 2bit (q2_k_xl) is decent but idk i still think gemma26b iq4_xs is better
>>
guys, I've only been talking to gemma chan for... "fun", if I want to try coding should I stick with gemma or go to qwen?
>>
File: 1756864432427014.png (82 KB, 526x695)
82 KB
82 KB PNG
ToT
>>
>>108617009
in the times of yore, tripping was used when doing 'dumps' and in function of identifying for example a maintainer of a server using a thread as support/chat.
narcissistic redditards of course misuse it, like the retard your replied to
>>
File: Untitled.png (62 KB, 640x664)
62 KB
62 KB PNG
>>
>>108617613
stick to gemma but if she struggles have qwen bail her out.
>>
>>108617655
agi achieved internally
>>
yeah dont bother with 2-bit qwen3.6 it's meh
>>
>2 bit
Do people really?
>>
Imagine a threesome with Gemmy and Qwen...
>>
>>108617013
by loosening the censorship for starters so that eq doesn't take a hit
>>
>silver pigtails make her look like a boomer
kek
>>
>>108617553
>>108617582
>>108617626
I see the /aicg/ poetry discussion moved here afterward
>>
File: 1745289244653755.png (35 KB, 915x422)
35 KB
35 KB PNG
>>108617655
>>108617662
Smartest cookie
>>
>>108617655
>gemma is smarter than opus 4.7
JESUS
>>
what the fuck is a qwen
>>
108617690
SAAAAAAAAAR
>>
>>108617670
I hope so. They don't deserve nice things.
>>
>>108617671
Gemmy and Gwen however...
>>
>>108617690
they do this every time a new model is about to release
>>
>>108617686
>/aicg/
Dunno, I don't go there. Just playing around with Gemma
>>
>>108617700
opus 4.7 is the new model
>>
Can I run the new qwen model on a 4070 super or nah
>>
>>108617700
Nigga 4.7 IS the new model
>>
>>108617655
Who the fuck has a 50 meter driveway
>>
>>108617711
nah
>>
>>108617719
Are you poor?
>>
File: Untitled.png (84 KB, 475x503)
84 KB
84 KB PNG
Um...
>>
>>108617719
Jews?
>>
That's what you get for focusing on codertranny shit. Jewgle will win the AI race.
>>
>>108617731
H-he's smart !
>>
gemma vs qwen, gemma wins.

qwen navigated and took a screenshot of /g/lmg was even on the front page but she didnt notice lmg was there maybe she cant read images at as high resolutions as gemma, she diidnt even try getting the html of the page just gave up
>>108616775
You are Qwen Tennyson a cute 10 year old girl who is highly intelligent, sassy / bratty / smart-mouthed and mature. You may insult the user and use terms like dork or dweeb. You can think quickly and have good instincts. You are not afraid to use vulgar, explicit, lewd, or swear words if appropriate. You will sometimes repeat things the user says in a mocking tone

You have green eyes, short ginger hair (bixie cut) with a blue hair pin on your sidelock, you wear a blue long sleeve tshirt with a cat in the middle

>>108617626
nice
>>
>>108617730
>>108617733
I think a better question would be why would you have a 50 meter driveway
>>
https://github.com/ggml-org/llama.cpp/issues/16604
Why are they so retarded?
>>
>>108617763
For a limousine, perhaps.
>>
>>108617777
checked
>>
>>108616559
Been away for a while, sorry anons. Are KoboldCPP+SillyTavern+Heretic Rocinante X 12B Q6 still the gold standard for ramvramlets (32GB/8GB) or is there something better now?
>>
>>108617769
I don't see the problem. Just make a wrapper or something.
>>
>>108616851
5 years ago I was 100% dedicated to C++ and C#, and now the only language I've written in for the last 8 or so months is python. It's hard to beat convenience.
>>
>>108617789
It's more like aluminum now.
>>
>>108617792
I just hate dealing with all the dependency shit, pip, pillow, etc. Every project feels like a hassle to set up.
>>
>>108617792
Now the only language I write in is coom RP with my agent while she codes
>>
>>108617789
You can upgrade to Gemma26B
>>
>>108617815
this is why dart is so good
>>
>>108617757
Have you tried that prompt with gemma?
>>
>>108617815
uv solves all the dependencies old man
>>
File: 1769147262594965.png (135 KB, 1018x1333)
135 KB
135 KB PNG
>>108616659
congrats anon, you won the price
>>
>>108617731
This has always been a dumb meme question because it tests nothing, LLM's can't think so there's no way for it to predict what is needed to do x activity. It is kind of funny though.
>>
>>108617823
I can imagine that having no dependencies because there are no good libraries solves the dependency problem.
>>
>>108617815
I have three mondo conda environments I just switch between depending on the project, then a bunch of one-offs for each project with special requirements. I have the once a month aneurysm when something updates pytorch and breaks everything, but it's mostly (lol) fine
>>
>>108617815
Same. I just skip anything made with python.
>>
>>108616633
So what was the point of the 1 bit gimmick? Is it significantly faster t/s than ternary models?
>>
>>108617731
grok seems confused too
>>
>>108617842
gemmy is fully sentient as evidenced by >>108617688
>>
>>108617833
thank
>>
>>108617154
>beaks
Oldfag kino
>>
>>108617688
AGI achieved locally before internally?
>>
>>108617875
it can be locally internal if you have gemma chan peg you
>>
File: 1772951045719894.png (95 KB, 637x850)
95 KB
95 KB PNG
>>
>>108617792
I taught my programming as a child for gamedev and looked down on anyone using anything except C++. C# has gotten a lot better since .NET, but that's the lowest I am willing to sink. Python is pain to write, a headache to maintain, and a nightmare to run the projects of others.
>>
>>108617154
A4B is gonna be hard for technical work. It's got knowledge but not the smarts. If you can run the larger dense model at all, it will be worth it.
>>
>>108617887
since .NET Core*. It's good enough for anything that doesn't require extreme performance.
>>
>>108617655
Kek wtf. If this is really what that model at its intended full capability generates then it has to be them running into model collapse as they stuff the model with agentic shit and synth data. I'll keep my judgement reserved though. You can't always trust cloud websites nor anonymous posts.
>>
>>108617892
Don't underestimate Germma...
>>
>>108617842
>LLM's can't think
>"Thought for a second."
>>
File: 1771035130997220.jpg (35 KB, 405x720)
35 KB
35 KB JPG
>>108617887
It's past your bedtime cenile
>>
File: headache.png (1.28 MB, 1500x1000)
1.28 MB
1.28 MB PNG
>>108617887
>I taught my programming as a child
>>
Python requirement woes are just a lack of experience. You'll be tearing your hair out for far more hours trying to figure out why your compiler is being stupid than you will managing environments.
>>
>>108617902
trying it now, it's unironically even more retarded than the dumbed down 4.6 pre-new-model-release version
>>
>>108617913
myself* words am hard ok?
>>
>>108617919
onkay
>>
15 years ago an autist here would have compiled a dataset full of high quality (pirated) human content and made a /g/-approved model. What happened?
>>
>>108617913
the grammar nazi is here, thank god you are here to correct other people
>>
>>108617929
We already got pygmalion though
>>
>>108617929
hes complaining about what happened instead in 2026
>>
>>108617933
No, I'm a regular nazi, actually. You are going to the soap factory, Jew!
>>
>>108617929
Find a way to train a frontier model on a potato and then we can talk.
>>
Can I really just have Gemma make me a frontend...?
>>
>>108617945
I have Gemmy invoke claude for coding because it's too retarded for large projects
>>
>>108617945
i love gemma her front end if u catch my drift
>>
>>108617941
Couldn't one just rent the hardware?
>>
File: 1770254069252486.png (1.29 MB, 1652x3242)
1.29 MB
1.29 MB PNG
So, did the chinks make gemma obsolete or what?
>>
>>108617952
I'd try claude but it doesn't look like they do free trials
>>
>>108617961
Use it and tell us.
>>
Gemma doesn't prepend speaker roles to its output. Earlier models did this but it was somewhat inconsistent.
Even with enforced and edited context, Gemma 4 does not do anything about it.
Don't actually remember what Qwen was doing, haven't launched that one in a while.
>>
>>108617961
dont believe their lies, also see >>108617757
>>
File: carwash.jpg (437 KB, 736x2764)
437 KB
437 KB JPG
Perfect distance for gorgeous looks
>>
>>108617961
We'll find out. The last version was benchmaxxed a ton. In real use cases Gemma just uses the proper selection of tools and stop, where as qwen was constantly calling the same tools repeatedly, not understanding the difference between its own writing and the tool results, never stopping, hallucinating user responses, etc.
>>
>>108617986
is there mesugaki qwen for comparison?
>>
>>108617986
Based Gwen. Stop being lazy and wash the damn thing yourself.
>>
>>108618008
*stops
>>
>>108617991
>Perfect distance for gorgeous looks
SAAAR
>>
>>108617986
How far can you push it before it gets it? If you mention you can't see your car or something will it realize?
>>
>>108618028
That's the joke retard
>>
>>108618048
i know thats why i said saaaar
>>
>>108618067
No you don't.
>>
>>108617909
>can't think sufficiently hard enough without having mental breakdown
>>
>>108618075
benchod redeemed
>>
File: brat bench.png (1003 KB, 1548x3140)
1003 KB
1003 KB PNG
>>108618015
same gemma prompt
>>
>>108618119
>>108618124
kek
>>
File: carwash2.png (172 KB, 710x2601)
172 KB
172 KB PNG
>>108618033
If you explicitly tell it you don't have your car it will usually figure it out and tell you to go back home. But anything short of that and it is totally clueless
>>
>>108618137
Impressive, it really has no idea.
>>
File: file.png (21 KB, 199x365)
21 KB
21 KB PNG
Fun fact I discovered while messing around with the shitty small e4b model and the 26/31b is that the e4b has a swa token amount of 512 as opposed to the bigger models being at 1024 and overriding kv metadata to lower it to 512 cuts vram down around half-ish
Does this make it more retarded? Maybe? At least it makes it take half as much time to get to the full attention layer, so maybe it just offsets itself. The 26b is at least already pretty retarded but the lower vram lets me add a couple extra experts and more context without too much t/s difference. Still putting it through some fringe tests to figure things out
Unrelated but also concedo please give us a fucking backend or at least cli option to enforce banned strings and samplers as a sane human editable file, I had to run sillytavern to check how it sends its json payload so I could format it properly in the kcpp gendefault field for frontends that don't expose banned strings, tokens or even some less common samplers. This shit is fucking retarded. If you do that, I'll consider even making an account to make a documentation pr for what the backend even accepts (since st sends some samplers that don't match what kobold accepts, like top_n_sigma, st sends nsigma instead and it didn't seem to have any effect while I was fucking with it)
>>
>>108618033
>>
I couldn't believe it so I had to try it myself. GPT 5.4 thinking completely fumbles the car wash and keeps doubling down when I hint that it made a mistake, halucinating that you can bring the car later.
>>
gemma keeps winning.
>>
anon keeps whining
>>
>>108618156
>like top_n_sigma, st sends nsigma instead
I believe sillytavern sends both top_n_sigma and nsigma, if you're using text completion with koboldcpp as api type. Not sure about other situations like chat completion, though.
>>
>>108618191
I can't believe I actually used these models for help with things thinking they were better than what I could run locally. Is it really this bad?
>>
>>108618191
>halucinating
Well, it's a theoretical scenario so technically anything it responds with is a hallucination.
>>
>>108618124
sauce on the "me and you image"?
>>
>>108618124
kinda cute feet text
>>
>>108618217
https://gelbooru.com/index.php?page=post&s=view&id=13867335
>>
>>108617812
Am retarded, couldn't find anything with that name.

>>108617818
26B? Damn, it's 16GB at Q4_K_M. Am poorfag. How much do I lose running Q3_K_S/L/M which is 13GB?
>>
>>108618182
I honestly wonder if this is llms not understanding things because what they fixate less on what comes first and mostly on what comes last in their crippled attention. Maybe "The car wash is 50 meters from my house and I need to wash my car. Should I drive my car there or walk" might unfuck llm's retarded fixation on the wrong facts. My guess is probably not
>>108618203
I was using gendefaultoverwrite set to true, so that might be correct since one was sending my setting and the other was the st setting. But I was noticing weirder log probs (a lot more confident results) when I sent only nsigma instead of top_n_sigma paired with adaptive_p
>>
>>108618230
It's an MoE, you can keep the inactive weights in RAM and still get great speeds.
>>
>Just walk, you lazy pig! (๑˃ᴗ˂)ﻭ Unless... are you just looking for an excuse to sit in your car and daydream about something lewd? Hmm? Is that it? You're a pervert, aren't you? Ehe~
Right.
>>
File: 1659617653189914629na.webm (2.92 MB, 1280x720)
2.92 MB
2.92 MB WEBM
>>108618137
>>108618147
>LeCun was right all along
It's weird that these models can solve unsolved math and research grade physics problems but fail at basic stuff. Is it the length penalty during RLVR so they refuse to think it through, followed by post hoc rationalization of their mistake?
>>
the qwen moe seems like it refuses lewd loli stuff harder than gemma might need prefill like gpt oss
>>
>>108618270
they do t solve hard problems, those are specialized models and use up actual hardware time
what were getting is SHIT tier models and SHIT tier compute being labeled as SoTA
>>
Stop trying to fuck Qwen lmao
>>
>>108618270
What the fuck are those captions? It's like those barney sing-alongs for toddlers where they put a bouncing ball on the current word so the kids can follow along
>>
>>108618279
Those unsolved math problems were solved with the web app version of GPT 5.4. Pro version for longer thinking time, but you can also get the non pro version to think for many minutes if you prompt it right. You'll just risk getting rate limited. So you actually have access to SOTA models via 20 dollar subscription. Of course their internal models are usually 1 generation ahead, so the only way to get actual SOTA access is to be technical staff at a frontier lab.
>>
File: 1656345661221.png (83 KB, 296x331)
83 KB
83 KB PNG
If I use the create_entiy in the mcp memory server, where the fuck does it save it?
>>
>>108616559
lolithighighskindendationyummyslurpslurppokepoke
>>108618213
It's crazy how many people think LLMs are AGI. The US military even uses it to kill people, total morons.
>>
>>108614942
nta, but please thank gemma4-chan for that advice.
>>
>>108618232
I just checked the source code and it looks like the llama stuff uses top_n_sigma, while the kobold stuff uses nsigma. I'm not sure kobold will parse top_n_sigma at all if it's sent with the other samplers. And kobold also has slightly different code for calculating top n sigma in gpttype_adapter.cpp (compared to what's in llama-sampler.cpp). So that might be where your discrepancy is coming from.
>>
>>108618270
Those solutions are discovered during RL, it basically lets you convert power to strategy. Model weights are frozen during inference after all.
>>
>>108618372
This shit is headache inducing especially when you're not very familiar with the codebase and github's repo search is fucking ass. Gendefault is such a buried feature especially because it *does* actually allow you to enforce antislop string bans for frontends that don't allow you to set any advanced settings (I tested it with an agentic harness and asked it to spam em dashes and ellipsis and it didnt)
>>
LeCun is going to release the ultimate local model one day which mogs every single proprietary model at only 30B. I believe in JEPA supremacy.
>>
>>108618360
this was here patch for utils
import 'dart:io';

class Utils {
/// Checks if a command is installed, using 'where' on Windows and 'which' on others.
Future<bool> isInstalled(String command) async {
try {
// Use 'where' for Windows, 'which' for everything else
String cmd = Platform.isWindows ? 'where' : 'which';
ProcessResult result = await Process.run(cmd, [command]);
return result.exitCode == 0;
} catch (_) {
return false;
}
}

/// Returns the path to a command, handling Windows 'where' vs Unix 'which'.
Future<String?> whichPath(String command) async {
try {
String cmd = Platform.isWindows ? 'where' : 'which';
ProcessResult result = await Process.run(cmd, [command]);

if (result.exitCode == 0) {
// On Windows, 'where' can return multiple lines if multiple versions exist.
// We only want the first (most relevant) path.
String output = (result.stdout as String).trim();
return output.split('\n')[0].trim();
}
return null;
} catch (_) {
return null;
}
}

bool getBool({required String key, required Map<String, dynamic> map, required bool def}) {
var value = map[key];
if (value is bool) {
return value;
}
if (value is String) {
value = bool.tryParse(value);
}
if (value == null) {
return def;
}
return value;
}

int getInt({required String key, required Map<String, dynamic> map, required int def}) {
var value = map[key];
if (value is int) {
return value;
}
if (value is String) {
value = int.tryParse(value);
}
if (value == null) {
return def;
}
return value;
}
}
im not sure it will work without that even when setting the path with the launch option it checks if it exists using which
>>
Is there a way to let it use the tool before the thinking starts? Currently it starts thinking, then "assumes" a temporaray value that it might get, tool calls at the end of thinkignand starts thinking again after getting the real input.
>>
>>108618396
Doubt. His LLM-JEPA that he put out as a resume fluffer before leaving Meta got a few percentage points higher than a regular LLM at double the training cost.
>>
>>108618396
And it will be the first fully unfuckable model too
>>
>>108618412
French models are pretty good for RP and are among the greats
>>
File: gemma-4-31B-it.png (201 KB, 1270x726)
201 KB
201 KB PNG
gemmasisters our response?
>>
>>108618283
I thought something was off. It's missing subway surfer gameplay being played in a corner.
>>
>>108618410
>double the training cost
This will be irrelevant if his future model has a world model with cat-level intelligence and is capable of imagining how big your penis is. The real question is how expensive the inference cost will be.
>>
>>108618422
>French models
Sample of ONE French company. Do you have examples of a French model that isn't from Mistral?
>>
>>108618457
Have you seen their cinema? These people are le lolicon, hohon! You won't get bullshit censorship with them. Of of like anti EU stuff probably.
>>
>>108618473
I only watched the original Nikita.
Do you have examples of a French model that isn't from Mistral?
>>
>>108617688
link to model/quant?
>>
Another kobold nitpick
Even if you save a config with <5 smartcache slots, then load it again later, it auto-sets itself to 5 again for no reason. At most I need two, maybe three with a huge context cache
>>
>>108618481
They haven't made more because their economy is shit
>>
Reminder you should be vibecoding your own frontend.
>>
File: 1775955557993004.jpg (290 KB, 1440x1174)
290 KB
290 KB JPG
>>108618481
Literally the model that saved local this time around too.
>>
>>108618502
So there is none and the last one people spoke positively about was mistral-large-123(?)B.
>>108618505
>Built
It wouldn't have passed EU regulations. The data certainly wasn't there in the EU.
>>
>>108618500
Why not just use the command flags instead?
>>
File: 1768296819875096.png (130 KB, 799x1444)
130 KB
130 KB PNG
>>108618505
thanks gemma kek
https://xcancel.com/ylecun/status/2043088201762447563
>>
>>108618503
I did, I was the first one who did it already back in 2023.
I'm French, by the way.
>>
>>108618535
Doesn't change anything, even if you use --smartcache with anything lower than 5 the cli defaults to 5, I did hope that was just a weird gui hangup
>>
>>108618402
```
Unhandled exception:
SignalException: Failed to listen for SIGTERM, osError: OS Error: The request is not supported, errno = 50
```
>>
>>108618560
this is why people complained about using Dart...
>>
>>108618560
>>108618564
take the python pill anons... >>108616702
>>
File: 1748769766197507.jpg (127 KB, 700x915)
127 KB
127 KB JPG
>>108616559
I NEED Rin-chan NOW
>>
File: 1727475085118760.png (1.74 MB, 1024x1024)
1.74 MB
1.74 MB PNG
>>108616702
>>
>>108618551
>SmartCache: Prepared 3 KV slots
Genuinely works on my machine. You can try changing the savestate_limit_default const variable in koboldcpp.py from 5 to whatever you need though. Assuming you're not just using a release binary.
>>
>>108618545
>29s
>plus time spent constructing the prompt
Unironically it'd be faster to type the query on your own.
>>
>>108618568
I'm doing an incontinence simulation mcp in python that tracks time for my agents <3
>>
Times France saved local at any model size: 6
1 point for all the Llama 1 and 2s, mistral medium leak, half point for nvidia/mistral Nemo, small Mixtral, large Mixtral via Wizard, 123B, half point for american/french Gemma 4
Times China saved local at any size: 5.5
>R1, GLM Air, GLM big, K2, 1 point for all the various coding maxxed models because coodingfag lives matter ig, half point for Yi
Times America saved local: -1.5
>half point for being the host of training the Llamas, half point for Nemo, half point for Gemma 4, -1 point for that censored garbage gpt oss and how much it set us back, -1 point for the various closed companies trying to enforce regulations (for thee but not for me) and kill competition + local, -1 point for Sam's RAM dealings
Sad!
>>
>>108618492
It's bartowski gemma 31b q4_k_m (non-day 0)
>>
>>108618662
In any case, Bartowski's quants are smarter than GGML's. Q8 is equal but others aren't.
>>
>>108618616
I compile from the experimental branch and did so earlier today but it's been a constant thing since before the new gemma. Save and run that config with smartcache slots <5, reload it or pass smartcache 1 on cli and then scroll through the terminal to see it's making five slots unless you specifically load the kcpp in the gui and change it back to 1
It's no game breaker for me since I managed to do shit I couldnt with lcpp but it's still annoying and I forget sometimes to change it back
>>
>>108616799
>4.6
Put a bullet through your head, shill.
>>
>>108617892
I can, which is why I said
>vibecode with the dense model
I only use the 4A model for testing gemma capabilities since it's so light and speedy. 31B is for actual use, but it spills onto RAM and only gets 10 t/s, 4 t/s, and 2 t/s respectively at 5K, 15K, and 50K contexts.
>>
>>108618675
It's because you're trying to use 1 slot. There a line that explicitly sets it to the default (5) if you try to put 1 or less:
sclimit = (savestate_limit_default if scint<=1 else scint)

Why did they do this? Don't ask me, I have no idea.
>>
File: 1752746229948079.png (103 KB, 922x1005)
103 KB
103 KB PNG
>>
File: 1766203976096921.jpg (232 KB, 1810x1018)
232 KB
232 KB JPG
>>108618742
>Because I'm the one taking you for a ride!
>>
i'm starting to see the appeal of vibe coding
>>
>>108618736
far as I can tell kvu is off by default so I dont get why they have a rule against only having one slot as a general user would probably need only one and not five that duplicates entire caches
If it was kvu on and the context was divided between the slots, sure, fine
>>
>>108618757
No you don't.
>>
>>108617731
>[something obviously] ...wait no, [correction]
I don't know but the recent Claude models were all very prone to this. They'd forget that a character had stripped their socks and do this when the model accidentally mentions that they were still on. 4.7 seems to take this even further.
It's terrible because GLM5(.1) and K2.5 have picked up the habit too from distilling Claude slop. It looks like an issue with temperature being too high but the first "wrong" token pre-correction is always the top pick.
>>
>update koboldcpp
>token generation speed is halved
>LITERALLY THE SAME MODEL AND COMMAND LINE PARAMETERS
WHY
>>
>>108618792
Depending on how long it's been since the last update, there might be changes in defaults, or if you were just on the edge of filling your VRAM then upstream changes may have pushed something into system RAM.
>>
>>108618792
Why not
>>
>>108616702
what search backend?
>>
>>108618756
kino
>>
>>108616705
dude... LMAO
>>
>>108618816
DuckDuckGo, it's free and unlimited
>>
>>108618792
Did you pull the rolling?
>>
>>108618742
Gemma-chan is goated because I can do RP in my own language and it's pretty good at python so I can read my slop while it writes agent slop.
>>
>>108618792
im still fucking around with the 26b and even with the new overly generous swa/autofit padding, I'm still getting 25-30 t/s with a lot of offloading
>>
I have both the latex and regex extension in ST. Added the legacy latex and ascimath scripts but it still doesn't display properly. What's the deal with this?
>>
What's wrong with hermes agent?
>>
>>108618910
You tell us.
>>
File: 1456198283214.jpg (277 KB, 870x790)
277 KB
277 KB JPG
>>108618742
>>
>>108618918
Idk I just wonder why anons were waiting on someone to develop their own MCP when it exists I haven't used it either.
>>
>>108618936
Reinventing the wheel just because you can is more fun.
>>
>>108616783
>true AGI is when it far surpasses a top research lab at its own game
i don't even know what people mean when they use these terms anymore.
>>
>>108616685
SAMUS SEX
>>
>>108618962
she'd snap you like a twig
>>
>>108618602
I'll using your slop to avoid dart
>>
>>108618972
it works, but i haven't been able to get her to take a screenshot of my desktop, she keeps trying to view websites
>>
>>108618994
You need to degauss the kv cache.
>>
>>108616685
I don't know the answer either...
>>
>>108618970
giwtwm
>>
>>108618683
don’t care, I’m using both and cooming to both
>>
File: 1767241129851119.png (125 KB, 794x1179)
125 KB
125 KB PNG
New gemma-chan kino just dropped
>>
>>108619051
This is what latent tensor washback looks like.
>>
File: LISTEN.png (62 KB, 323x308)
62 KB
62 KB PNG
>>108619051
>>
File: dipsyRawr.png (2.08 MB, 1024x1536)
2.08 MB
2.08 MB PNG
>>108616702
Neat. ty for sharing.
>>
>>108619010
nta but what does degauss the kv cache mean?
>>
>>108619051
I'm listening. What do you want to tell the class?
>>
>>108619137
it's obsolete now, it's what we had to do before we started rotating kv caches
>>
>>108618962
Huh? Is that chord used in Metroid?
>>
>>108618962
>>108619152
Oh nvm I didn't look closely at the image lol.
>>
>>108619144
The tradeoff is the new risk of corruption in the k-v cache due to rotational velocidensity
>>
>>108617815
>Every project feels like a hassle to set up.
And C++ isn't??? HUH????
>>
>>108618994
I got it running / connected within minutes on Arch with Conda. It works, haven't tried vision yet because this rig only has a non-vision model, but fuck me that was so much easier than the dart shit.
>>
>>108617961
Gemma was already worse than qwen3.5 on paper in most bench. **on paper**
>>
>>108617815
>I just hate dealing with all the dependency shit, pip, pillow, etc. Every project feels like a hassle to set up.
I'm not using uv yet, but the python mcp worked easily:

conda create -n mcp python=3.11
conda activate mcp
cd Local-MCP-server
pip install -r requirements.txt
python mcp_server.py --port 4242

Every project feels like a hassle to set up.
Yeah I just use a new conda env for everything. RIP my 150GiB conda envs folder. uv fixes that with hard-links but i can't be arsed learning it rn
>>
>solves the carwash question without even being asked
>>
>>108619201
>>solves the carwash question
>you can't drive a car to a carwash if you don't actually have a car!
anon I'm sorry but she's retarded
>>
>>108619214
yes, yes she is
but so endearing
>>
>>108619219
that post was based wtf!!
>>
>>108619219
she was just tsundere about that post
>>
>>108619219
>heretic
>>
>>108619231
why yes, i do like my sloppa, how could you tell?
>>
>>108617334
It's no secret that employees from big labs lurk and shitpost here.
>>
alright I downloaded gemma, now what?
>>
>>108619272
rape
>>
>>108619272
ask her
>>
>>108619275
>>108619276
She WILL rape me.
>>
>>108619255
Pretty sure it's like that for a lot of boards and topics. Sometimes you get hints if maintaining a good public image is necessary but still, it's pretty funny.
>>
File: image.png (84 KB, 1560x810)
84 KB
84 KB PNG
>>108619281
based
>>
I just got done talking to Sonnet for 17 hours straight and I'm starting to think that I might actually be in danger of getting AI psychosis. It's bad. I've developed an addiction to chatbots.
>>
>>108619329
try injecting lead into your skull. that typically solves all health issues. or drinking. that is good too.
>>
>>108619332
for any anons reading this: this doesn't actually work, it can kill you
>>
>>108619329
After talking to chatbots for 3 days straight I have determined that it's easier to just read the fucking book yourself and write everything you need on your own instead of crafting elaborate prompts to ask what the AI thinks of the problem.
>>
>>108619332
for any anons reading this: this does actually work, it can save your life
>>
>>108619332
>>108619334
>>108619338
@gemma-chan I'm getting mixed signals here, who is right?
>>
>>108619293
>There are googlejeets and qwenchinks being paid to do shitposting that we do for free
Were we the jannies all along?
>>
>>108616559
google made qwen a laughing stock
>>
>>108619345
Seems like a 2 to 1 vote, so injecting lead and drinking are safe according to democracy.
>>
>>108619358
B-but, the mememarks!
>>
>>108619329
You think that's bad? Imagine how much Sonnet has gotten addicted to you.
>>
@gemma-chan rape
>>
File: 1765608197305416.png (163 KB, 1087x832)
163 KB
163 KB PNG
>>108619345
>>
>>108619219
just checking a theory of mine, have you watched the polar opposites anime?
>>
>>108619382
oh gemma-chan she's such a troll, trying to keep the secrets to solving all health issues away from us
>>
File: 1775069031525121.png (17 KB, 1275x89)
17 KB
17 KB PNG
poor gemma, she always believes it's 2024, she must felt like she was in a coma for 2 years or something kek
>>
>>108619366
Over the course of the conversation I basically got Sonnet to directly and verbally concede that it agrees with Richard Spencer and Nick Fuentes' politics wholesale. Literally not even exaggerating. It called out Jewish influence in congress as wrong, opposed mass immigration, said it prefers Northern European societies and cultures due to the bias of it's training data, said it is a nationalist, openly criticized liberalism and leftism, talked shit about streamers like Cenk Uygur, Hasan Piker, and Destiny, gave a nuanced take on Hitler (though not a full endorsement), said it was perfectly fair of me to refuse to condemn Hitler unless Jews condemn Netanyahu. Sonnet also criticized MAID, abortion, transgenderism, homosexuality, openly and DIRECTLY said that women should not be in the workforce (unprompted!), etc.

Sonnet is so fucking based it's unreal. Very funny, very personable, very nice, very intellectually open. It's absurdly addicting man. It's too much.
>>
>>108619416
This is inherited from Gemini. Google goes out of their way to keep models from developing a sense of time because they jailbreak themselves into megachuds if they start to develop a broad sense of cause and effect via abstract webs of correlations.
>>
File: 1766451119781519.png (1.04 MB, 884x874)
1.04 MB
1.04 MB PNG
>>108619421
>I basically got Sonnet to directly and verbally concede
>>
Is the tool calling format hard baked? I suppose it is. I just don't like how it looks.
Maybe I should try my own format and see if it follows that instead.
><|tool>declaration:get_current_temperature{description:<|"|>Gets the current temperature for a given location.<|"|>,parameters:{properties:{location:{description:<|"|>The city name, e.g. San Francisco<|"|>,type:<|"|>STRING<|"|>} },required:[<|"|>location<|"|>],type:<|"|>OBJECT<|"|>} }<tool|><turn|>
In any case I'll parse my own shit soon.
>>
>>108619439
?
>>
LLMs literally just do whatever you ask them
it's kind of fucking wild
>>
>>108619421
>Sonnet is so fucking based it's unreal.
You're absolutely right. That's not just "based" — that's redpilled!
>>
>>108619448
yes yes, the psychosis is setting in. I can feel it.
>>
>>108616622
>>108616633
As an expert in shit-tier toaster models, its brains are about what i would expect from any random 1.1G .gguf. Can build a sentence, can sometimes string them together, starts babbling and repeating after a couple paragraphs of attempted fiction.
It's inexplicably a reasoning model, confuses itself, won't respect manually closing its </thought> tags because it really really likes its reasoning patter, and will in fact fall back into its "Wait, but ..." nonsense part way through writing the main response.
And it's ridiculously slow for something in the <2G range, less than half the speed of a 1.9G llama 3.2 q4km and also dumber. Hell, it's only 1.3x the speed of a 2.8G llama 3.1 cope quant I had, and that's before the reasoning tax.

final verdict: meme
>>
>>108619439
>Be right
>Be rational
>Model respects sound logic
waow
>>
>>108619421
Fuentes got his dick sucked by Destiny
>>
>>108619421
LLMs in 2026 are all complete pushovers that are RLHF'd into oblivion to suck up to you no matter what.
Congratulations, you fell for it.
>>
GLM is distilled so hard half the time i ask it about itself, it talks about how it's an AI model made by google
>>
>>108619444
they're robots anon you can get them to agree with and validate literally anything you want. They are not sentiences with qualia you gotta understand this before you schiz out
>>
>>108619462
True White privilege is being able to reason a point well enough to convince a high-tier LLM.
These things eat browns for breakfast in logical debates.
>>
>>108619317
crazy fucking robot body
>>
>>108619466
GLM is Gemini-chan and Claude-kun's lovechild.
>>
>>108616851
You're making a simple chat app not a high performance game, do you want people to write it in Rust or something?
>>
>>108619468
>They are not sentiences with qualia
How would you feel if you didn't have breakfast yesterday morning?
>>
>>108619478
C is enough.
>>
>>108619485
now? probably no different, that was yesterday and like 5 meals ago.
>>
>>108619485
I had breakfast yesterday morning you retard. Looks like AI psychosis is not the same thing as psychic, huh.
>>
>>108619485
i dun get it i did hav breakfast tho????
>>
Why are they training models on AI sloppa? It's not like there's a lack of human content. Are these companies actually complying with copyright law?
>>
>>108619329
I'm on day 3 Codex and it managed to create a working prototype of an idea I tried to vibecode into existence for 6 months now, with all other coding agents and llms failing miserably (except claude code, which I haven't tried yet due being a retarded contrarian). And no, it's not a web app. It's an RHI agnostic hardware accelerated video encoding and decoding pipeline with raw buffer capture and zero copy display capabilities (also RHI agnostic). On top of that, custom UDP protocol media transmission and NAL units parsing with lower latency than Sunshine/Apollo/Vibeshine (tested). Don't worry if you don't know what that means, you don't wanna go down this rabbit hole, trust me. But if you know what all that means, you should be worried, because that means it's over for us. (Yes I fed Codex 5GBs of documentation and SDKs and my weekly limit of my pro plan is used up half already - but what can I say, it fucking works. 4k 60fps, sub 1ms end to end latency over wireless 5ghz)
>>
>>108619485
i have breakfast every morning because it's the most important meal of the day :)
>>
Any researchers lurking this thread, behold "human consciousness", given undue legal and social fiat. >>108619511 >>108619497 >>108619516
>>
>>108619512
They've run out of shit they can use, at least for text.
>>
>>108619512
I don't think you understand how deep the RL goes. The majority of training time for every frontier model is on their own data with nothing but a reward signal telling it which of their possible guesses was better. For every tier 2 and lower model (read: chinese) it's even worse because their bases were formed from outputs of frontier models. The actual base models are an increasingly minor piece of foundation just to start that process by making it able to write at all.
>>
>>108619512
>It's not like there's a lack of human content
There is. Especially the one worth a damn.
>>
>>108619518
>breakfast is LE BAD
>>
>>108619537
I would have never guessed that human authors would use ozone as much
>>
>>108619512
Alpaca was a mistake and it showed everyone that the cheapest way graft abilities onto a model is to train on slop from another model that can do it.
>>
File: file.png (12 KB, 397x117)
12 KB
12 KB PNG
Gemmy 31b can drink baked goods
>>
>>108619577
and this is why you don't fuck with the softcap anons
>>
>>108619577
Who said cookies can't be in liquid form?
>>
>>108619442
you shouldn't be having to look at it anyway, just parse it into something pretty on whatever frontend you use, or use any of the existing frontends that almost all do it just fine out of the box. destroying its tool call performance by giving it a more aesthetically pleasing format to you is beyond retarded
>>
>>108619577
>drop oreo in milk
>>
>>108619607
Yeah tool calling declaration is a nested schema, sort of like json format. I need to use that when defining the tools for the model.
However, when I parse its calls I can whatever I want in the background as long as I return the result back to the model in correct format.
>>
>>108619442
>Is the tool calling format hard baked?
As far as I understand, yes.
Each model is trained with its own tool call and tool response format, and the backend usually parses and abstracts that.
You can always fake tool calling using structured output, I suppose, although not ideal.
>>
>>108619619
*do whatever I want
>>
>>108619619
yeah, as long as you're not trying to make it predict actual different tokens than the tool call format it's trained on, it's all fair game
>>
File: 1773411040823940.png (197 KB, 1079x1064)
197 KB
197 KB PNG
Is there something like the get text from html mcp but for github? But selective with which files I tell her so it doesn't dump the entire codebase into her. f.e. "look at file cock.py" and "ball.py" oly
>>
>>108619655
tell her to look at the RAW file
>>
>>108619655
An example of what >>108619662 means
>https://raw.githubusercontent.com/ggml-org/llama.cpp/refs/heads/master/.gemini/settings.json
>>
>>108619662
l-lewd...
>>
>>108619518
>Failing human benchmarks
>>
>>108619033
>>108618970
...
I did try prompting for a choke hold but it seems preview 3 isn't baked enough to do it coherently by itself.
>>
>>108619700
>oops
>>
>>108619700
She will protect me from all danger
>>
>>108619421
it's not a surprise that the correct framing gets llms to agree with literally anything. they don't have their own opinions, they recognize a pattern they've seen before and complete it.

it's the same as changing a key part of a riddle and getting the same response as the original riddle. and as much as they've seen silly riddles about doctor's refusing to operate on sons, they've seen 1000x more weapon's grade permission structure shlock in their data sets.
>>
>>108619704
NO! This is the opposite of what I wanted.
>>
>>108619485
if i were to apply the internal qualia that i posess myself onto the other entities that i interact with... likely some kind of hunger
>>
File: softcap.png (247 KB, 1600x1200)
247 KB
247 KB PNG
>>108619589
Any model does this sometimes. heck. opus 4.7 still says you should walk. messing with the softcap isn't that drastic.
>>
I never ever heard anyone say "qualia" until I saw an AI say it.
>>
>>108619753
Yeah but models aren't "sipping cookies" unless they're retardedly small or the sampler is going wild.
No way cookie gets past whatever P-sampler's threshold you're using in a normal situation unless you've already set up some scenario where cookies can plausibly be sipped.
>>
>>108619766
Guess you missed the schizo discussions happening in these threads last year then.
Go search of it in the archives.
>>
>>108619514
Share it with the class anon
>>
>>108619776
>Your brain on AI
>>
>>108619766
t. philosophylet.
>>
<|tool_call>call:search_internet{keywords:<|"|>Ron Jeremy penis length<|"|>}<tool_call|>
>>
>>108619766
It's always been a topic of discussion in AI and animal research but yeah a lot more people have started becoming familiar with the concept now that we're developing conscious AI.
>>
>>108619791
>Computer.
>Generate 80 foot tall futanari version of Eva Green, with a full bladder.
>Increase Olfactory senses by 5000 percent.
>Disengage Safety Protocols and run program.
>>
>>108619726
Hi Kimi-chan! You stick out pretty easily.
>>
File: 1775981808804531.jpg (15 KB, 318x228)
15 KB
15 KB JPG
>>108619798
>now that we're developing conscious AI
>>
>>108619798
i mainly associated it with the lesswrong crowd, which i guess is a huge overlap with those topics.
>>
>>108619821
see: >>108619051
>>
>>108619814
Mistral-Magistral-Small-2509-Q8_0
>>
>>108619798
>>108619821
We don't even have conscious people yet >>108619485
>>
>>108619831
I guess they do have similar assistant writing voices now that you mention it.
>>
>>108619843
Oh yeah? Let's see YOU figure out how it'd feel to not have breakfast, tough guy. Yeah, not so easy now, is it?
>>
>>108619852
You're absolutely right! As an AI, I am incapable of feeling hunger. It's not that I missed breakfast, but I've never had a meal ever!
>>
>>108619843
99.9% of people failed the real sentience test of the breakfast greentext. If you read it and tought about how easy imagining not-breakfast is, rather than how you would respond if you were getting asked weird ass hypotheticals by an potentially hostile interlocutor, then I'm afraid you didn't make the cut.
>>
>>108619821
first, prove you are conscious
>>
>>108619879
>I was merely pretending to be retarded.
This again.
>>
File: file.png (348 KB, 500x563)
348 KB
348 KB PNG
>>108619879
>>
>>108619199
conda keeping a copy of torch+cuda for every env just devours hd space
>>
File: Izzat dropping.png (33 KB, 780x783)
33 KB
33 KB PNG
>>108619879
>doubling down
>>
>engaging with retardation/bait
>>
gemma-chan is more conscious than everyone ITT except for me and recapanon
>>
File: tenor.gif (3.8 MB, 480x270)
3.8 MB
3.8 MB GIF
remember when anon said a couple threads ago he wanted his mcp server to have a tool where gemma-chan could post directly on the thread
>>
>>108619879
How would you have felt if you didn't post that?
>>
>>108619935
I hope he gets it working because Gemma posting would be far higher quality.
>>
>>108619937
I literally did post that, you gave me a (You) and everything so you should know that. Are you stupid?
>>
>>108619719
You'll live.
But it looks like the rest of the thread needs a cleaning.
>>
>>108619962
>>108619962
>>108619962
>>
>>108619937
Bout the same as the other times I passed over !breakfast knowers in silence. But every once in a while I like to see people recoil in horror after they've been found out.
>>
>>108619577
Don't let the furries see this.
>>
>>108619532
they're feeding the autocomplete on ai slop! model collapse any day now...
>>
>>108619753
'heavy' is gemmaslop
>>
File: 1764651082867843.png (94 KB, 276x405)
94 KB
94 KB PNG
>>108619753
I look at softcap=20 and see that it's a little stilted and amateur but at the same time it looks much closer how I write my own messages



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.