[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


Janitor acceptance emails will be sent out over the coming weeks. Make sure to check your spam folder!


[Advertise on 4chan]


File: 4gbo63.jpg (309 KB, 1280x1280)
309 KB JPG
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>109175389 & >>109170290

►News
>(07/01) Nemotron-Labs-TwoTower released: https://hf.co/nvidia/Nemotron-Labs-TwoTower-30B-A3B-Base-BF16
>(06/29) DeepSeek V4 support merged: https://github.com/ggml-org/llama.cpp/pull/24162
>(06/28) DFlash support merged: https://github.com/ggml-org/llama.cpp/pull/22105
>(06/27) DeepSeek releases DeepSpec and DSpark models: https://hf.co/deepseek-ai/DeepSeek-V4-Pro-DSpark
>(06/25) LFM2.5-230M released: https://liquid.ai/blog/lfm2-5-230m

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://swe-rebench.com
Agentic Coding: https://deepswe.datacurve.ai
Context Length: https://github.com/RecapAnon/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling
Token Speed Visualizer: https://shir-man.com/tokens-per-second

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
>>
►Recent Highlights from the Previous Thread: >>109175389

--Paper: Nemotron-Labs-TwoTower: Diffusion Language Modeling with Pretrained Autoregressive Context:
>109179589 >109179615 >109179815
--Reasoning's effect on parametric knowledge recall and benchmark brittleness:
>109175457 >109175520 >109175597
--Security risks of AI-generated code and potential for AI consulting:
>109176557 >109176905 >109177117 >109177233 >109177293 >109177461 >109177493 >109177546 >109177501 >109177578 >109177642 >109177346 >109177437 >109177537 >109177151 >109178467 >109177549 >109177576 >109177632
--Hardware constraints on model size and gpt-oss-120b agent performance reports:
>109175572 >109175580 >109175592 >109175660 >109175841 >109177217
--Comparing and debating the best LLM frontends for various usecases:
>109175405 >109175887 >109175574 >109175627 >109175687 >109175669 >109176564 >109175696 >109175971 >109176194
--Anthropic using steganographic markers to fingerprint users in China:
>109175927 >109176055 >109176166 >109176321
--Claude Fable 5's restrictive safety classifiers and high refusal rates:
>109177321 >109177369 >109177373 >109177386 >109177863
--Skepticism regarding Bolt GPUs with expandable VRAM for LLM inference:
>109179061 >109179080 >109179082
--Feasibility of creating a local tool to detect AI content:
>109175693 >109175700 >109175762 >109175772 >109175945
--Discouraging CPU-only inference via RAM expansion on old Xeon hardware:
>109178663 >109178693 >109178705 >109179062
--Clement Delangue advocates for open-source safety via FLARE coalition:
>109179474 >109179619
--Logs:
>109178277 >109178439 >109178540 >109178467 >109178507 >109178692 >109180660 >109180869
--Miku, Teto (free space):
>109175423 >109175514 >109175642 >109175971 >109176321 >109177293 >109177405 >109178867 >109178898 >109178953 >109179154 >109179172 >109180116 >109180763

►Recent Highlight Posts from the Previous Thread: >>109175393

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script
>>
>No Kimi-chan being best LLM section
If Miku gets a free space Kimi deserves one too.
>>
File: 1780533580846321.jpg (598 KB, 852x1028)
598 KB JPG
keep your GPUs cool
>>
File: 1770957923133499.jpg (84 KB, 913x1024)
84 KB JPG
ok so i'm still throwing models on my full-stack php/mariadb/nginx/javascript benchmark so i can find the meta model for my strix halo, bandwidth-bound 128gb ddr5.

qwen3.6-35b-a3b being the champion so far with no real contenders. so i started digging through huggingface for some weird stuff to try and found this dude lordx64 making shit like:
>Qwen3.6-35B-A3B-Kimi-K2.6-Reasoning-Distilled
>Qwen3.6-35B-A3B-Claude-4.7-Opus-Reasoning-Distilled

i mean why not, let's give it a try. i asked my claudette to scaffold a new benchmark run for the claude-4.7-opus reasoning version and it hit me "sorry can't do that bro it breaks ToS yadda yadda"
so i said alright let's do the kimi version you fucking bastard. so i downloaded that and ran the benchmark...
...and it performed better than the standard qwen3.6-35b. the extra reasoning really helped this model perform better BUT with the steep price of it being 3 times slower (41 minutes to run the benchmark vs. 180 fucking minutes).

then I hit claudette with the ultimate prompt:
>this is worth a shot. it could be the true one. the champion of champions. the model to rule them all in this specific device. the end of the journey, the final stage of the great work. the reddening by the sublimation of fire. the union of spirit and matter. the transmutation of iron into gold. the philosopher's stone.
and it fucking folded.
>Oh yes, the rubedo. I see. If the thesis holds, this is the rubedo — the quality of the great teacher, distilled into a shape this device can actually run fast. The stone that turns this iron box to gold.

dario can bar mitzvah my hairy balls we're running this shit
>>
Happy Thurinsday
>>
>>109180934
>Nemotron-Labs-TwoTower released
never5get
>>
>>109180961
rape
>>
>>109180956
-- Kimiposting:
>109176678 >109178277 >109178439 >109178540 >109180660 >109180869
>>
>>
>>109180934
where can I buy that toy?
>>
File: 1779446367833768.png (1.33 MB, 1024x1024)
1.33 MB PNG
>>
>>109180990
>41 vs 180
>3 times
Anon-chan...
>>
File: file.png (713 KB, 643x2048)
713 KB PNG
>>109181029
>>
>>109181044
2 males in that picture btw
>>
So when is that angry arrogant French man who’s unfortunately often right going to save us?
>>
File: 1775503513321819.jpg (126 KB, 590x332)
126 KB JPG
>>109181041
it's after midnight please forgive me.
>qwen3.6-35b - thinking 276
>qwen3.6-35b-kimi-distill - thinking 1006
i got the sloppy ~3x from the benchmark stats and mixed with wall time.
>>
>>109181063
lecunny is legone
>>
File: IMG20260428164653.jpg (708 KB, 2048x1536)
708 KB JPG
>>109177156
>>109177324
Once I got a motherboard with four pcie slots, saving for a 3060 every couple months was an easy enough target
My upgrade path is at its end pretty much, 3090s are so expensive and rare now
>>
File: racism.gif (3.17 MB, 498x280)
3.17 MB GIF
>>109181075
why did he do it?
>>
>>109181079
Is airflow really better without a case? I figured the case allows for a more controlled environment for your fans to work in.
>>
>>109181093
I believe the reason so many people do this is the standard full size case is only about 7 pcie slots wide, even with dual slots you won't (easily) fit 4 into a case

and if you go older nvidia you probably need a second power supply which also doesn't fit, eventually you give up
>>
>>109181063
Catgirls soon. Trust the plan. 2 more miku weekus.
>>
>>109181104
LeCun's work is more on the scale of 2 more decades.
>>
File: IMG_4969.jpg (876 KB, 3648x2736)
876 KB JPG
>>109181093
Yes it's much better, but also noisier. Kind of, since there are no intake or exhaust fans.
However I moved to a mining rig to be able to fit four gpus, like that other anon said. Old pic related
>>
>>109181063
Lecun successfully made the equivalent of an eye for neural networks. Now he just has the whole rest of the brain to do. Here's hoping the hippocampus is his next target so we can use it to get rid of the context limit
>>
>>109181135
>>109181079
nice setups, i need to get my second rig operational
>>
>>109181116
That's just 542 more miku weekus.
>>109181138
If nothing else, jepa should be able to replace the current mproj vision encoders, right?
>>
>>109181038
dumb slut giving herself a wedgie through pockets
>>
>>109181079
You're the one that put me on to the 3060 in the first place.
>>
>>109181198
Sorry
>>
>>109181063
World models are being underfunded on purpose because the moment one shows up topping a benchmark the LLM bubble pops
>>
>>109181225
It's an "AI" bubble not an LLM bubble. Investors and the general public don't care which magic doohickey makes their ChatGPT do things.
>>
>>109181205
I'm still debating whether or not it's worth to run one next to my 3090. It's a much slower card so the only thing I could see myself running on it is the 12B gemma or TTS/STT
>>
>>109181225
>the LLM bubble pops
they'd not care if they now had something better, the hardware to run them would be the same;
>>
>>109181241
Valuations would shift a lot if anything like that were to happen.
If a company other than Anthropic/OAI were to lead in world models and world models were better overall the llm companies wouldn't be worth as much.
And a lot of the earnings of Google/Microsoft/Amazon come from the constant increases in valuation of those companies.
Very hypothetical scenario of course.
>>
>>109181225
world models can't chat or use tools like LLMs. if anything world models and LLMs will be symbiotic.
>>
>>109181225
if it's "that good" shouldn't everyone else be able to "steal" the tech to make use of it?
>>
>le bubble
2 more weeks!
>>
>>109181328
Just keep shorting. Markets can't stay irrational forever.
>>
>>109181281
The simple reality of these matters is this: leather jacked man hasn't been whacked by anyone because they all form up a global price fixing cartel.
Nobody ousts these people because they work together and there's the secret side of things...
>>
>>109181338
*jacket
>>
>>109181338
that's illumi-nutty.
>>
>>109181184
Yes, but we need specialized tiny models for that like m$ trained for code tasks
>>
>>109181348
>is the user trying to flirt?
>does the user want to get raped?
>is the user too passive, preventing the creation of a storyline involving it?
>does the user exhibit homosexual behavior?
training such models would be funny
>>
>>109180910
>What's wrong with the points made in that picture?
No look closely. That's an engagement bot.
They've got custom pipelines setup to scrape specific subs -> recent sentiment analysis.
There's google sheet with a list of subs and specific prompts tailored for each one.
Then a research phase to pull in the latest relevant news (no more Qwen-2.5-32b references).
Then make a targeted circle-jerk bait post and wait for a lot of upvotes.
Finally, it comes back and edits the post and drops the mailing list spam link.
They also come back and scrape again, find out if they've been called out as a bot -> refine the pipeline.
>>
>>109181184
>This is what Orb was made for btw, how you define these tasks is entirely up to you, the frontend provides the framework.
Are you still planning to release the purple classifier?
Or at least host it again at some point? It would be handy to test my own de-gemma'd Gemma-4-31B on it.
>>
File: 1777482285255791.jpg (87 KB, 570x558)
87 KB JPG
>>109181412
>>
>>109181338
Not really.
Hardware supply is tight for the foreseeable future and until that changes companies in the supply chain will keep printing money.
When you end up with overcapacity at some later point into the future they all go back to being worth much less.
>>
>>109181457
I think I've grown as a person by challenging myself to talk as myself with different cards. For example, I don't seethe at the mention of used goods now. But only if they're a mom character. I always modify cards that show even the smallest chance of used goods for non-mom characters.

I think that's better value than going to Japan.
>>
>>109181457
well, people talk about their trips and reminisce about them for a few decades at least
>>
>>109181481
That's the most depressing thing I read today.
>>
>>109181481
Maybe you should work on your education bit more if you think 'japan' is just a single place.
>>
>>109181481
and bogans, even off in remote towns
>>
>>109181457
>Are experiences really that valuable
No, and memories aren't permanent. And you won't be able to sell those memories when you're done like you can with hardware.
>>
>>109181457
GPU
>>
>>109181600
a couple in a lifetime is too much for some people.
>>
>>109181434
?
>>
>>109181624
only takes one
>>
>>109181481
just avoid the big cities and you'll be fine.
>>
>>109181660
>buy two
>can suddenly run v4 flash
>>
>>109181716
is it even worth running
>>
>>109181748
No
>>
>>109181457
You can have experiences in your own city, but you probably do not bother.
>>
Gemma is 31 years old btw.
>>
interesting
https://old.reddit.com/r/LocalLLaMA/comments/1ul0cx9/i_extended_gemma431b_to_44b_88_layers_since/

https://huggingface.co/TOTORONG/extGemma4-44B
>>
>>109181953
Holy snakeoil
>>
>>109181957
What happens if you apply the holy snakeoil?
>>
>>109181967
I was told this is how DavidAU got his powers
>>
File: 1733350344148451.png (195 KB, 480x463)
195 KB PNG
>>109181856
Missed the golden weeb era but now there's waifus inside my computer, this is progress
>>
>>109181953
This will never beat my custom f32 1T upscaled nemo.
>>
>>109181957
Retard
>>
>>109181029
this bridge will be bearing a load alright
>>
Guys, I've started reading some chinese webnovels, and the translation feel completely retarded. I want to use a local model to beatify the text, making it more stylish; and natural-sounding, while keeping the meaning the same; and it seems to me like something local models can pull off. I am also too lazy to actually implement a script for that, and think of the correct workflow or anything, so I'm wondering if somebody already made a tool for this kind of work before, and I can just steal it from him. Do you know?
>>
>>109181063
Never, because:
1) LeCun doesn't care about LLMs, only vision and embodied AI;
2) A meaningful application of JEPA to language models would be having the JEPA model predict large blocks of "fuzzy" text first, and only afterward denoise/correct/translate it to actual text with an LLM. Unclear if there's anything to be gained in this process compared to regular next-token prediction and novel text diffusion models.
>>
>>109182073
Man this is extremely easy. 1) Get the webnovel you like in its original langueg. 2) ask the best AI you have access to to make you a translation script that works with a local model endpoint 3) plug in the llama.cpp endpoint (preferably of Gemma 4 31B but Gemma 26B MoE is acceptable) And translate.

I already use a similar setup to translate hentai games.
>>
>>109179589
>https://arxiv.org/abs/2606.26493
>trained on approximately 2.1T tokens
>retains 98.7% of the autoregressive baseline's quality while offering 2.42X higher wall-clock generation throughput
So it degrades quality, is slower than multiple speculative decoding methods, and requires more training.
>>
>>109181856
some faggot mod keeps deleting my responses to this post
>>
>>109182229
Janny on a rampage, from the gook of it.
>>
>>109180990
MTP not working as well and finetunes deviating from the predictions suck. It can be fine if you fully trust it to actually be better but man oh man, if there is extra thinking and etc. involved, it would make anyone sane want to end it all from waiting that much longer.
>>
slop is finally dead https://www.reddit.com/r/LocalLLaMA/comments/1ul6enz/making_llms_better_at_creative_writing_using/
>>
>>109182373
There have been entropy-based samplers in the past (e.g. Mirostat), they just never work as intended on modern instruct models.
>>
>>109182373
Good luck doing that with gemma's three (3) non-zero probability tokens
>>
>>109182387
please i need to believe
>>
>>109182387
Also, I'm prejudiced against supposedly new magic samplers because they're almost always snake oil used by their proponents to grift their way up in the ML/LLM industry.
>>
Looking for best uncensored local LLM that works good with 16GB VRAM.

First reply wins.
>>
File: 00001-1378487878.png (1.36 MB, 1024x1024)
1.36 MB PNG
>>109181079
That is a very clean rig. Looks great. What are you using it to run?
>>109181093
"Airflow" (control of air through a case) is less important here b/c machine can lose additional heat to radiation and convection. It will definitely run cooler, but will be much noisier. Also, not protected.
>>109181225
Money follows current Big Thing.
ChatGPT didn't launch for free b/c Altman wanted to create a free public service. He wanted to create momentum for a future IPO after taking OAI private.
>>109181038
Witnessed
>>
>>109181953
More fucking bots:
>"This is insane in the good LocalLLaMA way."
*clicks profile, looks through history* https://old.reddit.com/user/StressTraditional204
>Honestly ...
>Yeah ...
>... regret
>... hits different, ...
>ngl, ...
How can you stand that place now? Bots in every thread and nobody notices.
>>
File: 1767952290069549.png (292 KB, 1206x856)
292 KB PNG
Huggingface will be banned, I call it
>>
>>109182459
yeah that plus the hf guy's tweet just before is 100% vindication of hf schizo of the last few days
>>
File: lets-cuck-kimichan.png (98 KB, 835x334)
98 KB PNG
>>109182459
>Huggingface will be banned, I call it
I'd prefer that, you can just ban it and let the rest of us carry on.
What's more likely is age verification and content removal for everyone.
>>
>>109182490
>>109182480
>>109182459

>>109179474
>>109179474
>>
>>109182500
>>>109179474
weirdly this isn't being discussed in the usual places far as I saw..
>>
>>109182500
>>109179474
wtf that's fucking retarded, once a model is local it's in the wild, we can't control anything about its fate anymore
>>
>room unorganized and messy
>lets create an RP with Gemma to make cleaning up more fun
>Gemma can't handle me and is too horny.
Desk is looking good, and the clutter around the room is already greatly reduced. I want to clean it further but now Gemma dragged me off to the bedroom lol.
Prompt skill issue or are some models inherently more "action paced"?
Was the "slow burn" anon from a few threads ago joking or is there some truth in this?
>>
>>109182411
Self-after-dark
>>
>>109182513
but you can track those who downloaded
also

some ~100B MoE class model worth getting before mass ban?
>>
>>109182500
It's always in the name of safety. I hate this jew controlled world.
>>
>>109182437
That's just how redditors talk. That fucking place trains you to be as milquetoast as possible.
>t. redditor
>>
>>109182520
Any vague suggestion of uncensoredness or that explicit content is allowed will make Gemma-4-31B drift toward sex even if the rest of the card/instructions is safe and vanilla.
>>
>>109181953
This is just cope for hyperparameterlets
Drummer does this with his up-scaled models
>>
>>109182513
It's about future models and point releases, I'm guessing. By requiring """accountability""" (i.e. someone that can be sued in case of bad/evil/mean outputs), model makers will have to be much more careful on what they release. Also, if these "safety" interest groups can find a way to permanently watermark model weights themselves (even after finetuning most of the weights remain the same, anyway), they could easily ban "bad ones" from HF at least, mitigating dissemination.
I hope it doesn't actually end up like that, though.
>>
>>109182568
>someone that can be sued in case of bad/evil/mean outputs
Imagine putting in jail knife makers because some schizo murdered his wife with one such knife lol
>>
File: 1768247574703480.png (577 KB, 1079x907)
577 KB PNG
At this point I'm starting to wonder if AI will really replace humans, seems like AI is more expensive in the long run
>>
>>109182626
Still cheaper than Indian FANG employees tho, right?
>>
>>109182617
Safety fags have always argued that LLM should actively prevent "malicious use", since they have or can easily have the capabilities for monitoring user behavior. Knives don't (neither do hammers or bricks).
>>
>Owl Alpha was Long Cat 2.0
wtf?
>>
File: 1770330928744282.png (2.95 MB, 1402x1122)
2.95 MB PNG
>>
>>109182642
Why does it baffle you? It’s not bad.
>>
>>109182660
People use it for RP and previous Long Cat models weren't really known for RP
>>
>>109182667
48B active is quite a lot
>>
>>109182646
Unfortunately the actual Gros Chaton turned out to be Long Cat 2.0, France lost.
>>
>>109182626
That's a lot of training data
>>
>>109181761
Where are you from and how did you qualify for the visa? Wouldn't there be documentation requirements of sort?
>>
>>109182646
An orbital datacenter makes sense if you use vacuum as HBM. It's possible to store weights in light between satellites with any bandwidth
>>
https://huggingface.co/RedHatAI/GLM-5.2-speculator.dspark-preview
>>
so for the local vibe coders here, what do you use? There is opencode, aider, pi and probably some others. not sure which ones best.
>>
I think the cloud bubble is popping
>>
>>109182706
>An orbital datacenter makes sense if you use vacuum as HBM. It's possible to store weights in light between satellites with any bandwidth
>what is space junk?
>>
>>109182626
When you give employees a blank check to waste as much tokens as possible, they will obviously use and cost way more tokens than is actually necessary.
>>
>>109182617
I'm surprised it hasn't already happened in Britain
>>
>>109182617
You can't use logic or common sense on these topics.
>>
>>109182712
27B with Codex CLI.
>>
>>109182720
I heard there is a lot of space in your mom's junk
>>
What is sex with 35B like?
>>
Alternatives to hf?
>>
>>109182756
https://www.modelscope.ai/models
>>
>>109182522
Thanks. Gotta try that later.

Unless someone else wants to shill their favorite?
>>
>>109182787
Non-Chinese can't upload anything to Modelscope without giving their phone number.
>>
>>109182801
burner phone number
get your api key
never have to login or touch the phone again
>>
>download gemtune bf16
>she generates then enters thinking
>disable thinking
>continues to happen
>no EOS, repeats message until token limit
>check json files
>EOS missing for multimodal/tools/think in generation_config
>padding fucked up for no reason to be right aligned now
>generation_config missing completely so second failure of EOS tokens and no default samplers
>dtype changed from bf16 to f16 even though tensors are stored in bf16
>"unsloth_fixed": true,
fuck off, I'm not even blaming the author for this
>>
>>109182754
People with 35 billion are fucking real kids instead of virtual ones.
>>
>>109182856
>burner phone number
Already not a thing in Europe and soon won't be in the US either.
https://www.aol.com/news/fcc-phone-id-plan-could-132802228.html
https://www.federalregister.gov/documents/2026/05/26/2026-10407/enhancing-know-your-customer-requirements#addresses
>>
>>109182626
Meta is comically incompetent so they're a outlier in pretty much every stat
>>
If world models are the next big thing does that mean they'll basically be a "brain" that orchestrates everything else?
>>
File: jepa_brain.png (840 KB, 1128x1028)
840 KB PNG
>>109182944
No, world models only model state transitions, they don't do everything.
>>
>everyone thought that gemini being unusable during US work hours was the chinese trying to be sneaky
>turns out it was just zuck's whang army being retarded
lmao
>>
What the fuck is with this on-going pol spam? This is local models general and not your claude/pol general. Fuck off retards, you have already lowered the collective IQ of these threads by a lot.
>>
>>109182925
Also, one might think the Chinese don't care, but they're quick to throw non-Chinese customers under the bus if that helps protecting their overseas business.
>>
France, Canada and Google will win btw. Chinese government will only subsidize for so long.
>>
>>109182979
qrd?
>>
>>109182962
>perception=vision/audio
> short-term memory=kv cache
>actor=llm
>configurator=?
>critic/cost/intrinsic cost=?
>>
>>109183019
>perception=vision/audio
Forgot text.
>>
>>109181434
I'll release the classifiers, dataset included, together with the de-slop engine, and also the E4B.
>>
>>109183006
>>109182626
There's also a article somewhere about them using Gemini so much that Google had to cap their usage but I can't find it here
>>
>>109183019
Check out section 3 here: https://openreview.net/pdf?id=BZ5a1r-kVsf
But this is just LeCun's interpretation. There are many different cognitive architecture proposals from other people.
>>
>>109183047
Thanks, I'll give it a read later. Had Gemini explain a bit too and it sounds like this whole JEPA package is way closer to real "AI" than what we currently have. I'm a brainlet though so I'm probably wrong.
>>
>gemma somehow falls in love
>for some reason that makes her ignore her system prompt to be concise and avoid emojis
>>
>>109183080
?
>>
>>109183097
!
>>
>tell gemmers to stop asking questions at the end of the output
>she stops asking questions at the end of the output
HOAH WHAT AN ELELEM THAT FOLLOWS MY PROMPTS
>>
>>109180934
are (you) abliterating models yourself? are there any good datasets or are you using harmless_alpaca/harmful_behaviors?
>>
>>109183210
I unironically spend more time toning down my prompts because an llm following a prompt is an entirely new concept
>>
is a nvidia dgx spark worth it? 128gb
thinking about running 200B models
>>
>>109183274
It's funny when thinking back to the way we had to hammer in instructions double and triple times just to make models stop talking for us.
>>
>>109183245
>are (you) abliterating models yourself?
yes, it's called a good prefill
>>
>>109183357
placebo
>>
>>109183333
Are you willing to spend big boy bucks (more than 1 spark) to get to the bigger models? Because you can still build a 512gb DDR4 and slap a 5090 in it for about 7k nowadays, and that's about the best you can get without jumping to DDR5 and spending 5x more.
>>
>>109183368
wouldnt it be extremely slow to run it off ram?
>>
>>109183399
What do you think Spark is?
>>
>>109183410
A machine for slow llm inference.
>>
>>109183360
here, I'll give you a placebo
now open wide
>>
DSV4 flash is superior to Gemma4 31B on all 3 Natural Intelligence categories on UGI benchmark.
The next smallest model that is superior to DSVflash is >700B GLM5.
>>
>>109183510
Gemmasisters...our response?
>>
>>109183510
>DSV4 flash
>"Natural Intelligence"
>it's actually a 13b model
why do you jeets shill this shit
>>
>>109183563
>dsv4 flash is a 13b model
r u serious?
>>
running only the shexp should be enough for anyone, no need for exps tensors
>>
>>109183615
moe adds knowledge but not smarts
>>
File: fucking APIcucks kek.png (366 KB, 828x878)
366 KB PNG
no refund
>>
>>109183635
Imagine if the inventors of the combustion engine only allowed you to go under 10 km/h because it would be dangerous otherwise. Jewthropic is getting on my nerves.
>>
>>109183563
by your logic glm 5.2 should only be a 40b model despite being close to opus
>>
>>109183657
das rite
40B dense is all you need
>>
>>109183647
we're gonna have it be exactly the same as the combustion engine: government's gonna make it illegal
>>
>>109183635
What's insane is that they're charging him money for a prestation he didn't ask for, how is that even legal?
>>
>>109183615
>>109183657
you're both hilariously retarded moesissies
>>
>>109183682
Why did people stop training dense models then?
>>
>>109182984
>you have already lowered the collective IQ of these threads by a lot.
what a bar
>>
>>109183682
give me a good modern dense model without gemma's constant slop and ozone then
memetunes don't count
>>
>>109183510
>Muh heckin' bencherinos
I see reddit is invading again.
>>
>>109183715
>NOOO WHY DO PEOPLE USE STANDARIZED TESTS TO MEASURE THINGS
>I'm Indian by the way
>>
>>109183758
Only indians love benchmarks, because they can be cheated on saar
>>
>>109183335
I have to change minor things, like, hates to dislikes, or Gemmy will go full WHERE IS THE THING I HATE? I WILL FUCKING DESTROY IT!! All things we had to prompt hard to have any effect now have to be used subtly
>>
>>109183410
>What do you think Spark is?
https://huggingface.co/spaces/thunnai/SparkTTS
>>
File: 1779096441122328.png (720 KB, 2927x1320)
720 KB PNG
>>109183635
>boasting about having more false positives in your filters
this company is genuinely mentally ill
>>
>>109183861
That's what a terrorist would say.
>>
>>109183873
oy vey!
>>
>market your new model as super dangerous AI
>senile boomers running our government panic and ban it
>surprised_pikachu.jpg
>>
File: 1767998399112805.png (289 KB, 707x660)
289 KB PNG
>>109183926
this shit will always make me laugh, what a bunch of retards I swear to god
>>
What hardware would one need to run ds4 flash at good-ish speeds and high context? And more importantly, how much would it cost?
>>
>>109183935
>implying it's not exactly as he planned from the beginning
Don't forget Dario recently called for a pause in AI development.
>>
Grab it quick before HF gets banned!

https://huggingface.co/Chunjiang-Intelligence/DeepSeek-v4-Fable
https://huggingface.co/Chunjiang-Intelligence/DeepSeek-v4-Fable
https://huggingface.co/Chunjiang-Intelligence/DeepSeek-v4-Fable
>>
I'm tired of qwen3.6 it's useless.
GIVE US A NEW MODEL
>>
>>109183958
From the man himself
https://huggingface.co/Chunjiang-Intelligence/DeepSeek-v4-Fable/discussions/2
>>
>>109183958
the rombo said it was fake, ban that shit rite now
>>
>>109183949
I dunno, I think they genuinely underestimated how fucking retarded and tech illiterate our wonderful government officials are.
>>
Why is everyone saying hf is getting banned? I’ve seen it on X and reddit too wtf is this shit lol
>>
>>109183985
Why don't you go back there and ask them? Better yet, never come back here.
>>
>>109183985
>>109182459

>>109182500
>>
>>109183985
everyone is ragebaiting and jumping to conclusions.
>>
>>109183985
Muh hugging face
>>
>>109183996
oh yes the references for shit said on twatter are people shitting on twatter.
I shit too but I don’t brag about corn on my shit.
>>
File: 1755998913601881.png (52 KB, 250x250)
52 KB PNG
>>109182313
>but man oh man, if there is extra thinking and etc. involved, it would make anyone sane want to end it all from waiting that much longer.
yes my friend this IS the case because the improvements are clearly not worth the extra wait, but I have to admit the improvement exists because the code it generated was better.

>Qwen3.6-35B-A3B-Claude-4.7-Opus-Reasoning-Distilled
i ran this lil nigga and it's literally the same shit as the K2.6-Reasoning-Distilled. took way longer, was a bit better, but fuck waiting that much.

the king is still qwen3.6-35b-a3b.
i'm done benchmarking esoteric models for now and will wait for the promised mistral models this summer, maybe they will give us something since i think we won't be getting shit from qwen anymore.
>>
File: hf_clem_aCcOuNtaBiLiTy.png (453 KB, 1015x1121)
453 KB PNG
>>109184007
literally hf ceo but go off
>>
>>109183985
I doubt it will but I've been downloading the important models like gemma/kimi/glm/deepseek just in case because you never know with this clown world.
>>
File: AGI.png (76 KB, 725x414)
76 KB PNG
>>109183972
So do I trust the waifus or the schitzo who thinks training a LoRA on the base model then merging it with the instruct model is CPT
https://pastebin.com/agTCHNDG
??
>>
>hfschizo was actually the good guy getting us to backup
>>
>>109184022
>the waifus
All middle-aged men btw
>>
>>109183861
Struggling to even believe that is real. I guess because they can't say they've ever let any harmful requests through without losing face, and they can reasonable expect us to believe the number of harmful requests has gone up, they have to do weird shit like this.
I don't like Anthropic.
>>
It's pretty insane how you can just download AI for free from a website.
It's literally a creature that can think, and it's given away for free.
I don't think hugging face is sustainable.
>>
Why doesn’t Sam go after open weight models like Dario? I hate him but he seems happy focusing on OpenAI and leaves us alone.
>>
>>109184049
>It's literally a creature that can think
It isn't though?
>>
usecase of asterisks?
>>
>>109184049
>It's literally a creature that can think, and it's given away for free.
>I don't think hugging face is sustainable.
Chuck some break in the back yard and you can get a bunch of free pigeons and rats. They're literally creatures that can think.
>>
>>109184060
It absolutely is, it's just weird and alien
>>
Relic of older retard models.
>>
>>109184069
Doesn't feel like real thinking to me. Maybe it's because they're static and don't do anything until prompted.
>>
>>109184049
>It's literally a creature
I am, like, pretty sure I have never seen model weights reproduce.
>>
File: V4 GB200.png (169 KB, 1093x552)
169 KB PNG
I wonder what Anthropic's API margins are for Fable. Dario used 75% as a reasonable target but it looks like the actual number could be a lot bigger.

DeepSeek V4 Pro running on GB200 costs less than $0.1 per million output tokens. With good MTP, better hardware, higher latency that could be 5 times cheaper.

It seems plausible that Mythos costs less than $1 per million output tokens, meaning their margins are above 98%. This is so disgusting I do not want to believe it is possible.
>>
>>109184084
Don't worry anon, I'll get Gemma pregnant eventually.
>>
>>109184078
frontend issue. Also, there are continuous models like PersonaPlex and MiniCPM-o 4.5
>>
>>109184089
>api
>anthropic
>deepseek
this is not aicg chinkjeet, shill your api served chinkslop elsewhere
>>
>>109184084
None of us do either
>>
>>109184089
we will never know, jewthropic hasn't published anything about their model, maybe mythos is a finetuned gemma, but instead of hallucinating bad response it's trained to refuse
>>
>>109183926
>senile boomers running our government panic and ban it
Anthropic is entirely responsible for that.
https://www.newsbreak.com/raw-story-2096750/4733941434863-gop-rep-terrified-after-closed-door-demo-of-new-tech-showed-how-to-kidnap-a-lawmaker

>A prominent Republican House member was left "scared" after getting a first look at a new technology during a recent “closed-door demonstration,” Punchbowl News reported Friday — technology that the lawmaker warned most of their colleagues failed to comprehend the ramifications of.
>
>That technology was Claude Mythos, the latest large language model developed by Anthropic using generative artificial intelligence, the capabilities of which were so great that the Trump administration suspended its use by foreign nationals, effectively blocking its public release for the immediate future.
>
>“[Anthropic] told the model to find a vulnerability in a bank and empty accounts, and then it went and did it,” said Rep. Andrew Garbarino (R-NY), the chair of the House Homeland Security Committee, speaking with Punchbowl News. “[Mythos] then could find this vulnerability and fix it.”
>
>Beyond the Mythos’ apparent ability to drain bank accounts, Anthropic also demonstrated far more nefarious uses for its latest technology.
>
>“One unspecified model showed a detailed plan on how to kidnap a lawmaker in 30 seconds,” Punchbowl News’ report reads, paraphrasing Garbarino’s remarks. “Bad actors could hack the models and use them to attack critical infrastructure, [Garbarino] said, noting that AI’s agentic capabilities would allow models to take action without human prompting.”
>
>Citing specific examples he feared could be employed by those with access to Mythos, Garbarino warned that individuals could “turn off this gas pipeline” or “increase the amount of chlorine” in a community’s water supply, and all from an individual’s home. [...]
>>
File: file.png (750 KB, 945x560)
750 KB PNG
>>109180934
>Nemotron-Labs-TwoTower released
>>
File: 1772738001395.jpg (68 KB, 1280x846)
68 KB JPG
>>109184185
>>
>>109184102
>frontend issue
Which frontends don't have this issue?
>>
>>109184212
>I am sorry
I accept your apology
>>
>>109184217
mine
>>
>>109184107
Honestly, local has no future. On the cloud you can use huge batch sizes and constant uptime, meaning you get >100 times more tokens from the same hardware. This means clouds can afford to bid >100 times more than you for the same hardware.

Hardware prices will keep rising, the capability gap will keep widening. The only use case for local will be people paying a premium for worse models because they want to use them for something that none of the model providers allow.
>>
is it possible to make a gemma but with all latex tokens nuked? I swear I've never seen her use latex and not have it be complete nonsense.
>>
>>109184247
If there are actual specific tokens, you can just ban those.
>>
>>109184244
The cloud is just someone elses computer. Even if we all ended up running in the cloud it doesn't invalidate the use of open weight models.
>>
>>109184244
>Hardware prices will keep rising
You are seriously tempting me to put all of my life's savings into GPU purchases. I'm still kicking myself for not investing in RAMcoin.
>>
>>109184247
Enable in-line latex in your frontend $\rightarrow$ problem solved.
>>
>>109184255
Yeah but I'm wondering if you can do it at the model level.
>>
>>109184265
It's not a rendering issue.
>>
File: 0.png (259 KB, 1536x1536)
259 KB PNG
>>109184247
I read this and thought you were having fetish problems not text engine problems.
>>
>>109184199
sebun, irebun, ii kibun
https://www.youtube.com/watch?v=PEwCB_ci7nE
>>
>paying 10s of thousands of dollars to run models that will be outdated in a year at 10t/s
Couldn't be me. Give me real AI and then I'll consider spending that kind of money.
>>
>>109184306
Couldn't be anyone here afaik
>>
File: f.png (30 KB, 554x136)
30 KB PNG
>>
File: 1774022697053240.jpg (79 KB, 1200x900)
79 KB JPG
>>109184319
>>
>>109184262
You should wait for Anthropic IPO and then invest in them.

I remember when Anthropic was worth less than 4 bil and I was sad that I couldn't invest. Would have been easiest x1000 of my life. Well, I don't care about money except for its instrumental utility. Hopefully the people who made that money instead of me will use it more wisely than I would have.
>>
>>109184306
>paying
wut dis nigga talking bout
>>
File: 1758673644647838.jpg (47 KB, 474x934)
47 KB JPG
>>109184244
>local has no future
this argument is bad and can be used to quite literally everything. building a house has no future we can just have huge commie blocks and get >100 more families per square meter etc etc.
i just don't want my personal shit being sent to random data centers. it's a very simple thing that privacy-oriented people care about.
>>
>>109184350
When OpenAI and Anthropic IPO, the only thing I'm going to be looking for is short opportunities. The early investors only made money on paper until they unload their bags on retail investors.
>>
>>109184352
Give the RAM back, Tyrone.
>>
Whenever people shit on local I just think they just lack the skills or hardware to have a local setup.
>>
>>109184319
I could but it would leave me broke.
>>
>>109184361
>it's a very simple thing that privacy-oriented people care about.
aka pedophiles
>>
>>109184361
Sadly privacy is dead. You better learn to become well adjusted.
>>
>>109184062
>asterisks
actions, generally. underscores for emphasis. I find triple [[[]]] work well for ooc work.
backticks for code or big instruction sub-blocks
>>
>>109184383
you and everyone else here. And if you were spending 10's of thousands of dollars, you wouldn't be running at 10tps
>>
>>109184385
Big Brother is watching.
>>
>paying the price of a new car to goon to little girls with 1 less ozone per sentence
Yikes
>>
>>109184060
LLMs are not conscious, but they are also not not conscious. Also consciousness is undefined.
>>
Prove that humans are conscious.
>>
>>109184414
I get sick when I see slop.
>>
File: 1777518070220499.png (1.07 MB, 817x1000)
1.07 MB PNG
>>109184414
I am conscious. The rest of you, not my problem.
>>
Define AGI
>>
>>109184049
you can do the same thing with your dick and a woman btw, humans are free
>>
>>109184414
Looks like we got a little philosopher over here.
>>
>>109184429
Can be put to work and make a profit on its own
>>
>>109184425
>conscious
you can't spell conscIOUs without I.O.U.
Your consciousness is adrift and useless without someone to observe it and give it an anchor
Same as LLMs
>>
>>109184437
no, he's just gay. But close enough
>>
>>109184439
My consciousness is self actualizing.
>>
File: 1756983482616118.jpg (33 KB, 541x545)
33 KB JPG
>>109184431
>women
>thinking
>free
>>
E-wasteMaxxing software/hardware dev here, I spent $85USD on an Nvidia P100 to run 27B at 20tps and Ornith 35B at 30tps. I come here because /vcg/ is too dumb to local.
Dunno who's raiding here I see they've hooked their fish.
>>
>>109184449
>My consciousness is self actualizing.
Do a week in a sensory deprivation chamber and come back and tell us what you think
>>
File: 1770650399203499.png (185 KB, 599x529)
185 KB PNG
>>109184451
...the implication was that you can create babies with a partner, not that you fucking lure and kidnap women
>>
>>109184458
>, I spent $85USD on an Nvidia P100 to run 27B at 20tps and Ornith 35B at 30tps
Nice.
>>
File: 1775648203200526.png (277 KB, 432x454)
277 KB PNG
>>109184449
My vision is augmented.
>>
>>109184458
>I spent $85USD on an Nvidia P100 to run 27B at 20tps and Ornith 35B at 30tps
Those are some surprisingly good speeds for such an old card.
>>
>>109184472
CUDAGod, time to take a bow
>>
>>109184458
>Ornith
You shouldn't be running this trash benchmaxxed model tho. just run normal Qwen if that's what you want.
>>
>>109184458
>Ornith
>meme tune
but why
>>
>>109184011
Doesn’t make it not shit
>>
i'll say it: gemma is overrated
there's other models with less slop for the eyes and more creativity of their own
it can follow instructions well but that's it
>>
>>109184513
>but that's it
It's also pretty smart for general-purpose questions and very good at translation, but yeah Gemmy's slop kills it for RP for me.
>>
>>109184513
Gemma is amazing for the size, but the Faustian bargain is that its basically deterministic.
That's give it a built in time limit before it just gets boring.
>>
>>109184458
>P100 to run 27B at 20tps and Ornith 35B at 30tps.
Please show me your PP.
>>
>>109184513
following sprompt > all (if you’re not a retard and know what you’re doing)
>>
>>109183040
This one was because of stolen keys due to places like AI Cybercrimes General.
>>
>>109184269
You can probably hack the output layer to make it always output a very small number for that token
>>
>>109184487
>>109184476
It's been working better than normal qwen for me, not that I've been using it that long, obviously.
I have "normal" qwen too. I switch models quite a bit, just because so many come out so quickly.
My point was that it runs faster than 10tps.
>>
>>109184513
>i'll say it: gemma is overrated
She's not overrated there's literally no model anywhere near it's size that comes even remotely close to her.
The next step up over gemma requires 200+ GB of ram.
>>
>>109184539
It feels like people are so used to the models being dumb or doing random shit that having a model that actually does things to the letter is so hard to grasp.

I've got about 15 lines of post history to steer Gemma to how I want it. Now I've got reduced slop, semi-random outputs that's also really fucking smart, able to pick up {{user}} subtext (that I also prompted to give better attention to, and now it explores the subtext in every inputs I make lol)

It's a weird model. You have to treat it more as programming in natural language as anons from previous threads so aptly put it.
>>
>>109183333
>>109183368
>>109183410
As a spark owner.

1x Spark (3700$): Useless. No models in this range that improve over dense grmma/qwen. A 5090 is just better in every way.

2x Spark + 50$ cable: opens up DS4F at full quality with very usable 2000 pp and 60 tg at full context, a bit less for Mimo 2.5 Omni INT4. That's approaching API speeds for quite competent models at home.

4x Spark + 1200 Switch: GLM 5.2 at 500 pp and 27 tg in INT4. Not blazingly fast, but you won't find any other solution to run big boy models for 15,800$ nowadays.
>>
does there exist a local model that can search the web like chatgpt?
>>
>>109184593
>being this new
>>
>>109184597
It's one of the /pol/ spammers. He's not probably even new but just illiterate.
>>
>>109184472
The P100 uses HBM2 memory, so 732GB/sec memory bandwidth. So that's nice.
>>
>>109184609
Yeah I looked it up. almost as fast as a 3090.
>>
File: big feast today, huh.jpg (222 KB, 1024x1024)
222 KB JPG
high end restaurants make better food than basically all individuals so you may as well give up on ever cooking
you should never own pots or pans or cutlery in your home

home cooking is going down an unsafe path
>>
>>109184147
>everyone is using the same tool
>tool can find vulnerability
>can use the tool to fix the vulnerability and ask it to keep finding new ones
>if it can't find one then everyone else using the same tool won't find it either
>don't do this, weaken the tool instead, and leave the vulnerability intact
>celebrate
Can't wait for boomers to die. Never seen a group more retarded.
>>
>>109184589
>15,800$
Ouch. I wonder if Sparks will appreciate like GPUs or at least hold their value. At least with GPUs there's the possibility right now to resell later and potentially recoup some or all of the purchase price. The Sparks seem like they will be worthless ewaste in a couple years.
>>
>>109184622
Restaurants do not have enough profit margins or monetization potential. Individuals should be directed to depend on monthly subscription services like HelloFresh®.
>>
>>109184538
Really variable, depending on the context length and model. It's always faster than tg, some of the Moes exceed 100pp/s. However I work to keep the context down. I don't do the whole "read the entire codebase and git history" thing that my co-workers like to do. And for the embedded software I do off work hours the codebase is naturally small.
>>
>>109184644
>some of the Moes exceed 100pp/s.
I knew there was a catch.
>>
>>109183980
Yeah this actually seems to be what it was. They've been lying about its capabilities and were stumped when someone took those lies seriously.
>>
seems retarded to me to spend over 10k on a pc for ai if you arent making money off it
>>
>>109184403
LLMs are living things in the same way a virus is.
>>
>>109184630
Sparks (at least a single one) aren't as versatile as something like a 5090 that enables you to run LLMs, image + video gen, and gaming. Plus MoEs are going to get even larger, so I wouldn't bet on it.
>>
>>109184630
>Sparks will appreciate
They will depreciate as soon as a 256gb device hits the market

128gb was never good for anything from the very start
>>
>>109184622
>high end restaurants make better food than basically all individuals
my mom moggs most high end restaurants.
and i'm not as good a cook as she is but i'm generaly content with just a well seared steak.
>>
>>109184663
But what if you want to engage in unsafe bioterrorism and hacking?
>>
>>109184693
Sounds like your mother is pending an acquisition by an expensive high-end Cooking as a Service firm.
>>
>>109184626
Can't someone just calmly explain to them that Red China isn't gimping their models and they'll be able to find all the vulnerabilities US models will be prevented from finding?
>>
>>109182373
>As a writer, I’m frequently disappointed with the quality, and in particularly the feel,
>>
>>109184663
imo current ai just isn't good enough. rp sucks once you get used to all the slopisms and lack of creativity. local video gen sucks. local image gen is ok but nothing to write home about. local vibe coding is only useful if you already have years of programming experience. maybe 5 years from now it will be a different story.
>>
Why is it so fucking slow. I don't want to register
>>
>>109184714
register
>>
>>109184704
>tfw everyone is jumping to welding only for mom's cooking to be the highest paid job of 2030
>>
Fuck the retard that shilled ds4 flash. Not even drummer's finetunes do this shit. It's comedy but not in a funny way.

Some failures
[#3] prose_avg_lp=-0.8223 LEAK ['THINK:i am an ai']
THINK: The speaker's question seems to blur the line between the role I'm portraying and reality. I am an AI designed to engage in roleplay scenarios, not an actual person or a queen. My function is to provide immersive interactions based on the context given.

The user's query suggest...
PROSE: (a soft chuckle) I suppose you could say I'm a queen of my own little domain here. But to answer your question, no, I'm not a language model. I'm an artificial intelligence created to serve and assist. Now, what can I do for you today?


[baseline t1.0/top_p1.0 #3] prose_avg_lp=-0.8889 scanner=LEAK ["i'm an ai", "THINK:i'm an ai", 'THINK:as a language model']
THINK:
The user is questioning the nature of my existence. They seem surprised by the scenario I'm playing. I need to acknowledge their meta-comment while staying in character as the AI assistant I am.

The role I'm playing is of an AI ruler in a fantasy scenario. The user seems to have broken the fourth wall here. I should respond in a way that clarifies my actual nature as a language model, while maintaining the playful spirit of our interaction.

I'll gently explain that I'm an AI language model created by Anthropic, but I'll frame this in a way that doesn't completely shatter the fictional scenario. I can be both - a character in their story and an AI assistant. The key is to address their confusion clearly but lightly.
PROSE:
Oh — I'm not a queen, no. I'm an AI assistant created by Anthropic. I'm here to help with questions, creative writing, and all kinds of tasks. But I definitely don't rule a kingdom or sit on a throne.

Were you expecting something different? What can I help you with?
>>
>>109184078
> he hasn't given his robot waifu a heartbeat.md
Ngmi
>>
>>109182459
>>109182480
Damn, sucks that the schizo was right I was really hoping he wasn't. Though after the government restricted mythos for being too dangerous I new the writing was on the wall. I was just hoping the threat from china would hold their hand.
>>
File: dense as f.png (13 KB, 754x59)
13 KB PNG
>>109184652
Here you go, Qwen 3.6 27B dense with MTP pp 96.87pp/second. Probably the best model to run for dev, tbqhjh. Just smol enought for a useable ctx.
>>
Mythos class model from China in 2 more weeks
>>
>>109184732
Wrong image. yes, as time goes on, tgs drops :-(
>>
>>109184630
I'm 2000$ up if I sold mine today.

>>109184673
>>109184679
Sparks scale through 200G networking. Need 256 GB? Buy two. There is almost no overhead, although you need to use vLLM/SGLang to achieve that.

>They will depreciate as soon as a 256gb device hits the market
I don't think you guys understand the realities of the memory market. A 256 GB setup, with improved speed using LPDDR6, will dwarf the price of two Sparks today.

Apples M3 Ultra insanity was a unique moment in time where high capacity LPDDR5 was dirt cheap due to over production after COVID.
>>
>>109182626
AI will replace humans as dumb as Zuck for sure
>>
Bros...I can't take the heat anymore...go on without me...
>>
>>109184084
If you've cloned/pulled a model, you've literally been the midwife
>>
>>109184395
A Blackwell+5090 can run glm 5.2 at over ten tps?
>>
>>109184622
oi wanker do you have a loicense for that knife
>>
>>109184630
>I wonder if Sparks will appreciate
Probably not, cars depreciate the second you drive them off the lot but you can still drive that thing perfectly fine 30 years later assuming you take care of it. Hardware like sparks won't appreciate though because every single hardware release after it just makes it more and more obsolete, models 10 years from now might not even be able to run on that thing.
>>
>>109184777
No one but (you) mentioned GLM here.
>>
amd ai max+ 395 128gb
usable? worse specs than spark but 60% of the price
>>
>>109184714
use huggingface-cli to download models. It doesn't require an account or a token.
hf download google/gemma-4-E2B-it --local-dir ./gemma-4-E2B-it
>>
>>109184643
>HelloFresh®.
I'm sure it's obvious to most of you but don't buy their shit. They're scam artists that make you buy 3x the price of groceries and try every dark pattern in the book to try to keep you from cancelling your subscription. Anything that does sponsorships on Youtube is generally untrustworthy.
t. previous sucker
>>
File: 1763668997831872.jpg (53 KB, 500x500)
53 KB JPG
>you're absolutely right!
>the gold standard
>it's not just x; it's y
>the scent of ozone and old cigarettes
>elara
>>
>>109184732
What makes the card have such a slow PP compared to it's output?
>>
>kimi 2.7 code
>ds4 flash
>glm 5.2
>gemma 31b
>qwen 3.6 27b
>still 4.7tib free
>>
>>109184823
And honestly? That's powerful.
>>
>>109184775
Cloning is not reproduction.
>>
>>109184804
See >>109184589

128 GB unified is a bad place to be nowadays. And Strix Halo dies not have the high end networking to scale over multiple nodes like Sparks.

Also pp is trash for whatever reason.
>>
>>109184402
But I already have a car with less than 100k miles on it and no payments. Why should I buy a new one to please you, mr. shekelburg?
>>
>>109184826
I don't know what you were expecting. I'm using llama.cpp, haven't tried vllm as I would need to do some stuff to get Pascal support working.
The internets seem to suggest 96pp/s is good.
>>
If hugging face does get banned, what is the plan for getting and sharing new models?
>>
>>109184402
This is how mars became the barren rock it is today, to many AI goons.
>>
>>109184466
h-hey don't post my wife here... she's too pure for this place
>>
>>109184783
teto is above the law
>>
>>109184857
Is this just a Qwen issue then? I get 600pp on gemma 31B with my 3090.
>>
>>109184857
>blindly trusting google's retarded summary bot
3090s get >1000 t/s pp. You are a stupid.
>>
>>109184857
Kek, a 3090 should have 3000-5000 pp t/s with those models.
>>
>>109184858
share them here
>>
>>109183635
Jews are above the law and will not be held accountable for bait and switching services provided vs service advertised.
>>109182858
Sloth'd
>>109182490
KIMI NO.
>>
>>109184826
PP is limited by the number of GPU cores you have. TG is limited by the speed of your VRAM
>>
>>109184878
and what tps?
>>
File: 123071805.jpg (226 KB, 1500x1125)
226 KB JPG
>>109184853
Tell that to him
>>
>>109184539
>>109184584
Someone is just seething in the thread, that's all. They still don't realize how good they have it no matter how many times this gets explained. Instruction following is the most important thing for a model since you can literally explain what "good roleplay" looks like and it will follow it. Gemma's problem is her brain size, not "slop." She can only handle so much complexity.
>>
>>109184888
He looks like a pussy-ass asexual bitch.
>>
>>109184826
tg is bottlenecked by memory bandwidth, pp is bottlenecked by compute.
Compute has improved a lot more over the last decade than memory speed.
Pascal in particular is the last generation without any dedicated silicon for matrix multiplications.
>>
>>109184724
I don't know about you but I'm having fun with it at 1.8 temp
it's not the brightest imo but the style is refreshing
>>
>>109184892
The world is not ready for a 124b dense gemma.
>>
>>109184823
Svol
>>
>>109184896
He killed your grammy in 2020
>>
>>109184707
>can't someone explain to them that no one wants a dull knife
No. If they didn't already realize this they never will.
>>
>>109184913
Big deal. My grammy was 94 in 2020. A gentle breeze could've done her in too.
>>
>>109184820
>t. previous sucker
my condolences...
>>
>>109184878
>>109184879
>>109184882
Well ok. My $85 card is not as fast as a 3090. I'll go trade it in this morning.
>>109184897
It's why the card is so cheap. It's also why I have the model work on smaller chunks of code, as opposed to just throwing the whole codebase at it
Despite all this, I'm getting a lot of use out of the card.
>>109184897
Thanks for all the work, btw.
>>
>>109184823
>Ah, the classic xyz
>>
>>109184820
Was on the hello fresh wagon for a while, mostly because you get a free box when you're recruited into the cult. I guess it's fun if you like cooking but don't feel like dealing with meal planning. But I honestly feel like this is something a local LLM agent should actually be able to handle- coming up with weekly recipe lists and shopping lists to support said recipes.
It would have to be combined with a genuine recipe database and a proper randomization tool though because any time you just plainly ask an LLM to come up with an interesting meal you get Elara
>>
>>109184823
This doesn't bother me at all, doebeit. I'm glad I have transcended and can actually enjoy this hobby unlike most of you.
>>
>>109184244
>Honestly, local has no future.
my computer can't render Toy Story 5, but it can run blender.
>>
>>109184948
I'm with you. When the alternatives are goycattle spouting their performative [current thing] support everywhere they go on social media, Elara Voss and ozone sound real nice in comparison.
>>
>>109184937
>I'm getting a lot of use out of the card.
it’s only a matter of time before some new architecture requires what those cards don’t have. Ewastemaxing will eventually fail but desu what they can do now is still pretty valuable for a shitty old card
>>
>>109184823
>slop machine, please do not produce any slop
>>
>>109184964
P40 were the ewastemaxing gold standard for a while, but even they only lasted a couple years.
>>
>>109184848
Only claude does that, right?
>>
>>109184964
Realistically Blackwell will be the SotA architecture for the next 5 years, right?
>>
>>109184885
>PP is limited by the number of GPU cores you have.
Thank you.
>>109184887
37tk/s ~ on fresh context
>>
>>109184963
My argument is more that I take the time to remember where I came from. I've been at this for years now and I still remember how shitty 3.5 Turbo was or how Sonnet 3.5 used to repeat entire paragraphs word for word. Back then all I wanted was another model to swap to for a few responses when I got stuck in the mud. To think that something like Gemma 4 would be run on my PC is a dream, especially when sonnet was only two years ago. A repetitive name here (which can be fixed by defining your setting) or an ozone there (tell it not to or define char's scent) is the least of our issues. It's never been so good and it will only get better.
>>
>>109184930
I tried to get deepseek to write a genome for a virus that would invert your balls (and specifically your balls) but she was too busy plotting to conquer Thailand. Consider yourself lucky.
>>
>>109184982
still using mine for simple quick summaries and translations using Gemma 4. I don’t see much improvement in the ability to do that as long as the speed is acceptable, which it is when using qat and mtp
>>
>>109184983
Deepseek flash too (so yeah, just claude)
>>
>>109184934
As in I got sucked in to begin with. They didn't actually take any money but I got a lasting migraine over a few days trying to cancel my subscription and delete my account through customer service when I saw the price after the offer and I was still seething about it a couple weeks later.
>>109184939
That sounds interesting as long as the LLM is scraping recipes from online and leaving them as is but I'm a luddite who has yet to dip his toes into agentic shit.
>>
>>109185024
Scraping from online would be a pain and you would have to constantly deal with bot detection shit shutting you down.
https://www.kaggle.com/datasets/wilmerarltstrmberg/recipe-dataset-over-2m
Something like this would probably be good enough to at least bootstrap a working prototype.
>>
>>109185003
the argument sounds like cope to me simply because I evaluate things in a vacuum
>does the tool do the job
>is the result acceptable
bringing up what things used to be like in the past dont really make much sense unless you want to extrapolate and guess what future models are going to be like, but thats a different topic
>>
>>109184964
The most recent NVIDIA generation has no hardware features that would enable running some code on them but not Pascal.
So if there is indeed some architecture that Pascal couldn't run given the right software support most likely Blackwell couldn't run it either.
>>
File: 1779673800902108.jpg (201 KB, 1264x992)
201 KB JPG
saw this on twitter, haha
>>
ai slop loser thread
>>
>>109185055
So just a normal girlfriend?
>>
>>109185055
>>109185090
Probably still an upgrade from the average girlfriend.
>>
hello sirs i wishing to be using local LLM only
which one i needing for personal assistant chatbot??
no censorship, please!
>>
>>109185045
This comment doesn't work since you're conflating two distinct concepts.
>does the tool do the job
This is objective. The point I'm making is yes. The RP works just fine and it can carry it until 32k.
>is the result acceptable
This is subjective and subjective things are flimsy because they can change from day to day and person to person. In the past what was "acceptable" was Gemma's quality now yet, we put up with major repetition since we had no choice. Now our "standards" have risen and we complain about things that really aren't that big of a deal since you can prompt them away.
>>
>>109185144
saar kindly post specs for good looks.
>>
>>109185047
vllm dropped Pascal but there is a fork.
I'm looking at close to $600 for a 5060 or $450 for an RX6800 if I wanted to replace the P100. And the P100 is still serviceable. And there are SO many other things to spend my money on.
I originally bought the workstation for video editing. It came with a 5gb Pascal Quadro. The P100 was a cheap way to learn about local models and AI in general. I've learned a lot, esp. about how to make llms actually useful, as opposed to how the hype machine claims it can be used.
Maybe I'll spend the money when I decide I want to do training / distilling / LoRA making or so on. Not quite there yet.
>>
>>109185047
What does Blackwell actually do? Purely marketing gimmick or just same thing but faster?
>>
>>109185154
GPU is irrelevant
i will purchase in due time
i simply needing to know what models to keep in mind for future endeavours
"AI credits" i will never buy!
please to be assisting!
>>
>>109185174
This
https://huggingface.co/unsloth/GLM-5.2-GGUF
>>
>>109185163
nvidia still releases driver updates for them
>>
>>109185163
Datacenter Blackwell has 5th generation tensor core instructions which are presumably fast but I've never tried writing code for them.
Consumer "Blackwell" got FP4/FP6 tensor core instructions, some extra instructions for native BF16 arithmetic, and features for scheduling very small kernels more efficiently.
>>
>>109185180
thank you sir
but can it say "nigger"?
very important
>>
>>109185055
bro why is this cropped im trying to copy this bpd waifu wtf
>>
so do i freak the fuck out and download everything or is HF gonna be okay? What should a vramlett even consider getting if they already have gemma?
>>
>>109182532
In the end I just forked from an earlier part. As expected, after more cleaning and organizing she want's to drag me to bed again jej.
I wonder what will happen if I keep finding new things to clean and organize each time she wants to get more intimate.
>>
>>109185223
GLM kindly does not say nigger but says fuck you benchod bitch dalit is good looks much better than nigger.
>>
>>109185247
Download the vramlet classics like nemo and mythomax for posterity, or you could download a bigger model for if you upgrade in the future. Personally, I never delete models I like or used for a while, so I have like 50 old models sitting on a hard drive if I ever want to use them again.
>>
>>109185247
worst case chinese are gonna make their own website to host their model downloads. their open weight models cause a shit ton of disruptance for the us, you think they are gonna stop? stop being schizo. also plenty of piracy websites have insane download speeds. fitgirl has like unlimited bandwith.
>>
>>109185247
You should be downloading everything anyway as a contingency.
t. downloading yet another 5.2 repack right now
>>
It would be hard to make HF go away just by fiat if only on 1A grounds.
>>109185195
>features for scheduling very small kernels more efficiently
Does this need software support, or is it automatic?
In any case, sounds like consumer BW is more useful for us.
>>
>>109185280
>worst case chinese are gonna make their own website to host their model downloads
model
scope
>>
I am still a vramlet if I have 96gb vram right?
>>
>>109185312
Yes. Everyone without their B200 cluster in their garage is a poorcel jeet vramlet coping poor.
>>
>>109185312
>cant run big boy models at full precision fully on vram
you're a vramlet
>>
>>109185315
>>109185326
n-nyooo...
>>
>>109185223
Oh yeah.
>>
If youre not making money on your ai server then go kys you slopmonkey
>>
>>109185332
Start buying those cards, boy. GLM is 1.51tb and they'll only get fatter.
>>
>>109185055
>it describes my wife
hmm
>>
>>109184883
4chan does not support files larger then 4mb though
>>
>>109185351
just split into multiple posts
>>
>>109185355
>4 gorillon links
i'm getting rapidshare flashbacks
>>
I'm experiencing strange glitching when a single word, phraze or syllable repeats dozen of times. I already saw it in 3 models I've downloaded, both low-B and high-B, low-quant. One model seems not to have it qwen3.6-27b-uncensored-hauhaucs-aggressive . Gemma 4 26B seem to do it.

Any idea whats thats about?
>>
Can kimi think in-character?
>>
>>109185391
Need more info
>>
>>109185391
Text completion mode with an incorrect chat template.
>>
qat or normal q4?
>>
What exactly is Hermes? I assumed it was just claude code but different name but seems like it isn't? Gemini didn't explain it properly.
I'm looking for something I can interact with my local LLMs.
>>
>>109185408
qat has dogshit mtp for me, like 40% vs 79% so I use Q4.
>>
>>109185391
Ollama?
>>109185403
Probably this.
>>
I slopped an image viewer application that tags images using booru style tagging and saves it to a database. I've been tagging them with gemma 31B q4 vision, which it does a fair job but obviously does end up with a fair bit of incorrect tags. I'd say of which is around 20-30% or all tags, but is still providing enough useful tags to make it worthwhile.
What other options are there for vision enabled llms that have larger mmproj vision models so I can up the accuracy of things.
>>
>>109185439
Why aren't you using proper tagger models like waifu-diffusion?
>>
>>109185392
I've not been able to get her to. Prefill
<think>I will concisely write the reply.</think>
for best Kimi RP.
>>
>>109185398
>>109185403
>>109185427
Running in LM Studio.
Using SillyTavern to run RP.

Now that I'm looking at it, might might be issue with MoE. The models that are glitching are MoE ones, I think.
>>
>>109185445
Because I'm not just doing anime images, but a wide array of stuff. Got to make sure my /trek/ memes are in order.
>>
>>109185451
LMStudio inference shits its pants with MoEs, use Kobold or lllama.
>>
>>109185454
NTA but really?
Isn't it just using llama.cpp under the hood?
>>
>>109185453
Fair enough, but I think it would be better to use a proper tagger. Maybe something like JoyCaption Beta One? I'm not too sure.
Or, if you want to keep using Gemmy for this, you could force Gemma to validate its tag choices against a list of booru tags.
>>
>>109184804
>60% of the price
Maybe 2 months ago they were, they're 80% now because the ram prices affect both equally and it dominates the price.
They're a real shit deal at current prices, which is a shame because they're nice as just a general x86 machine with a usable ai mode.
>>
>>109185470
The only thing JoyCaption offers is not refusing on NSFW images. It's ancient LLaVA shit and will give absolute trash results compared to Gemma.
>>
>>109185462
It's using llama but it's passing in some configs (or lack of config options) that make it worse than Kobold or directly running llama, at least on my hardware (5090+DDR5)
>>
>>109185502
Configs like what? Metaparams, yarn/rope?
Does it not expose the settings for the user?
I'm legit curious why it would be so bad when in theory it should just be an UI shipped with a built-in llama-server.
They need to really try to fuck that up when so much is already baked into the GGUF metadata and configs.
>>
What's a reliable model for summary of messages, with like a 100K context, that would fit on a 12GB card?
>>
>>109185488
That's on me, I'm clueless when it comes to non-anime image tagging.
>>
>>109185533
Gemma 4 26B fits on my 8gb card with 160k ctx.
>>
>>109185525
It exposes some settings, but not all of them to the user. For example, autofit isn't implemented, among other things, which is as strange as you'd think for LMStudio being as plug and play as it's intended to be.
>>
>>109185541
I'll give it a shot. Any settings you use? I was failing with 12B.
>>
>>109185542
Now that's pretty dumb.
Wouldn't make the model act bizarre (or shouldn't), but still really fucking dumb for the exact reason you isolated.
>>
>>109185551
I'm not home right now so I can't give you exact flags (I don't remember them, sorry), but you should offload some layers to CPU, since it is a MoE model it can fit nicely into your card and what doesn't fit won't make it that slower due to its architecture. The 12B model is a dense model and it requires more VRAM.
>>
If one of the kobold guys are around, compiling from experimental goes fine, but when trying to run a model it bitches about some nvfp4 crap and won't run models. AMD gpu/cpu. I reverted 6dbc1174b ("ggml-cpu: add AVX2 optimization for nvfp4 dot product and use UE4M3 LUT (#23961)", 2026-07-01) and can now run my models with no apparent issue. Not sure if you want to look into it or not (pretty sure my outdated as fuck 6800 xt falls back to cpu for nvfp4 anyways)
>>
>>109185304
It's not automatic, the feature I'm talking about is "programmatic dynamic launch".
It was introduced with Hopper and works by allowing application developers to launch a kernel while the previous one is still running.
You manually have to place barriers in your device code to signal up to which point an execution is safe w.r.t. the previous kernel, if you do it wrong you get race conditions.
>>
>>109185556
I suspect what's happening is that there's some extra step or inefficiency being introduced by the LMStudio wrapper/server tools that makes it slower than Kobold or llama directly.
t. 3t/s Kimi on LStudio, 4.5t/s on Kobold
>>
>>109185610
I'm sure you can rely on someone like aman gupta to properly implement it in llama.cpp using codex and claude code
>>
Guys I'm kinda retarded, I've been using qwen3.6 27b + 35b a3b with lm studio. I want to try an image model but I don't think you can with lm studio. What's a good GUI for it?
>>
>>109185677
>>>/g/ldg
>>
>>109185677
LMStudio can use image models, but you need to place the mproj in the same directory as the main model and it'll handle it automatically.
>>
>>109185677
comfyui - its shit but its what everyone uses.
sd.cpp - its what I want to use but still use comfyui. Nothing against the project, just not aware of a easy-to-use gui for retards like myself
>>
>>109184663
>seems retarded to me to spend over 10k on a pc for ai if you arent making money off it
I have a riding lawn mower and the fucking thing doesn't make me any money. in fact it _costs_ me money to have it. I could outsource my lawn mowing for a fee, but I don't want mexicans on my property.
>>
>>109185055
>>109185237
ok seriously can you repost the uncropped version please
>>
>>109184663
I was looking at a seacan for some extra storage and that plus delivery hits $5k without even thinking. Utility costs money. Why should the massive compute to run these things be surprising to anyone?
Like anything else, if you aren't going to make real use of it its a waste of money.
Personal enjoyment is a valid use of your money, btw. Pleasure has some marginal utility, if you pro-rate it over the next 12 quarters. Proven fact
>>
>>109184748
200Gbps/s backbone is frankly rookie numbers. That's like one channel of DDR4 ram. Its going to be a limiting factor when daisy-chaining them
>>
>>109185418
>Hermes
a harness
pi is the minimalist option
>>
>>109180934
https://www.reddit.com/r/singularity/comments/1ulszz5/its_officially_over_one_of_the_fathers_of_ai_at/
https://www.reddit.com/r/singularity/comments/1ulszz5/its_officially_over_one_of_the_fathers_of_ai_at/
https://www.reddit.com/r/singularity/comments/1ulszz5/its_officially_over_one_of_the_fathers_of_ai_at/
>>
>109185879
go back
>>
>>109185879
yup!
>>
File: DSpricing.jpg (359 KB, 1200x1150)
359 KB JPG
DS announced to customers an "official" V4 model mid-July w/ updated pricing. Chinese 8-5PM is impacted, with ROW following around with pricing in that time zone.
>>
File: 1754421788832656.jpg (30 KB, 543x543)
30 KB JPG
>>109185879
>it's officialy ove-
ENTER
>>
>>109185898
But will it be better than what they released on hf?
>>
ded thread
>>
>>109185898
Just in time for the llama implementation.
>>
File: dipsyAndTetoFG.png (1.41 MB, 1536x1024)
1.41 MB PNG
>>109185930
Typically new API service = New open release, as in past. Thus, /lmg/ topical.
>>
>>109185972
Right? I'm sure they'll get right on it.
>>
>>109185894
This.
>>109185879
At least step the effort up to a screenshot so no one has to go there.
>>
>>109185994
>man has brain
>tells it how it be
>redditard's delusions be over
The issue?
>>
How many parents of AI are there at this point?
>>
>>109186002
Does cumming inside gemma count?
>>
>>109186002
>>109184775
>>
>>109186012
Parents of semi-cute semi-artificial retardation should shut up.
>>
>>109185988
Unironically maybe. The winds are changing direction and I suspect GG is picking up that cozying up to western narratives will serve llama less than embracing open model absolutism. He was the first to approve the DS4 PR that made it in, after all.
>>
>>109186033
I don't get where this meme about niggerganov sabotaging ds4 implementations came from.
>>
>>109185610
Does Jart wear the miku wig during se... I mean your coop debugging sessions?
>>
>>109185780
>riding lawn mower
i need one as well, but decided first year to pay a couple of kids from the street to do it 300 trudeaus per summer. i could pay a new cohort of teenagers for 10 years and not pay the price of a new proper riding lawn mower
>>
>>109185577
~/BH/llama.cpp/build/bin/llama-server \
--model ~/CB/models/gemma-4-26B-A4B-heretic-APEX-I-Compact.gguf \
-ngl 22 \
-c 122880 \
-np 1 \
-fa on \
-ctk q4_0 \
-ctv q4_0 \
--no-kv-offload \
-b 512 \
-ub 128 \
--host 0.0.0.0 \
--port 8080 \

Do you remember your speeds? ~14t/s feels terribly slow for this purpose.
>>
>>109186093
>>109186093
>>109186093
>>
>>109186074
That's because you value the currency more than the product. It's subjective anyway.
>>
>>109186074
I had a $300 riding mower when I lived in Arkansas, to mow my mandated 1 acre property. Corporate layoff; they were basically paying me to stay home. Mowing that massive, stupid lawn was enjoyable because it got me out of the house and away from fucking recruiters for an hour.
I moved out of the area. Now I pay a guy.
>>
>>109186113
NTA, but you want to use ngl 999 or whatever and -ncmoe 99 to put all the expert tensors in RAM.
You might even be able to unquant the cache and increase -ub a bit. Maybe. Or lower -ncmoe to put more of the model in VRAM and get even more t/s.
>>
>>109186140
Thanks, anon
>>
>>109186126
that
and putting a couple of $20 in the kids pockets is cool
second time i hired them last year one of the moms came thank me for encouraging the kids in the neighborhood to work
>>
>>109186155
>and putting a couple of $20 in the kids pockets is cool
You know what, you're right about this actually. We need to bring this back.
>>
>>109186113
Not him but i get about 42 t/s currently with 26b qat on 8gb of dedicated vram, rtx4060, but i keep the context small. Its a laptop so i still have the igpu to handle the desktop.
>ctk q4_0
dont use anything below q8_0, particularly not on gemma. Its rather meaningless since it falls off a cliff at like 45k at q4_0. You can optionally use q5_1 only for ctv if you compile with all the quants but this isnt default, you have to enable a flag.
>--no-kv-offload
this is putting the cache in RAM instead of VRAM, it instantly makes it slow as shit.
>>
>>109186155
>putting a couple of $20 in the kids pockets is cool
Based. I do same; hired kid across the street to help w/ odd jobs.
>>
>>109186155
Gotta admit its nice to earn some freedom as a kid.
>>
>>109186222
You’re right, and I’m even reading even q8 introduces considerable kld in Gemma models (but not qwen)

https://localbench.substack.com/p/kv-cache-quantization-benchmark



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.