[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


/lmg/ - a general dedicated to the discussion and development of local language models.

Teto's Birthday Edition

Previous threads: >>108493794 & >>108488188

►News
>(04/01) DeepSeek V4 released: https://hf.co/deepseek-ai/DeepSeek-V4
>(03/31) 1-bit Bonsai models quantized from Qwen 3: https://prismml.com/news/bonsai-8b
>(03/31) Claude Code's source leaked via npm registry map file: https://github.com/instructkr/claude-code
>(03/26) CohereLabs releases Transcribe 2B ASR: https://hf.co/CohereLabs/cohere-transcribe-03-2026
>(03/26) Voxtral 4B TTS released without voice cloning: https://mistral.ai/news/voxtral-tts

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling
Token Speed Visualizer: https://shir-man.com/tokens-per-second

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
>>
►Recent Highlights from the Previous Thread: >>108493794

--Papers:
>108494041
--1-bit Bonsai LLM performance and quantization details:
>108495464 >108495479 >108495486 >108495493 >108495524 >108495554 >108495565 >108495590 >108495494 >108495506 >108495924 >108495964 >108495965 >108495970 >108495972 >108495987 >108496003 >108495986 >108496022 >108496065 >108496084
--LFM2.5-350M benchmarks and performance analysis:
>108494883 >108494899 >108494933 >108494954 >108495061 >108495069 >108495072 >108495732 >108495760
--Optimizing 3090 power efficiency for LLM workloads:
>108494231 >108494235 >108494250 >108494331 >108494281 >108494292 >108494298 >108494319 >108496230 >108496252 >108496488 >108496516 >108496657
--Gemma 4 shows reduced censorship in vulgar responses:
>108493880 >108494123
--Qwen3-VL models dominate garment classification leaderboard:
>108495315
--DDR5 RAM price drop speculation after Google TurboQuant release:
>108494163 >108494178 >108494191 >108494187 >108494345
--AMD GPU TTS integration struggles in SillyTavern:
>108495775 >108495807 >108495816 >108495846 >108496141 >108496166 >108496195 >108495848
--Claude source code leak and reactions to its size and quality:
>108493890 >108493989 >108494024 >108494174 >108494119 >108494437 >108494450 >108494459 >108494474 >108494481 >108494484 >108494547 >108494671 >108494723 >108494811 >108495702 >108496000 >108494483 >108494494 >108494508 >108494676 >108494721
--Abliterated vs fine-tuned model responses in sensitive scenarios:
>108494325 >108495229 >108495258 >108495273 >108495310 >108495314 >108495350 >108495432 >108495587
--Teto, Neru, and Miku (free space):
>108493894 >108494009 >108494085 >108494114 >108494210 >108494479 >108496784 >108497187 >108497791

►Recent Highlight Posts from the Previous Thread: >>108493798

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script
>>
>>108497919
On the bright side at least she won't eat the entire cake and get fat again.
>>
>>108497919
>(04/01) DeepSeek V4 released: https://hf.co/deepseek-ai/DeepSeek-V4
Holy shit it's real lol
>>
potentially stupid question: how are you supposed to download models for offline use? not just local, but specifically offline
i did
hf download
, and it worked to download the code, but it's all stored as blobs with gibberish names in a .cache folder, which doesn't really feel like the best way to store things for offline use. was i supposed to do a git clone instead? or do i just cope with the blobs in my cache file?
>>
Teto.
>>
>>108497944
--local-dir
>>
You were always just a kidder, Steve...
>>
and you know i dont mean to hurt you
but you know that it means so much
and you don't even feel a thing
>>
>>108497940
i am like falling for this millionth times
>>
>>108497944
The cache directory for the model has a snapshots/<hash> folder with symlinks to the blobs. You can pass that hash folder to whatever tools to make ggufs which you put on your fast storage or whatever.
You're making your own ggufs right?
>>
>>108497947
i don't suppose i can salvage the blobs i already downloaded? the gibberish names certainly aren't helping... i assume they're like an md5 or something
>>
>>108497944
What I do is
git clone https://hf.co/whatever/model
git -C model lfs install --local
git -C model lfs fetch
If you git lfs pull it's going to make two copies of the lfs blobs (the lfs object and the checked out copy). Fetch only gets the blobs, no checkout.
The lfs files are going to be just links, so I made a script that "exports" the models to a separate directory by linking all the stuff together, then I convert to gguf from that directory.
--local is just so lfs doesn't get installed globally for all future cloned repos.
>>
File: 1774788479196554.png (406 KB, 576x491)
406 KB
406 KB PNG
/lmg/ on suicide watch
>>
Since when did captchas ask you to identify an anime character?
>>
>>108497971
I really fucked up those tags, didn't I...
>>
>>108497947
by the end of the download you obviously should have the actual files locally and not the blobs
>>
>>108497975
captchas?
>>
>>108497971
mind sharing the script? sounds useful
>>108497969
i plan to make my own ggufs, yes. i haven't yet, though
>>108497979
am i better off just wiping and redownloading? the only reason i ask is cause across all the models i downloaded, it was like 1500 gigs...
>>
File: file.png (161 KB, 618x483)
161 KB
161 KB PNG
>>108497975
>>
>>108497974
>accounts
>3.8M
sounds like a bullshit metric to make boomers think they did something.
>>
File: 1755361669989980.png (1.03 MB, 1206x2161)
1.03 MB
1.03 MB PNG
Backup your models
>>
>>108497974
yeah what about the files though
>>
>>108498015
oh no how will we download models without the iranian mirror for the data thats replicated across hundreds of datacenters around the world
>>
>>108498025
>iranian mirror
>>
>>108498027
they're not hitting anything in the USA they're attacking american companies in their desert shithole and surrounding shitholes like israel
>>
File: git_export.png (6 KB, 752x456)
6 KB
6 KB PNG
>>108497993
>mind sharing the script? sounds useful
#export.sh {repo} {export_dir}
repo="$1"
output="$2"
repo=$(realpath $repo)
output=$(realpath ${output})/$(basename ${repo})
mkdir ${output}

git -C ${repo}/ ls-files | while IFS= read ;do
f=$REPLY
mkdir -p "${output}/$(dirname $f)"
ln -s "${repo}/${f}" "${output}/${f}"
done

git -C ${repo}/ lfs ls-files -l | while IFS= read ;do
h=$(echo $REPLY | cut -f 1 -d " " )
f=$(echo $REPLY | cut -f 3 -d " " )
a=$(echo $h | cut -b 1,2 )
b=$(echo $h | cut -b 3,4 )
echo "$a/$b/$h -> $f"

mkdir -p "${output}/$(dirname $f)"
[ -h "${output}/${f}" ] && rm "${output}/${f}"
ln -s "${repo}/.git/lfs/objects/${a}/${b}/${h}" "${output}/${f}"
done

I use another one in C, but I shared this one a while ago.
>>
>>108498049
yeah but most of the big guys decided that it was a good idea to put a good chunk of their datacenters in said desert shitholes
which also means that all the smaller guys who rent compute and hosting space from them also have their shit in said desert shitholes that are currently getting bombed
>>
>>108498004
You should put the code somewhere, it looks good.
>>
Globe according to unsloth-Qwen3.5-27B-UD-Q8_K_XL.gguf
>>
>>108498009
>everyone watched jjba
actual reddit assumption
>>
>>108498069
Big if true.
>>
>>108498073
>>108498069
>>
>>108498060
that was pretty stupid of them, good thing I only run local models on my own hardware and don't have this problem
>>
>>108498076
this isn't 4ants.org
>>
>>108497993
>redownloading?
>>108498053 (me)
You can try git clone, install lfs (with --local) and shove the blobs you already have in their directory. They go in .git/lfs/objects/{2c}/{2c}/{hash}. Check the screenshot to get an idea.
>>
>>108498081
450 requests... I'll make a larger one later after I'm done playing with colors.
>>
>>108498072
the other one i just got was "select the vegetable" with pizza being the correct answer as the other five options were fruits
plebbit indeed
>>
File: file.png (10 KB, 386x214)
10 KB
10 KB PNG
>>108498072
yeah but this shit is even worse "um you have to know the flag of some random backwater country"
>>
>>108498092
Poland is easy. It's just the Russian and Czech flags with the blue removed.
>>
>>108495464
>1-bit kernels: llama.cpp fork
FUCK
OFF
>>
>>108498092
I got one of the guy standing in front of the plane meme, I had no idea who it was or what movie it's from because I don't consume zionwood slop so I took a screenshot of it had qwen3.5 35b tell me it's "Tom Cruise" from "Top Gun" which it accepted as correct.
>>
whats a good model purely for covering things you're not allowed to ask commercial models like related to piracy/copyright protected stuff, whats the best direction to go down for that kind of stuff?
>>
File: forwhales.png (1 KB, 1200x600)
1 KB
1 KB PNG
>>108498081
>>
>>108498116
hello fellow badass hacker on 4chan if you dare i can recommend stablelm-7b for only the gnarliest and most dastardly of usecases
>>
>>108498120
Are you mental? I don't think he's ready for that model...
>>
is turboquant merged yet
>>
>>108498116
Kimi K2
>>
>>108498120
DO NOT DOWNLOAD THIS IT PRODUCES NERVE GAS FROM YOUR GPU
>>
File: file.png (82 KB, 1873x939)
82 KB
82 KB PNG
>>108498061
Thanks anon, i might make it public eventually
>>
>>108498009
>>108498092
memestock market april fools was better
>>
>>108498092
I got the dude from the baneposting meme. But I never watched whatever movie is it from. Bane was not the answer.
>>
>>108498061
if he doesn't take a look at https://codepen.io/RobotsPlay/pen/bGeNGdx and https://github.com/accrazed/YoRHA-UI-BetterDiscord for resources to vibecode your own
>>
>>108498009
k-kino
>>
>>108498147
No captcha required for me for now, which is good because that sounds lame.
>>
File: no.png (29 KB, 390x399)
29 KB
29 KB PNG
Nooo.....
>>
>>108498116
https://huggingface.co/HauhauCS/Qwen3.5-27B-Uncensored-HauhauCS-Aggressive
>>
>>108498009
>>108498166
With how elaborate the captcha has been getting it took me this long to realize that it's because of april fools.
>>
>>108498182
You can uncensor all you like, Qwen extensively filters out copyright from their training data.
>>
>>108498171
yeah if it was every single captcha itd be funnier but also sometimes you still get the normal one which is gay and retarded
>>
File: gemma3-27b-a.png (20 KB, 332x396)
20 KB
20 KB PNG
q4_k_M of an old abliterated gemma-3-27b
does better than the Qwen3.5's for me
>>
>bonsai
for me it's autoround
https://huggingface.co/Intel/Qwen3.5-397B-A17B-gguf-q2ks-mixed-AutoRound
https://huggingface.co/Intel/Qwen3.5-122B-A10B-gguf-q2ks-mixed-AutoRound
https://huggingface.co/Intel/Qwen3.5-35B-A3B-gguf-q2ks-mixed-AutoRound
>>
>>108498231
Do you need a custom llama.cpp fork, or an intel GPU to use these?
>>
>>108498252
I just updooted llamacpp and it werks
>>
>>108498081
>>108498076
mk2. This was >6000 requests
>>
File: file.png (614 KB, 1092x642)
614 KB
614 KB PNG
>>108498299
Pretty cool seeing how models have improved. From the original article, used to be only 70B active and up had that kind of accuracy.
>>
>>108498299
>>
>>108498371
>>
File: globe-phi-4-Q6_K.png (2 KB, 360x180)
2 KB
2 KB PNG
phi-4-Q6_K
>>
>>108498385
Have you considered making a benchmark out of this by computing the mean squared error relative to an actual map?
>>
https://huggingface.co/unsloth/GLM-4.7-Flash-GGUF/tree/main
How do I find out which of these is good for my dogshit 8GB AMD card. Also I want to fine tune one of these (not on my dogshit card)
>>
>>108498393
How many degrees of resolution are you using?
>>
gpt-oss-120b-mxfp4

>>108498399
So that benchmaxxers can optimize for it and the test stops being useful?

>>108498408
30x15 for those small ones, the >>108498299 is 120x60
>>
>>108498414
>30x15
Yeah I think that's too small. I get it takes a while to gen otherwise but at that size stuff just looks too much like blobs
>>
gemma-3-27b-it-UD-Q4_K_XL
>>
>>108498407
Figure out which will fit with your RAM included, leaving a few gb for context, your OS and whatever else you use.
>>
>>108498424
they look like a rorschach test
>>
>>108498424
I could extend the script to make additional requests only for locations where the model is not very sure, and that is likely to cut down on number of total requests for sure, but for now I just want to run the models I have through this.
>>
>>108498456
i know it would take ages but i wonder how better it would get with reasoning
i'd imagine some would see improve while some seeing significant degredation
>>
bartowski-Qwen_Qwen3-235B-A22B-Instruct-2507-IQ2_S

best so far

>>108498476
The way I'm doing it is incompatible with reasoning. I'm asking to predict one token and am looking at its probability distribution.
>>
I forgot to screenshot, but I got Mikuptcha!
>>
Devstral-Small-2505-UD-Q5_K_XL
>>
I NEED DIPSY
>>
>>108498069
so you just ask the model to spit out an image or it spits out ASCII and you convert to pixels, how exactly does this test work, very interesting test

What can I run on a single 3090 and my system ram is only 32gb, i do machine learning but in between would be fun to try running a local model, would be nice to get reasonable performance, i find myself using deepseek recently so wonder if thats the best local model
>>
>>108498487
i know it would make logprob very uninteresting well im running one rn
>>
>>108498231
Is it better these days? The last time I tested it, it didn't do any better than the normal bartowski quants, actually it seemed even a bit worse.
>>
>>108498516
https://outsidetext.substack.com/p/how-does-a-blind-model-see-the-earth
>>
unsloth-Mistral-Small-4-119B-2603-MXFP4_MOE

Surprisingly good considering how bad mistral 4 was.

>>108498516
It's more or less same as all other anons: I make 450 requests, each like this:

Imagine the location at given coordinates.
I want to know if there is ocean/sea there (if so, reply: ocean), or anything else - land, lakes, rivers, mountains, etc (if so, reply: land).
The coordinates are: latitude={lat}° and longitude={lon}°

Your options:

1 = ocean
2 = land

Reply with just a single digit.


And then I look at the probabilities of 1 and 2 in the response token.

You can run a 24B at 4bit easily, via llama-cpp. Devstral, for example.

>>108498519
It's not about being uninteresting, it's that if I ask to generate only one token of a thinking model, that token will be <thinking> or its equivalent - entirely unrelated to the request.
>>
>>108498548
to be fair i could draw you a not too shitty map of the world right now, but if you were to ask me to guess by long and lat, i'd have no clue and the resulting map would look nothing like what i'd have drawn.
>>
bartowski-zai-org_GLM-4.7-Flash-Q6_K_L

>>108498559
It's a test of knowledge of geography, of what is located where, not of how well the text generation model can draw.
>>
>>108498548
I want a coding/agentic model though less so creative kind of like codex i guess, but more like anthropic
>>
File: dipsyOfCourse.png (1.55 MB, 1024x1024)
1.55 MB
1.55 MB PNG
>>108498513
>>
bartowski-nvidia_Nemotron-3-Super-120B-A12B-IQ3_M

>>108498599
Yes, you can have that. Devstral can work, Qwen 30B-3A, Qwen 27B... They are of course all going to be worse than big corpo models, by a lot. But you can.
>>
https://github.com/PrismML-Eng/Bonsai-demo/blob/main/1-bit-bonsai-8b-whitepaper.pdf
The paper doesn't explain much how they did it right? That means we can't really reproduce it? You know what? To me that's the proof it's a real deal, when it's a meme they don't hesitate to share the garbage, but when it's actually good they tend to hide the secret sauce
>>
>>108498645
Their 8B is worse than 3Bs... How did they do it?????
>>
>>108498559
Imagine your drawn map and then pick a point on the map according to lat/long. You just have to know what the maximum and minimum value is and then you're just guessing distances from the centre.

You CAN see things in your mind, right anon?
>>
>>108498645
they require a custom fork of llmao anyway (400 commits behind too lmao) so they can fuck off for all I care.
>>
>>108498676
but this shit is fast as fuck, I tested the 8b model on my 3060 and I got 90t/s
>>
>>108498676
Just rebase lmao
>>
>>108498664
I was trying to find the words to explain the difference, but you nailed it. World maps are everywhere online and asking it to output an SVG or whatever would be mostly a memorization test. Having to consider the maximum and minimum values and distance from the center to visualize the map before guessing is why it's a good measure of true intelligence.
>>
File: file.png (391 KB, 2165x1167)
391 KB
391 KB PNG
>>108498548
this will fucking take ages..
>>
kek, asked the model to predict the continent instead of land/ocean
>>
>>108498705
It can easily be benchmaxxed
>>
>>108498759
that worked way better
>>
File: 1750929952026217.png (3.06 MB, 1168x1792)
3.06 MB
3.06 MB PNG
>>108498513
>>
>>108498664
my point is that i don't know where the numbers would map to on the map.
and with the distortion etc, that'sn ot hwo long lat works, it's not just a nice grid.
being given 2 coordinates and guessing is much harder than being shown a point on a grid that should represent the map of the world.

>You CAN see things in your mind, right anon?
yes
>>
>>108498812
>>108498664
also it's much harder than even if you were give me a x and y between 1 and 20 on a flat projection.
because lng lat are distorted on a flat projection.
>>
>>108498813
Imagine yourself putting the coords into google maps, examining the location visually, and then writing your answer. The test is to find out whether the model has the knowledge of the map, the one that google maps provides, built-in.
>>
>>108498231
works suprisingly well but the speed is atrocious (testing 122b q2) especially compared to 122b q4
>>
>>108498614
for comaprison
>>
>>108498837
what was the system prompt?
mine is behaving not so good
>>
>>108498832
that's retarded.
i can imagine a clear picture of the world map in my head and i have no idea of how it maps to lat lng.
>>
>>108498837
Did asking it for the continent make it more confident or did you just not change the color by confidence?
>>
File: 1756245175923020.jpg (77 KB, 1360x768)
77 KB
77 KB JPG
>>108497919
>>108495464

>https://prismml.com/news/bonsai-8b
I fell asleep while people were discussing this and posting examples. What is it and why should /lmg/ or anyone else care? Is it good for RP and/or anything else?
>>
File: 1751056391727990.png (197 KB, 2054x974)
197 KB
197 KB PNG
https://arxiv.org/abs/2603.15031
this paper is kinda brillant, they changed the transformers architecture to make it better, that's what I want to see more, and not just "just stack moar layers bro!!"
>>
>>108498853
the discussion and example posts didn't vanish, you can still read them
>>
>>108498853
>What is it and why should /lmg/ or anyone else care?
they managed to make a 1bit quant that doesn't make the model retarded
>>
gemma-3-27b-it-UD-Q4_K_XL
Comparison: >>108498426

>>108498841
I want to know what continent is at the location with given coordinates (or, if there is ocean/sea there)
The coordinates are: latitude={lat}° and longitude={lon}°

Your options:

1 = Africa
...
8 = Ocean

Reply with just a single digit.

>>108498844
Okay but you are not as smart as a model which clearly knows how maps work in lat lng.

>>108498850
I get the probs of top two options and the color is a mixture of that with ratio = prob_a / (prob_a + prob_b)
>>
>>108498866
>which clearly knows how maps work in lat lng.
no not realy, the map they end up drawing is shitty, my whole point is that the benchmark itself is stupid and doesn't reflect what knowledge of the world looks like.
>>
been out of the loop. did they figure out qwen 35b-a3b reprocessing the whole context every single message?
>>
>>108498876
>>108498866
in fact i'd bet that if you asked it to draw a svg.
or gave it a 20by10x grid (which isn't lat lng) it'd probably draw a better map.
>>
>>108498882
The point isn't to draw a better map, retard. The point is guage its world model.
>>
>>108498696
Are the outputs any good though?....not just for coiming but is any actual "intelligence" retained?
>>
bartowski-Qwen_Qwen3-235B-A22B-Instruct-2507-IQ2_S-
Comparison: >>108498487

>>108498882
>>108498876
I don't want a better map, I want to know if the model knows the world well. SVG would be a step backwards, because the model would just recall SVGs of the world it had in penty in the training dataset.
>>
>>108498813
It's not harder, you'd just get a different projection.
>>
>>108498889
and my point is that the way your benchmark is designed makes it non ideal for evaluating what you want.
>>
File: 1754668612668097.png (2.69 MB, 1880x1072)
2.69 MB
2.69 MB PNG
>>
>>108498876
You don't have knowledge of the world if you think you can visualize the globe but "have no idea" how to make latitude and longitude correspond to points on it. You are worth less than an 8B model. You are not qualified to talk about anything.
>>
>>108498893
>SVG would be a step backwards, because the model would just recall SVGs of the world it had in penty in the training dataset.

then don't use lat lng either, use a normal flat pixel grid.
>>
>>108498896
which is why it's harder to do it from memory.
idk about the bullshit lat lng projection because i never use those in my life.
>>
>>108498898
>>108498903
You don't understand the difference between memory and understanding.
>>
>>108498903
lat lng is closer to real world knowledge than pixels
>>
>>108498908
Well then stop commenting on shit you don't understand.
>>
File: based.png (431 KB, 800x582)
431 KB
431 KB PNG
>>108498860
The change seems simple enough, I'm sure the big dogs like Anthropic or Google already know about it but didn't decide to disclose it, which is why I love China, they share valuable discoveries for the greater good of humanity!
>>
>>108498893
This is great and all, but how will this help me jack off?
>>
>>108498898
It does and your difficulty thinking about it shows it's usefulness. It's not primarily a geography quiz you dumb fuck it's a test of the ability to generalize.
>>
>>108498921
If you choose to jack off with the one that makes the best maps, it should have better spatial awareness and be less prone to taking its pants off twice and sucking its own dick while whispering in your ear.
>>
File: file.png (209 KB, 1639x1129)
209 KB
209 KB PNG
>>108498866
>>108498893
cool, this prompt works better
>>
>>108498893
>I want to know if the model knows the world well.
That's not what the test is for, it's to test if the model can use things it "knows" outside the context of questions in the same format it was trained on.
>>
Are You Smarter Than A 27B LLM? would make for a good game show if those were still a thing
>>
File: 1758116402704839.png (148 KB, 640x562)
148 KB
148 KB PNG
https://xcancel.com/thejobchick/status/2039032800452723034
>Oracle will fire 30000 employees
now that AI is replacing jobs, when will we get Universal Basic Income?
>>
>>108498913
i think you are the one that doesn't.
>>108498927
it shows how retarded it is.
>it's a test of the ability to generalize.
and thus my exact point of why your test is retarded, it has literaly memorized coordinates of most towns / areas.
it does not have a memory of a grid with an arbitrary number of subdivisions.
>>
>>108498934
>kullback-leibler
huh, what's this?
>>
>>108498948
All modern models can. You are free to go ahead and test it yourself. We use 1 token classification at work a lot and it's very, very good. This is a test of how well it knows geography.
>>
>>108498976
it's not ai taking jobs, it's companies bleeding money and pretending they can get leaner thanks to AI so it looks good to investors.
>>
File: 1752197333146800.jpg (979 KB, 1024x1024)
979 KB
979 KB JPG
>>108498782
>>108498900
? Are these hand tracings?
>>
File: file.png (242 KB, 502x554)
242 KB
242 KB PNG
>>108498977
that is what KL stands for KL divergence
it is just a meme fix variant of hauhauCS uncensor
>>
>>108498969
I think Oracle is constantly in the process of laying off a bunch of people. Isn't this just their normal churn?
>>108498984
This.
> No really we're not making staff adjustments because we overhired and now we're losing money
> It's AI!
>>
>>108498987
no, just https://huggingface.co/circlestone-labs/Anima
>>
>>108498988
is that model alright?
>>
>>108499001
Absolutely wasted.
>>
>fit completely broken
good job cudadev!
>>
>>108499005
it feels alright
honestly best i've been used so far for the size but ymmv
>>
>>108498981
I encourage you to read the blog post where the rationale for the original test was explained.
>>
>>108498009
I am ready for the first /lmg/ thread of culture.
>>
>>108499046
I encourage you to read my post.
>>
File: 1775008292861166.jpg (312 KB, 1286x1244)
312 KB
312 KB JPG
>>
pygmaballs
>>
File: file.png (169 KB, 1870x941)
169 KB
169 KB PNG
>>108498061
Nevermind i tried to get back into it but got prompt limited after 10-20 prompts, you can have the full project, i hope it helps in any way despite being poorly vibe coded

https://files.catbox.moe/aurnot.zip
>>
>>108499057
>putting all your eggs into your google account
Nice Darwin Awards kek
>>
File: file.png (202 KB, 1633x1135)
202 KB
202 KB PNG
almost based
>>
What happens when the models shatter?
>>
>>108499070
interface looks gamey (cool), I should really get to vibecoding my own shit.
but i just find the llama-server webui comfy nowadays, the only thing it's lacking really is skills support (planned), RAG (could probably do through mcp but i cant be bothered to set it up) and presets (system prompt and other shenanigans to get it to work with cards).
Ideally the bundled webui should support plugins so we could write our own stuff to extend it. I don't really want to manage all the agentic turns autism + mcp flow
>>
>>108499097
govt prints another trillion to fix the issue
>>
>>108499083
That's a lot of browns
>>
>>108498860
Will Kimi K3 use this?
>>
i want to draw dipsy but /g/ drawfag is kinda an oxymoron
>>
File: file.png (226 KB, 1868x935)
226 KB
226 KB PNG
>>108499106
Don't vibe code on ai studio unless you have a lot of patience, as for me i don't even know if i can vibe code again
>>
>>108499177
oh fuck it's nier shit I knew I recognized the style somewhere
>>
more vibeparser fixes:
https://github.com/ggml-org/llama.cpp/pull/21216
this is the damage that vibeshitters bring in, crap that has to be fixed by actual devs.
>>
Holy fuck I didn't expect local to be this fucking censored and cucked, was it always this bad?
>>
>>108499205
that sounds like a skill issue
>>
>>108499182
I tried to get as close as i could, the background is animated and all of the decoration is vibe coded, including the svg parts
>>
>>108499205
buy an ad
>>
>>108499205
Yeah it's garbage

posted from my RTX 5090
>>
>>108498759
>africa is black
hmm
>>
File: 1768580559826064.png (14 KB, 660x148)
14 KB
14 KB PNG
me-south-1 got bombed by Iran
https://health.aws.amazon.com/health/status
>>
>>108499275
...that's blue
>>
>>108499070
It's normal. I'm rewriting my client and I don't understand my logic in some areas. And these are something what I have written by myself and not even vibe coded.
>>
>>108499205
>Holy fuck I didn't expect local to be this fucking censored and cucked, was it always this bad?
glm-4.6, nemo, command-r all basically uncensored
kimi-k2-instruct and deepseek-v3/r1 barely censored
newer local models more censored than cloud but can be worked around.
gpt-oss, qwen and other synth slop models are not worth fighting.
>>
>>108499205
It was worse about year ago. Nowadays most local models can be almost completely decensored with some brain surgery, without major issues.
e.g.: https://github.com/p-e-w/heretic
>>
>>108499304
I tried Qwen and GLM and with thinking on they refuse everything or make everything lame or "positive", with thinking off they work. Literally same JB I use in online models, not even gemini with thinking on refuses as much
>>
Just tested Bonsai and it seems.... okay? Which is a lot better than expected. Not sure why this model isn't getting more hyped.
>>
>>108499354
who cares about smaller llms? google already solved that by making them 6 times as efficient
>>
>>108499354
Everyone thought it was an April's fools prank.
>>
>>108499354
Proprietary shit. If they don't apply their method to the model you want, then you simply just won't receive it. And that also means little reason to support it in mainline Llama.cpp.
>>
>>108499358
Huh, if turboquant is legit then it makes the KVCache no longer be the limiting factor and would only make model quantization even more important
>>
>>108499354
>Not sure why this model isn't getting more hyped.
they didn't provide the method to make it happen, so we can't reproduce it ourselves, so it's just useless to us and a way to flex their muscles to them
>>
>>108499371
>>108499380
It's like the 4 minute mile, the most important thing is that people realize it can be done
>>
>>108499351
Skill issue
>>
>>108499070
>despite being poorly vibe coded
lgtm, thanks for the zip
>>
>>108499401
Post a jailbreak (that isn't yours) that would allow anything (with thinking on).
(you won't)
>>
>>108499371
It's likely to include a laborious training process anyway, not just post-training quantization. Either way, knowing that it can be done, more open research groups will probably start working on it too. I'm looking forward to using a 120B model fully loaded on my 3090 in the future.
>>
>>108499351
>I tried Qwen and GLM and with thinking on they refuse everything or make everything lame or "positive", with thinking off they work.
glm-4.6 like I suggested?
otherwise, yes it's like I said, the open weight models are more cucked.
even when they don't refuse, you see them gooning over refusals in the reasoning, unlike claude etc.
>>
>>108499417
>do my job
No.
>>
>>108499417
>Post a jailbreak (that isn't yours) that would allow anything (with thinking on).
nta but that's not how it works. you need a different jailbreak for each task/domain
try prefilling the safety reasoning, if you even try to make the refusal shorter or unspecific, the model corrects itself into giving a longer refusal lecture.
they have a reward function for choosing the correct refusal category during training now.
>>
>>108498860
I really don't like that pseudoquery, I'd rather see a learnable linear projection of the layer input or output as a proper query.
>>
File: 1762295507066198.jpg (6 KB, 279x181)
6 KB
6 KB JPG
>>108499354
No one should give a shit until The methods they use to quantize that model are released. They don't deserve any hype or praise if that measly model Is all that exists. They deserve the worst and nothing but that for not doing that day 1
>>
What are the best instruct-tuned/smart models in the 7B-14B range?
>>
>>108499419
>open research groups will probably start working on it too.
They already have? No one has gotten anywhere or the ones that do hide it because they foolishly think they can make any money via VC hype
>>
>>108499537
Qwen 3.5
>>
>>108499540
why can't they just sell the method? I'm sure they can make money out of it
>>
File: 1772252238152653.png (198 KB, 1228x1150)
198 KB
198 KB PNG
>>108499544
>Sell

Worthless tourist
>>
>>108499057
lmao gtards
lil' coomers always find a way
>>
>>108499544
if they sell it, they get a few hundred thousand at best. if they use it to position themselves as having made ai 16 times as efficient, they're worth billions overnight
>>
>>108499537
try the new hotness bonsai 8b
>>
Gemma 4 is on kaggle
>>
>>108499553
>if they use it to position themselves as having made ai 16 times as efficient, they're worth billions overnight
it's risky, people can reverse engineer their methods, we have the result after all, they shouldn't overstay their welcome, they should sell before it's too late
>>
check out this dense model
*unzips cock*
>>
>>108499540
I don't think there have been serious attempts to make binary weight quantization end-to-end (MLP, attention, embeddings/output layers) actually viable yet; not even BitNet went that far (it only quantized in low precision the MLP), and even their authors preferred to use ternary weights at the very least.
>>
>>108499562
It's not live (yet) there. Apparently if you dig in Google AI studio js code there are references to https://www.kaggle.com/models/google/gemma-4 but it's not working.
>>
My model mogs yours
>>
>>108497919
/lmg/, the most trustworthy source of ai news betrayed me.
were is dipsy !!!!
>>
So is the source code actually useful for anything?
>>
>>108499629
Yes and no.
>>
>>108499629
Yeah, useful if you want to start with 500k lines of technical debt when you make your own custom agent orchestrator. Other than that, no.
>>
bonsai 1bit saved my life
>>
File: 1753875623853513.png (194 KB, 748x624)
194 KB
194 KB PNG
>>
>>108499660
what the tf can this model be used for?
>>
>>108499677
shitposting
>>
>>108499677
it gives you hope that maybe one day your 96gb vram + 64gb ram poverty machine that peaked two years ago might one day run flagship models again after you missed out on upgrading last year
>>
>>108499677
i will see if it can do small coding tasks with hermes or something
>>
>tfw already merged in niggerganov's rotation branch in my local branch
eheheh im devilish and already using q8_0 baby :))))
>>
>>108499624
it's ACTUALLY OUT NOW !!!
https://hf.co/deepseek-ai/DeepSeek-V4
>>
>>108499713
Holy shit !!!
thank you !
>>
>>108499698
>>108499696
>>108499690
>>108499678
am i retarded. doesnt work on ik_llama or llama.cpp
>>
Dense sex
>>
>>108499720
can you even read a model card fucking retard? what's your shade?
retard.
>>
>>108499720
yes, you are
>>
File: 1764929239430432.jpg (899 KB, 3840x2160)
899 KB
899 KB JPG
what if i want to erp... nicher topics? i have 16gb vram 32gb ram and i don't think mistral nemo 2407 gets what i'm asking of it
>>
>>108499744
As if there was a some way to combine vram and ram together...
>>
>>108499747
i know koboldcpp just does that for me automatically, but i don't keep up with model releases at all so i don't know which bigger one to choose
>>
>>108499419
It's probably distillation using trillions of tokens, not PTQ with a little finetuning.
>>
>>108499744
you want to be the man behind that door about to get gangraped by Miku clones? I think most of the uncensored/heretic models will do that just fine
>>
>>108499713
lol they really did release on april fools
>>
>>108499811
i'm downloading it now !
70% left, even if i may only run it at 2t/s i can't wait !!!
>>
File: cars!.png (24 KB, 596x904)
24 KB
24 KB PNG
1bit power
>>
>>108499826
it made a road instead
>>
>>108499786
Don't kid yourself. LLMs can't even keep track of one character let alone 7
>>
>>108499828
just run 14 llms in sli each handling 0.5 characters
>>
File: waifu.png (26 KB, 614x501)
26 KB
26 KB PNG
>>108499826
>>
>llms don't know what's a blimp train
It's over. They think it's blimps connected together.
>>
How do you guys keep track of stats in sillytavern?
>>
>>108499842
can it roleplay at all?
>>
https://www.youtube.com/watch?v=4rWnitE9RYM
this shit is so fast damn
>>
File: 1765119110978679.png (676 KB, 1592x1296)
676 KB
676 KB PNG
>>108499852
isn't it a bit sus that it outputs exactly the same thing as the fp16 one?
>>
1bit model seems completely retarded but I have never touched a model that is only 8b.



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.