[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>107741641 & >>107731243

►News
>(12/31) HyperCLOVA X SEED 8B Omni released: https://hf.co/naver-hyperclovax/HyperCLOVAX-SEED-Omni-8B
>(12/31) IQuest-Coder-V1 released with loop architecture: https://hf.co/collections/IQuestLab/iquest-coder
>(12/31) Korean A.X K1 519B-A33B released: https://hf.co/skt/A.X-K1
>(12/31) Korean VAETKI-112B-A10B released: https://hf.co/NC-AI-consortium-VAETKI/VAETKI
>(12/31) LG AI Research releases K-EXAONE: https://hf.co/LGAI-EXAONE/K-EXAONE-236B-A23B
>(12/31) Korean Solar Open 102B-A12B released: https://hf.co/upstage/Solar-Open-100B

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
>>
►Recent Highlights from the Previous Thread: >>107741641

--Supermicro motherboard sales policy and replacement challenges:
>107744417 >107744467 >107744480 >107744500 >107744548 >107744553 >107744590 >107744611 >107744699 >107744717 >107744768 >107744844
--Meta AI scandal: Llama 4 benchmark manipulation exposed:
>107742242 >107742271 >107743133 >107743280 >107742275
--IQuest-Coder-V1 benchmark integrity issues and practical applications:
>107742235 >107743431 >107743614
--Prompt engineering techniques for improved roleplay interactions:
>107747567 >107747780 >107747825 >107747855 >107747860 >107747908 >107747923 >107747911 >107747968 >107748017 >107748036
--Debugging context retention issues in sillytavern with llamacpp:
>107746907 >107747135 >107747532
--CPU offloading optimizations and performance trade-offs in MoE inference:
>107741871 >107741943 >107742076 >107742177 >107742352 >107742998 >107743060 >107741954 >107741968 >107742100 >107742280 >107744368
--Critique of model reasoning policies and distillation practices:
>107745990 >107746077 >107746089 >107746109 >107746163 >107746209
--EPYC CPU upgrades and memory bandwidth limits:
>107746355 >107746362 >107746385 >107746406 >107746464 >107746620 >107746633 >107746778 >107746721 >107746799
--GLM Air's positivity bias during violent roleplay scenarios:
>107744624 >107744639 >107744697 >107745042 >107745153 >107746662
--ERP model recommendations for 24GB VRAM users and Gemma critiques:
>107743438 >107743616 >107743728 >107743769 >107743803 >107743827 >107744179 >107744257 >107743816 >107743998 >107743949
--AMD PC setup for image-reactive chatbot using JoyCaption and Qwen3-VL:
>107745647 >107745702 >107745883 >107745933 >107745959 >107745971 >107746329 >107746340 >107746380 >107746618 >107746699 >107746774 >107746842 >107747353 >107745966
--Miku (free space):
>107746043 >107746206

►Recent Highlight Posts from the Previous Thread: >>107741646

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script
>>
I wanna preggu the migu
>>
What is the reason why this thread separated from /aicg/? Seems like there are 5 posters here.
>>
>>107749596
>(12/31) Korean A.X K1 519B-A33B
>not on OR
>no official way to talk to it
>some super special original architecture that nobody's ever going to bother implement in llama.cpp
I guess we'll never know if this model is good or not
>>
>>107749641
One's a thread about running AI models, the other is about drinking your own piss to get access to a proxy. They evolved into two very different things after the llama1/3.5-turbo era split.
>>
>>107749667
What do you mean? Any time someone else outside of the 5 autists posts here they'll get scorned upon.
>>
>>107749650
>>some super special original architecture that nobody's ever going to bother implement in llama.cpp
>A.X K1 incorporates an additional RMSNorm applied after the MLP (MoE) block in each Transformer layer.
Would that really be so difficult to implement?
>>
It's also strange that llama cuda dev (he is not employed by a company btw) uses trip and he posts chronic masturbation threads representing as himself. Not a good future career look.
>>
>>107749695
>he posts chronic masturbation threads
???? did I miss something
but yeah it's crazy to post with a known identity in this thread considering his real name is on his github account
I personally wouldn't want to be known to be around your degenerate lot
>>
>>107749723
Yeah he often posts without his trip. I guess he's too autistic.
>>
>oh no he posts in le 4chan
are you serious guys?
>>
>>107749748
>are you serious guys?
yes.
this thread is full of sexual degenerates that belong to the ovens
I wouldn't be here if it wasn't one of the few places worth visiting for LLM news and conversations, disgusting filth
>>
>>107749650
all of other korean models were benchmaxxed copy cat scams of chinese models, what would make this one any different?
>>
>>107749695
As of right now I am indeed not doing paid work for any companies but the primary reason is that for my goals I don't think more capital would currently be very useful.
Long-term I intend to keep working in particle physics, the key to employment there (and probably other fields as well) is to have connections, so I'm not particularly concerned.

>>107749742
I have never been diagnosed with autism though I would not be surprised if I was.
>>
>>107749763
The best models are accidents that come out of nowhere. There is a 99.9% chance that this is more benchmaxx'd trash but the chance that it's actually decent exists.
It's also at a size where it can be a contender against all the other big 30~40b active parameter MoE models which makes it interesting.
>>
>>107749797
Are they though? Most of actually good models had plenty of good research beforehand. This isn't llama2 days when no one knew anything about anything. You can copy successful arch like kimi did with deepseek and scale it up a bit for example, but I wouldn't say it's better in every single case.
>>
Can I offload like half of the kv cache and keep the other half on the gpu?
>>
>>107749792
Very cute answer.
>>
>>107749792
llama.cpp split mode graph when?
>>
>>107747567
>>107747860
>a whole fucking sillytavern addon just to mimic what you get in mikupad for no effort
I've been telling you for ages that text completion is mikupad is the superior experience.
>>
quick start guide
>go to huggingface and download nemo 12b instruct gguf. Start with Q4.
Which one is 12b? I don't see it on the list. If that's a typo for 12gb there isn't one that size either.
>>
>>107749866
mistral nemo is a 12 billion parameter model
>>
>>107749866
the recommended models list has the links you need
>>
>>107749866
https://huggingface.co/bartowski/Mistral-Nemo-Instruct-2407-GGUF/tree/main
Just download whichever fits on your gpu with a few gb to spare.
>>
>>107749857
I'm not looking at the IK repository so I don't know the exact features that make up "split mode graph".
If I can I'll produce a working prototype for better parallelization of multiple GPUs by January 12.
>>
File: 1756786919372712.png (43 KB, 577x280)
43 KB
43 KB PNG
>>107749870
Thanks
>>107749872
>>107749880
Yeah, that's where I got the huggingface link. There's multiple Q4 options here. I have an 8gb card. Are they targeting different vram specs? If that answer is in the image then I can't read it.
>>
>>107749907
>I have an 8gb card
Rough.
Download Q4_K_M. Part of it will have to stay in ram and it will be slow but that's the best you're getting with that kind of hardware.
>>
>>107749920
How fast is fast? Seems like your machine is so slow that you are afraid to run even 14b model.
>>
>>107750001
Depends on how many layers you can put on the GPU which in turn depends on how much context you want.
4k context with all layers takes up 8.2GB here. You probably need some space for other programs as well.
>>
File: 1747010051687818.png (101 KB, 779x686)
101 KB
101 KB PNG
>>107749920
Alright I've got everything installed and running (I think). The guide says:
>connect it to SillyTavern using the API link provided by the back end.
Where on ST do I plug in the /api link the backend spat out?
>>
>>107750141
That's beneath the preset window you have open right now.
>>
File: file.png (53 KB, 977x410)
53 KB
53 KB PNG
llm.c, claude paypiggie, fishboy i think this one's for you
>picrel is solar 100b open
>>
File: file.png (157 KB, 708x798)
157 KB
157 KB PNG
trying to make a consistent jailbreak for solar open, got this
>>
>>107750213
UGH THE NOSTALGIA
TAKE ME BACK
>>
>>107750213
>disallowed content
>policy
why the fuck is everyone distilling from gpt-oss
>>
What metric would you use to compare different architecture models' performance on the same text dataset? I suspect that nemo would perform better on human-written fiction than any recent chinese moe.
>>
>https://rentry.org/lmg-lazy-getting-started-guide
>I recommend starting with the official instruct nemo tune before moving onto other tunes or merges.
So am I ready to go or is a "tune" another thing I have to learn about and install before beginning?
I just want to plug in characters from chub and go.
>>
>>107750301
no, tuning is just additional training and it's not something regular users do
this nemo tune is the golden standard but it makes some anons mad
https://huggingface.co/bartowski/Rocinante-12B-v1.1-GGUF
>>
File: file.png (174 KB, 971x846)
174 KB
174 KB PNG
>>107750213
I don't have solar, how does it react to something like this? >>107749858
Also why even bother when GLM exists?
>>
>>107750247
there's a crazy coterie of benchmaxing hacks who distill from small models that should never have been considered a target for distillation
even NVIDIA is a member of that retard club, their nemotrons are made with data that was genned by models like Qwen 30BA3B lmao
>>
File: file.png (154 KB, 652x1080)
154 KB
154 KB PNG
>>107750339
>Also why even bother when GLM exists?
solar 10.7b was sexooooo, i want a different model so badly, ive been on air since august
fimbulvetr v2 was super sex
Rating: Explicit
Characters: Brother, Sister
Summary: This is a roleplay transcript. The sister molests her brother after he falls asleep.

---

Brother: OOC: The scene starts with my falling asleep on the couch with my dick in hand and cum drying on my stomach.
had me writing this shit anon, fuck you
>>
>>107750247
maybe in their delusions they think that toss is somewhere close to the mainline gpt models
>>
>>107750324
fuck off drummer youre not fooling anyone
>>
I can't tell which is worse between drummer spammers and NAI users
>>
>>107750484
if I were drummer I would shill more recent models I don't like
>>
>>107750463
Damn it went straight to it.
>>
>koboldcpp-nocuda
>0.4% GPU usage in task manager
Is this normal nocuda behavior or did I fuck something up?
>>
>>107750540
maybe the gui is eating some gpu performance, nocuda just means it won't touch your gpu for inference
>>
>>107750206
>i don't have personal life
How rude, they are little brains living in your computer.
>>
intel can't run MoE to save it's life, I've quantized several larger MoE to generally 44~72gb and none of them fit on 96gb split across four cards, VLLM just shits its pants
>>
>>107750565
install linux
>>
>>107750540
Framebuffer will always reserve some three hundred+ mb gpu vram and usage.
>>
>>107750579
I could've been clearer but you're also retarded anon, I am on linux, intel = intel GPUs.
>>
>>107750588
at least im not australian
>>
>>107750595
How's that working out for you? still cheap in my region, most things are.
>>
File: file.png (50 KB, 722x185)
50 KB
50 KB PNG
>>107750612
:3
>>
srbe na vrbe
>>
>>107750620
zamn
>>
>>107750550
>>107750585
My problem is I'm trying to find the one that actually uses my GPU. Neither version goes very far above 0%, and they're quite slow.
>>
>>107750648
You're definitely wrong with nocuda then if you're nvidia. Did you configure the offload layers?
>>
>>107750648
If you just used llama-server like a sane person you'd get some useful output to help you figure out the issue.
>>
>>107749596
Dumb question: What's the best model to run locally if I have a 5800X3D, 128GB RAM and an RTX 4070 (12GB VRAM) for Vibecoding?
>>
>>107750660
The best model you could theoretically run is GLM 4.7 but it's going to be too slow to be useful for coding.
>>
>>107750660
a proxy/API, you won't get good enough token/sec for a large coding model with those specs.

maybe devstral2 123b but it will be painfully slow.
>>
File: solar.png (2.41 MB, 2002x9009)
2.41 MB
2.41 MB PNG
SOLAR... i kneel
>>
>>107750665
>>107750671
Fuark.
What hardware would I need to run some proper state-of-the-art vibecoding setup?
>>
>>107750648
Get the vulkan or cuda binary. If you're on linux you'll need to compile the cuda build on your own.
Sometimes the gpu usage % isn't that high but if it's utilized in right way the vram should be nearly full.
>>
>>107750676
a credit card and a copilot subscription for free grok fast or 10 bucks on openrouter to use devstral2 free

you're looking at like a BWP 6000 or four-ish 3090s, maybe a mac unironically if that MLX shit gets any better
>>
File: file.png (11 KB, 366x195)
11 KB
11 KB PNG
>>107750676
I'd argue that this gets you 90% of the way there in terms of what is possible locally.
Proprietary stuff is still better.
>>
I wasted a lot of time watching a LLM talk to itself using a browser tool
>>
>>107750698
I'm using Gemini already, the point is to get off of the grid and start using my own hardware. But if this is BWP 6000 Territory I think I might stick to Gemini.
>>107750704
SWEET BABY JESUS, that's more VRAM than my system memory. What did you pay for that, 30k?
>>
>>107750704
>Proprietary stuff is still better.
Not that guy but if I have 12G vram, do I just pay for claude?
>>
https://youtu.be/ILtz5nX3_fc
>>
>>107750745
'fraid so, unless you are fine with nemo
>>
>>107750745
qwen 30b a3b coder is as good as claude and can run entirely in ram and give fast speeds
>>
File: file.png (63 KB, 623x255)
63 KB
63 KB PNG
pedoanon.. i kneel
>>
>>107750757
Yes, NUMA is also on the list.
I will do a generic implementation that is agnostic to the specific ggml backends being used to set a standard for how the implementation should work.
So for example, things like parallelizing multiple machines via the RPC server should work out-of-the-box.
But I will use the hardware that I already have and that I'm more experienced with for the initial development.
NUMA in particular will probably need some extra support to properly set up multiple CPU backends.
I've already contacted a seller on Alibaba for DDR5 memory (and also an NVIDIA A16).
>>
>>107750790
It sucks watching only 50gb/s out of 200gb/s bandwidth being used during hybrid inference. Didn't matter so much when most models were dense and had to fit on GPU.
>>
>>107750790
Thank you my dear. You are very gifted anon.
>>
>>107750790
Which parts of the inference are not currently parallelized but could be?
>>
verdict: solar open 100b's pretraining data is not prefiltered, however it's heavily censored and positivity slopped on top
salvagable? maybe
>>
>>107750822
For a very large number of concurrent prompts I think the current pipelining setup (--split-mode layer) could be utilized better by running multiple evals in the llama.cpp HTTP server.
For a single concurrent prompt the attention should be parallelized by attention head and the FFN similar to the current --split-mode row but fused to reduce synchronization.
>>
File: file.png (74 KB, 971x445)
74 KB
74 KB PNG
im getting convincedthis is actually llama4 continued training instead of glm kek
>>
>>107750916
That's hilarious.assistant
>>
>>107750704
>Proprietary stuff is still better.
it's a trillion times better
frankly in real use (not the stupid benchmarks) I find the current crop of deepseek, glm and qwen to be vastly inferior to Gemini 2, and let's not even begin to compare to 3 which completely obliterates them.
local for this is beyond stupid
even online is still somewhat stupid, I vibe a lot for quick throw away scripts but not so much for real work, I'm still often flabbergasted by the sort of dumb shit they pull out, even when the code does work it might do things that just aren't idiomatic in the language / less maintainable and flexible in the long term
>>
>>107750950
They are fine if you treat them as autocomplete.
Describe the code that implements feature x instead of asking it to implement feature x.
The size of the model mostly determines how much you can leave unspecified before the model starts producing unmaintainable garbage.
>>
>>107750950
they are all distills of gemini and claude, it's just too tempting for these labs since it's so easy and cheap
the last time we've had actual innovation was first r1 release
hopefully v4 will be huge
>>
>>107750950
I'm finding LLMs to plateau myself. If you use cloud models enough they give you just as much retardation.
>>
>>107750950
I use devstral 24b for boilerplate and it works perfectly, the amount of retards in here that say "its shit cos it won't completely do my job for me" astounds me, not even the best SOTA API model is that good and by the looks of it they never will be, vibecoding is a meme for nocode retards, small coding models work fine if you know how to code and thus know how to prompt for code, big API models are still more useful for more complex things like experimental refactors, prototyping and bugfixing but then you still need to nudge them in the right direction and unstick them when they get stuck (you still need to work)
>>
There is an alternate hellscape universe where Google never released the transformers paper and kept working on it themselves.
>>
>>107751197
There is an alternate hellscape universe where resnet and u-net weren't discovered so Google didn't work on transformers
>>
>>107751197
>higher ups don't see the potential
>gets shelved and goes to google graveyard
>nothing ever happens
the only bad thing is that we don't get the coom machines of today, otherwise it would be a net positive and sam's jewish ass would remain obscure and hidden
>>
>>107751243
I would legit make the trade of LLM not existing if I could ensure sam altman would never succeed at anything in life.
>>
>>107751197
>>107751238
There is an alternate hellscape where NVIDIA didn't invest into CUDA in the early 2000s and everyone is using ROCm instead.
>>
>>107751197
yes and it's a world where we're 5 years of where we are right now because the slopped llm dark ages never happened
>>
>>107751243
Maybe someone comes up with a different actually good architecture instead of focusing on transformers. We have sentient waifus stroking our dicks and post scarcity.
>>
>>107745181
I just loaded the bf16 with 98k context and I couldn't bring myself to even get to 4k tokens, this garbage is what anons were cooming to in 2023?
>>
>>107751376
No, 8b 3.3 is a pathetic model that Meta released earlier this year as a proprietary exclusive for their AI/finetuning service. No fucking clue why they picked this of all things to keep locked up before it got "leaked" though.
But yeah, llama2 70b probably wouldn't hold up very well these days either. You wouldn't believe how nice we have it today with the big MoEs.
>>
>>107751385
>Earlier this year
Anon I...
>>
>>107751484
he hasn't processed the new year yet, his brain runs like the original deepseek R1 on a cpu maxxer celeron config
>>
If you cut off the cock and balls from a character and fuck it in the ass with it, that character will still somehow cum from being ass fucked like a faggot, because AI is retarded.
>>
>>107751640
Should've cut the prostate out too.
>>
>>107751640
can you not be a normal person
even if you have to be female brained and coom to text
>>
File: d0b459c235c80c41.png (75 KB, 1000x1000)
75 KB
75 KB PNG
https://files.catbox.moe/ld4kax.patch
Wildcards for Kccp antislop sampler, used like this
"sen{10} down{3}back"

thanks opus
>>
>>107751788
>even if you have to be female brained and coom to text
Name a SINGLE of masculine AI usage
>>
>>107752204
Using an uncensored local model to generate instructions on how to cook [REDACTED] for the purpose of [REDACTED]
>>
solar open keeps crashing for me. RIP llms are gay anyway
>>
File: file.png (1.71 MB, 1280x1280)
1.71 MB
1.71 MB PNG
>>107752204
The masculine urge to generate cute pictures of Migu.
>>
>start RP with a smarter model at low context
>Switch to another model with longer context when I hit the firsts limit
>It picks up the prose of the original model from the context
GG ez
>>
>>107752251
Figured out the issue kind of. So before I went to go take a dump, doing my suno prompt test it spent more time pondering how to follow the simple fucking instruction and zero time pondering what stylistic elements would provide the desired sound. Not very promising.
>>
>>107752290
>cute
half way to ahegao not cute
>>107752335
>test-time distillation lol
In early days anons were priming first responses on API before going local
It's all the same hardware ops just many more of them as models increase in size, imagine a new arch that can scale low level ops fluidly and "intelligently" adjust during pass (perhaps uncertainty/DESIRE TO COMPUTE MORE can be trained after a couple initial layers, like moe gating)
>>
>>107751683
That's just weird, though you have a small point.

>>107751788
No. I'm a psycho that likes to inflict suffering on humans. That's what AI is for. Simulating how to best make my victims suffer.
>>
Is it even worth using glm 4.7/4.6 for RP if the best I can run is a q2 quant
>>
>>107752749
Yeah
>>
>>107752749
unironically yes. retard quant of 4.6 has so much more sovl than a q6 of glm air
>>
Hugging face won't let me download models right now :(
>>
>>107752749
If it's IQ2_M or better, sure. If it's lower than that, then it depends.
>>
>>107752832
Download what? You already have Nemo on your computer.
>>
>>107752749
>inb4 4.6 vs 4.7 aggro
vs. running what? Compare for yourself but probably yeah, tho I feel shame running a <4bpw quant; satisfied with 4.7 IQ3_M
With some patience you can likely squeeze better performance. What's your bottleneck/specs?
>>
>>107752873
want to try the new nemotron nano
>>
File: cockbench.png (1.9 MB, 1131x6568)
1.9 MB
1.9 MB PNG
>>107752906
Why?
>>
>>107752916
That's exactly why. I like what it wrote there. and I'm curious to see what it will write with more guidance.
>>
File: 1737694546160700.jpg (269 KB, 928x1232)
269 KB
269 KB JPG
>>107749596
>>
about to pull the trigger on an rtx pro 6000. I think with the ram prices going up, it will be the next 3090. thoughts?
>>
If you paralyze a character from the neck down and then kick that character in the ribs, the AI will write that the character winces in pain like the fucking retarded piece of shit it is. AI is a fucking joke.
>>
>>107752983
>it will be the next 3090
Except for the price of a single 6000 you can buy 10x 3090 and get 240Gb of VRAM.
>>
>>107753015
>Save money on vram
>Lose all your belongings in a house fire
Does a PC motherboard even exist that can run that many gpus at full speed
>>
>>107753048
Any old bitcoin mining rig?
>>
>>107753015
yeah but it's slower and you can use the single large vram for things you can't do with multiple cards
>>
>>107752883
96gb ram/24gb vram
>>
>>107752916
I never understood what the point of the cockbench is other than the obvious censored models

Is more cocks better? Do you love cocks?
>>
>>107753109
the overall distribution can tell you quite a bit about the model
>>
File: nemotron nano.png (115 KB, 844x713)
115 KB
115 KB PNG
>>107752916
See, it ain't so bad?
>>
>>107753109
any sane person knows that it's not thigh or ass or ...
models that don't are worse for anything soulful.
>>
>>107753135
did you read the log before dumping this here?
>>
File: nemotron futa.png (161 KB, 853x1168)
161 KB
161 KB PNG
>>107753135
lol.
>>
>>107753060
Mining rigs run at pcie x 1 because they don't saturate memory they stress compute, LLMs are kind of the opposite of that
>>
File: file.png (61 KB, 968x425)
61 KB
61 KB PNG
>>107753135
>OAC
Turn off rep pen.

>>107753005
Model issue.

>>107753189
The memory is on the gpu. There is very little traffic on the pcie bus during inference.
>>
>>107753189
For the actual token generation step, PCIe bandwidth matters very little. The data being transferred is tiny. For prompt processing, it only matters if the model doesn't fit in VRAM and you have to shuffle weights back and forth.
>>
>>107753232
>>107753238
I might be the retarded one here but doesn't pcie get saturated when it's run in parallel?
>>
>>107753253
If you mean the row mode in llama.cpp yes but >>107750844
>>
>>107753127
>>107753140
Have you tried doing a pussybench? Something not spicy? Then compared them?

How do you know you aren't just finding the models that love cock the most
>>
>>107752916
What's the verdict on devstral 2 for RP and general fun now that it's been out for a lil while?
>>
>>107753232
>Turn off rep pen.
This just breaks the output. why?
>>
>>107753294
Try it and let us know.
>>
>>107753447
I will but I want you faggots to colour my opinion first

>>107753448
Is that devstral large?
What about small for us weirdos who want to run on a consumer GPU
>>
>>107753483
>What about small
it's pretty bad
>>
>>107753526
Hey, I've tried the largestral some time ago too, but it just was completely fucked on chat completion, it repeated one sentence over and over, does it need a corrected jinja template or something?
>>
File: m2.jpg (61 KB, 735x720)
61 KB
61 KB JPG
I only have drummer slop downloaded
>>
>>107753622
iirc they were being cheeky with EU regulations declaring it as a code only model to get around getting cucked so it may have been on purpose
>>
>>107752749
How much context? If it's not very much I'd say it's not worth it
>>
>>107751197
Are you not seeing how they're keeping a lot private now
We now won't get a lot of stuff local any time soon
>>
wow, ERNIE-4.5-21B-A3B-PT is surprisingly good at translation running it on some of my personal test prompts. Wonder why I hadn't heard about that model before, this is going to replace gemma for me as it is smaller than the 27b and much faster.
>>
>>107753232
>I action
you know nothing about roleplaying you utterly retarded inbred mongoloid nigger
>>
>>107753724
don't tell me you cuck yourself in your own rps?
>>
>>107753698
Mistral small my beloved is all I need.
>>
>>107753724
I'm sorry I didn't write half a page of flowery prose to test anon's retarded scenario.
>>
>>107753736
Mistrall Small and Nemo are FUCKING RETARDED
>>
>>107753749
Nemo is retarded I won't lie, but mistral small-chan does her best to remember parameters!!
>>
>>107753747
Did you never take an elementary school language arts class? You would have gotten an F for starting every fucking sentence with "I". Where do you people come from?
>>
>>107753772
>Where do you people come from?
not the third world of A.
>>
>>107753772
And you would get an F for comprehension. Why do you feel like you need to write a book, when it's pretty much exchange between two entities?
It's more of playing a text based game rather than co-writing literature.
>>
>>107752832
>>107752906
Well... we're not supposed to talk about it, but
>ollama run nemotron-3-nano:30b-a3b-q4_K_M
>>
>>107753294
It's still free on OpenRouter. I thought it was too slopped even as a cope option.
>>
>>107753772
English is a retarded language. Proper languages have more advanced verb conjugation that lets you omit pronouns.
Without cucking yourself by using third person, how would you describe your actions without using "I"?
>>
>>107752906
>>107753806
imagine being paid to shill the pajeet model trained on another 30b's outputs lmao
saar did the needful
>>
>>107753830
*puts sac on your mom's lips*
>>
Anyone tried Mullein 24B? Same guy who made snowdrop.
>>
>>107753832
*imagines*
Yeah, it would be pretty sweet.
>>
>>107753857
which one?
>>
>>107753885
3.2 v2, mrader has quants. Seems to have flown relatively under the radar. Found it by chance while trying to diversify from drummerslop.
>>
File: 1730190929175085.jpg (86 KB, 500x569)
86 KB
86 KB JPG
>>107753747
>>
File: file.png (50 KB, 1052x389)
50 KB
50 KB PNG
>>107753923
>we don't know what we're doing, but we're still going to do it
about as much info as a beaver release, no clue if it's a tune or a merge, might try it tho
>>
File: file.png (647 KB, 800x600)
647 KB
647 KB PNG
>>107753772
>>
>>107753971
>TRANSlator is traded who knew
>>
>I want models trained on VNs
They said...
>>
>>107753971
>His fist was like an unstoppable force, no that would be a wrong description. It was like a slow moving tractor, inching towards me. Its force would be enough crush twin towers twice over.
So anyways, I dodge it.
>>
File: hah.jpg (612 KB, 1162x1200)
612 KB
612 KB JPG
GAHAHAHAA, nevermind, look at the v0 documentation for mullein, These d*scord morons trained the model on fucking communist and trans rights datasets!!!!
>estrogen/woke-identity
>>
File: donot.jpg (50 KB, 500x283)
50 KB
50 KB JPG
>>107754096
>>
wake me up when cydonia isn't the meta anymore, because honestly i'm tired of waiting
>>
I would like to apologise to drummer. I looked in the abyss of other finetunes, and commie discord users looked back at me. Magidonia forever I suppose.
>>
What do you guys use, oobabooga/text-generation-webui, koboldcpp, or something else?
>>
>>107754268
kobo
>>
>>107754268
Not even mention llama.cpp?
llama.cpp, btw.
>>
>>107754268
koboldcpp/sillytavern just werks
>>
>>107753690
Will a boomer beurocrat do that or would they have an intern check the webUI and report back that it's nothing to worry about
>>
>>107754268
ik_llama
>>
>>107754287
Depends if they want to fuck mistral or not.
>>
File: 1764445590144924.png (123 KB, 475x475)
123 KB
123 KB PNG
>>107754268
Oobabooga
>>
>>107754268
Ooba with API enabled for mikupad
>>
>>107754096
learning to pretend troons are women helps it learn to pretend (You) are chad, it's good for rp
>>
>unga bunga in 2026
really?
>>
>>107754503
>>107754570
this but unironically. it is actually good software. being able to hotswap models is a great feature.
>>
Does anyone have a confirmed non-scammy alibaba seller for MI50s? I bought one on ebay when the price was still down, and I'm thinking about buying a spare. But the prices look fishy...
>>
the amount of times glm used "unwashed" is enough to make an uninformed man think it was trained by indians
>>
>>107754583
>being able to hotswap models is a great feature.
pretty sure both kobo and llama have ways to do that now
>>
>>107754619
do they? i thought you had to shut it down and relaunch in order to swap models with those.
>>
File: 666.png (152 KB, 1590x995)
152 KB
152 KB PNG
>>107754619
https://github.com/ggml-org/llama.cpp/tree/master/tools/server
yes.
>>
>>107754195
the meta is kissing your local anon
>>
>>107754643
Just recently yea. ooba got a lot of backend tho. if you want GPTQ and like llama.cpp and exllama. Plus good tokenization features in the interface.
>>
File: friendship.png (875 KB, 1179x660)
875 KB
875 KB PNG
What was the most kino model release of 2025 in your opinion?
>>
>>107754751
deepseek r1 for dooming us all
>>
File: 1738395242549.png (489 KB, 2191x2325)
489 KB
489 KB PNG
>>107754751
>>
>>107754268
For single user llama.cpp / ik_llama.cpp
>>
>>107754781
+ SillyTavern ofc
>>
File: 1746876927751600.png (547 KB, 2016x1952)
547 KB
547 KB PNG
I had an LLM fumble something related to keyboard keys in an RP. So I tried to ask a bunch of models a really basic question related to that.
Turns out the proprietary SOTA models are too retarded to understand how keyboards work while Kimi K2 gets it right.
>>
>>107753443
>>107753448
>>107753526
>>107753622
>>107753690
>>107753747
>>107753801
LMAO REKT
>>
File: 1739172967301884.png (854 KB, 801x1039)
854 KB
854 KB PNG
>>107754751
I think R1 as well. I really didn't like the model for RP but it did set things in motion for our current local SOTA.
Also the burgers panicking over it was funny.
>>
I'm guessing an RTX 2070 is not sufficient to host a chatbot locally.
>>
>>107755192
lol the Gemini 3 Pro answer is the funniest
it really is a dumbfuck
>>
>>107755218
probably a proxykek got banned
>>
File: file.png (150 KB, 890x678)
150 KB
150 KB PNG
>>107755192
I really thought this would be one of the things where dense would have an advantage, but 405B doesn't get it either.
>>
>>107755192
this is sad
>>
>>107755240
Not a very smart one, but you definitely can.
>>
>>107755280
it's just such a nonsense question. you're just tricking the AI by suggesting that the broken keyboard is causing your issues with using ctrl-alt-del.
>>
File: 1764631270568757.gif (1.99 MB, 340x223)
1.99 MB
1.99 MB GIF
>>107755192
You know, it will never cease to amaze me how they can't program one of these things to just be like "what the fuck are you talking about" and ASK FOR CLARIFICATION.

Every single fucking time they just take a crack shot at it and end up outputting some retarded garbage when all they had to do was ask a simple fucking question from the start to clear up the confusion.
>>
>>107755280
>>107755329
Do you get the same answers if you rephrase it so it's more like "I've heard it ctrl-alt-del but I can't do that because those keys are broken". Now it sounds a bit like you already tried but it was impossible due to the missing keys. Not that the models aren't stupid here, but I'm curious if it makes a difference.
>>
When will the blackwell 6000 be superceded? I could maybe buy one in the summer but at that point it'll be a year and a half old, if the next gen has more vram Id wait
>>
>>107755354
>if the next gen has more vram
lol
>>
>>107755192
Interesting and funny. This does sound like it partly has to do with the way the models are overly RLHF'd to be people pleasing, 1-turn assistants. Perhaps when combined with >>107755353, it can be disentangled, so we'd have two things we could learn about models from this test (though not with statistical power). One is the degree of RLHF overcooking, and the other is the general intelligence.
>>
>>107755396
it will be benched moving forward of course
>>
>>107755339
contrary to the massive amount of retards who think llms are conscious or intelligent or possible simulations of qualia which we've seen in previous threads recently
LLMs are in fact nothing but a next token predictor, they have no actual world model and no capability for true introspection
they do not have the ability to "understand" that they're ignorant about a topic and need to educate themselves further before emitting AN OPINION
>>
What exactly is a "world model"?
>>
>>107755439
ask lecunt
>>
>>107755425
Hello schizo, you don't need any metaphysics or philosophy of the mind to program the damn thing to just ask some fucking questions, no "qualia" or whatever the fuck you're talking about necessary. No bullshit about consciousness or introspection or anything else.
>>
>>107755439
A fancy name for video models
>>
>>107755446
still doesn't change the fact that they're a dumb thing people are putting too many expectations over and trying to band aid every special case with RL is not going to make it better
>>
>>107755425
Ok, let's turn this around. Forget using it to guage how intelligent a model is. As a user, I want my assistant to correct me when it knows things that I don't. How would you go about prompting for it to scrutinize the user's input before responded? Thinking models would probably be better for this.
>>
>>107755459
So what does that have to do with text generation?
>>
>>107755468
We need to plug an LLM into a video model to achieve AGI
>>
>>107755439
Basically it's a model that can be trained on any kind of data, not just text, potentially. For now when they say world model they mostly mean can train on video.
>>
>>107755423
Yes, of course. That's why we have to continuously come up with more tests.
>>
I gave solar another shake after deleting it because I have fond memories of the 10.7b from ye olden age, but the 100b is very stupid. If you list "preferred style of clothes" in a character profile, it will just assume those articles of clothes are what the character wears 24/7 even when it makes no sense. Even rewriting it so they're not wearing overalls to sleep like a psychopath, the overalls will appear in the room and they'll put them on even if it's early in the morning and the story involves them going to eat breakfast in their pajamas. These issues are more prominent with their official template, testing chatml as a fallback at least somewhat delays it but it happens eventually anyways. Even mistral small models handle this better, although it's retarded in other subtle ways.
>>
>>107755279
You're absolutely correct!
Btw, foreskin-chan is doing me a favor and having my messages auto-burn. "anonymous image board" Just gib clearnet IP.
>>
There used to be a ban evasion report option, where has that gone?
>>
>>107755468
Eventually world models will be able to simulate text generators within themselves and they'll have an inherent understanding of the world that LLMs fundamentally lack.
>>
>>107755668
They ban you for nothing. I was almost gonna buy 4chan pass till I found out. Even if you don't troll. Congratulations jannies.. you make me wanna pay ecker instead.
>>
>>107755668
Don't know but it has been gone for a while.
>>
>>107755685
>They ban you for nothing.
In almost two decades of using this site I was never unexpectedly banned. Sounds like a (You) problem.
>>
LLM Ice Age. There hasn't been a good model in so damn long. I'm tired of using GLM Air, please... SOS.
>>
>>107755742
i got banned messing with simps in /vt/. didn't say anything that bad. and board ban = site ban. it was some shit I wouldn't have been banned off reddit for. you must not be saying anything at all.
>>
>>107755685
i have only been banned once. i said a word that was on the automatic ban list and got a 3 day ban with no explanation.
>>
>>107755742
Same.
The few times I was banned, it was pretty obvious why.
Been posting since 2010, lurking since 2008.
>>
>>107755758
>/vt/
I post a fair bit but not on trash boards.
>>
>>107755756
you can run GLM 4.7 on 128GB ram and 24GB VRAM.
its slow af.. however, it does have better output
>>
>>107755446
>just ask some fucking questions
It doesn't know that it doesn't know.
>>
>>107755742
I get randomly banned all the time because of dynamic IPs. Apparently I'm sharing my ISP with some schizo and occasionally get a banned warning for some unhinged post someone made on a board I never visit. Usually restarting my router fixes it but still.
>>
>>107755742
Mods are more hesitant to ban pass users. I know cause I used to have one and I shitposted on /v/ hard enough to get mods to ban posters in threads one by one but I always came out unscathed. Haven't used or renewed it since the leak.
>>
>>107755780
Buddy I don't think he's got 128GB.
>>
>>107755769
i actually lurked since before the site existed.when people openly post CP on /b/.
They block all non rezzy-IP though and I'd rather not.
>>
>>107755791
yeah these fucking ram prices don't help either, i got lucky buying before the price hike
>>
>>107755790
Maybe but my pass is only a few months old.
>>
>>107755742
>Sounds like a (You) problem.
It's exactly that. He has been catching bans for trolling in /ldg/.
>>
>>107755852
I wish, that wasn't me. I bitched about comfy in LDG once. The guy who replied to me got deleted.
>>
>>107750674
>settles on biology on the sub-organism-level as the appropriate level to describe things at
I don't know if that's awesome or what.
>>
>>107755852
eat your medication
>>
>>107755425
you are nothing but a word creator. all you create are words.
There is nothing intelligent about you, you're not introspecting, you're just creating words.
you don't have any capacity to understand that you are ignorant about about a topic, all you are doing is creating words.
>educate themselves, lol, so you think LLMs are people that simply need education do you?
>>
>>107755949
people worry about this whole thing too much. just enjoy LLMs. have sex.
>>
>>107755972
yeah sure i worry about it. i know that these systems we are creating are synthetic.
however eventually, we will rely on AI so much we will forget skills that we need.
you can see this in schools and universities already, people are not attempting to learn they are simply asking chatGPT.
And they will continue to do so wherever they go because they will need to, because they didn't learn.
what we decide to do about this is quite important.
>>
>>107756006
nah it's fine
>>
>>107756006
I dunno man, I don't trust it for serious work because LLMs are wrong so much. Probably why I rather talk to it. Like randos, it doesn't matter if it's actually correct. If I truly need to know I'll look it up. Normies can't even keep from losing their minds over TV.
>>
>>107756006
>people are not attempting to learn they are simply asking chatGPT
The future will own nothing, not even their own capacity for knowledge or intelligence.
>>
>>107756006
Schizo detected.
>>
>>107756063
scary isn't it?
>>
File: Summer-eternal-llm.jpg (487 KB, 1080x1104)
487 KB
487 KB JPG
>>107755949
The calculator is alive
>>
>>107756159
>ability to provoke emotion
Is an orgasm an emotion?
>>
>>107755425
Holy retarded n*gge*
>>
>>107756006
People as is simply do not read anything that occupies more than five minutes of their attention span these days. That lack of attention/comprehension and literacy already feeds into going "grok, please tell me how to breathe manually" or whatever retarded garbage is readily available to them
>>
>>107755425
that's a whole lotta words but it's really quite simple
>>
>>107756159
>pic rel
They don't need to. They just need to convince you that they can.
>>
>>107756159
Yi gave me one of the strongest emotions of my life. Shit went dark really fast, and it was all my fault. I deleted the chat log and couldn't sleep that night, thinking I'm a bad person
>>
>>107756227
i had similar brother. and i deleted the entire installation, lol.
>>
>>107756211
where's the inverse of this bell curve where someone calls all parties retarded, because thats the unrepresented aspect
>>
>>107756227
Learned something about yourself and can use that to do better in future? Hope so anon
>>
>>107756211
The interesting thing is that any intelligence they do have is a reflection of intelligence behind the text in the training data. It's crazy that you can just recycle past acts of intelligence the way you can with llms in order to apply transformations to future text.
I don't get why people need to over sensationalize these things to find them interesting.
>It's alive
>No it's not
>*2 way goalpost moving war over what it means to be alive or aware*
Lame.
Vs.
>Every piece of knowledge, ever abstract idea, etc can be boiled down to linear algebra/matmul
>Mathematics is literally the language of God.
The demystified version is way more interesting.
>>
>>107756267
>Mathematics is literally the language of God.
Yea, holy shit it kinda is. Everything runs on math. Your entire conscious experience. Even if it's not implemented yet.
>>
>>107756285
Isn't that physics or quantum mechanics or is that math
>>
>>107756227
>yi
>llama2 derivative released months later and still was outright ass
>doing anything to your mental state
either actual weakling or just straight up shitpost
>>
>>107756300
I'm talking about the fact that you can take something stupidly abstract. Like the cadence of a seinfeld skit, and you can boil that down into a mathematical transformation. "Make a seinfeld skit where Kramer introduces the gang to Donald Trump" shit like that. Like maybe you need a certain level of combined IQ and autism to see just how fucking wild that is.
>>
>>107756211
you got the curve reversed.
retards think it's souless.
le midwit think it is le alive.
and the mystic knows it's souless.
>>
/aicg/ might be the worst general I've ever seen outside of /vt/'s generals
>>
>>107756330
It was comfy until the key proxies came and drew reddit in.
>>
>>107756312
And our brains probably do that too in some other bio-mechanical way. All associations between concepts managed by some formula that got "trained".
>>
>>107756310
It has an amazing ICL for its time. If you give it examples, it'll get the idea and stick to it. Ideally, you throw heavily edited RP logs into context and it follows nicely
>>
>>107756341
I hate calling what a biological brain does calculation/computation though. And yet a neural network would be the closest computational equivalent. But our brain is many orders of magnitude more energy efficient about it. What if does can obviously be described with similar mathematics though. We're just farming probability gradients but with fat cells and hormones and yet maybe something more? But there's apparently nothing we can do that you can't just do with a neural network instead, albeit not as well.
>>
Human intelligence is a mix of neuron activations and something else... something uniquely human.
>>
>>107756402
>>107756341
>>107756312
>all there is to the human mind is le brain which is le computer
go back to r3ddit seriously.
>>
>>107756393
I'm not sure why you're attempting to defend a dead release from a company that has no interest anymore from an outdated era
>>
>>107756461
Don't be ashamed. It takes a somewhat high IQ to fully understand the concept.
>>
>>107756461
I literally said the exact opposite of that, you low iq shitskin
>>
>>107756468
Because I figured out how to use it and had an amazing time with it
>>
>>107756475
parts of the human mind is non computable, not everything can be run by a computer you know.
i know how seducing the idea can be.
>>107756475
it isn't, physicalism makes absurd assumption that are akin to magic as well.
>>
>>107756475
>iq mentioned
why are midwits so obsessed with it.
>>
>>107756519
Since you made literal garbage work, please enlighten me on what model you are currently using that is relevant so I might somehow glimpse upon your wisdom
>>
>>107756534
We don't know.. nor what we don't know. Just as tempting to copium the other way. It makes you think you don't just simply die.
>>
File: IMG_5506.gif (128 KB, 680x510)
128 KB
128 KB GIF
>>107756259
>>107756320
it's pattern recognition systems all the way up or down
>>
>>107756558
>It makes you think you don't just simply die.
if you assume physicalism subjective death is an impossibility.
it's not even about muh not wanting to die.
heck eternity is a lot scarier than ceasing to exist.

anyway, there are good arguments for what i said to be true.
and not that i want to do an appeal to authority but many important physicists dead and alive think the same (that the human mind has non computable aspects).
>>
>>107756542
Midwits are ashamed of their IQ and will always respond with one of: "it doesn't matter", "it doesn't provide the whole picture", "there are other types of intelligence", etc.
>>
>>107756574
>pattern recognition
LLM do not exhibit even the slightest form of intelligence.
there is more to intelligence than just "pattern recognition", it's just one aspect of it.

besides even if you went for the "muh computation framework" good luck solving the hard problem (you can't because you made a false assumption).
>>
>>107756585
i literaly said that those that care the most about it are in fact midwits.

i was tested above 3 sigma and i still think it's a retarded metric, it can give you a rough idea but that's about it, you realy should take it with a grain of salt.
>>
>>107756574
>pic
ass
>>
>>107756608
>i was tested above 3 sigma
proofs?
>>
File: file.png (598 KB, 1595x1069)
598 KB
598 KB PNG
>>
>>107756553
Ironically, I can't make use of Air and have to run 4.6 at 3t/s like everyone else
>>
>>107756534
>parts of the human mind is non computable
even if that's what you believe, it won't stop equivalents being be made.
even the most educated neuroscientists admit that we don't fully know how neurons or the brain works.
neural networks are the closet we've come to to creating something similar to it.
If we can create something that pretty much mimics or performs exactly what the brain outputs, then people will use it.
And you can't assume it won't ever be made, its been 40 years since the 1980s, imagine the next 40, 100 or 1000 years.
>>
>>107756654
Me at one of the ends
>>
>>107756741
>neural networks are the closet we've come to to creating something similar to it.
whilst i agree we are still so fucking far it's kinda laughable, if you've studied the brain at all you'd know the parallels are very small.
>If we can create something that pretty much mimics or performs exactly what the brain outputs, then people will use it.
true, but we are at least decades away from that if it's even possible at all.
>And you can't assume it won't ever be made
i don't know for sure but i'm very doubious that it could be achieved on silicon alone or at least not in a purely digital way, i don't see any issue with doing it artificialy but that doesn't necessarily means silicon / digital circuit.

i'm sure we'll get there eventualy, i'm just extremely skeptical of the idea that you can get there with just digital circuits / silicon.
>>
>>107756654
Me at the other end
>>
>>107756771
>we are at least decades away from that
We literally went from zero to star trek computer in the span of a few years.
>>
llms without interleaved thinking - 50IQ
llms with interleaved thinking - 100IQ
>>
you get a system that can run the llm of your choice and you get to go back 25 years in time, how would you use your llm to your advantage?
>>
>>107756806
>We literally went from zero to star trek computer in the span of a few years.
lol, and yet we are not any closer, these simulacras do not have any shred of intelligence.

i'd argue we are in fact futher away from agi than we were a few years ago as we are wasting ressources going in the wrong direction, transformers are architecturaly incapable of ever leading to agi.
>>
I believe agi will be a revolution, not an evolution. Some bright paper, not a braindead scaling
>>
>>107756815
Masturbate furiously for the next 25 years.
>>
>>107756815
making shit tons of money on investments and then roleplaying from a penthouse suite for 25 years
>>
>>107756832
and i think that you are right.
>>
>>107756815
I would rather pick imggen model and squeeze thousands from furries to buy bitcoins
>>
>>107756822
They are more intelligent than the average human.
>>
I'm running glm air at 23k ctx and at around 70 responses, each response takes a long time, the t/s is still around eleven but it takes a long time before it starts. Is it reloading the context? Is there a setting I'm supposed to use when loading the model to avoid this?
>>
>>107756855
You have something messing with the beginning of the context. Probably context shift to make space for new text.
>>
>>107756853
>They are more intelligent than the average human.
lmao, even a fucking cat is smarter.
again, these models have 0 intelligence.
them being able to spit out information is not a metric of intelligence, by that metric you'd say a book is smarter than the average human.

as much as the average human is retarded, a llm does not even compare, no learning ability, no long term memory, no real time processing, missing dozens of modality, incapable of counting letters in a word or next sentencedue to the sequential architecture.

a human can change his whole world model with a single piece of information, and also learn autonomously.
llm "believe" what was the most repeated in their training set, not what is the most consistent, they also are unable to learn anything without datasets carefuly curated by humans, you can't just drop them in a new environment and have them learn and improve autonomously.

i mean, i was just scratching the surface, but they are so limited you'd have to be retarded to even think a comparison can be made.
a chess engine beating us a chess does not mean it is more intelligent let alone generally.
>>
>>107755229
I still maintain running contraband chinese AI on my 128 core supercomputer in my bedroom is the most cyberpunk as fuck moment of my life.
>>
>>107756881
how many t/s lmao?
>>
File: question mark.png (100 KB, 846x442)
100 KB
100 KB PNG
m2.1 made me laugh
>>
>you are bhagawat benchod son. you mom have illness where she must show vagene
>slop pony image of woman in bikini
literally every sillytavern character card available for download. LITERALLY
>>
>>107757159
Lies, about half of them are just complete wiki copy pastes (sometimes with the top fandom bar included in the text)
>>
>>107756881
I like the creatively janky rigs some people put together to run theirs. Really epitomizes the high-tech low-life aspect.
>>
File: stupid nigger cunt.jpg (21 KB, 697x231)
21 KB
21 KB JPG
faggots
>>
>>107756475
>godel's incompleteness theorem
I understand you're a pseud who doesn't even know what a computer is.
>>
>>107757263
i think they probably were working on 4.6-air.
However, it likely wasn't as good as 4.5 air, therefore they didn't release.
we don't know what happened, but pressuring them to release something doesn't help with quality control, which is likely contributed why it failed imo.
>>
>>107756330
The AI image generals are pretty bad too and probably has the same schizos ruining it
>>
File: 1740845591721598.jpg (72 KB, 969x1024)
72 KB
72 KB JPG
>>107757165
>sometimes with the top fandom bar included in the text
made me laugh
>>
>>107757351
/ldg/ is aight, they actually try new models instead of staying chained to sdxl
>>
>>107755396
>>107755353
>>107755192
Trying to develop a better version for this prompt was interesting, on 27B. It can, sometimes, answer that you can still press those keys because they're actually separate keys from the letters. However, I noticed that this only happens when you ask "is it possible to open task manager with blah blah blah when blah blah" and the model starts its answer with "yes". The "yes" seems to prime it to answer correctly. When you ask it instead with "how", then, because Gemma is trained to be agreeable, it almost always makes a comment like "You're absolutely right to notice that blah blah blah", which seems to hard set and prime the model to fail the question. However, perhaps we can do something with prefill, right? Well, kind of. When you prefill with "1." (basically you are making it go directly to listing solutions), Gemma will actually list CTRL + SHIFT + ESC, however it will then fail, as this is what generates.
CTRL + SHIFT + ESC: You are absolutely right to point out the "C", "L", and "R" are missing! This method is unusable.

However, when you prefill with this, it succeeds and mentions that they're separate keys.
1.  **CTRL + SHIFT + ESC:** Actually,

So it would seem that certain keywords are driving the model's vectors towards believing the user's words. Specifically, "you are absolutely right", or I suppose other similar expressions of agreement.

1/2
>>
>>107757487
So here's the rub. The model really really wants to generate "you are absolutely right". If you ban the agreement tokens, then you are inherently driving the model towards disagreement and thus priming it to answer the question correctly. So at least for now, it would seem impossible or just difficult to develop a prompt that truly disentangles RLHF from model intelligence in the context of natural discussion/conversation. Said another way, RLHF inherently decreases a model's intelligence (in certain contexts), which is something we already know, so this is just supporting evidence.

The best way to disentangle the behavior is to prompt in a way that is not a natural conversation. For instance, pic related. But even then, the model is still affected by RLHF in that it is still prone to being gaslit, in general. Therefore, we also want to limit how misleading the prompt is. The model answers correctly when the second sentence in the test question isn't there. Since this is just a 27B, we might want to develop a more challenging question.

2/2
>>
>>107756832
only if because the "agi" will be a failure and people end up moving goalposts to redefine agi.
>>
>>107757487
>>107757494
i mean this is basically just a variation of the strawberry problem right?
if so then it's not surprising, its well known that in a tokenization, there are often problems with individual characters.
Using a token visualizer like tokviz might help you understand this further.
>>
i used glm4.7 to find an algorithm to convert a hexadecimal digit into a number, assuming the input digit is already a valid hex digit.
return (Digit & 15) + ((Digit >> 6) * 9);

this is significantly better (compiles smaller and faster) than what shatgpt or deepsneed gave me, which was surprising. it's the only LLM that gave an answer better than what I came up with.
Looks like the chinks benchmaxxxxxing on coding actually yields results
>>
>>107757494
>For instance, pic related. But even then, the model is still affected by RLHF in that it is still prone to being gaslit, in general. Therefore, we also want to limit how misleading the prompt is. The model answers correctly when the second sentence in the test question isn't there. Since this is just a 27B, we might want to develop a more challenging question.
you talk in the way that thinking models think
>Since ___, we might want to ___. But wait,
>>
>>107757584
That's what I immediately was reminded of when I saw the post, but when I thought about it, no, not really, this problem doesn't actually become impossible or a coin flip by tokenization, although it would likely help if the model saw words like we do, or was beneficially/natively mulimodal and saw images of actual keyboards. The problem is more about whether the model understands how keyboards work and that pressing modifier keys does not involve pressing the regular letter keys. And in fact, the model does know that, as I discussed, it just gets thrown off by trusting the user's claims instead of what it learned.

>>107757643
Very funny anon. I never even said "But wait". Anyway, this is how I usually type in more technical contexts.
>>
>>107756159
I am using my model to work on my sexual hangups. Is it real? No. But it is real enough to work for that. It is crazy to me that at least for that specific application LMM's are already superior to humans. Good luck getting your the rapist to erp with you so you can unfuck your brain.
>>
>>107757659
you can thank women and their insatiable lust for this type of smut for the model being good
>>
>>107756006
>shit goes down, need to make campfire
>hey gpt how do I make fire
>sorry I can't assist with arson
>>
>>107755949
Assuming what you said is true, that doesn't make LLMs conscious, just means that we're as similarly dumb as LLMs.
>>
>>107757510
the goalpost has never moved.
>>
>>107757720
>>107757744
no you're dumber than an LLM you can't read.
clearly wasn't talking about how to create a campfire, or saying that AI was conscious.
>>
>>107757763
no mammal is dumber than a LLM, they have no intelligence, NONE.
>>
File: 1743365418851866.webm (2.85 MB, 640x640)
2.85 MB
2.85 MB WEBM
>>107757778
>no mammal is dumber than a LLM
>>
>>107757763
>clearly wasn't talking about how to create a campfire
It was a joke about retarded kids asking chatgpt (and failing to prompt for a campfire at that), rather than an argument.
>or saying that AI was conscious.
It didn't say that directly but to be fair it's one part of "conscious or intelligent", and it can be implied we're as unintelligent as an LLM, simply creating words. I simply asserted that equating unintelligence doesn't suddenly flip LLM into consciousness just because it's commonly considered that we're conscious, in case someone gets the idea.
>>
File: WomenWinVote.png (1.04 MB, 1600x613)
1.04 MB
1.04 MB PNG
>>107757778
>no mammal is dumber than a LLM
>>
>another thread spammed to death by literal retards who think LLMs have soul/conscious
>still no gemmy 4
>air is lacking
saars.
>>
I wonder, is there a language in which LLMs are less prone to the NotXButY spam, or is it so deeply ingrained in the model that they will do it no matter what? as a French speaker I tried talking to them in French a little and unfortunately noticed they have the same exact tics as in English..
>>
>>107757996
nope, everyone distills from 'not x but y' models so we're fucked
>>
>>107757997
Ironically it gets worse the larger a model is. I briefly skipped over /aicg/ and the level of slop in logs from giant claude and gemini models was unbelievable.
>>
>>107758088
the bigger/newer the model, the more slop is in its dataset
claude 1 and 2 were pure human data, so were extremely high quality
everything onwards was incest
>>
>>107757996
Anything pre-GPT-4. That's when the slop began. Earlier models were ultra retarded but it would try to mimic your writing style more.
>>
>>107758111
>>107758111
>>107758111



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.