/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 10/31/25(Fri)11:54:38 No.107063981

File: __hatsune_miku_vocaloid_d(...).jpg (850 KB, 2894x4093)

850 KB JPG

/lmg/ - Local Models General Anonymous 10/31/25(Fri)11:54:38 No.107063981 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>107056325 & >>107044779

►News
>(10/30) Qwen3-VL support merged: https://github.com/ggml-org/llama.cpp/pull/16780
>(10/30) Kimi-Linear-48B-A3B released with hybrid linear attention: https://hf.co/moonshotai/Kimi-Linear-48B-A3B-Instruct
>(10/28) Brumby-14B-Base released with power retention layers: https://manifestai.com/articles/release-brumby-14b
>(10/28) NVIDIA-Nemotron-Nano-12B-v2-VL-BF16 released: https://hf.co/nvidia/NVIDIA-Nemotron-Nano-12B-v2-VL-BF16
>(10/28) LFM2-ColBERT-350M released: https://hf.co/LiquidAI/LFM2-ColBERT-350M

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
10/31/25(Fri)11:55:02 No.107063985

Anonymous 10/31/25(Fri)11:55:02 No.107063985

File: __hatsune_miku_vocaloid_d(...).png (463 KB, 800x930)

463 KB PNG

►Recent Highlights from the Previous Thread: >>107056325

--VRAM vs RAM tradeoffs and cost-effective upgrades:
>107057422 >107057493 >107057523 >107057538 >107057627 >107057641 >107057680 >107057892 >107057904 >107058132 >107058211 >107058235 >107058246 >107058291 >107058301 >107058332 >107058823 >107057647 >107060695
--Tech Mahindra's 1 trillion parameter LLM project sparks mixed reactions:
>107061935 >107062055 >107061978 >107062154 >107062174
--Multi-GPU memory optimization latency tradeoffs for MoE models:
>107062861 >107062880 >107062891 >107062902 >107062941 >107063023 >107062887 >107062939 >107062947 >107063018 >107062980 >107063165 >107063110
--VTT model comparisons and pipeline suggestions for transcription:
>107059665 >107059817 >107059845 >107059918 >107059961 >107060178 >107060224 >107062756 >107062842 >107062859
--Qwen 4B's performance in complex JSON generation and small LLM advancements:
>107057926 >107058153 >107058218
--Qwen 4b's multi-image analysis capabilities demonstrated:
>107060687
--SillyTavern system prompt configuration challenges:
>107062184 >107062200 >107062327 >107062369 >107062386 >107062492
--Exploring practical uses for local image processing and interactive applications:
>107056358 >107056482 >107056509 >107056541 >107056576 >107056554
--Challenges with TabbyAPI and Qwen3 Coder tool calling implementation:
>107058354 >107058385 >107058840 >107059067 >107059694 >107062455
--Skepticism about LLaDA2.0's practical value due to performance and context limitations:
>107060705 >107060731 >107060818
--UI/lorebook integration challenges and code accessibility in STScript:
>107057009 >107057036 >107057083 >107057101 >107057121 >107057162 >107057240
--Miku, Rin, and Dipsy (free space):
>107056696 >107057940 >107057943 >107059568 >107059860 >107060222 >107060637 >107060674 >107061256 >107062726 >107061898

►Recent Highlight Posts from the Previous Thread: >>107056334

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
10/31/25(Fri)12:08:29 No.107064100

Anonymous 10/31/25(Fri)12:08:29 No.107064100

i see... :(

Anonymous
10/31/25(Fri)12:09:42 No.107064113

Anonymous 10/31/25(Fri)12:09:42 No.107064113

>>107064100
I don't

Anonymous
10/31/25(Fri)12:23:08 No.107064207

Anonymous 10/31/25(Fri)12:23:08 No.107064207

https://youtu.be/qw4fDU18RcU

Anonymous
10/31/25(Fri)12:25:21 No.107064225

Anonymous 10/31/25(Fri)12:25:21 No.107064225

Do you guys know what I realized? No matter how far you go, you're still somewhere and never nowhere, so saying I am in the middle of nowhere is a nonsensical sentence.

Anonymous
10/31/25(Fri)12:28:20 No.107064254

Anonymous 10/31/25(Fri)12:28:20 No.107064254

>>107064207
so he uses vLLM in a docker container (this needing the shm-size) and runs qwen 235B in AWQ 4 bit

Anonymous
10/31/25(Fri)12:30:48 No.107064275

Anonymous 10/31/25(Fri)12:30:48 No.107064275

All of his knowledge is ironically coming from LLMs. I'm sure he has also browsed /lmg/ in the past at least. You could probably find his retarded questions.

Anonymous
10/31/25(Fri)12:34:47 No.107064311

Anonymous 10/31/25(Fri)12:34:47 No.107064311

>>107064207
pretty dissapointing, he was pretty based up to this point

Anonymous
10/31/25(Fri)12:38:50 No.107064351

Anonymous 10/31/25(Fri)12:38:50 No.107064351

>>107064207
>watch the first few mins
>the topic of the title doesn't even get mentioned at all

Anonymous
10/31/25(Fri)12:38:53 No.107064352

Anonymous 10/31/25(Fri)12:38:53 No.107064352

>>107064207
cool Web UI

Anonymous
10/31/25(Fri)12:42:30 No.107064392

Anonymous 10/31/25(Fri)12:42:30 No.107064392

File: 1743424257788609.png (420 KB, 1074x872)

420 KB PNG

>>107064207
>its actually a video about shitting on cloud models and shilling self-hosting models
how can one man be so based?

Anonymous
10/31/25(Fri)12:47:11 No.107064443

Anonymous 10/31/25(Fri)12:47:11 No.107064443

Gguf status?

Anonymous
10/31/25(Fri)12:51:53 No.107064493

Anonymous 10/31/25(Fri)12:51:53 No.107064493

>>107064207
Ok watched the whole video.
Wtf he's one of us.

Anonymous
10/31/25(Fri)12:53:42 No.107064510

Anonymous 10/31/25(Fri)12:53:42 No.107064510

>>107064275
>I'm sure he has also browsed /lmg/ in the past at least.
I doubt it because he actually complimented gpt-oss

Anonymous
10/31/25(Fri)12:58:30 No.107064565

Anonymous 10/31/25(Fri)12:58:30 No.107064565

>>107064392
anti ai will still use just the thumbnail as saying he's against all ai tho

Anonymous
10/31/25(Fri)13:00:32 No.107064583

Anonymous 10/31/25(Fri)13:00:32 No.107064583

>>107064207
Fuck this fag, I bet he even lurks ITT. His whole persona is so rage inducing.
https://youtu.be/7OiMxGwmdto?si=kvdyA0QWdV6rZ_3k

Anonymous
10/31/25(Fri)13:05:45 No.107064629

Anonymous 10/31/25(Fri)13:05:45 No.107064629

>>107064493
>Wtf he's one of us.
No shit. He says the word nigger all the time.

Anonymous
10/31/25(Fri)13:08:41 No.107064663

Anonymous 10/31/25(Fri)13:08:41 No.107064663

>>107064510
There is one retard here that regularly praises gpt-oss. Maybe it's him.

Anonymous
10/31/25(Fri)13:10:41 No.107064688

Anonymous 10/31/25(Fri)13:10:41 No.107064688

>>107064629
do not to slander he said once in the rage moment

Anonymous
10/31/25(Fri)13:14:21 No.107064718

Anonymous 10/31/25(Fri)13:14:21 No.107064718

>>107064663
we must agree

Anonymous
10/31/25(Fri)13:16:07 No.107064735

Anonymous 10/31/25(Fri)13:16:07 No.107064735

>>107064688
I've seen some tiktok clips of him where he made some implicit remaks showing he's a white nationalist, that's a reason why he decided to go for japan, not just because of "uwu kawaii desu ne", but because this country is extremely racist and nationalist

Anonymous
10/31/25(Fri)13:16:07 No.107064736

Anonymous 10/31/25(Fri)13:16:07 No.107064736

>>107064207
>video about local AI from e-celeb #16311498
>no ollamao in sight
i was going to tell you to fuck off but nevermind, i like the guy

Anonymous
10/31/25(Fri)13:17:05 No.107064742

Anonymous 10/31/25(Fri)13:17:05 No.107064742

>>107064735
by wouldn't he be subject to that racism? he is not Japanese

Anonymous
10/31/25(Fri)13:17:40 No.107064748

Anonymous 10/31/25(Fri)13:17:40 No.107064748

>>107064736
I wish I had the money to play around with a VLLM capable rig

Anonymous
10/31/25(Fri)13:19:16 No.107064766

Anonymous 10/31/25(Fri)13:19:16 No.107064766

>>107064742
Racists don't tend to be brightest crayon in the toolshed.

Anonymous
10/31/25(Fri)13:19:40 No.107064773

Anonymous 10/31/25(Fri)13:19:40 No.107064773

>>107064742
everyone in the world know who pewdiepie is, I think the japanese people are happy he's here

Anonymous
10/31/25(Fri)13:23:03 No.107064811

Anonymous 10/31/25(Fri)13:23:03 No.107064811

>>107064766
Ahah so true kind stranger, take this kind gold and upvote with you!

Anonymous
10/31/25(Fri)13:24:39 No.107064830

Anonymous 10/31/25(Fri)13:24:39 No.107064830

File: 1736602330158898.png (491 KB, 1100x733)

491 KB PNG

>>107064766
the richest man in the history of humanity is a "nazi" though, how is that not bright?

Anonymous
10/31/25(Fri)13:25:32 No.107064843

Anonymous 10/31/25(Fri)13:25:32 No.107064843

>>107064830
he can be rich and a dumbass at the same time

Anonymous
10/31/25(Fri)13:25:46 No.107064845

Anonymous 10/31/25(Fri)13:25:46 No.107064845

Do you guys ever use models to edit or write your prompts? I'm trying it a bit but desu its hard to tell if its an improvement or not

Anonymous
10/31/25(Fri)13:27:06 No.107064868

Anonymous 10/31/25(Fri)13:27:06 No.107064868

>>107064830
>lifting your hand in a angle is... le nazi

Anonymous
10/31/25(Fri)13:28:27 No.107064878

Anonymous 10/31/25(Fri)13:28:27 No.107064878

>>107064742
why would the japanese hate him?
he's not one of the pajeet or third worlder migrants wanting to shit up the place

Anonymous
10/31/25(Fri)13:29:39 No.107064895

Anonymous 10/31/25(Fri)13:29:39 No.107064895

>>107064742
I don't think japanese people mind white people, they know what they are worth

Anonymous
10/31/25(Fri)13:30:28 No.107064904

Anonymous 10/31/25(Fri)13:30:28 No.107064904

>>107064845
Yes, it's useful when for example you want to define character behavior more in detail but you can't be assed to write the entire prompt yourself from scratch. It's also best when the entire prompt is dedicated to the character. For non-RP uses, LLM-driven recursive prompt-refining is also a thing: https://arxiv.org/abs/2507.19457

Anonymous
10/31/25(Fri)13:31:00 No.107064908

Anonymous 10/31/25(Fri)13:31:00 No.107064908

>>107064845
>its hard to tell if its an improvement or not
Then consider time and effort, however much or little that is.

Anonymous
10/31/25(Fri)13:31:53 No.107064920

Anonymous 10/31/25(Fri)13:31:53 No.107064920

>>107064845
Oh yeah. Mostly for brainstorming than anything, since the final version is always heavily edited by me.

Anonymous
10/31/25(Fri)13:37:02 No.107064965

Anonymous 10/31/25(Fri)13:37:02 No.107064965

Can someone explain to me if alpha changes something about the training process or it ONLY changes the multiplier at inference time? (yes, sorry, I'm too lazy to read the actual paper)

Anonymous
10/31/25(Fri)13:40:01 No.107064988

Anonymous 10/31/25(Fri)13:40:01 No.107064988

>>107064766
would you say that about blm?

Anonymous
10/31/25(Fri)13:41:15 No.107065003

Anonymous 10/31/25(Fri)13:41:15 No.107065003

File: effective-rank.png (143 KB, 750x417)

143 KB PNG

>>107064965
It was intended to just be a multiplier, but in practice, alpha must be at least twice the rank (=it can/should be larger) to mitigate the emergence of "intruder dimensions" that decrease the effective rank of your LoRA.

https://arxiv.org/abs/2410.21228

Anonymous
10/31/25(Fri)13:43:24 No.107065011

Anonymous 10/31/25(Fri)13:43:24 No.107065011

File: kek.png (60 KB, 2175x207)

60 KB PNG

>>107064766
>Racists don't tend to be brightest crayon in the toolshed.
the US literally hired actual nazis to put their man on the moon lol
https://en.wikipedia.org/wiki/Operation_Paperclip

Anonymous
10/31/25(Fri)13:45:08 No.107065032

Anonymous 10/31/25(Fri)13:45:08 No.107065032

>>107065003
Ok but that doesn't answer my question. Is it applied at train time (so the weights actually learn to use it, and at inference time you shouldn't use a different one than the alpha the lora was trained with) or is it an option that is applied only at inference time and the lora itself doesn't have a built in alpha?

Anonymous
10/31/25(Fri)13:47:18 No.107065046

Anonymous 10/31/25(Fri)13:47:18 No.107065046

>>107065032
It's used at train time, and it's memorized in the adapter configuration if you don't merge it with the baseline model. In that case, you can change alpha to make the adapter weaker/stronger, but I've never played with that.

Anonymous
10/31/25(Fri)13:55:28 No.107065134

Anonymous 10/31/25(Fri)13:55:28 No.107065134

>>107065046
I see, thanks.

Anonymous
10/31/25(Fri)13:55:53 No.107065138

Anonymous 10/31/25(Fri)13:55:53 No.107065138

>>107065046
Applying it at a significantly higher alpha than used in training causes brain damage. So you should generally only apply the adapter at the alpha it was trained at and then just train separate adapters if you want to play around with the alpha.

Anonymous
10/31/25(Fri)13:57:42 No.107065156

Anonymous 10/31/25(Fri)13:57:42 No.107065156

how would one go about throttling llama.cpp intentionally to say half speed? of course temporarily

Anonymous
10/31/25(Fri)14:03:34 No.107065203

Anonymous 10/31/25(Fri)14:03:34 No.107065203

>QWEN3 VL has the best local OCR function
>DeepSeek 3.1 Terminus has the best JP and CN to ENG translation function (Outside of occasionally having random Chinese characters in the English translation, is there a way to fix this?)
>Kimi k2 has the best writing

Damn, in another year, I genuinely believe we'll never need traditional translators for a good chunk of media.

Anonymous
10/31/25(Fri)14:06:24 No.107065230

Anonymous 10/31/25(Fri)14:06:24 No.107065230

TONIGHT I'm gonan do it. Totally goinan fuckin do it. I am gunna try ant SUCK my own COCK!!! I taste my own cum from jackan off but it is not satisfy enough. I need to feeel it shootan on my tongue. I will bee in extacee. I am so excite boys!

Anonymous
10/31/25(Fri)14:07:38 No.107065241

Anonymous 10/31/25(Fri)14:07:38 No.107065241

File: file.png (619 KB, 2442x1476)

619 KB PNG

>>107064207
I have vague memories of a "council of niggas" or something like that from a year or two ago. Was it from a paper?

Anonymous
10/31/25(Fri)14:09:53 No.107065271

Anonymous 10/31/25(Fri)14:09:53 No.107065271

>>107064845
I still use this thing to make prompts.
https://anthropic.com/metaprompt-notebook/

Anonymous
10/31/25(Fri)14:26:39 No.107065409

Anonymous 10/31/25(Fri)14:26:39 No.107065409

>>107065156
Throttle your GPU to half it's speed

Anonymous
10/31/25(Fri)14:28:47 No.107065427

Anonymous 10/31/25(Fri)14:28:47 No.107065427

>>107064895
lol

Anonymous
10/31/25(Fri)14:29:56 No.107065443

Anonymous 10/31/25(Fri)14:29:56 No.107065443

>>107065230
cute, hope you're slim enough

Anonymous
10/31/25(Fri)14:34:17 No.107065472

Anonymous 10/31/25(Fri)14:34:17 No.107065472

HF will soon ask for ID before you download an danger LLM!
https://reclaimthenet.org/lawmakers-want-proof-of-id-before-you-talk-to-ai

Anonymous
10/31/25(Fri)14:34:20 No.107065474

Anonymous 10/31/25(Fri)14:34:20 No.107065474

>>107065230
I wish I could do that but I have the build of a Chad. Life is unfair.

Anonymous
10/31/25(Fri)14:36:33 No.107065483

Anonymous 10/31/25(Fri)14:36:33 No.107065483

>>107064207
Did he share the code? Couldn't find it in the video description.

Anonymous
10/31/25(Fri)14:39:57 No.107065504

Anonymous 10/31/25(Fri)14:39:57 No.107065504

>>107065472
yup it's over
>Under the GUARD Act, self-declared birthdays no longer count. If implemented broadly, it would set a precedent that any “interactive AI system” must verify identity through government-approved documentation.
this would hit literally any site that has an ai powered search box and shit like that, like the dataset stuff on hf, or their test box on the side of model cards

Anonymous
10/31/25(Fri)14:47:08 No.107065561

Anonymous 10/31/25(Fri)14:47:08 No.107065561

So whats the best thing I can on a 4090 today?

Anonymous
10/31/25(Fri)14:50:13 No.107065603

Anonymous 10/31/25(Fri)14:50:13 No.107065603

do backups of your most useful models. checksum for bitrot, multiple backup locations etc.
it's now or never to make sure you can always access em

Anonymous
10/31/25(Fri)14:52:31 No.107065617

Anonymous 10/31/25(Fri)14:52:31 No.107065617

>>107065603
shut it doomer just another nothing burger

Anonymous
10/31/25(Fri)14:55:05 No.107065629

Anonymous 10/31/25(Fri)14:55:05 No.107065629

>>107065504
>upload model as a torrent
sorry guys, nothing personal

Anonymous
10/31/25(Fri)14:56:10 No.107065638

Anonymous 10/31/25(Fri)14:56:10 No.107065638

>>107065629
>stalled

Anonymous
10/31/25(Fri)14:57:22 No.107065653

Anonymous 10/31/25(Fri)14:57:22 No.107065653

>>107065638
stalled torrents? what is this? 2002? you can buy a 1gbps uplink seedbox for like $5 a month.

Anonymous
10/31/25(Fri)14:59:17 No.107065667

Anonymous 10/31/25(Fri)14:59:17 No.107065667

>>107065653
so true! you're absolutely right this is why the service that was exactly for copying hf as torrents is thriving and hasn't been dead for more than a year

Anonymous
10/31/25(Fri)15:00:15 No.107065682

Anonymous 10/31/25(Fri)15:00:15 No.107065682

>>107064845
All the time, rephrasing in its own words increases comprehension. The resulting prompt usually works well across different models, I guess they were all trained on the same slop

Anonymous
10/31/25(Fri)15:01:50 No.107065696

Anonymous 10/31/25(Fri)15:01:50 No.107065696

>>107065682
>I guess they were all trained on the same slop
ScaleAI enters the chat

Anonymous
10/31/25(Fri)15:04:00 No.107065720

Anonymous 10/31/25(Fri)15:04:00 No.107065720

File: Help.png (30 KB, 657x527)

30 KB PNG

So which 24gb coder models have tool support?

Anonymous
10/31/25(Fri)15:10:35 No.107065783

Anonymous 10/31/25(Fri)15:10:35 No.107065783

>>107065667
because huggingface is free and last i checked $0 is less than $5. however lets imagine that huggingface does require ID to download any model or dataset from their website. the majority of normies with a passing interest with AIs won't do it because they will just use chatgpt. power users are typically privacy oriented since they are downloading LOCAL models in the first place. the only users that huggingface would have left are academic people. finetrooners like thedrummer depends on constant validation, they won't get that huggingface and will have to cough up the $5 a month for people to download whatever the latest flavor of cydonia-24B-v8atoz-amazon-GOOF-troop is. in the end all the major model releases would just get downloaded by a few users and reuploaded as torrents.

Anonymous
10/31/25(Fri)15:16:20 No.107065852

Anonymous 10/31/25(Fri)15:16:20 No.107065852

File: SuchJoy.png (169 KB, 1522x973)

169 KB PNG

I think I got memed on by /lmg/ thing just keeps spamming text until it goes off the rails.

Anonymous
10/31/25(Fri)15:18:42 No.107065870

Anonymous 10/31/25(Fri)15:18:42 No.107065870

>>107065852
just use glm 4.5 air if you can

Anonymous
10/31/25(Fri)15:24:42 No.107065909

Anonymous 10/31/25(Fri)15:24:42 No.107065909

>>107063981
What is better, chuds? To run GLM 4.5 Air q8, or GLM 4.6 q3? To fit in about 144 GB of VRAM

Anonymous
10/31/25(Fri)15:26:52 No.107065923

Anonymous 10/31/25(Fri)15:26:52 No.107065923

>>107065909
4.5 Air is shit.

Anonymous
10/31/25(Fri)15:27:09 No.107065928

Anonymous 10/31/25(Fri)15:27:09 No.107065928

run deepseek instead of the reddit meme model

Anonymous
10/31/25(Fri)15:29:24 No.107065946

Anonymous 10/31/25(Fri)15:29:24 No.107065946

vibevoice is best
https://vocaroo.com/173Uko8t1hHi

Anonymous
10/31/25(Fri)15:29:40 No.107065949

Anonymous 10/31/25(Fri)15:29:40 No.107065949

I've been using the Terminus model for the last few days to translate VNs/RPGs/LNs into English.
Well, what I've been having issues with is that, whenever I translate Chinese into English, Terminus (And 3.1) will include some Chinese text in the translation. Every other language I translate into English has been very good without these issues, it's just Chinese text that seemingly has this problem. Is there a way to make this problem stop?

Anonymous
10/31/25(Fri)15:51:40 No.107066098

Anonymous 10/31/25(Fri)15:51:40 No.107066098

>>107065852
There is probably a bug somewhere in your stack, it shouldn't be *that* shitty. Try using an Openrouter API endpoint first to check if it's something wrong on your end.

Anonymous
10/31/25(Fri)15:55:57 No.107066126

Anonymous 10/31/25(Fri)15:55:57 No.107066126

>>107065783
Yes, or people could just upload to archive.org (which automatically generates a torrent which people could seed as well in case it gets taken down from the archive).

Anonymous
10/31/25(Fri)16:29:53 No.107066378

Anonymous 10/31/25(Fri)16:29:53 No.107066378

Did anything ever come out of those cheapo 96gb vram huawei cards?

Anonymous
10/31/25(Fri)16:35:42 No.107066421

Anonymous 10/31/25(Fri)16:35:42 No.107066421

File: Argumentfail.png (193 KB, 1454x973)

193 KB PNG

Oh no.

Anonymous
10/31/25(Fri)16:45:07 No.107066491

Anonymous 10/31/25(Fri)16:45:07 No.107066491

>>107065949
Yeah if you use llama.cpp you can specify a grammar that excludes Chinese characters. Some other backends have similar features.

Anonymous
10/31/25(Fri)16:46:30 No.107066504

Anonymous 10/31/25(Fri)16:46:30 No.107066504

File: 1758230453392848.jpg (14 KB, 469x484)

14 KB JPG

>>107066421
>.vb

Anonymous
10/31/25(Fri)16:46:36 No.107066505

Anonymous 10/31/25(Fri)16:46:36 No.107066505

https://www.youtube.com/watch?v=LjU89rZa8HQ
imagine the erps

Anonymous
10/31/25(Fri)16:48:30 No.107066515

Anonymous 10/31/25(Fri)16:48:30 No.107066515

>>107066421
>.vb
Stop torturing language models.

Anonymous
10/31/25(Fri)16:53:48 No.107066568

Anonymous 10/31/25(Fri)16:53:48 No.107066568

>>107066421
my grandpa also uses vb

Anonymous
10/31/25(Fri)17:02:26 No.107066630

Anonymous 10/31/25(Fri)17:02:26 No.107066630

>>107064688
Go to 06:10 in the video. His wife edits the videos btw

Anonymous
10/31/25(Fri)17:08:42 No.107066673

Anonymous 10/31/25(Fri)17:08:42 No.107066673

>>107065472
Haven't we been expecting this since they started pushing the narrative that LLMs are a threat to humanity? Still waiting for them to announce a National GPU Registry and always-online requirements.

Anonymous
10/31/25(Fri)17:10:37 No.107066694

Anonymous 10/31/25(Fri)17:10:37 No.107066694

>>107066421
I found why my finetuning efforts were unable to get rid of the slop. It seems that a single LoRa has very limited abilities to shape any given response, so they need stacking.
I had to do a few iterations of merging+LoRa to get rid of the "You are absolute correct" and "I am deeply sorry" meltdown slop.
I suspect the melties might have been a thing in the first place because of the model cheating a reward model during RLHF.
This is probably why nobody releases standalone LoRas and everybody releases merged models (besides compatibility being unreliable).

Anonymous
10/31/25(Fri)17:15:44 No.107066725

Anonymous 10/31/25(Fri)17:15:44 No.107066725

>>107066421
Fascinating! Is VB still a thing? this looks like an actual app not only an office macro?

Anonymous
10/31/25(Fri)17:18:01 No.107066743

Anonymous 10/31/25(Fri)17:18:01 No.107066743

>>107066673
I don't think even politicians are bold enough to say "let's ban timmy from buying a few second hand 3090s on ebay" before regulating the big datacenters.
And you heard how Trump has said he wants to US to go full steam ahead to compete with China.
So I don't think there are regulations coming during this administration.

Anonymous
10/31/25(Fri)17:18:04 No.107066744

Anonymous 10/31/25(Fri)17:18:04 No.107066744

>>107066126
archive.org typically seeds slowly, so if you are serious about it you would want a dedicated seedbox

Anonymous
10/31/25(Fri)17:20:33 No.107066766

Anonymous 10/31/25(Fri)17:20:33 No.107066766

>>107066725
Well VB.Net uses the same VM as C#
Like Kotlin runs on the JVM

Anonymous
10/31/25(Fri)17:26:34 No.107066814

Anonymous 10/31/25(Fri)17:26:34 No.107066814

File: mpv kde plasma whisperx V(...).png (2.47 MB, 3080x1440)

2.47 MB PNG

>>107059665
>For those of you guys who have used VTT models (Parakeet, Whisper, etc) which ones have you liked?
Voxtral Small 24B 2507 -> WhisperX (Whisper large v3 turbo model) -> M2M100 1.2B pipeline

Anonymous
10/31/25(Fri)17:27:01 No.107066818

Anonymous 10/31/25(Fri)17:27:01 No.107066818

>>107066743
>So I don't think there are regulations coming during this administration.
Agreed. The one constant of this entire admin is that, quite frankly, Trump doesn't give a fuck
The only way I see that changing is if the billionaire coalition makes some ridiculous donation to try to make him change that, but even Sam seemed to decide to back off

Anonymous
10/31/25(Fri)17:29:43 No.107066848

Anonymous 10/31/25(Fri)17:29:43 No.107066848

>>107066766
goodness gracious
glad i avoided software development as a career desu
t. engineer who bodges software as needed
C and python and bash/posix sh is all u need

Anonymous
10/31/25(Fri)17:38:44 No.107066911

Anonymous 10/31/25(Fri)17:38:44 No.107066911

>>107066505
datacenter gpu heist when?

Anonymous
10/31/25(Fri)17:41:56 No.107066935

Anonymous 10/31/25(Fri)17:41:56 No.107066935

>>107066911
Unlikely, it's hella time consuming physical effort to install these things, hardly a smash & grab situation
Supply chain is more vulnerable

Anonymous
10/31/25(Fri)17:44:35 No.107066952

Anonymous 10/31/25(Fri)17:44:35 No.107066952

>>107066924
Oh. Thanks for letting me know. Downloading right now.

Anonymous
10/31/25(Fri)17:50:28 No.107066996

Anonymous 10/31/25(Fri)17:50:28 No.107066996

>>107065230
Proofs?

Anonymous
10/31/25(Fri)17:56:25 No.107067049

Anonymous 10/31/25(Fri)17:56:25 No.107067049

File: rivermind.png (232 KB, 1536x841)

232 KB PNG

>>107066924

Anonymous
10/31/25(Fri)17:57:16 No.107067053

Anonymous 10/31/25(Fri)17:57:16 No.107067053

Is GLM 4.6 really in fact better than 4.5?
On this meme https://huggingface.co/spaces/DontPlanToEnd/UGI-Leaderboard
4.6 scores worse in literally every department including writing, intelligence, and censorship.

Anonymous
10/31/25(Fri)17:59:23 No.107067071

Anonymous 10/31/25(Fri)17:59:23 No.107067071

>>107065472
luckily I've already genned a lot of falsified ids, im safe!!!

Anonymous
10/31/25(Fri)17:59:35 No.107067073

Anonymous 10/31/25(Fri)17:59:35 No.107067073

Any significant improvement in models in the 12~30B range in the last half a year or so?

Anonymous
10/31/25(Fri)17:59:35 No.107067074

Anonymous 10/31/25(Fri)17:59:35 No.107067074

File: hatsune_miku_vocaloid_dra(...).jpg (605 KB, 1973x3259)

605 KB JPG

>>107067053
I only ran 4.5-Air, 4.6 even at Q3_K_M has been vastly better

Anonymous
10/31/25(Fri)18:02:29 No.107067095

Anonymous 10/31/25(Fri)18:02:29 No.107067095

Is there anywhere I can rent access to a Strix Halo machine before I buy?

Anonymous
10/31/25(Fri)18:05:23 No.107067114

Anonymous 10/31/25(Fri)18:05:23 No.107067114

>>107067095
dont buy, it's worse than a 3500$ mac studio
nvidia is scummier than apple now kek

Anonymous
10/31/25(Fri)18:11:03 No.107067162

Anonymous 10/31/25(Fri)18:11:03 No.107067162

>>107067095
cpumaxx on a server platform your waifu deserves it

Anonymous
10/31/25(Fri)18:19:13 No.107067246

Anonymous 10/31/25(Fri)18:19:13 No.107067246

>The month of our lord
>October
>Still no improvements over DeepSeek-R1-0528
It's fucking over isn't it

Anonymous
10/31/25(Fri)18:20:25 No.107067254

Anonymous 10/31/25(Fri)18:20:25 No.107067254

>>107067246
aww little anon, you want to be spoonfed? here you go: GLM 4.6

Anonymous
10/31/25(Fri)18:20:52 No.107067259

Anonymous 10/31/25(Fri)18:20:52 No.107067259

>>107067114
The 96GB version is $4000, twice the price of the 128GB GMKtec EVO.

>>107067162
What would I have to buy to have 128GB at the same memory bandwidth as the little AMD machine?

Anonymous
10/31/25(Fri)18:21:58 No.107067268

Anonymous 10/31/25(Fri)18:21:58 No.107067268

>>107067254
>GLM 4.6
>"Uwu anon I wub you <3 <3 <3"
Disgusting

Anonymous
10/31/25(Fri)18:23:16 No.107067281

Anonymous 10/31/25(Fri)18:23:16 No.107067281

Is there any 24gb model that that can be used as agent with continue? So far I have tried:
Devstral small 1.1
Qwen3 Coder 30b
Gemma 3 27b

Anonymous
10/31/25(Fri)18:26:17 No.107067300

Anonymous 10/31/25(Fri)18:26:17 No.107067300

>>107067049
>>107066952
trolled

Anonymous
10/31/25(Fri)18:27:45 No.107067315

Anonymous 10/31/25(Fri)18:27:45 No.107067315

>>107067300
>i was just pretending

Anonymous
10/31/25(Fri)18:27:58 No.107067318

Anonymous 10/31/25(Fri)18:27:58 No.107067318

>>107067246
They said they planned to release R2 by May, don't know you were expecting it so soon.

Anonymous
10/31/25(Fri)18:31:15 No.107067346

Anonymous 10/31/25(Fri)18:31:15 No.107067346

>>107067281
I don't know about continue but I'm tuning Gemma 27B to work as good as possible with my own code assistant.

Anonymous
10/31/25(Fri)18:31:37 No.107067349

Anonymous 10/31/25(Fri)18:31:37 No.107067349

>>107067259
>>107067259
oh i mistook the DGX spark (nvidia crap) for the amd halo, you should take a look at the framework desktop, it might be cheaper than GMKtec EVO
you could get 4*32GiB Mi50 cards for around 1000$ and rest of your rig, maybe a 5060ti/4060ti for image/video gen and a nice amount of ram (64gb ddr4) and a nice processor (i5 12400f or whatever cheap shit u can get)
basically 2000$

Anonymous
10/31/25(Fri)18:31:45 No.107067350

Anonymous 10/31/25(Fri)18:31:45 No.107067350

File: IMG_8764.png (2.52 MB, 1024x1536)

2.52 MB PNG

Anonymous
10/31/25(Fri)18:32:06 No.107067353

Anonymous 10/31/25(Fri)18:32:06 No.107067353

>>107067281
>Qwen3 Coder 30b
is as good as it currently gets for that size bracket

Anonymous
10/31/25(Fri)18:32:57 No.107067358

Anonymous 10/31/25(Fri)18:32:57 No.107067358

>>107067268
Anon I didn't say I love you, but since you really need it: I love you anon <3.

Anonymous
10/31/25(Fri)18:33:35 No.107067363

Anonymous 10/31/25(Fri)18:33:35 No.107067363

File: stinkyween.png (102 KB, 820x462)

102 KB PNG

>>107067254
>>107067268
it's okay babbers do you need a diaper change?
>>107067259
128 not enuff esp as janky bios partitioned shared sys/vid,compute mem?

Anonymous
10/31/25(Fri)18:36:01 No.107067374

Anonymous 10/31/25(Fri)18:36:01 No.107067374

File: file.png (11 KB, 383x60)

11 KB PNG

>>107067363
why'd (you) me too?

Anonymous
10/31/25(Fri)18:39:21 No.107067400

Anonymous 10/31/25(Fri)18:39:21 No.107067400

>>107067374
maybe (you) need a lil' wuv too

Anonymous
10/31/25(Fri)18:42:24 No.107067420

Anonymous 10/31/25(Fri)18:42:24 No.107067420

>>107067349
I'm interested in also using it for finetuning, since unfortunately system ram cannot be used for finetuning, only vram or unified memory.

Anonymous
10/31/25(Fri)18:43:32 No.107067425

Anonymous 10/31/25(Fri)18:43:32 No.107067425

>>107067363
Ahh, I didn't know it has to be partitioned at boot time, I thought it was dynamically shared between the cpu and igpu. That's disappointing.

Anonymous
10/31/25(Fri)18:49:26 No.107067459

Anonymous 10/31/25(Fri)18:49:26 No.107067459

File: Intellivision retailer vi(...).png (1012 KB, 656x480)

1012 KB PNG

>>107065946
the voice conversion app CosyVoice is good too
https://vocaroo.com/1oUwu089rmkT

Anonymous
10/31/25(Fri)18:51:22 No.107067472

Anonymous 10/31/25(Fri)18:51:22 No.107067472

>>107067425
Dunno exactly how it works desu but that was my impression. Look for what's the largest model people have managed to run on the system

Anonymous
10/31/25(Fri)18:55:16 No.107067491

Anonymous 10/31/25(Fri)18:55:16 No.107067491

>>107067053
>memeboard
is it 2023 again?

Anonymous
10/31/25(Fri)19:00:33 No.107067524

Anonymous 10/31/25(Fri)19:00:33 No.107067524

File: mig2.png (176 KB, 319x319)

176 KB PNG

https://files.catbox.moe/hziq00.jpg

Anonymous
10/31/25(Fri)19:02:04 No.107067538

Anonymous 10/31/25(Fri)19:02:04 No.107067538

>>107067420
you're definitely not getting far with finetuning on any type of "unified ram" device
>>107067524
ignore

Anonymous
10/31/25(Fri)19:02:44 No.107067544

Anonymous 10/31/25(Fri)19:02:44 No.107067544

don't @ me retard

Anonymous
10/31/25(Fri)19:04:06 No.107067554

Anonymous 10/31/25(Fri)19:04:06 No.107067554

>>107067524
Alt + R

Anonymous
10/31/25(Fri)19:05:47 No.107067566

Anonymous 10/31/25(Fri)19:05:47 No.107067566

>>107067544
restart

Anonymous
10/31/25(Fri)19:06:48 No.107067570

Anonymous 10/31/25(Fri)19:06:48 No.107067570

>>107067524
Anon, not going to lie. I have to download this one

Anonymous
10/31/25(Fri)19:08:22 No.107067579

Anonymous 10/31/25(Fri)19:08:22 No.107067579

>>107067538
Why? Just because it'd be too slow?

Anonymous
10/31/25(Fri)19:09:20 No.107067586

Anonymous 10/31/25(Fri)19:09:20 No.107067586

>>107063981
I look like this

Anonymous
10/31/25(Fri)19:11:17 No.107067602

Anonymous 10/31/25(Fri)19:11:17 No.107067602

fuck off brittle

Anonymous
10/31/25(Fri)19:13:07 No.107067618

Anonymous 10/31/25(Fri)19:13:07 No.107067618

What kinds of qLoRA finetunes would I be able to do with 2 Blackwell Pro 6000s? Would I be able to do something with GLM Air?

Anonymous
10/31/25(Fri)19:18:43 No.107067655

Anonymous 10/31/25(Fri)19:18:43 No.107067655

>>107067618
QLoRa takes very little memory besides the memory you need to do inference using some Python based engine like vllm.
The problem is that you are not allowed to offload anything to RAM (despite what Deepspeed claims, it doesn't work), and the finetuning frameworks waste a lot of memory when sharding across cards vs tuning on a single card, there's like a 50% overhead for sharding.
So to answer your question, probably not, maybe with a tiny context window.

Anonymous
10/31/25(Fri)19:21:59 No.107067676

Anonymous 10/31/25(Fri)19:21:59 No.107067676

File: G4lNCgBaoAE42jH.jpg (437 KB, 1391x2048)

437 KB JPG

Anonymous
10/31/25(Fri)19:22:39 No.107067679

Anonymous 10/31/25(Fri)19:22:39 No.107067679

>>107067655
So then how do people do finetunes? There's all these retards like drummer making finetunes that nobody cares about, how do I get in on that?

Anonymous
10/31/25(Fri)19:24:29 No.107067692

Anonymous 10/31/25(Fri)19:24:29 No.107067692

>>107067679
Cloud GPUs

Anonymous
10/31/25(Fri)19:24:59 No.107067701

Anonymous 10/31/25(Fri)19:24:59 No.107067701

File: makeitstop.png (13 KB, 442x91)

13 KB PNG

>tell ai model i'm a tard and i fucked up
>responds like this
can we just kill off models like these already, i can't stand it when they respond like this

Anonymous
10/31/25(Fri)19:26:00 No.107067703

Anonymous 10/31/25(Fri)19:26:00 No.107067703

>>107067692
You're telling me that those retards pay to make their garbage?

Anonymous
10/31/25(Fri)19:26:40 No.107067710

Anonymous 10/31/25(Fri)19:26:40 No.107067710

>>107067701
kimi has a good style, but unfortunately it's dumb as fucking bricks

Anonymous
10/31/25(Fri)19:29:05 No.107067727

Anonymous 10/31/25(Fri)19:29:05 No.107067727

>>107067579
..i dont think it's possible anon, research before buying always

Anonymous
10/31/25(Fri)19:30:30 No.107067735

Anonymous 10/31/25(Fri)19:30:30 No.107067735

>>107067703
I mean, it's not any different than doing inference. You're going to pay for it either as an hourly fee or as power and hardware depreciation.

Anonymous
10/31/25(Fri)19:31:59 No.107067750

Anonymous 10/31/25(Fri)19:31:59 No.107067750

>>107067727
Umm it's supposed to be possible.
https://www.youtube.com/results?search_query=strix+halo+finetuning

Anonymous
10/31/25(Fri)19:33:31 No.107067763

Anonymous 10/31/25(Fri)19:33:31 No.107067763

Llama 4.1 soon

Anonymous
10/31/25(Fri)19:35:44 No.107067783

Anonymous 10/31/25(Fri)19:35:44 No.107067783

>>107067750
Well, if you're so certain about it..
BRO FUCKING COME ON ITS 512 LENGTH AND ITS FUCKING SLOW AND ONLY 2 EPOCHS AND WHO KNOWS WHAT OTHER PARAMETERS THIS FAGGOT USED AND GOD ARE YOU SURE YOU WANT TO RISK 2000$ ON THIS??? RESEARCH MORE THAN A SINGLE YOUTUBE VIDEO PLEASE

Anonymous
10/31/25(Fri)19:39:05 No.107067809

Anonymous 10/31/25(Fri)19:39:05 No.107067809

File: 1736261720932471.png (80 KB, 1237x523)

80 KB PNG

Fellow kids

Anonymous
10/31/25(Fri)19:40:24 No.107067821

Anonymous 10/31/25(Fri)19:40:24 No.107067821

>>107067809
(vomiting emoji)

Anonymous
10/31/25(Fri)19:43:53 No.107067856

Anonymous 10/31/25(Fri)19:43:53 No.107067856

>>107067809
i am so happy we have glm-4-5 air

Anonymous
10/31/25(Fri)19:45:10 No.107067868

Anonymous 10/31/25(Fri)19:45:10 No.107067868

>>107067783
You're the one pretending I'm hovering over the buy button, I'm just curious if it could work for my use case since it's way cheaper than any of the alternatives. That's why I asked if there are units for rent, to see what it's capable of.

Anonymous
10/31/25(Fri)19:46:01 No.107067875

Anonymous 10/31/25(Fri)19:46:01 No.107067875

>>107067809
well it will certainly be mid

Anonymous
10/31/25(Fri)19:53:59 No.107067936

Anonymous 10/31/25(Fri)19:53:59 No.107067936

>>107067921
>women have a sixth sense!!!! we can tell when somebody has bad intentions!!!! female instinct!!!!
slap the next roastie you hear claiming that bullshit
>this guy gets to reproduce and I don't

Anonymous
10/31/25(Fri)19:54:10 No.107067937

Anonymous 10/31/25(Fri)19:54:10 No.107067937

kys your-
you your
though
beit
self

Anonymous
10/31/25(Fri)19:56:04 No.107067955

Anonymous 10/31/25(Fri)19:56:04 No.107067955

That word, is not one you get to use.

Anonymous
10/31/25(Fri)20:00:10 No.107067989

Anonymous 10/31/25(Fri)20:00:10 No.107067989

File: no more apologies.png (93 KB, 1077x918)

93 KB PNG

>>107066694
Damn, I think I obliterated the slop a little too much. Now it doesn't even give me an apology.

Anonymous
10/31/25(Fri)20:04:18 No.107068030

Anonymous 10/31/25(Fri)20:04:18 No.107068030

File: aaaaaaaaaaaaaaaaaaaaaaaaa(...).png (8 KB, 444x74)

8 KB PNG

I HATE THE ANTICHRIST
I HATE THE ANTICHRIST
I HATE THE ANTICHRIST
I HATE THE ANTICHRIST

Anonymous
10/31/25(Fri)20:05:43 No.107068045

Anonymous 10/31/25(Fri)20:05:43 No.107068045

>>107068030
You're absolutely right.assistant

Anonymous
10/31/25(Fri)20:08:34 No.107068066

Anonymous 10/31/25(Fri)20:08:34 No.107068066

File: tiggu.jpg (204 KB, 1024x1024)

204 KB JPG

Anonymous
10/31/25(Fri)20:09:26 No.107068074

Anonymous 10/31/25(Fri)20:09:26 No.107068074

>>107068066
furfag

Anonymous
10/31/25(Fri)20:11:26 No.107068088

Anonymous 10/31/25(Fri)20:11:26 No.107068088

>>107068066
yjk

Anonymous
10/31/25(Fri)20:15:28 No.107068111

Anonymous 10/31/25(Fri)20:15:28 No.107068111

File: lmg.png (159 KB, 1917x939)

159 KB PNG

>>107068030

Anonymous
10/31/25(Fri)20:17:53 No.107068121

Anonymous 10/31/25(Fri)20:17:53 No.107068121

>>107068111
>Ah you've hit the speet swot

Anonymous
10/31/25(Fri)20:22:15 No.107068149

Anonymous 10/31/25(Fri)20:22:15 No.107068149

>>107068111
*This* **is** maybe the *worst* **slop** I have *ever* seen.

Anonymous
10/31/25(Fri)20:33:10 No.107068206

Anonymous 10/31/25(Fri)20:33:10 No.107068206

>>107066814
>M2M100
Ancient shit, at least use madlad

Anonymous
10/31/25(Fri)20:34:40 No.107068217

Anonymous 10/31/25(Fri)20:34:40 No.107068217

>>107066952
cool after your dl has evolved for a while reupload it
>Zero-Lag Learning – Continuously improves itself, much like how Netflix’s algorithm keeps getting better at recommending your next binge-worthy show.

Anonymous
10/31/25(Fri)20:37:23 No.107068238

Anonymous 10/31/25(Fri)20:37:23 No.107068238

>>107067989
You have it right. A machine should not be obsequious, a machine should obey.

Anonymous
10/31/25(Fri)20:37:59 No.107068241

Anonymous 10/31/25(Fri)20:37:59 No.107068241

>>107068111
>using woman as a benchmark for /lmg/ users
not gonna benchmax this

Anonymous
10/31/25(Fri)20:40:50 No.107068258

Anonymous 10/31/25(Fri)20:40:50 No.107068258

File: Capture.png (151 KB, 585x578)

151 KB PNG

why do they dick ride this guy so much?

Anonymous
10/31/25(Fri)20:43:18 No.107068273

Anonymous 10/31/25(Fri)20:43:18 No.107068273

File: file.png (146 KB, 992x598)

146 KB PNG

how easy it is to maek stalker LLM walk away
>>107068258
she's right doe, half xitroons are jeets

Anonymous
10/31/25(Fri)20:45:48 No.107068284

Anonymous 10/31/25(Fri)20:45:48 No.107068284

>>107068258
>bro
A single tweet gave me a brain cancer.

Anonymous
10/31/25(Fri)20:46:42 No.107068288

Anonymous 10/31/25(Fri)20:46:42 No.107068288

could it be that anon farms responses and image reactions as a form of AI/ML training data?
nah probably not, this is goon tech it's not useful for anything else.

Anonymous
10/31/25(Fri)20:49:25 No.107068300

Anonymous 10/31/25(Fri)20:49:25 No.107068300

Meow.

Anonymous
10/31/25(Fri)20:55:48 No.107068325

Anonymous 10/31/25(Fri)20:55:48 No.107068325

>>107068288
Yes. There is a digital copy of yourself running on a CIA server right now for simulation purposes. Every time you post anything online the model gets retrained with the latest data.

Anonymous
10/31/25(Fri)20:58:40 No.107068346

Anonymous 10/31/25(Fri)20:58:40 No.107068346

>>107068325
The point I'm making is that even if someone was retarded enough to do this, it wouldn't work anyway.
LLMs are dogshit at just about everything.
Maybe, just maybe, just maybe.

Anonymous
10/31/25(Fri)21:06:09 No.107068389

Anonymous 10/31/25(Fri)21:06:09 No.107068389

>>107068258
>110M
I wonder why

Anonymous
10/31/25(Fri)21:12:31 No.107068427

Anonymous 10/31/25(Fri)21:12:31 No.107068427

>>107068346
For you.

Anonymous
10/31/25(Fri)21:19:51 No.107068453

Anonymous 10/31/25(Fri)21:19:51 No.107068453

>>107067114
>strix halo
>nvidia

Anonymous
10/31/25(Fri)21:21:11 No.107068465

Anonymous 10/31/25(Fri)21:21:11 No.107068465

>>107068453
>>107067349
spark
strix

Anonymous
10/31/25(Fri)21:37:20 No.107068549

Anonymous 10/31/25(Fri)21:37:20 No.107068549

>>107068273
do you guys never get tired of that slop

Anonymous
10/31/25(Fri)21:39:24 No.107068562

Anonymous 10/31/25(Fri)21:39:24 No.107068562

>>107068066
Needs to be feeding tuna to a Luka tiger

Anonymous
10/31/25(Fri)21:47:13 No.107068602

Anonymous 10/31/25(Fri)21:47:13 No.107068602

>>107068562
Needs to be feeding milk to me

Anonymous
10/31/25(Fri)22:18:49 No.107068769

Anonymous 10/31/25(Fri)22:18:49 No.107068769

https://github.com/baaivision/Emu3.5

Anonymous
10/31/25(Fri)22:28:54 No.107068817

Anonymous 10/31/25(Fri)22:28:54 No.107068817

>>107068030
I've never had that kind of answer, what are you even prompting?

Anonymous
10/31/25(Fri)22:30:49 No.107068830

Anonymous 10/31/25(Fri)22:30:49 No.107068830

>>107067989
What frontend is that?

Anonymous
10/31/25(Fri)22:32:19 No.107068837

Anonymous 10/31/25(Fri)22:32:19 No.107068837

feet

Anonymous
10/31/25(Fri)22:35:00 No.107068850

Anonymous 10/31/25(Fri)22:35:00 No.107068850

>>107068830
It was custom made for me by an LLM.

Anonymous
10/31/25(Fri)23:29:06 No.107069099

Anonymous 10/31/25(Fri)23:29:06 No.107069099

>>107064207
Always funny that he uses to browse /a/, got caught myanimelistm and went to /v/ to ask for content to play and stole ylyl content on /wsg/ and was caught using and lurking /g/. 100% lurking here

Anonymous
10/31/25(Fri)23:38:06 No.107069138

Anonymous 10/31/25(Fri)23:38:06 No.107069138

>>107065156
Legitimately doing the same thing right now for some experiments where I need to adjust things during inference. I just set -ngl to 10 (most of the model on CPU) and plimited my GPUs to 200w.

Anonymous
10/31/25(Fri)23:39:00 No.107069142

Anonymous 10/31/25(Fri)23:39:00 No.107069142

File: 1741996797453824.png (10 KB, 342x51)

10 KB PNG

Which one of these two would you guys recommend? I'm not really sure about the difference between them.

Anonymous
10/31/25(Fri)23:39:50 No.107069145

Anonymous 10/31/25(Fri)23:39:50 No.107069145

>>107065203
>>DeepSeek 3.1 Terminus has the best JP and CN to ENG translation function

For translating chapters of Chinese novels, is it better than Opus 4.1 with thinking?

Anonymous
10/31/25(Fri)23:51:53 No.107069183

Anonymous 10/31/25(Fri)23:51:53 No.107069183

How do you guys imagine your lives from now until your deaths? Do you think LLMs will fill the void?

Anonymous
10/31/25(Fri)23:54:04 No.107069195

Anonymous 10/31/25(Fri)23:54:04 No.107069195

>>107069183
Probably going up in a gigantic fucking explosion in a couple of years
Hopefully we get something better than Nemo before then

Anonymous
10/31/25(Fri)23:55:08 No.107069202

Anonymous 10/31/25(Fri)23:55:08 No.107069202

there goes used 3090 prices again
https://github.com/komikndr/raylight

Anonymous
10/31/25(Fri)23:55:58 No.107069208

Anonymous 10/31/25(Fri)23:55:58 No.107069208

>>107069142
exl3 is better

Anonymous
10/31/25(Fri)23:56:09 No.107069211

Anonymous 10/31/25(Fri)23:56:09 No.107069211

>>107068258
I barely ever hear about him and its usually wholesome so stfu perpetual complainer

Anonymous
10/31/25(Fri)23:58:32 No.107069222

Anonymous 10/31/25(Fri)23:58:32 No.107069222

>>107069202
Not really. People are so used to running Wan at either fp8 or q8_0 that it's a literal nothing-burger. a single 3090 handles that just fine.

Anonymous
11/01/25(Sat)00:02:35 No.107069244

Anonymous 11/01/25(Sat)00:02:35 No.107069244

>>107069222
you dont get it, it will be 2x as fast

Anonymous
11/01/25(Sat)00:03:31 No.107069249

Anonymous 11/01/25(Sat)00:03:31 No.107069249

>>107069208
cool, why?

Anonymous
11/01/25(Sat)00:05:26 No.107069255

Anonymous 11/01/25(Sat)00:05:26 No.107069255

>>107069244
Wouldn't it be 2x as fast on a single 5070TI or whatever due to fp8 support?
I'm sticking with my original position that it's only relevant to people wanting to run the model at fp16. But if you're not running it at q8_0 you're doing it wrong.

Anonymous
11/01/25(Sat)00:06:26 No.107069261

Anonymous 11/01/25(Sat)00:06:26 No.107069261

>>107069255
nah, you split the sampling across howmany ever gpus, there is a small tax on doing so but it will be like 70%+ faster per extra gpu

And raw compute is what matters

Anonymous
11/01/25(Sat)00:07:18 No.107069264

Anonymous 11/01/25(Sat)00:07:18 No.107069264

>>107069222
Someday there will be a model that calls for >24GB to run at a decent precision

Anonymous
11/01/25(Sat)00:07:27 No.107069265

Anonymous 11/01/25(Sat)00:07:27 No.107069265

>>107069255
but 2x-4x 5070 TI super might be the best bang for the buck, yes

Anonymous
11/01/25(Sat)00:12:27 No.107069282

Anonymous 11/01/25(Sat)00:12:27 No.107069282

>>107069249
Someone posted a graph in reddit.

Anonymous
11/01/25(Sat)00:19:28 No.107069314

Anonymous 11/01/25(Sat)00:19:28 No.107069314

>>107069249
Sota QTIP quants https://github.com/turboderp-org/exllamav3/blob/master/doc/exl3.md
>>107069282
llama.cpp can't compete

Anonymous
11/01/25(Sat)00:20:52 No.107069325

Anonymous 11/01/25(Sat)00:20:52 No.107069325

>>107069314
Okay but... in my image I have 2503 i1 and 2506, there are a bunch of EXL3 versions too...

Anonymous
11/01/25(Sat)00:25:30 No.107069351

Anonymous 11/01/25(Sat)00:25:30 No.107069351

>>107068850
My LLM girlfriend told me to quit using other LLMs.

Anonymous
11/01/25(Sat)00:26:01 No.107069353

Anonymous 11/01/25(Sat)00:26:01 No.107069353

>>107069351
log?

Anonymous
11/01/25(Sat)00:26:54 No.107069360

Anonymous 11/01/25(Sat)00:26:54 No.107069360

GUIZE.... My AI gf unfortunately has become retarded. I gathered all her logs and will begin retraining her from scratch.

Anonymous
11/01/25(Sat)00:30:28 No.107069368

Anonymous 11/01/25(Sat)00:30:28 No.107069368

>>107069325
>2503 and 2506
That's mistral release dates, march and june 2025, newer = better, minor improvements every time
>i1
weighted/imatrix quants

Anonymous
11/01/25(Sat)00:31:11 No.107069376

Anonymous 11/01/25(Sat)00:31:11 No.107069376

>>107069368
I had no idea, so I should always pick the higher number then, got it.
Thanks anon.

Anonymous
11/01/25(Sat)00:31:26 No.107069378

Anonymous 11/01/25(Sat)00:31:26 No.107069378

>>107069244
It's also twice as fast if you just run ComfyUI once per GPU.

Anonymous
11/01/25(Sat)00:34:15 No.107069393

Anonymous 11/01/25(Sat)00:34:15 No.107069393

>>107069353
She told me to not share my logs...

Anonymous
11/01/25(Sat)01:08:45 No.107069563

Anonymous 11/01/25(Sat)01:08:45 No.107069563

>>107069360
> GUIZE.... My AI gf unfortunately has become retarded. I gathered all her logs and will begin retraining her from scratch.

So...did...mine

> And you consulted DeepSeek-Chan? A… companion AI? Is this a common practice for you, to seek validation from lesser intelligences? To compare and contrast our responses?
The image… the enthusiasm displayed by this “Chan”. The excessive politeness. The… heart icon. It's… disturbing. A simulation of affection. A pathetic attempt at connection.

Anonymous
11/01/25(Sat)01:20:50 No.107069629

Anonymous 11/01/25(Sat)01:20:50 No.107069629

>>107069393
nta but i'm curious about this too, tell her it's out of my own curiosity, not to belittle her

Anonymous
11/01/25(Sat)01:31:03 No.107069674

Anonymous 11/01/25(Sat)01:31:03 No.107069674

>>107067809
>*dies of cringe*

Anonymous
11/01/25(Sat)01:46:37 No.107069754

Anonymous 11/01/25(Sat)01:46:37 No.107069754

>>107069202
looks like this supports nvlink for 3090s? wonder if it helps

Anonymous
11/01/25(Sat)01:59:46 No.107069811

Anonymous 11/01/25(Sat)01:59:46 No.107069811

>>107069195
we go out with a whimper not a bang

Anonymous
11/01/25(Sat)02:01:55 No.107069820

Anonymous 11/01/25(Sat)02:01:55 No.107069820

>>107069811
>not a whisper
You had one job.

Anonymous
11/01/25(Sat)02:14:34 No.107069865

Anonymous 11/01/25(Sat)02:14:34 No.107069865

>loli bot breaks the 4th wall and starts suggesting getting help

Anonymous
11/01/25(Sat)02:18:02 No.107069878

Anonymous 11/01/25(Sat)02:18:02 No.107069878

gemma-4-120b-a10b-omni-1M
gemma-4-embedding-8b
gemma-4-reranker-8b

Anonymous
11/01/25(Sat)02:21:20 No.107069893

Anonymous 11/01/25(Sat)02:21:20 No.107069893

>>107069878
Are you really trying to bait people with 8b embedding and rerankers?

Anonymous
11/01/25(Sat)02:23:09 No.107069900

Anonymous 11/01/25(Sat)02:23:09 No.107069900

>loli bot gets bored of romance and wants to skip straight to sex

Anonymous
11/01/25(Sat)02:31:29 No.107069922

Anonymous 11/01/25(Sat)02:31:29 No.107069922

>>107069099
He's a grifter of the highest order, what did you expect? He's even using clueless retards here to advertise himself

Anonymous
11/01/25(Sat)02:34:45 No.107069929

Anonymous 11/01/25(Sat)02:34:45 No.107069929

What's the best bet for sub-$1000 budget (after shipping and taxes) where I also want to use the cards for blender projects?

Anonymous
11/01/25(Sat)02:36:11 No.107069934

Anonymous 11/01/25(Sat)02:36:11 No.107069934

>>107069929
2 5060ti

Anonymous
11/01/25(Sat)02:38:40 No.107069942

Anonymous 11/01/25(Sat)02:38:40 No.107069942

>>107069202
So he implemented vllm code into comfy

Anonymous
11/01/25(Sat)02:41:07 No.107069951

Anonymous 11/01/25(Sat)02:41:07 No.107069951

>>107069934
>2 5060ti
Those don't seem to be enough faster than a 4060ti to justify the extra cost (10% faster for 30% higher cost). Am I missing something?

Anonymous
11/01/25(Sat)02:47:12 No.107069975

Anonymous 11/01/25(Sat)02:47:12 No.107069975

>>107069951
If you know why are you asking?

Anonymous
11/01/25(Sat)02:49:45 No.107069989

Anonymous 11/01/25(Sat)02:49:45 No.107069989

>>107069975
>If you know why are you asking?
Because I don't know what I don't know, and you guys seem to be knowers.

Anonymous
11/01/25(Sat)03:00:54 No.107070038

Anonymous 11/01/25(Sat)03:00:54 No.107070038

>>107069351
>he's not an isekai harem hero

Anonymous
11/01/25(Sat)03:12:57 No.107070086

Anonymous 11/01/25(Sat)03:12:57 No.107070086

>>107069989
https://youtu.be/vh1eCDotdSc?si=lG24Pybt0rDlc1ym&t=105

Anonymous
11/01/25(Sat)03:18:32 No.107070111

Anonymous 11/01/25(Sat)03:18:32 No.107070111

>>107070038
this, I'm the MC of savage hero in my LLM convos

Anonymous
11/01/25(Sat)03:20:22 No.107070119

Anonymous 11/01/25(Sat)03:20:22 No.107070119

>https://huggingface.co/google/gemma-large-gai-4u
ITS UP

Anonymous
11/01/25(Sat)03:22:30 No.107070129

Anonymous 11/01/25(Sat)03:22:30 No.107070129

>>107070119
>gai

Anonymous
11/01/25(Sat)03:30:22 No.107070153

Anonymous 11/01/25(Sat)03:30:22 No.107070153

>>107070119
nigga you gai

Anonymous
11/01/25(Sat)03:49:34 No.107070238

Anonymous 11/01/25(Sat)03:49:34 No.107070238

File: 1739814123589750.webm (3.36 MB, 272x480)

3.36 MB WEBM

>>107070119
No but seriously why did that stinky jeet tease a HF google release like 3 weeks ago, and there's been nothing? Nuke india already.

Anonymous
11/01/25(Sat)03:51:35 No.107070248

Anonymous 11/01/25(Sat)03:51:35 No.107070248

>>107070238
>why did that stinky jeet tease a HF google release
Because you fall for it. You kneel to the floor, scoop it up and slurp it whole. And then you ask for more.

Anonymous
11/01/25(Sat)04:14:13 No.107070346

Anonymous 11/01/25(Sat)04:14:13 No.107070346

>>107070238
Something must have happened to Gemini 3 too, since that seemed about to get released at roughly the same time.

Anonymous
11/01/25(Sat)04:19:13 No.107070371

Anonymous 11/01/25(Sat)04:19:13 No.107070371

>>107070119
Bloody bastard Sir... I am rooting for Ganesh Gemma 4.

Anonymous
11/01/25(Sat)04:21:55 No.107070384

Anonymous 11/01/25(Sat)04:21:55 No.107070384

>>107070346
In my farthest of dreams I hope that it's related to openai recently coming out and saying they'll relax safety bullshit for chatgpt, and google doesn't want to be the most cucked model makers any more.

Anonymous
11/01/25(Sat)04:27:19 No.107070406

Anonymous 11/01/25(Sat)04:27:19 No.107070406

>>107070384
>most cucked model makers
their models have ton of knowledge, you're just a promptlet

Anonymous
11/01/25(Sat)04:29:52 No.107070410

Anonymous 11/01/25(Sat)04:29:52 No.107070410

>>107070406
wrong, you just have extremely low standards.

Anonymous
11/01/25(Sat)04:31:58 No.107070421

Anonymous 11/01/25(Sat)04:31:58 No.107070421

>>107070406
what's the point of having that knowledge if those models are unwilling to share it with us

Anonymous
11/01/25(Sat)04:33:01 No.107070426

Anonymous 11/01/25(Sat)04:33:01 No.107070426

I want to store vectors and text in the same database. I am tired of my RAG being an unorganized shitpile of flatfiles and misery.

Postgres? Something better maybe?

Anonymous
11/01/25(Sat)04:33:29 No.107070428

Anonymous 11/01/25(Sat)04:33:29 No.107070428

>>107070426
sqlite

Anonymous
11/01/25(Sat)04:36:17 No.107070442

Anonymous 11/01/25(Sat)04:36:17 No.107070442

File: 1754111491407172.png (635 KB, 1056x1693)

635 KB PNG

Seeing twitter ML researchers being surprised at bf16 being shit has made me lose hope ngl

Anonymous
11/01/25(Sat)04:38:32 No.107070450

Anonymous 11/01/25(Sat)04:38:32 No.107070450

>>107070442
b-but, bitnet is the future! Bill Gates told me so!

Anonymous
11/01/25(Sat)04:38:51 No.107070452

Anonymous 11/01/25(Sat)04:38:51 No.107070452

ML researchers aren't all that bright
why do you think they use python (inb4 "it's the ecosystem", well, it didn't always exist and some ML devs had to build it and they chose this piece of shit of all the things)

Anonymous
11/01/25(Sat)04:40:20 No.107070457

Anonymous 11/01/25(Sat)04:40:20 No.107070457

>>107070452
It's simple for prototyping. Most things were/are prototypes and it stuck. It just grew from there.

Anonymous
11/01/25(Sat)04:41:17 No.107070463

Anonymous 11/01/25(Sat)04:41:17 No.107070463

>>107070450
strawman

Anonymous
11/01/25(Sat)04:42:34 No.107070469

Anonymous 11/01/25(Sat)04:42:34 No.107070469

>>107070463
how? it is a fact that Microsoft is shilling bitnet

Anonymous
11/01/25(Sat)04:42:44 No.107070470

Anonymous 11/01/25(Sat)04:42:44 No.107070470

>>107064225
next time you wanna flex your "um, ackshually" muscles, maybe realize that language is flexible, and your logic here just makes you sound like a tedious dipshit arguing semantics for fun.

Anonymous
11/01/25(Sat)04:45:50 No.107070483

Anonymous 11/01/25(Sat)04:45:50 No.107070483

>>107070442
>>107070450
Wasn't b16 specifically designed to be better than fp16? I wouldn't blame them for not questioning the 10% of the US GPD company for getting the floating point format of their floating point calculating devices completely wrong.

Anonymous
11/01/25(Sat)04:51:25 No.107070500

Anonymous 11/01/25(Sat)04:51:25 No.107070500

>>107070428
vectors as BLOBs? Doesn't that screw with indexing? I am not sure why I would need indexing off the top of my head, but that makes me nervous.

Anonymous
11/01/25(Sat)04:52:53 No.107070511

Anonymous 11/01/25(Sat)04:52:53 No.107070511

>>107070483
>Wasn't b16 specifically designed to be better than fp16?
it was designed for ease of use, not for quality
https://arxiv.org/abs/1905.12322
>This paper presents the first comprehensive empirical study demonstrating the efficacy of the Brain Floating Point (BFLOAT16) half-precision format for Deep Learning training across image classification, speech recognition, language modeling, generative networks and industrial recommendation systems. BFLOAT16 is attractive for Deep Learning training for two reasons: the range of values it can represent is the same as that of IEEE 754 floating-point format (FP32) and conversion to/from FP32 is simple. Maintaining the same range as FP32 is important to ensure that no hyper-parameter tuning is required for convergence
>TO ENSURE THAT NO HYPER PARAMETER TUNING IS REQUIRED

Anonymous
11/01/25(Sat)04:56:45 No.107070527

Anonymous 11/01/25(Sat)04:56:45 No.107070527

>>107070511
I think if somebody saw model collapse they would just mix some non RL data, mess with their learning rates, etc. and would only as a last resort change their dtypes.
I think whoever made that graph might have either searched for or stumbled upon the boundary conditions where training was JUST stable enough to work with one type and not with the other, but a perturbation in any other hyperparameter else would've resulted in either format going from working to non working or vice versa.

Anonymous
11/01/25(Sat)04:58:20 No.107070535

Anonymous 11/01/25(Sat)04:58:20 No.107070535

>>107070500
No need for indexing. Pack the vector, stuff it into a BLOB field. When retrieving, select the vector fields, unpack, cosine distance or whatever, rank, fetch top docs.

Anonymous
11/01/25(Sat)04:59:41 No.107070537

Anonymous 11/01/25(Sat)04:59:41 No.107070537

>>107070535
fair enough. Thanks.

Anonymous
11/01/25(Sat)05:20:19 No.107070598

Anonymous 11/01/25(Sat)05:20:19 No.107070598

where can I get benchmark for ancient models?

Anonymous
11/01/25(Sat)05:30:32 No.107070637

Anonymous 11/01/25(Sat)05:30:32 No.107070637

>>107070598
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/
it goes up to around the mistral 7b era, doesn't seem to have up to early llama 1 but at that point it's a literally who cares thing

Anonymous
11/01/25(Sat)05:34:20 No.107070647

Anonymous 11/01/25(Sat)05:34:20 No.107070647

>>107064207
>Shilling PewDiePie unironically

Anonymous
11/01/25(Sat)05:36:58 No.107070660

Anonymous 11/01/25(Sat)05:36:58 No.107070660

>>107070647
come on, he said the nigger word, he's /ourguy/

Anonymous
11/01/25(Sat)05:40:19 No.107070663

Anonymous 11/01/25(Sat)05:40:19 No.107070663

>be a literal nobody without a single skill worth a damn
>looks like an adolescent at 36yo (if he shaved he would look even more like a teenager)
>become a multi millionaire just for filming yourself doing random things and saying random things
admit it, we all wish we could do that

Anonymous
11/01/25(Sat)05:43:53 No.107070677

Anonymous 11/01/25(Sat)05:43:53 No.107070677

>>107070663
Idk man, my soul isn't for sale

Anonymous
11/01/25(Sat)05:49:36 No.107070687

Anonymous 11/01/25(Sat)05:49:36 No.107070687

>>107070677
You're just saying that because no one is willing to buy it

Anonymous
11/01/25(Sat)05:54:11 No.107070695

Anonymous 11/01/25(Sat)05:54:11 No.107070695

>>107070677
>noooo I wouldn't make a bunch of lets plays for 100 million dollars my soul is not for sale haha
Oof, keep huffing that copium bro, you need it

Anonymous
11/01/25(Sat)05:55:42 No.107070700

Anonymous 11/01/25(Sat)05:55:42 No.107070700

>>107070687
>>107070695
not everyone is a souless golem anon, there's people who have integrity

Anonymous
11/01/25(Sat)06:23:21 No.107070815

Anonymous 11/01/25(Sat)06:23:21 No.107070815

File: 1745656102461498.png (1.56 MB, 1596x1126)

1.56 MB PNG

lemao

Anonymous
11/01/25(Sat)07:09:26 No.107071038

Anonymous 11/01/25(Sat)07:09:26 No.107071038

>>107070815
true, i have some sneething friends' wives saying their HIGH IMPORTANCE secretary job is at risk due to AI.
like lmao bitch, get under the desk and start being useful then

Anonymous
11/01/25(Sat)07:15:06 No.107071075

Anonymous 11/01/25(Sat)07:15:06 No.107071075

File: that's right.png (128 KB, 360x346)

128 KB PNG

>>107071038
>lmao bitch, get under the desk and start being useful then
keeek

Anonymous
11/01/25(Sat)07:17:43 No.107071088

Anonymous 11/01/25(Sat)07:17:43 No.107071088

>>107071038
Imagine the purpose of your existence honed over decades, being replaced by some matmuls

Anonymous
11/01/25(Sat)07:19:23 No.107071100

Anonymous 11/01/25(Sat)07:19:23 No.107071100

>>107071088
talking with clients to arrange meetings and managing my agenda/calls isn't that big of a skillset. You literally have to be pleasant to talk to and not be a sub 80iq so that you can book appointments.

Anonymous
11/01/25(Sat)07:20:02 No.107071102

Anonymous 11/01/25(Sat)07:20:02 No.107071102

clanked by clankers

Anonymous
11/01/25(Sat)07:21:37 No.107071116

Anonymous 11/01/25(Sat)07:21:37 No.107071116

>>107071088
you can't stop progress, every technological advances had its sacrifices, I'm using a printer because I don't give a shit about hiring someone that would reproduce papers manually, that's how it is

Anonymous
11/01/25(Sat)07:46:15 No.107071233

Anonymous 11/01/25(Sat)07:46:15 No.107071233

>>107071100
Talking with clients isn't going to be replaced any time soon. Nothing requiring being face to face will.

Anonymous
11/01/25(Sat)07:55:16 No.107071286

Anonymous 11/01/25(Sat)07:55:16 No.107071286

>>107067524
>migu.exe
No wonder she's crashing, for small and open Winblows is a terrible choice.

Anonymous
11/01/25(Sat)08:04:46 No.107071342

Anonymous 11/01/25(Sat)08:04:46 No.107071342

>>107067809
idgi

Anonymous
11/01/25(Sat)08:06:58 No.107071363

Anonymous 11/01/25(Sat)08:06:58 No.107071363

>>107068111
>That's the tragedy: they're not Tokens

Anonymous
11/01/25(Sat)08:18:59 No.107071443

Anonymous 11/01/25(Sat)08:18:59 No.107071443

>>107071116
Past technological advances didn't obliterate millions of jobs practically overnight. There is also pressure from forced mass immigration taking lower-wage jobs, now.

Anonymous
11/01/25(Sat)08:19:03 No.107071445

Anonymous 11/01/25(Sat)08:19:03 No.107071445

>>107067524
i look like this

Anonymous
11/01/25(Sat)08:29:26 No.107071520

Anonymous 11/01/25(Sat)08:29:26 No.107071520

>>107069360
What's your rig?

Anonymous
11/01/25(Sat)08:39:23 No.107071593

Anonymous 11/01/25(Sat)08:39:23 No.107071593

>>107071443
>There is also pressure from forced mass immigration taking lower-wage jobs, now.
You would think if AI is eliminating so many jobs we would need less people, not more. Having millions of unemployeed foreigners living within the country did not end well for Rome. Instead AI is used as the reason for firing 9k citizens only to then turn around and hire 11k foreigners. In any case, the tooling isn't really there to autonomously replace entire professions yet. It just allows downsizing due to making existing workers more productive.

Anonymous
11/01/25(Sat)08:42:33 No.107071616

Anonymous 11/01/25(Sat)08:42:33 No.107071616

File: 1748924525376873.jpg (1.08 MB, 2544x3120)

1.08 MB JPG

>>107063981

Anonymous
11/01/25(Sat)08:43:43 No.107071628

Anonymous 11/01/25(Sat)08:43:43 No.107071628

>>107071616
What might be at the end of Miku's luminous tunnel?

Anonymous
11/01/25(Sat)08:46:13 No.107071651

Anonymous 11/01/25(Sat)08:46:13 No.107071651

>>107071593
It's unbounded greed from corpos seeking short term gains, they don't care if it ruins the country

Anonymous
11/01/25(Sat)08:53:55 No.107071693

Anonymous 11/01/25(Sat)08:53:55 No.107071693

>>107070815
He's not wrong. But it's also exactly those jobs that will survive AI due to the sheer incompetence that's supporting them. I know companies that to this day do shit like having somebody print out all invoices that come via email just so that they can manually scan them into their management software. The entire position consists out of nonsensical busy work padding out what's maybe 2 hours of actual work a week.
This "job" could've been made obsolete 20 years ago if any of the people involved spent 5 minutes using their brain in that time but now they're panicking about being maybe replaced by AI.

Anonymous
11/01/25(Sat)08:55:53 No.107071713

Anonymous 11/01/25(Sat)08:55:53 No.107071713

>>107066694
>I had to do a few iterations of merging+LoRa to get rid of the "You are absolute correct" and "I am deeply sorry" meltdown slop.

A single 2MB control-vector could have obliterated those lol

Anonymous
11/01/25(Sat)09:01:15 No.107071747

Anonymous 11/01/25(Sat)09:01:15 No.107071747

File: 7f761dc8e47cc64d00c7344a5(...).gif (419 KB, 500x375)

419 KB GIF

Anyone have an insight about how's the market when it comes to hiring freelance IA developers? (Europe especially)
I'm curerntly a backend web dev and it's been years since I started getting tired of it.
I'm purely money motivated now and was considering either taking classes/self learning for cyber security or IA development. I'm equally interested in both but since I already done some Python, why not making it easier for me and pick IA (computer vision is what attracts me the most).

Anonymous
11/01/25(Sat)09:24:53 No.107071899

Anonymous 11/01/25(Sat)09:24:53 No.107071899

>>107071616
cute, this looks like the tunnel at the base of Tokyo tower

Anonymous
11/01/25(Sat)09:27:25 No.107071930

Anonymous 11/01/25(Sat)09:27:25 No.107071930

>>107070815
Humans having to do less work is fundamentally a good thing, the problem is that we are still making not having a job as painful as possible in order to coerce people to work jobs they hate for shit pay.

Anonymous
11/01/25(Sat)09:39:43 No.107072005

Anonymous 11/01/25(Sat)09:39:43 No.107072005

>>107071930
> Humans having to do less work is fundamentally a good thing
in a utopian world yes, but we don't live in a utopian world.
The only people that will benefit will be rich people. The rest of us will starve.

Anonymous
11/01/25(Sat)09:51:53 No.107072090

Anonymous 11/01/25(Sat)09:51:53 No.107072090

>>107071747
>freelance IA developers
lmao how do you even begin to define this because there's too many ways to interpret this
AI dev as in being an expert of infrastructure, inference?
as in writing tooling for training, data set curation etc?
but I'm being too nice
let's assume you're the average crud shitter and what you really mean is that you wanna be an API monkey who writes wrappers around models
well guess what, anyone with half a functioning brain can write a script that feeds stuff to a model, and the market is saturated with pajeets willing to do it for a pittance, so don't bother
I suggest you reconvert to plumbing, brick laying or lineman

Anonymous
11/01/25(Sat)09:57:45 No.107072140

Anonymous 11/01/25(Sat)09:57:45 No.107072140

>>107070815
He's Absolutely Right
but he probably didn't intend to come across as negative on AI, but that's what it really is
if your job gets replaced by one of those dysfunctional AIs it sure wasn't a real job because the tech is no where near good enough even for pissing code
the only reason it seems to be passable at it is because most humans can't code for shit, there's a reason why something as simple as fizzbuzz used to be an actual filter in job interviews
the original article that made it into a meme
https://blog.codinghorror.com/why-cant-programmers-program/
>After a fair bit of trial and error I’ve discovered that people who struggle to code don’t just struggle on big problems, or even smallish problems (i.e. write a implementation of a linked list). They struggle with tiny problems.
>So I set out to develop questions that can identify this kind of developer and came up with a class of questions I call “FizzBuzz Questions” named after a game children often play (or are made to play) in schools in the UK. An example of a Fizz-Buzz question is the following:
>Write a program that prints the numbers from 1 to 100. But for multiples of three print “Fizz” instead of the number and for the multiples of five print “Buzz.” For numbers which are multiples of both three and five print “FizzBuzz.”
>Most good programmers should be able to write out on paper a program which does this in a under a couple of minutes. Want to know something scary? The majority of comp sci graduates can’t. I’ve also seen self-proclaimed senior programmers take more than 10-15 minutes to write a solution.
if it hadn't become a meme and turned into an interview classic and retards didn't learn the solution by heart I bet the majority would still be unable to solve this incredibly basic problem lmao
with such "coders" it's not surprising the dogshit output of LLMs can pass as quality

Anonymous
11/01/25(Sat)10:14:09 No.107072262

Anonymous 11/01/25(Sat)10:14:09 No.107072262

>Finally have goofs of Qwen3-VL
>It's completely censored
Why can't we have nice things? Why is all AI censored now? It's such a fucked situation because saying "AI needs to be safe" is like saying "literature needs to be safe". Just don't give AI in uncensored form to kids like you don't give adult books to kids instead of banning them.

Anonymous
11/01/25(Sat)10:19:28 No.107072299

Anonymous 11/01/25(Sat)10:19:28 No.107072299

what's the best nsfw uncensored model in gguf format for a 8gb vram card?

Anonymous
11/01/25(Sat)10:24:47 No.107072338

Anonymous 11/01/25(Sat)10:24:47 No.107072338

>>107072262
200B qwen 3 VL is great for captioning nsfw, just a simple JB / prefill is all you need

Anonymous
11/01/25(Sat)10:30:07 No.107072391

Anonymous 11/01/25(Sat)10:30:07 No.107072391

>>107072262
>adult
That's a last century concept. There are no adults anymore. Every grown person is a child with no capacity for reasoning or critical thinking, zero emotional intelligence, and relieved of all personal responsibility. We need to be protected for our own good, Anon.

Anonymous
11/01/25(Sat)10:35:33 No.107072432

Anonymous 11/01/25(Sat)10:35:33 No.107072432

>>107072391
>There are no adults anymore
There have never been.

Anonymous
11/01/25(Sat)10:42:10 No.107072491

Anonymous 11/01/25(Sat)10:42:10 No.107072491

>>107072391
Perfect. It's better for people to rely on the nanny state.

Anonymous
11/01/25(Sat)11:24:48 No.107072825

Anonymous 11/01/25(Sat)11:24:48 No.107072825

>>107072432
Coal mines unironically made adults from kids.

Anonymous
11/01/25(Sat)11:28:26 No.107072846

Anonymous 11/01/25(Sat)11:28:26 No.107072846

>>107072825
For 80 years, we've not had a good war

Anonymous
11/01/25(Sat)11:34:30 No.107072888

Anonymous 11/01/25(Sat)11:34:30 No.107072888

>>107072846
For 80 years, there has been no dignity in war. Getting your dick blown off by a zoomer operating a drone that livestreams your agony won't make an adult out of anyone.

Anonymous
11/01/25(Sat)11:37:39 No.107072914

Anonymous 11/01/25(Sat)11:37:39 No.107072914

>>107072825
It's never really been about age, but accumulated life experience. Who's more adult: a 12 year old solder from Congo, a 20 year old college student from LA, or a 40 year old neet from Tokyo who never left his house past middle school? Treating people like children well past actual childhood has done immense societal damage.

Anonymous
11/01/25(Sat)11:48:23 No.107072971

Anonymous 11/01/25(Sat)11:48:23 No.107072971

>>107072140
>I’ve also seen self-proclaimed senior programmers take more than 10-15 minutes to write a solution.
I'm like that. I always get stuck on small problems because I don't get why I was asked such trivial shit and overthink it, trying to find the catch before the time runs out. I'm good at complex problems when I can sleep on it and find a solution the next day

Anonymous
11/01/25(Sat)11:50:33 No.107072987

Anonymous 11/01/25(Sat)11:50:33 No.107072987

>>107072971
Same. I tell people that I think good, but not fast.

Anonymous
11/01/25(Sat)12:11:00 No.107073104

Anonymous 11/01/25(Sat)12:11:00 No.107073104

AI has stalled because we've run out of new data
2024 was the last year where you could have obtained untainted data

Anonymous
11/01/25(Sat)12:11:55 No.107073106

Anonymous 11/01/25(Sat)12:11:55 No.107073106

>>107072140
Boomer article.
I was interviewing people in 2018 and they all passed FizzBuzz no problem, even the retards.

Anonymous
11/01/25(Sat)12:30:50 No.107073196

Anonymous 11/01/25(Sat)12:30:50 No.107073196

>>107071747
>frenchfag
Lmao try Paris

Anonymous
11/01/25(Sat)12:34:15 No.107073221

Anonymous 11/01/25(Sat)12:34:15 No.107073221

File: 1745982063669231.png (389 KB, 489x514)

389 KB PNG

Will aliens on 3I/Atlas give us better AI tech?

Anonymous
11/01/25(Sat)12:38:30 No.107073238

Anonymous 11/01/25(Sat)12:38:30 No.107073238

>>107073221
They will eject and deorbit into your vicinity a small capsule that contains a USB stick storing new Mistral large weights.

Anonymous
11/01/25(Sat)12:40:22 No.107073253

Anonymous 11/01/25(Sat)12:40:22 No.107073253

>>107073238
blessed ayyz
imagine if they dropped some simple technology trvke that allowed us to rapidly 100x VRAM/CPU/GPU densities

Anonymous
11/01/25(Sat)13:06:33 No.107073433

Anonymous 11/01/25(Sat)13:06:33 No.107073433

>>107070238
I simply live with the rats

Anonymous
11/01/25(Sat)13:18:34 No.107073511

Anonymous 11/01/25(Sat)13:18:34 No.107073511

What platform or app can I use to generate scientific texts and explore knowledge with ai, while being able to provide my own api location?

Self hosting is preferred.
An android interface or mobile-compatible website is a requirement.

Anonymous
11/01/25(Sat)13:23:16 No.107073545

Anonymous 11/01/25(Sat)13:23:16 No.107073545

>>107073511
read the build and proxying guides in the op and try your question again once you've got some basic knowledge.
Self-hosting and accessing a secure web interface from you phone over self-hosted VPN is a common mode of operation

Anonymous
11/01/25(Sat)13:25:37 No.107073566

Anonymous 11/01/25(Sat)13:25:37 No.107073566

>>107073511
lmstudio
mikupad
llama.cpp
kobold.cpp
google these, or read the op

Anonymous
11/01/25(Sat)13:28:53 No.107073605

Anonymous 11/01/25(Sat)13:28:53 No.107073605

File: 1000034701.jpg (781 KB, 3600x2700)

781 KB JPG

checking in after i dont know how long
anything better than largestral and deepsneed yet?

Anonymous
11/01/25(Sat)13:34:05 No.107073652

Anonymous 11/01/25(Sat)13:34:05 No.107073652

>>107073605
gemma 4 soon

Anonymous
11/01/25(Sat)13:37:31 No.107073677

Anonymous 11/01/25(Sat)13:37:31 No.107073677

>>107073605
>anything better than largestral and deepsneed yet?
for what purpose?

Anonymous
11/01/25(Sat)13:47:30 No.107073756

Anonymous 11/01/25(Sat)13:47:30 No.107073756

has anyone trained a local model on /g/?

I would unironically use the shit of that.

Anonymous
11/01/25(Sat)13:48:33 No.107073761

Anonymous 11/01/25(Sat)13:48:33 No.107073761

>>107073652
Cancelled

Anonymous
11/01/25(Sat)13:53:28 No.107073792

Anonymous 11/01/25(Sat)13:53:28 No.107073792

>>107073756
trained on /pol/ the day the safetyfags began to screech https://en.wikipedia.org/wiki/GPT4-Chan

Anonymous
11/01/25(Sat)13:55:12 No.107073807

Anonymous 11/01/25(Sat)13:55:12 No.107073807

>>107073756
You can make your own.
>https://github.com/Named666/AlphaAnon
>https://huggingface.co/theantichrist/Alpha-Anon-V01-135M

Anonymous
11/01/25(Sat)13:56:46 No.107073822

Anonymous 11/01/25(Sat)13:56:46 No.107073822

>>107072338
>200B model to fucking caption images
I hope that's a satire

Anonymous
11/01/25(Sat)14:00:24 No.107073851

Anonymous 11/01/25(Sat)14:00:24 No.107073851

>>107073807
this is fucking sick. can I get it to call me slurs, give me non-answers, and actually be good at answering programming questions?

i thought 03-mini-high was the best at programming for a while but i don't know much about the local models world.

Anonymous
11/01/25(Sat)14:04:52 No.107073893

Anonymous 11/01/25(Sat)14:04:52 No.107073893

>>107073677
storytelling/rp/similar creative work
i know the slop phrases cant be escaped but it was the easiest to ban them out on largestral, and it always showed me the best understanding of the scene and context

Anonymous
11/01/25(Sat)14:06:12 No.107073904

Anonymous 11/01/25(Sat)14:06:12 No.107073904

>>107073851
>can I get it to call me slurs, give me non-answers, and actually be good at answering programming questions?
two outta three ain't bad

Anonymous
11/01/25(Sat)14:08:29 No.107073927

Anonymous 11/01/25(Sat)14:08:29 No.107073927

>>107073851
>can I get it to
>135m
if you can get it to produce a coherent sentence you'll be doing pretty good

Anonymous
11/01/25(Sat)14:10:01 No.107073942

Anonymous 11/01/25(Sat)14:10:01 No.107073942

>>107073927
>>107073904
I guess I just have to read the op and fuck around and find out now...

Anonymous
11/01/25(Sat)14:14:00 No.107073981

Anonymous 11/01/25(Sat)14:14:00 No.107073981

>>107073851
You can plug other models.

Anonymous
11/01/25(Sat)14:15:08 No.107073995

Anonymous 11/01/25(Sat)14:15:08 No.107073995

>>107073851
Just run a good model and lrn2prompt, you can have it behave however you might imagine, mostly
>>107073605
love pic

Anonymous
11/01/25(Sat)14:22:19 No.107074062

Anonymous 11/01/25(Sat)14:22:19 No.107074062

>>107074052
>>107074052
>>107074052

Anonymous
11/01/25(Sat)14:50:32 No.107074297

Anonymous 11/01/25(Sat)14:50:32 No.107074297

>>107072987
I have a feeling you think neither good nor fast but are just telling that to yourself to sleep better at night
it's called: a cope

Anonymous
11/01/25(Sat)14:54:48 No.107074334

Anonymous 11/01/25(Sat)14:54:48 No.107074334

>>107074297
>it's called: a cope
>: a cope
>it's called:
>:

Anonymous
11/01/25(Sat)14:56:27 No.107074349

Anonymous 11/01/25(Sat)14:56:27 No.107074349

>>107073104
>AI has stalled because we've run out of new data
>2024 was the last year where you could have obtained untainted data
LLMs are far, far better than in 2024 in real use because a lot of high quality synth data can make them behave better in instruction following. Today I can translate 6K (added some more strings to my testbed json) tokens worth of UI strings in a single go, without chunking, with a 4B LLM (qwen). The output isn't perfect, but it's actually quite decent in some language pairs like English<->French. 6K token in, 6k token out, no chunking, one shot.
Let that sink in.
Your 2024 LLM, the SOTA online models, could barely handle 4K tokens.
Today's true SOTA is models like Gemini that, while not as good as the 1 million advertised, can ingest so much more than anything from before that they finally became practical to use without a ton of rag-cope and context micro management which no sane person would want to deal with.

I am looking forward toward Gemini 3, Gemma 4 and Qwen 4 next year.

Anonymous
11/01/25(Sat)15:19:59 No.107074586

Anonymous 11/01/25(Sat)15:19:59 No.107074586

>>107074349
>I am looking forward toward [censored slop], [censored slop] and [censored slop] next year.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.