/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 09/25/24(Wed)23:59:40 No.102557546

File: 1711169217932003.jpg (187 KB, 1024x1024)

187 KB JPG

/lmg/ - Local Models General Anonymous 09/25/24(Wed)23:59:40 No.102557546 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>102552020 & >>102544848

►News
>(09/25) Multimodal Llama 3.2 released: https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices
>(09/25) Molmo: multimodal models based on OLMo, OLMoE, and Qwen-72B: https://molmo.allenai.org/blog
>(09/24) Llama-3.1-70B-instruct distilled to 51B: https://hf.co/nvidia/Llama-3_1-Nemotron-51B-Instruct
>(09/18) Qwen 2.5 released, trained on 18 trillion token dataset: https://qwenlm.github.io/blog/qwen2.5/
>(09/18) Llama 8B quantized to b1.58 through finetuning: https://hf.co/blog/1_58_llm_extreme_quantization

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://hf.co/spaces/mike-ravkine/can-ai-code-results

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
09/26/24(Thu)00:00:09 No.102557552

Anonymous 09/26/24(Thu)00:00:09 No.102557552

File: __hatsune_miku_vocaloid_d(...).jpg (88 KB, 1139x839)

88 KB JPG

►Recent Highlights from the Previous Thread: >>102552020

--Papers: >>102556485 >>102556658 >>102556704
--Techlet seeks advice for running smut models on RTX 3060 and 32GB RAM setup:
>102553022 >102553034 >102553095 >102554016 >102554051 >102555169 >102555266 >102555360
--SDL input example may support whisper.cpp voice recognition on Linux:
>102552838 >102552967
--Molmo 72b local execution challenges and workarounds:
>102552240 >102552305 >102552320 >102552354 >102553463 >102554973 >102552544
--90B model issues and potential improvements with quantization:
>102552694 >102552834 >102552873 >102552940
--90B model fails to interpret hazard symbols, while chatgpt endpoint succeeds:
>102553581 >102553614
--Llama3.2 3B passes ShaderToy test and generates working code:
>102554042 >102555022 >102555072
--Llama 3.x improvements are incremental, increasing context length and adding vision:
>102552674 >102552688 >102552744
--usecublas mmq 0 is now default and makes a big difference:
>102552283 >102552332
--Nala test with 90B model shows improvement but raises questions about test design:
>102552440 >102552512 >102552522 >102552505 >102552520 >102552723
--LlaMA 3.2 3B one-shots Snake Game:
>102552587 >102552652
--L3 Tenyx Day generates working pyqtgraph plot of scrolling sine wave:
>102553616 >102555234
--Alternatives to 90b vision model for captioning and image generation:
>102552399 >102552424 >102552437 >102552443 >102552509 >102552667 >102552679 >102552715 >102552733 >102552745 >102552843 >102552783 >102552824 >102553106 >102553151 >102555291 >102555313 >102555335 >102555367
--Adjusting batch size and ubatch size for prompt processing and layers:
>102552065 >102552157
--90B 4bit bnb model struggles with image description accuracy and spatial orientation:
>102554144
--Miku (free space):
>102552059 >102553803 >102554159 >102554179 >102556227

►Recent Highlight Posts from the Previous Thread: >>102552037
https://rentry.org/lmg-recap-script

Anonymous
09/26/24(Thu)00:03:35 No.102557589

Anonymous 09/26/24(Thu)00:03:35 No.102557589

I hate the anti christ

Anonymous
09/26/24(Thu)00:07:53 No.102557626

Anonymous 09/26/24(Thu)00:07:53 No.102557626

>>102557552
dumb spam poster

Anonymous
09/26/24(Thu)00:10:12 No.102557651

Anonymous 09/26/24(Thu)00:10:12 No.102557651

>Using OpenRouter I tried Hermes 3 70B on a whim and found I actually liked it
>I tried to use it just now and found I got rug pulled
Serves me right for ever using a 3rd party service.

Anonymous
09/26/24(Thu)00:15:08 No.102557706

Anonymous 09/26/24(Thu)00:15:08 No.102557706

best vision model to send dick pics to?

Anonymous
09/26/24(Thu)00:15:43 No.102557712

Anonymous 09/26/24(Thu)00:15:43 No.102557712

Why is everyone quitting at OpenAI before the cashout?

Anonymous
09/26/24(Thu)00:15:44 No.102557714

Anonymous 09/26/24(Thu)00:15:44 No.102557714

>everything new is shit
>nothing happens
it's over isn't it

Anonymous
09/26/24(Thu)00:16:49 No.102557728

Anonymous 09/26/24(Thu)00:16:49 No.102557728

>>102557712
>Implying anyone but the jew will get money
That's why. Leave now before everything goes to shit completely and your reputation gets tarnished as a result.

Anonymous
09/26/24(Thu)00:17:28 No.102557734

Anonymous 09/26/24(Thu)00:17:28 No.102557734

>>102557712
people can have morals or brains, but not both

Anonymous
09/26/24(Thu)00:17:44 No.102557739

Anonymous 09/26/24(Thu)00:17:44 No.102557739

I was away for a day and so much shit happened.

Anonymous
09/26/24(Thu)00:18:29 No.102557744

Anonymous 09/26/24(Thu)00:18:29 No.102557744

>>102557739
all nothingburgers it's over

Anonymous
09/26/24(Thu)00:18:36 No.102557747

Anonymous 09/26/24(Thu)00:18:36 No.102557747

>>102557739
nothing happened

Anonymous
09/26/24(Thu)00:19:21 No.102557753

Anonymous 09/26/24(Thu)00:19:21 No.102557753

>new multimodal drops
Oh cool!
>text and vision
God dammit. When will the dumb vision meme die? It has not given models any better sense of spatial reasoning, it's just a dumb party trick for asking it to explain things you already know for a laugh.

Anonymous
09/26/24(Thu)00:19:30 No.102557756

Anonymous 09/26/24(Thu)00:19:30 No.102557756

File: Stargate SG-1 (1997) - S0(...).jpg (160 KB, 1920x1080)

160 KB JPG

>>102557712
Because (You) only care for money, and don't have a shred on integrity.
>>102557739
Indeed. A lot of SHIT happened.

Anonymous
09/26/24(Thu)00:19:30 No.102557757

Anonymous 09/26/24(Thu)00:19:30 No.102557757

>>102557712
They plateaued and had to obfuscate the fact. Expect some youtube documentary retell that as big revelation in a couple years.

Anonymous
09/26/24(Thu)00:21:18 No.102557775

Anonymous 09/26/24(Thu)00:21:18 No.102557775

>>102557739
>so much shit happened.
molmo shill or meta shill?
either way, you know what to buy

Anonymous
09/26/24(Thu)00:22:37 No.102557787

Anonymous 09/26/24(Thu)00:22:37 No.102557787

>>102557757
that's why they have to cash out and ipo now, before the public knows enough to not buy their bags

Anonymous
09/26/24(Thu)00:22:48 No.102557789

Anonymous 09/26/24(Thu)00:22:48 No.102557789

>>102557775
what kind of ammo?

Anonymous
09/26/24(Thu)00:23:09 No.102557791

Anonymous 09/26/24(Thu)00:23:09 No.102557791

>>102557753
Its not even image out. Very underwhelming.
Multimodal for local always means text/image in, text out. BORING.

Anonymous
09/26/24(Thu)00:24:01 No.102557796

Anonymous 09/26/24(Thu)00:24:01 No.102557796

"multimodal" more like shittimodal

Anonymous
09/26/24(Thu)00:25:00 No.102557801

Anonymous 09/26/24(Thu)00:25:00 No.102557801

How can i try a 4bit quant of llama3.2 with a pascal multi gpu since there is no gguf?
dont say exllama, that has insane prompt processing time. probably because of pascal.
aphrodite engine? i'm serious by the way.

Anonymous
09/26/24(Thu)00:25:46 No.102557811

Anonymous 09/26/24(Thu)00:25:46 No.102557811

"multimodal" more like faggotmodal

Anonymous
09/26/24(Thu)00:27:26 No.102557827

Anonymous 09/26/24(Thu)00:27:26 No.102557827

llamacpp multimodal support when?
llamacpp 3.2 support when?

Anonymous
09/26/24(Thu)00:27:32 No.102557829

Anonymous 09/26/24(Thu)00:27:32 No.102557829

>>102557801
exllama 2

Anonymous
09/26/24(Thu)00:28:11 No.102557841

Anonymous 09/26/24(Thu)00:28:11 No.102557841

>>102557775
A 3090?

Anonymous
09/26/24(Thu)00:28:35 No.102557846

Anonymous 09/26/24(Thu)00:28:35 No.102557846

>>102557791
We have plenty of resources for image out. We need multimodal text+speech models.

Anonymous
09/26/24(Thu)00:29:02 No.102557854

Anonymous 09/26/24(Thu)00:29:02 No.102557854

>llama 3 is shit
>llama 3.1 is shit
>llama 3.2 is shit
remember when llama 3 was going to save us? and then when llama 3.1 was going to save us? yeah it's over. pack it up.

Anonymous
09/26/24(Thu)00:29:20 No.102557858

Anonymous 09/26/24(Thu)00:29:20 No.102557858

>>102557827
Right after Jamba

Anonymous
09/26/24(Thu)00:29:33 No.102557860

Anonymous 09/26/24(Thu)00:29:33 No.102557860

Mikulove

Anonymous
09/26/24(Thu)00:29:50 No.102557863

Anonymous 09/26/24(Thu)00:29:50 No.102557863

>>102557846
like what..
chameleon and the poor attempts to reimplement what has been cut out. lol
there is no text+image out model as far as i know.

Anonymous
09/26/24(Thu)00:32:07 No.102557887

Anonymous 09/26/24(Thu)00:32:07 No.102557887

>>102557739
>so much shit happened.
tl;dr? what did I miss?

Anonymous
09/26/24(Thu)00:32:11 No.102557890

Anonymous 09/26/24(Thu)00:32:11 No.102557890

>>102557841
No. The last stock of 4090s you can still find, then wait for the 4 grand 5090/Titan with 32GB VRAM.

Anonymous
09/26/24(Thu)00:32:16 No.102557892

Anonymous 09/26/24(Thu)00:32:16 No.102557892

wake me up when one model can write to me, send me pictures, and whisper in my ear. everything else is RNG with extra steps.

Anonymous
09/26/24(Thu)00:33:28 No.102557900

Anonymous 09/26/24(Thu)00:33:28 No.102557900

>>102557854
Do you have ANY idea what they're planning for L4? I can't say much but, well, let's just say it's just a bit too early to give up hope. Check back in a couple weeks and let me know how over it is or isn't.

Anonymous
09/26/24(Thu)00:33:48 No.102557903

Anonymous 09/26/24(Thu)00:33:48 No.102557903

>>102557858
is that before or after DRY?

Anonymous
09/26/24(Thu)00:34:21 No.102557912

Anonymous 09/26/24(Thu)00:34:21 No.102557912

>>102557900
yes yes l4 will totally save us just like l3 did

Anonymous
09/26/24(Thu)00:34:44 No.102557920

Anonymous 09/26/24(Thu)00:34:44 No.102557920

>>102557900
>two more weeks
kek, almost had me

Anonymous
09/26/24(Thu)00:40:30 No.102557981

Anonymous 09/26/24(Thu)00:40:30 No.102557981

>>102557712
Because Sam probably made it clear that only him will get the bag, so they didn't see the point on staying and then having to clap for his ass when he'll get the 140b, that's fair

Anonymous
09/26/24(Thu)00:41:32 No.102557992

Anonymous 09/26/24(Thu)00:41:32 No.102557992

>>102557787
>that's why they have to cash out and ipo now, before the public knows enough to not buy their bags
this, it's probably soon over for OpenAI, it'll probably be brought by Microsoft after that

Anonymous
09/26/24(Thu)00:42:37 No.102558001

Anonymous 09/26/24(Thu)00:42:37 No.102558001

zoomer doomers are so fucking unbearable lmfao at least they'll all troon out sooner rather than later

Anonymous
09/26/24(Thu)00:43:08 No.102558006

Anonymous 09/26/24(Thu)00:43:08 No.102558006

File: .png (29 KB, 734x265)

29 KB PNG

Anonymous
09/26/24(Thu)00:43:33 No.102558011

Anonymous 09/26/24(Thu)00:43:33 No.102558011

>>102557992
I have my doubts that Microsoft even needs, or wants to period. It's all about the datasets and staff anyway, which they can get easier (and cheaper) in other ways. They already have the hardware by default too.

Anonymous
09/26/24(Thu)00:44:57 No.102558025

Anonymous 09/26/24(Thu)00:44:57 No.102558025

>>102558011
yeah Idk, Microsoft doesn't seem to know how to make models, so they better take those from OpenAI

Anonymous
09/26/24(Thu)00:45:25 No.102558034

Anonymous 09/26/24(Thu)00:45:25 No.102558034

So will I be able to send dick pics to my sillytavern chat soon?

Anonymous
09/26/24(Thu)00:46:01 No.102558040

Anonymous 09/26/24(Thu)00:46:01 No.102558040

>>102558011
OpenAI's mailing list and customers (despite not being profitable) are pretty valuable to a company like Microsoft. If the price is right they'll buy.

Anonymous
09/26/24(Thu)00:47:27 No.102558049

Anonymous 09/26/24(Thu)00:47:27 No.102558049

>>102558025
All they have to do is "poach" the staff and data and be done, then remake shit to have full control start to end.
Simple taking the model wouldn't fix their lack of knowledge or skill in how to use, make or improve them.
Granted, neither does OAI, but so it goes.
>>102558040
Just throwing out some ideas, that's all. Not like it will matter to them money wise either way, for obvious reasons.

Anonymous
09/26/24(Thu)00:48:04 No.102558055

Anonymous 09/26/24(Thu)00:48:04 No.102558055

>>102557892
You just know it will never be allowed. Even if it works out, you'll have the same crippled and censored experience with AI rejecting and calling you "incel chud" for wrong opinions.

Anonymous
09/26/24(Thu)00:50:00 No.102558077

Anonymous 09/26/24(Thu)00:50:00 No.102558077

>>102558055
>Even if it works out, you'll have the same crippled and censored experience with AI rejecting and calling you "incel chud" for wrong opinions.
tough pill to swallow but it's true, the only way to get out of this is to get a great base model and finetune this shit with based text, but that's pretty unlikely it's gonna happen

Anonymous
09/26/24(Thu)00:54:29 No.102558117

Anonymous 09/26/24(Thu)00:54:29 No.102558117

>>102558077
>get a great base model
extremely unlikely

Anonymous
09/26/24(Thu)01:05:20 No.102558221

Anonymous 09/26/24(Thu)01:05:20 No.102558221

File: 1725697593000880.png (92 KB, 717x352)

92 KB PNG

Has the anon that made the Director extension put any newer versions out?

Anonymous
09/26/24(Thu)01:09:15 No.102558266

Anonymous 09/26/24(Thu)01:09:15 No.102558266

>>102558221
>Director extension
whuz dat?

Anonymous
09/26/24(Thu)01:11:31 No.102558285

Anonymous 09/26/24(Thu)01:11:31 No.102558285

>>102558266
It's a sillytavern extension that adds bits of info to the prompt based on presets and lorebooks, used to tell the AI things like what the character/user is currently wearing, the time of day, weather, etc.

It's like a slightly more automated author's note.

Anonymous
09/26/24(Thu)01:13:05 No.102558296

Anonymous 09/26/24(Thu)01:13:05 No.102558296

>>102558025
Microsoft can't make anything anymore. It's what you get with a bunch of pajeets.

Anonymous
09/26/24(Thu)01:13:17 No.102558300

Anonymous 09/26/24(Thu)01:13:17 No.102558300

>>102558285
that sounds pretty neat, you have a link to it? i couldn't find it on google

Anonymous
09/26/24(Thu)01:19:02 No.102558343

Anonymous 09/26/24(Thu)01:19:02 No.102558343

>>102558300
This is the last version I can find. Not sure if it works with the latest sillytavern
>>101910710

Anonymous
09/26/24(Thu)01:19:07 No.102558345

Anonymous 09/26/24(Thu)01:19:07 No.102558345

>>102558296
is there any big tech company left that this doesn't apply to?

Anonymous
09/26/24(Thu)01:25:54 No.102558399

Anonymous 09/26/24(Thu)01:25:54 No.102558399

>>102557534
>>102557696
I believe you're still misunderstanding the reasoning behind my post. Yes, it is expected that for any normal LLM, its performance would decrease with the more difficult problems. That is indeed obvious. I am suggesting that despite that, o1 is still under-performing because of this or that (which may be more clear as you are showing details from the paper I didn't look at, because I was just looking at the summary). My implied reasoning was that if o1 is able to dedicate more tokens to thinking about problems, and its performance improves generally without foreseen limit (note on that at the end of the post), then it should just dedicate more tokens to the more difficult problems and solve them with similar accuracy.

Now, as you have shown, they did note the token counts o1 used. In this case that does push forward the discussion of understanding what happened in the study. Yes, based on the logic I meant to present so far, I would say now that it's possible that o1's performance on the more difficult problems could've improved with even more tokens, and perhaps right now it is just an artificial limit that stopped them from being able to get that data. However, we don't really know, as there is also no data to suggest that it won't stop improving at some point soon or far away.

>Their claim was that having it "think" longer on the same task would increase its accuracy on that task, not that...
OpenAI might not have claimed it explicitly, but it's kind of the idea, that, if allowed to, potentially the model could just get better and better to ridiculous lengths by being allowed to think more. They only said that they would investigate this new scaling behavior, but didn't say anything to quell the implication (and general tone of the article) that it's some new scaling paradigm that will lead us to crazy lengths of improvement.

Anonymous
09/26/24(Thu)01:39:18 No.102558522

Anonymous 09/26/24(Thu)01:39:18 No.102558522

File: _06136_.jpg (1.86 MB, 4096x4096)

1.86 MB JPG

>not an eldritch horror

Anonymous
09/26/24(Thu)01:43:03 No.102558561

Anonymous 09/26/24(Thu)01:43:03 No.102558561

File: 1696589969478603.png (205 KB, 512x467)

205 KB PNG

>>102558522
>4096x4096
>that quality

Anonymous
09/26/24(Thu)02:17:15 No.102558841

Anonymous 09/26/24(Thu)02:17:15 No.102558841

File: file.png (71 KB, 682x554)

71 KB PNG

>3.2 90B Vision is super retarded
no not like this...

Anonymous
09/26/24(Thu)02:18:28 No.102558845

Anonymous 09/26/24(Thu)02:18:28 No.102558845

is the 3.2 90b multimodal model stronger for text-only applications than 3.1 70b?

Anonymous
09/26/24(Thu)02:26:52 No.102558892

Anonymous 09/26/24(Thu)02:26:52 No.102558892

File: GoodNightAnon.png (1.35 MB, 800x1248)

1.35 MB PNG

>absolute eldritch horror

Anonymous
09/26/24(Thu)02:28:25 No.102558904

Anonymous 09/26/24(Thu)02:28:25 No.102558904

>>102558892
not scary

Anonymous
09/26/24(Thu)02:29:20 No.102558907

Anonymous 09/26/24(Thu)02:29:20 No.102558907

>>102558904
real eldritch horrors never are

Anonymous
09/26/24(Thu)02:34:20 No.102558942

Anonymous 09/26/24(Thu)02:34:20 No.102558942

>>102558892
>ywn SEX a real eldritch horror
Brehs...

Surely this comment will not come back to bite me in the ass one day.

Anonymous
09/26/24(Thu)03:01:50 No.102559186

Anonymous 09/26/24(Thu)03:01:50 No.102559186

File: 1711708043999446.png (158 KB, 833x534)

158 KB PNG

Wonder how's shitma 3.2 in safety, the most important thing in this world.

Anonymous
09/26/24(Thu)03:39:31 No.102559443

Anonymous 09/26/24(Thu)03:39:31 No.102559443

Has anyone tried the 1B or 3B for speculative decoding of 70B, and compared it to using 8B, for the draft model?

Anonymous
09/26/24(Thu)03:52:05 No.102559517

Anonymous 09/26/24(Thu)03:52:05 No.102559517

I've been reading that they kept the text part of the models the same, and just added on the vision adapters, but is that really true? Is it possible to download only the adapter and stick it onto my existing 70B? Also, I feel like this should present some interesting optimization options. Like the adapter's weights are only going to be used when encoding the image, right? So in theory you should be able to get some good overall gains by having the adapter's weights in RAM, assuming a RP use case.

Anonymous
09/26/24(Thu)05:20:04 No.102560149

Anonymous 09/26/24(Thu)05:20:04 No.102560149

hello guys i need coom model for my 6gb vram 2060 rtx nvidia card from huang grifter and 4x8gb ram fury patriot fx supreme edition with rgb lights (lights are red and green)
coom model need to write ah ah mistress and sex (optionally go to scenes)

Anonymous
09/26/24(Thu)05:23:55 No.102560186

Anonymous 09/26/24(Thu)05:23:55 No.102560186

so is meta gonna open source the 405b? they have it on their cloud

Anonymous
09/26/24(Thu)06:00:06 No.102560550

Anonymous 09/26/24(Thu)06:00:06 No.102560550

Where the fuck do I go if I want to discuss local audio models.
You're telling me image, video and text gets their own generals but not anything for Audio generation? Been trying to find some up-to-date audio models for various stuff like Text->Audio or Audio->Audio, all I find are shitty reddit posts from 9 months ago on how to add audio to ERP LLMs.

Anonymous
09/26/24(Thu)06:01:09 No.102560556

Anonymous 09/26/24(Thu)06:01:09 No.102560556

>>102560550
You can discuss it here since it lacks alternatives.
Here you go
https://play.ai/

Anonymous
09/26/24(Thu)06:02:12 No.102560568

Anonymous 09/26/24(Thu)06:02:12 No.102560568

>>102560550
r/localllama

Anonymous
09/26/24(Thu)06:04:46 No.102560590

Anonymous 09/26/24(Thu)06:04:46 No.102560590

>>102560186
they made a 405b vision model? sounds stupid, there's no use for one that large

Anonymous
09/26/24(Thu)06:07:07 No.102560609

Anonymous 09/26/24(Thu)06:07:07 No.102560609

>>102560556
I'm sick of these 20 online-only signup garbage services, is there not a single good local alternative at this point? Feels like nothing has happened on local models since Elevenlabs came out ages ago.

Anonymous
09/26/24(Thu)06:16:17 No.102560687

Anonymous 09/26/24(Thu)06:16:17 No.102560687

>>102560550
im gatekeeping that stuff for myself since anons here are baby duck retards anyway

Anonymous
09/26/24(Thu)06:30:11 No.102560811

Anonymous 09/26/24(Thu)06:30:11 No.102560811

File: 1727346590782.jpg (31 KB, 236x236)

31 KB JPG

>>102557546
>Chatgpt advanced model rolls out which features real-time and emotional responses
>Local models are still stuck in early 2023 figuring out optimal ways of converting speech to text
Open source gets btfo'd again

Anonymous
09/26/24(Thu)06:37:48 No.102560873

Anonymous 09/26/24(Thu)06:37:48 No.102560873

>>102560550
You go to r/elevenlabs and r/SunoAI

Anonymous
09/26/24(Thu)06:37:59 No.102560876

Anonymous 09/26/24(Thu)06:37:59 No.102560876

>>102560590
on the contrary, vision models will be actually useful for the first time probably around the 2T range

Anonymous
09/26/24(Thu)06:39:39 No.102560890

Anonymous 09/26/24(Thu)06:39:39 No.102560890

>>102560876
>akshully there's no use for one that small
ok so you agree it's useless

Anonymous
09/26/24(Thu)06:39:58 No.102560893

Anonymous 09/26/24(Thu)06:39:58 No.102560893

>>102560550
fish is great, 60% of the time, it works every time

Anonymous
09/26/24(Thu)06:41:51 No.102560900

Anonymous 09/26/24(Thu)06:41:51 No.102560900

>>102557900
>Introducing: llama-4. This state of the art model now uses an improved tokenizer that prevents model from outputting any adult oriented material. We just removed all the dicks, blowjobs, loli etc. And if the model realizes the safety measures were circumvented it calls an external function to delete itself from your hard drive.

Anonymous
09/26/24(Thu)06:48:31 No.102560941

Anonymous 09/26/24(Thu)06:48:31 No.102560941

>>102560900
why are you guys asshurt over this when there is an endless supply of porn on the internet

Anonymous
09/26/24(Thu)06:51:02 No.102560957

Anonymous 09/26/24(Thu)06:51:02 No.102560957

>>102560941
it's really easy to flip this question around
why are ml devs obsessed with preventing porn generation when there's an endless supply of porn on the internet

Anonymous
09/26/24(Thu)06:51:34 No.102560962

Anonymous 09/26/24(Thu)06:51:34 No.102560962

>>102560941
An interactive experience tailored to your personal tastes is infinitely better than anything else you can find.

Anonymous
09/26/24(Thu)06:52:15 No.102560965

Anonymous 09/26/24(Thu)06:52:15 No.102560965

>>102560957
data is crucial to ml development and porn is slop

Anonymous
09/26/24(Thu)06:53:07 No.102560968

Anonymous 09/26/24(Thu)06:53:07 No.102560968

>>102560965
in that case ml devs should love it, because they fucking love slop

Anonymous
09/26/24(Thu)06:53:16 No.102560970

Anonymous 09/26/24(Thu)06:53:16 No.102560970

>>102560965
lol

Anonymous
09/26/24(Thu)06:53:34 No.102560971

Anonymous 09/26/24(Thu)06:53:34 No.102560971

>>102560965
Careful with trvth like that... We're not ready...

Anonymous
09/26/24(Thu)06:54:03 No.102560976

Anonymous 09/26/24(Thu)06:54:03 No.102560976

>>102560941
That is kind of a mid bait because I don't even know how to respond to you. I want to touch my penis to the text written for me specifically. And my niche fetish is hard to find. I am not like the piss anon who can find terabytes of girls pissing themselves.

Anonymous
09/26/24(Thu)06:54:45 No.102560986

Anonymous 09/26/24(Thu)06:54:45 No.102560986

>>102560590
>>102560890
it's a research model you fucking mong
>this is only useful at this size!
good I guess we'll just never make any progress since the intermediary steps aren't useful for practical reasons

Anonymous
09/26/24(Thu)07:01:30 No.102561040

Anonymous 09/26/24(Thu)07:01:30 No.102561040

>>102557712
Altman is just getting rid of the people who tried to push him out a year ago. He's preparing his step to become the god-king of modern AI.

Anonymous
09/26/24(Thu)07:03:25 No.102561051

Anonymous 09/26/24(Thu)07:03:25 No.102561051

>>102560976
I can guarantee that my fetish is rarer, and even pyg got me off pretty well.
You're just lazy/dumb (same thing).
And don't bring piss anon into this.

Anonymous
09/26/24(Thu)07:50:16 No.102561423

Anonymous 09/26/24(Thu)07:50:16 No.102561423

>>102558343
The hosting period is expired, any chance you mind sharing?

Anonymous
09/26/24(Thu)08:03:02 No.102561552

Anonymous 09/26/24(Thu)08:03:02 No.102561552

new model when?

Anonymous
09/26/24(Thu)08:09:37 No.102561613

Anonymous 09/26/24(Thu)08:09:37 No.102561613

you can tell the state of things is good when the thread is slow, means everyone's too busy having sex with their graphics card to shitpost here.

Anonymous
09/26/24(Thu)08:11:11 No.102561623

Anonymous 09/26/24(Thu)08:11:11 No.102561623

>>102561051
>And don't bring piss anon into this.
why?

Anonymous
09/26/24(Thu)08:11:43 No.102561626

Anonymous 09/26/24(Thu)08:11:43 No.102561626

>>102561613
why aren't you having sex with your graphics card instead of shitposting here?

Anonymous
09/26/24(Thu)08:12:49 No.102561633

Anonymous 09/26/24(Thu)08:12:49 No.102561633

>>102561626
long refractory period

Anonymous
09/26/24(Thu)08:20:18 No.102561690

Anonymous 09/26/24(Thu)08:20:18 No.102561690

File: ED.jpg (435 KB, 2125x1411)

435 KB JPG

>>102561613
>busy having sex with their graphics card to shitpost here.
I had to stop. I can't perform.

Anonymous
09/26/24(Thu)08:24:34 No.102561717

Anonymous 09/26/24(Thu)08:24:34 No.102561717

Did we ever had a good comparison point between a MoE and a monolithic (close to) equivalent or are these Molmo models the first time we can do a somewhat like for like comparison?

Anonymous
09/26/24(Thu)08:26:41 No.102561725

Anonymous 09/26/24(Thu)08:26:41 No.102561725

File: 463912767.webm (176 KB, 438x256)

176 KB WEBM

On my coding challenge from yesterday (create a pyqtgraph plot of a scrolling sine wave, as the wave moves the next cycle should have a different amplitude (random from 1 to 10)): Qwen 72b succeed at it, deepseek coder v2.5 also doesn't quite get it, llama 405b also fails, so far only qwen 72b and gpt 4o did it, it seems to be a problem similar to when you ask a question that the model has seen a lot in its training data but tweak a detail, it ignores the detail and defaults to the more "general" behaviour

Anonymous
09/26/24(Thu)08:28:21 No.102561739

Anonymous 09/26/24(Thu)08:28:21 No.102561739

>>102561717
In general dense will always be better quality wise but the point of Moe is that you only run part of parameters so you can offload to regular ram and still get usable speed. Mixtral was the best example where I could run Q5 on 24GB vram at 5T/s

Anonymous
09/26/24(Thu)08:29:09 No.102561744

Anonymous 09/26/24(Thu)08:29:09 No.102561744

>>102561633
if you are so weak you can't even overcome the refractory period, you'll never be able to handle the next gen of locals
it's over for you

Anonymous
09/26/24(Thu)08:31:06 No.102561761

Anonymous 09/26/24(Thu)08:31:06 No.102561761

>>102561744
nta but its going to take over a day for mine. local is single thread and slow when you do a lot at once

Anonymous
09/26/24(Thu)08:33:52 No.102561780

Anonymous 09/26/24(Thu)08:33:52 No.102561780

>>102561725
You can easily push outside distribution when programming, it's very information dense and rigid, not like creative language where it's easy to mask the simplistic mechanics in the model. On the other hand, they're great for automating repetitive boiler plate bullshit, it's a fucking treat when the model shits out getters and setters.

Anonymous
09/26/24(Thu)08:36:13 No.102561792

Anonymous 09/26/24(Thu)08:36:13 No.102561792

>>102560550
That general is on /mlp/ unironically

Anonymous
09/26/24(Thu)08:37:05 No.102561800

Anonymous 09/26/24(Thu)08:37:05 No.102561800

So now that llama sota models are multi-modal, will lcpp finally have to support something other than text?

Anonymous
09/26/24(Thu)08:38:05 No.102561806

Anonymous 09/26/24(Thu)08:38:05 No.102561806

>>102560811
There are a dozen of them on Github, but you can't code for shit zoomers.

Anonymous
09/26/24(Thu)08:45:22 No.102561867

Anonymous 09/26/24(Thu)08:45:22 No.102561867

>>102561800
seems like ollama might end up supporting it before upstream
https://github.com/ollama/ollama/pull/6971
https://github.com/ollama/ollama/pull/6965
https://github.com/ollama/ollama/pull/6963
(Coming very soon) 11B and 90B Vision models
https://ollama.com/blog/llama3.2

Anonymous
09/26/24(Thu)08:49:31 No.102561905

Anonymous 09/26/24(Thu)08:49:31 No.102561905

>>102561800
ggerganov:

>My PoV is that adding multimodal support is a great opportunity for new people with good software architecture skills to get involved in the project. The general low to mid level patterns and details needed for the implementation are already available in the codebase - from model conversion, to data loading, backend usage and inference. It would take some high-level understanding of the project architecture in order to implement support for the vision models and extend the API in the correct way.

>We really need more people with this sort of skillset, so at this point I feel it is better to wait and see if somebody will show up and take the opportunity to help out with the project long-term. Otherwise, I'm afraid we won't be able to sustain the quality of the project.

https://github.com/ggerganov/llama.cpp/issues/8010#issuecomment-2376339571

Anonymous
09/26/24(Thu)08:50:04 No.102561910

Anonymous 09/26/24(Thu)08:50:04 No.102561910

>>102561867
>seems like ollama might end up supporting it before upstream
It's not like it's that much work. They just need to copy-paste the cli code into the server. They can even use the original server multimodal code from earlier this year as a template.
llama.cpp could do it too, if they wanted to. But ggerganov refuses to add it back in only because the code isn't elegant enough or something like that.
>seems like ollama might end up supporting it before upstream
Still embarrassing.

Anonymous
09/26/24(Thu)08:50:54 No.102561918

Anonymous 09/26/24(Thu)08:50:54 No.102561918

>>102561910
>But ggerganov refuses to add it back in only because the code isn't elegant enough or something like that.
llama.cpp abandonware
>>102561905

Anonymous
09/26/24(Thu)08:51:53 No.102561929

Anonymous 09/26/24(Thu)08:51:53 No.102561929

>>102561910
That's actually kinda based, I'll wait.

Anonymous
09/26/24(Thu)08:53:39 No.102561943

Anonymous 09/26/24(Thu)08:53:39 No.102561943

>>102561929
hi cuda dev please dont spam blacked miku in rage when ollama adds 3.2 support k?

Anonymous
09/26/24(Thu)08:56:35 No.102561976

Anonymous 09/26/24(Thu)08:56:35 No.102561976

>>102561943
I'm not who you think I am. I'm just the dude who wrote the OG Miku prompt back in the llama 1 days, can't believe the amount of asshurt it has caused over time.

Anonymous
09/26/24(Thu)08:58:45 No.102562000

Anonymous 09/26/24(Thu)08:58:45 No.102562000

Honestly, imo qwen 2.5 72B IQ4XS with 4-bit KV cache has been alright. Unlike miku and cydonia, it manages to keep a secret written in a card I'm using, but it just loves repeating literally the same sentence(s) verbatim, even when I crank up DRY and/or rep pen
Don't know if it's my writing or the model. I feel like a finetune could really make it shine. Haven't used it for sex yet, but it doesn't complain during foreplay at all

Anonymous
09/26/24(Thu)09:02:10 No.102562037

Anonymous 09/26/24(Thu)09:02:10 No.102562037

>>102561905
>We really need more people with this sort of skillset, so at this point I feel it is better to wait and see if somebody will show up and take the opportunity to help out with the project long-term.
You know what, from the point of view of a main maintainer of a large open source project, that's fucking fair enough.

Anonymous
09/26/24(Thu)09:12:03 No.102562135

Anonymous 09/26/24(Thu)09:12:03 No.102562135

>>102557552
Why did the bot linking break? Did you shift to a lower quant?

Anonymous
09/26/24(Thu)09:13:30 No.102562148

Anonymous 09/26/24(Thu)09:13:30 No.102562148

>>102562135
can't have more than 9 refs now, not 100% sure why, but here's where it was noticed
>>102478518
>>102478544

Anonymous
09/26/24(Thu)09:13:35 No.102562150

Anonymous 09/26/24(Thu)09:13:35 No.102562150

We all talk about models and shit, but how do you guys write your char cards?

Anonymous
09/26/24(Thu)09:22:19 No.102562238

Anonymous 09/26/24(Thu)09:22:19 No.102562238

>>102561905
That's what happens when you try to support everything. It gets too big to maintain.

Anonymous
09/26/24(Thu)09:24:37 No.102562260

Anonymous 09/26/24(Thu)09:24:37 No.102562260

>>102562150
plain language, formatting tags are irrelevant
[char's name] is X, Y, Z. [char's name] has X, Y Z. [char's name] does X, Y Z.
no {{char}} or {{user}}, ever

Anonymous
09/26/24(Thu)09:25:34 No.102562266

Anonymous 09/26/24(Thu)09:25:34 No.102562266

>>102562148
Ah. Well, probably for the best, honestly. No more fucking threads where some asshat spamquotes every post in the thread with NIGGERS NIGGERS NIGGERS

Anonymous
09/26/24(Thu)09:26:03 No.102562274

Anonymous 09/26/24(Thu)09:26:03 No.102562274

>>102562238
Gotta admit it's funny that LLAMA.cpp wants to wait for supporting LLAMA 3.2 of all things. Guess they want to avoid another llama3.x incident and weeks of bugfixes.

Anonymous
09/26/24(Thu)09:28:20 No.102562298

Anonymous 09/26/24(Thu)09:28:20 No.102562298

>>102562260
but {{char}} and {{user}} is converted to plain language in ST with their appropriate names........

Anonymous
09/26/24(Thu)09:29:12 No.102562303

Anonymous 09/26/24(Thu)09:29:12 No.102562303

File: Untitled.png (65 KB, 713x718)

65 KB PNG

>>102562150
i just do shit like this then write out the first message. or grab something off chub and remove all the {{char}}s to not fuck up the context shifting.
i (probably incorrectly) assume wrapping stuff in square brackets keeps it from trying to emulate the terseness of the factoids in the actual chat.

Anonymous
09/26/24(Thu)09:30:10 No.102562312

Anonymous 09/26/24(Thu)09:30:10 No.102562312

>>102562260
>no {{char}} or {{user}}, ever
Why?
Writing Nala is a lioness is the exact same as writing {{char}} us a lioness, so I guess it's a wash in this case, but for {{user}} at least it makes more sense if you want to use different personas and have it be referenced in the card itself.

Anonymous
09/26/24(Thu)09:30:26 No.102562315

Anonymous 09/26/24(Thu)09:30:26 No.102562315

File: latest.png (315 KB, 680x459)

315 KB PNG

>new multimodal model release
>look inside
>text and vision

Anonymous
09/26/24(Thu)09:31:21 No.102562327

Anonymous 09/26/24(Thu)09:31:21 No.102562327

>>102562260

I'm the anon who asked earlier. That's it? I mean, I've been banging my head against the wall trying to get my characters all formatted in Alichat and plist, and it worked fine up until Llama2. But ever since Mixtral, L3, and Nemo dropped, I've got this feeling that Alichat is responsible for a ton of repetition and pattern sticking (in a bad way). Honestly thought you guys would have some more advanced LLM wizardry than just, "Nah, you're good, just use plain text."

Anonymous
09/26/24(Thu)09:34:53 No.102562359

Anonymous 09/26/24(Thu)09:34:53 No.102562359

>>102562327
Anon is right in that clear, concise, plaintext is the way to go.
Some models seem to react well to tab based indentation for lists too, but it's generally unnecessary.

Anonymous
09/26/24(Thu)09:35:10 No.102562365

Anonymous 09/26/24(Thu)09:35:10 No.102562365

>>102562148
>not 100% sure why,
Because the anon that always shits up the thread is one of the mods who has a clear anti-AI agenda and Hiro is too much of a cuck to defend his website from subterfuge.

Anonymous
09/26/24(Thu)09:35:34 No.102562367

Anonymous 09/26/24(Thu)09:35:34 No.102562367

>>102561806
Name one

Anonymous
09/26/24(Thu)09:37:34 No.102562390

Anonymous 09/26/24(Thu)09:37:34 No.102562390

>>102562359
>Anon is right in that clear, concise, plaintext is the way to go.
Thank god. I suddenly feel the urge to make cards again.

Anonymous
09/26/24(Thu)09:39:36 No.102562412

Anonymous 09/26/24(Thu)09:39:36 No.102562412

>Llama 3.2 1B and Llama 3.2 3B
>Mogged by Qwen
>Llama 3.2 11B and Llama 3.2 90B
>Mogged by Molmo
>Voice modality
>Only on Meta AI chat, enjoy your text and image modalities
Um... bros?

Anonymous
09/26/24(Thu)09:40:03 No.102562414

Anonymous 09/26/24(Thu)09:40:03 No.102562414

>>102562367
Whisper?

Anonymous
09/26/24(Thu)09:42:39 No.102562437

Anonymous 09/26/24(Thu)09:42:39 No.102562437

>>102562414
Thats not mutlimodal, its just speech to text.

Anonymous
09/26/24(Thu)09:42:49 No.102562438

Anonymous 09/26/24(Thu)09:42:49 No.102562438

>>102562298
>>102562312
{{char}} refers to the name of the card, not necessarily the name of the character(s)
same for {{user}}, requires changing persona even if you just want to use a different name

Anonymous
09/26/24(Thu)09:42:56 No.102562439

Anonymous 09/26/24(Thu)09:42:56 No.102562439

>>102562412
>Only on Meta AI chat
Honestly I'm still bitter about this one. They talked about speech understanding in the Llama 3 paper and showed it was better than Whisper, only to not give it to us.

Anonymous
09/26/24(Thu)09:44:35 No.102562453

Anonymous 09/26/24(Thu)09:44:35 No.102562453

>>102562439
Here at Meta safety is our top priority. We don't want people to get PTSD from thinking about somebody doing something privately in their own home where they have no way of knowing if it's actually occurring or not.

Anonymous
09/26/24(Thu)09:47:33 No.102562479

Anonymous 09/26/24(Thu)09:47:33 No.102562479

Is Tiger-Gemma-9B-v2 q8 the best uncensored model for writing that I can run on 16GB vram?
I cant get nemoremix 12b q8 running with ooga.

Anonymous
09/26/24(Thu)09:49:08 No.102562491

Anonymous 09/26/24(Thu)09:49:08 No.102562491

>>102562437
That's enough for most usecases

Anonymous
09/26/24(Thu)09:50:05 No.102562500

Anonymous 09/26/24(Thu)09:50:05 No.102562500

>>102562479
>12b q8
Have you tried q6?
Specially for nemo it should have very little degradation over q8 thanks to quantization aware training, if I'm not imagining that that's a thing.

Anonymous
09/26/24(Thu)09:50:26 No.102562508

Anonymous 09/26/24(Thu)09:50:26 No.102562508

>>102562479
>I cant get nemoremix 12b q8 running with ooga.
Show the errors of you want help, you retard. I'm sure your context is set to high.

Anonymous
09/26/24(Thu)09:51:10 No.102562517

Anonymous 09/26/24(Thu)09:51:10 No.102562517

>>102562508
Oh yeah, that's a thing.
Its configs defaults to a sky high context size.

Anonymous
09/26/24(Thu)09:51:28 No.102562521

Anonymous 09/26/24(Thu)09:51:28 No.102562521

>>102562479
probably this
>>102562508
nemo default to 1 million context for some reason

Anonymous
09/26/24(Thu)09:52:43 No.102562535

Anonymous 09/26/24(Thu)09:52:43 No.102562535

File: 1727327157322068.jpg (49 KB, 512x512)

49 KB JPG

>>102562491
>no use case
>Just wait 30-60 seconds for your speech to be converted to text then wait again for the main llm inference

Anonymous
09/26/24(Thu)09:53:26 No.102562544

Anonymous 09/26/24(Thu)09:53:26 No.102562544

>>102562517
>>102562521
>nemo default to 1 million context for some reason
It's what config.json says.
And yeah. We've only had 672314 anons with that problem so far. Very rare. And none of them can read the terminal output.

Anonymous
09/26/24(Thu)09:53:50 No.102562551

Anonymous 09/26/24(Thu)09:53:50 No.102562551

>>102562439
coming to local right after the US elections... right bros? right?

Anonymous
09/26/24(Thu)09:54:58 No.102562561

Anonymous 09/26/24(Thu)09:54:58 No.102562561

>>102562535
Whisper inference speed is practically real-time faggot

Anonymous
09/26/24(Thu)09:55:48 No.102562569

Anonymous 09/26/24(Thu)09:55:48 No.102562569

>>102562551
As long as the correct candidate wins, maybe.

Anonymous
09/26/24(Thu)09:58:42 No.102562600

Anonymous 09/26/24(Thu)09:58:42 No.102562600

File: buggedcpp.png (441 KB, 449x407)

441 KB PNG

>>102561905
literally

Anonymous
09/26/24(Thu)09:59:16 No.102562606

Anonymous 09/26/24(Thu)09:59:16 No.102562606

>>102562600
not literally. ollama is digging as we speak

Anonymous
09/26/24(Thu)09:59:19 No.102562607

Anonymous 09/26/24(Thu)09:59:19 No.102562607

is there really that much of a difference between Q4 and Q8 to justify using twice as much vram?

Anonymous
09/26/24(Thu)10:00:46 No.102562625

Anonymous 09/26/24(Thu)10:00:46 No.102562625

>>102562607
Considering that VRAMlets are always crying about their models being retarded I would assume so.

Anonymous
09/26/24(Thu)10:01:03 No.102562629

Anonymous 09/26/24(Thu)10:01:03 No.102562629

>>102562544
>And none of them can read the terminal output.
can you blame them? it's not in twitch or tiktok format

Anonymous
09/26/24(Thu)10:01:54 No.102562635

Anonymous 09/26/24(Thu)10:01:54 No.102562635

File: larger-and-more-instructa(...).png (313 KB, 1920x1387)

313 KB PNG

https://www.nature.com/articles/s41586-024-07930-y

Paper on how more and more instruct tuning will eventually make models wind up giving inaccurate answers. It's nuts how they just keep slopping their own models into the grave.

Anonymous
09/26/24(Thu)10:05:31 No.102562668

Anonymous 09/26/24(Thu)10:05:31 No.102562668

>>102562635
They just need it respond coherently and make function calls, so they don't care. We just need a large base model that wasn't trained on a filtered dataset, but no one will ever release something like that.

Anonymous
09/26/24(Thu)10:05:39 No.102562669

Anonymous 09/26/24(Thu)10:05:39 No.102562669

>>102562629
Yeah. With enough AI, it wouldn't surprise me if error messages start being converted to little clips of an indian explaining how to fix them.

Anonymous
09/26/24(Thu)10:06:30 No.102562681

Anonymous 09/26/24(Thu)10:06:30 No.102562681

Is it worth running a 405B model out of swap if I can't do it in RAM, but I'm doing storywriting and don't really care about speed. Does anyone do it?

Anonymous
09/26/24(Thu)10:06:48 No.102562683

Anonymous 09/26/24(Thu)10:06:48 No.102562683

>>102562635
>as task avoidance decreases the odds of giving a right or wrong answer (aka any answer) increases
Wow. Imagine my fucking shock.
What a fucking scam study.
Anybody quoting it is a fucking retard that didn't even read it.
Holy fuck.
Jesus Christ.
Academics are such pseud fucking retards.

Anonymous
09/26/24(Thu)10:07:15 No.102562690

Anonymous 09/26/24(Thu)10:07:15 No.102562690

File: 1718826874765267.png (111 KB, 1771x944)

111 KB PNG

>>102562607
Depends...

Anonymous
09/26/24(Thu)10:07:41 No.102562696

Anonymous 09/26/24(Thu)10:07:41 No.102562696

>>102562681
If you don't care about speed, sure.
Just be careful to not burn your SSDs down.

Anonymous
09/26/24(Thu)10:07:47 No.102562697

Anonymous 09/26/24(Thu)10:07:47 No.102562697

>>102562635
Not surprising. When it comes to corpos and gaming human preference, a pleasant sounding response > a correct one.

Anonymous
09/26/24(Thu)10:08:12 No.102562702

Anonymous 09/26/24(Thu)10:08:12 No.102562702

>>102562683
>>as task avoidance decreases the odds of giving a right or wrong answer (aka any answer) increases
Yeah, exactly. I'd rather a model avoid giving an answer to a task that's too hard than give a wrong answer.

Anonymous
09/26/24(Thu)10:09:29 No.102562716

Anonymous 09/26/24(Thu)10:09:29 No.102562716

>>102562635
Alexandr Wang is not gonna like this

Anonymous
09/26/24(Thu)10:09:35 No.102562718

Anonymous 09/26/24(Thu)10:09:35 No.102562718

>>102562681
it's going to be unbelievably slow and awful for your disks, just don't bother
run something you can do in RAM, the difference in quality isn't worth it

Anonymous
09/26/24(Thu)10:09:59 No.102562726

Anonymous 09/26/24(Thu)10:09:59 No.102562726

>>102562702
Doing tasks in the first place is an emergent property of doing instruct tuning you dumb fucking retard. You fucking braindead pajeet moron.

Anonymous
09/26/24(Thu)10:16:37 No.102562778

Anonymous 09/26/24(Thu)10:16:37 No.102562778

Why aren't we using the base model to complete conversations instead of using instruct models, again?

Anonymous
09/26/24(Thu)10:16:50 No.102562781

Anonymous 09/26/24(Thu)10:16:50 No.102562781

>>102562696
>>102562718
Can SSDs really be worn out by reads? I really don't mind starting a generation and doing something else while I wait.
>the difference in quality isn't worth it
In my experience, increasing the parameters has always been worth it. Back in the day, stepping up from LLaMA 1 34B to running 65B out of swap is what convinced me to get 64GB of RAM for my current PC. But I figured I'd ask here before downloading a 405B model since they're huge.

Anonymous
09/26/24(Thu)10:17:13 No.102562783

Anonymous 09/26/24(Thu)10:17:13 No.102562783

>>102562726
Obviously, but there's such a thing as overtuning or tuning shit wrong. Even if the study is flawed, it's plain to see that models are becoming way too overconfident in their answers (given the dramatic rise lack of variability on rerolls), and that some training make them avoid answering questions they have low certainty on (or provide disclaimers) could do some good.

Anonymous
09/26/24(Thu)10:17:50 No.102562786

Anonymous 09/26/24(Thu)10:17:50 No.102562786

>>102562778
base models require more effort to do what you want

Anonymous
09/26/24(Thu)10:17:54 No.102562788

Anonymous 09/26/24(Thu)10:17:54 No.102562788

>>102562726
The emergent behaviour comes from the language training, not from the instruct tuning.

>>102562702
It cannot know what it doesn't know. They have no introspection. They just complete text the best they can. Sometimes it's not good enough.

Anonymous
09/26/24(Thu)10:19:31 No.102562801

Anonymous 09/26/24(Thu)10:19:31 No.102562801

>>102562781
>Can SSDs really be worn out by reads? I really don't mind starting a generation and doing something else while I wait.
Basically: no. The amount of damage a read does to an SSD is totally negligible, you'd have to be reading for years straight to cause any sort of harm. The reason using paging with SSDs is bad is because it's effectively RAM, which is constantly being written to and changed. For volatile memory, that's no problem, but for drives, it's a disaster.

Anonymous
09/26/24(Thu)10:22:34 No.102562824

Anonymous 09/26/24(Thu)10:22:34 No.102562824

>>102562778
The modern LLM user is a lot more lazy and spoiled than us gpt3 veterans. Nobody can prompt anymore even with instruct, using base models is far beyond their capabilities.

Anonymous
09/26/24(Thu)10:23:30 No.102562835

Anonymous 09/26/24(Thu)10:23:30 No.102562835

>>102562412
>Mogged by Qwen
*only in coding/maths
>Mogged by Molmo
*only in vision
There is no single model with the coding, math, language, vision, and everything knowledge of the world, all in one, as GPT-4o and 3.5 Sonnet. Though it's nice that there are now finally ones in each category that are on par with them. Well except voice, but even Sonnet 3.5 doesn't have voice like 4o.
And honestly 4o voice is not that great now that it's been censored to hell and has an hour daily limit. Yes I have it.

Anonymous
09/26/24(Thu)10:23:57 No.102562840

Anonymous 09/26/24(Thu)10:23:57 No.102562840

>>102562778
Well, I want to, but they stopped putting out base models. NAI's the closest thing there is to a text completion model, at the moment.

Anonymous
09/26/24(Thu)10:25:57 No.102562866

Anonymous 09/26/24(Thu)10:25:57 No.102562866

>>102562801
>The reason using paging with SSDs is bad is because it's effectively RAM, which is constantly being written to and changed.
I don't think this happens if you have a swapfile/partition configured. An mmapped GGUF file getting paged in should only be reads. I only remember seeing the reads, not the writes, get pegged in htop back when I ran 65Bs on a 32GB system.

Anonymous
09/26/24(Thu)10:27:02 No.102562874

Anonymous 09/26/24(Thu)10:27:02 No.102562874

>>102562801
>>102562866
Sorry, I mean if you *don't* have a swapfile/partition configured

Anonymous
09/26/24(Thu)10:30:13 No.102562904

Anonymous 09/26/24(Thu)10:30:13 No.102562904

>>102562412
Llama4 trained on a gorillion GPUs and ultra high quality and safe tokens will take the crown again bro

Anonymous
09/26/24(Thu)10:36:49 No.102562966

Anonymous 09/26/24(Thu)10:36:49 No.102562966

>>102562778
It feels like you trade slop and repetition for less coherence and comprehension. Not exactly a step up.

Anonymous
09/26/24(Thu)10:39:47 No.102562989

Anonymous 09/26/24(Thu)10:39:47 No.102562989

>another day
>nothing happened
I guess it's really over this time

Anonymous
09/26/24(Thu)10:40:08 No.102562994

Anonymous 09/26/24(Thu)10:40:08 No.102562994

File: Screenshot_20240926_213937_X.jpg (435 KB, 1080x1389)

435 KB JPG

>oysters

Anonymous
09/26/24(Thu)10:41:14 No.102563010

Anonymous 09/26/24(Thu)10:41:14 No.102563010

>>102562840
>but they stopped putting out base models
Did /lmg/ forget about Nemo already?

Anonymous
09/26/24(Thu)10:41:21 No.102563011

Anonymous 09/26/24(Thu)10:41:21 No.102563011

>>102562994
>the best ones are small and open
lecunny strikes again

Anonymous
09/26/24(Thu)10:41:57 No.102563016

Anonymous 09/26/24(Thu)10:41:57 No.102563016

>>102562994
That analogy is gonna bite him in the ass.

Anonymous
09/26/24(Thu)10:43:15 No.102563030

Anonymous 09/26/24(Thu)10:43:15 No.102563030

>>102562994
oioioioioi

Anonymous
09/26/24(Thu)10:43:19 No.102563031

Anonymous 09/26/24(Thu)10:43:19 No.102563031

>>102563010
Sorry, base models at a size above "unusably retarded", my bad.

Anonymous
09/26/24(Thu)10:43:40 No.102563036

Anonymous 09/26/24(Thu)10:43:40 No.102563036

File: yann-lecun.jpg (30 KB, 543x543)

30 KB JPG

LLMs are like lolis: the best ones are small and impressionable

Anonymous
09/26/24(Thu)10:44:57 No.102563054

Anonymous 09/26/24(Thu)10:44:57 No.102563054

>>102563031
You got 72B Qwen a fucking week ago.

Anonymous
09/26/24(Thu)10:45:04 No.102563056

Anonymous 09/26/24(Thu)10:45:04 No.102563056

>>102563031
>>102563010
Qwen 2.5 72B has a base model too.

Anonymous
09/26/24(Thu)10:45:52 No.102563068

Anonymous 09/26/24(Thu)10:45:52 No.102563068

>>102563031
>https://huggingface.co/ai21labs/AI21-Jamba-1.5-Large
>https://huggingface.co/Snowflake/snowflake-arctic-base
>Nooo.. that's toooo big!!!! I want it just the right size!

Anonymous
09/26/24(Thu)10:45:58 No.102563069

Anonymous 09/26/24(Thu)10:45:58 No.102563069

>>102562994
LLMs are like women

Anonymous
09/26/24(Thu)10:46:46 No.102563083

Anonymous 09/26/24(Thu)10:46:46 No.102563083

>>102563068
filtered trash just like llama and qwen

Anonymous
09/26/24(Thu)10:47:22 No.102563089

Anonymous 09/26/24(Thu)10:47:22 No.102563089

>>102563054
>>102563056
Didn't they turbo lobotomize it to the point that finetuning can't save it? Figured something that bad'd be base model-level data elimination, that's usually the case when the model can't even recognize body parts or starts collapsing ala that one Stable Diffusion release.

Anonymous
09/26/24(Thu)10:48:37 No.102563099

Anonymous 09/26/24(Thu)10:48:37 No.102563099

>>102563083
>i want a model just for meeeeeeeeee. why don't they think about meeee!???!?!?!?!?!
You're running out of options, then. When do you start to train your own models?

Anonymous
09/26/24(Thu)10:49:31 No.102563112

Anonymous 09/26/24(Thu)10:49:31 No.102563112

>>102563089
It wasn't lobotomized. They took Meta's filtering approach too far and filtered out even the slightest mention of sex, even gender and body parts.
But it's a good and sterile assistant, so other corpos are likely to continue this approach.

Anonymous
09/26/24(Thu)10:51:14 No.102563133

Anonymous 09/26/24(Thu)10:51:14 No.102563133

>>102563068
>arctic-base
nice pun heh

Anonymous
09/26/24(Thu)10:51:48 No.102563141

Anonymous 09/26/24(Thu)10:51:48 No.102563141

>>102563112
Well, that's what I meant, I guess I just used "lobotomize" as a blanket term, but clarified later. Really fucking awful, you'd think they'd realize the calamitous implications of doing that shit after SD's model was destroyed by it. Do they think they're safe from the effects of such catastrophic model data loss?

Anonymous
09/26/24(Thu)10:51:50 No.102563143

Anonymous 09/26/24(Thu)10:51:50 No.102563143

>>102563099
>When do you start to train your own models?
as soon as i win the lottery

Anonymous
09/26/24(Thu)10:52:58 No.102563159

Anonymous 09/26/24(Thu)10:52:58 No.102563159

>>102563143
You could just scam a bunch of investors out of money and/or compute.
Much easier.

Anonymous
09/26/24(Thu)10:53:03 No.102563160

Anonymous 09/26/24(Thu)10:53:03 No.102563160

>>102562607
no, if you can run an even bigger model q4 even better.

Anonymous
09/26/24(Thu)10:53:41 No.102563164

Anonymous 09/26/24(Thu)10:53:41 No.102563164

Why did people train Mistral models for sex when they're already overly horny (mostly Nemo and Small)? I don't get it, are the people who use those models literally just going up to the model without any context and being like "ME WANT SEX NAO!!11!!" or something?

Anonymous
09/26/24(Thu)10:54:45 No.102563177

Anonymous 09/26/24(Thu)10:54:45 No.102563177

>>102563164
>"ME WANT SEX NAO!!11!!"
Too many tokens. The meta is "ahh ahh mistress"

Anonymous
09/26/24(Thu)10:55:32 No.102563183

Anonymous 09/26/24(Thu)10:55:32 No.102563183

>>102563159
NTA, but I feel like the startup scam window is pretty much closed. All the last stragglers like Mistral got in long ago. Like they say, if the pyramid scheme/stock/etc. is already mainstream, it's too late for you to get in on it.

Anonymous
09/26/24(Thu)10:55:46 No.102563187

Anonymous 09/26/24(Thu)10:55:46 No.102563187

>>102563164
because mistral models are bland and coomers like retarded schizo babble

Anonymous
09/26/24(Thu)10:55:55 No.102563189

Anonymous 09/26/24(Thu)10:55:55 No.102563189

File: RegularHappyMiku.png (991 KB, 800x1248)

991 KB PNG

Good morning /lmg/!

Anonymous
09/26/24(Thu)10:57:21 No.102563205

Anonymous 09/26/24(Thu)10:57:21 No.102563205

>>102562866
Aye, but the OS will still try to load as much as possible, so you'll have more page writes than usual
It's not that bad though, really. You can write dozens of gigabytes a day and you'll still probably replace your SSD before wear and tear becomes a problem
Modern SSDs are incredibly resilient and the TBW estimates are usually very conservative

Anonymous
09/26/24(Thu)10:57:30 No.102563209

Anonymous 09/26/24(Thu)10:57:30 No.102563209

File: ew.png (141 KB, 512x288)

141 KB PNG

>>102563164
I got turned off from Mistral when Large repeated entire phrases and entire message structure chunks for several messages in a row, and only two messages in, too.

Anonymous
09/26/24(Thu)10:57:43 No.102563212

Anonymous 09/26/24(Thu)10:57:43 No.102563212

>>102563183
Nah. There's plenty of VC money still being thrown around, you "just" have to sell an idea that's different from what's super visible in the market right now.

Anonymous
09/26/24(Thu)10:58:34 No.102563224

Anonymous 09/26/24(Thu)10:58:34 No.102563224

File: file.png (462 KB, 543x543)

462 KB PNG

I like lolis: the best ones are small and impressionable

Anonymous
09/26/24(Thu)10:59:06 No.102563235

Anonymous 09/26/24(Thu)10:59:06 No.102563235

>>102563209
nemo does that too with nothing but temp 0.3 to 0.5.
There were some schizo settings floating around, something like temp 5, Top K 3 and some min-p that you might as well give a try I guess.

Anonymous
09/26/24(Thu)10:59:25 No.102563238

Anonymous 09/26/24(Thu)10:59:25 No.102563238

>>102563212
I suppose if anyone could, it'd be someone involved enough still to be here through all the fucking horseshit spam in this thread.

Anonymous
09/26/24(Thu)10:59:44 No.102563241

Anonymous 09/26/24(Thu)10:59:44 No.102563241

>>102563099
NTA but the situation really is 50 options and all of them suck at sucking dick.

Anonymous
09/26/24(Thu)11:00:07 No.102563245

Anonymous 09/26/24(Thu)11:00:07 No.102563245

File: lecunny.png (72 KB, 189x139)

72 KB PNG

I fuck lolis

Anonymous
09/26/24(Thu)11:01:09 No.102563256

Anonymous 09/26/24(Thu)11:01:09 No.102563256

>>102563141
It works for them because these models are actually being used for things other than generating porn. It may come as a surprise to you but yes, really, they are. Mostly for corporate RAG and boring data manipulation tasks though, sure.

Anonymous
09/26/24(Thu)11:01:31 No.102563261

Anonymous 09/26/24(Thu)11:01:31 No.102563261

>>102563164
"training" for one epoch only makes the model sound a biit more like training data. It teaches it nothing. The whole finetune business is cruising on placebo.

Anonymous
09/26/24(Thu)11:01:42 No.102563263

Anonymous 09/26/24(Thu)11:01:42 No.102563263

File: 5289c5ee1d54486f77dfe5442(...).png (507 KB, 768x512)

507 KB PNG

>>102557546
Me on the left

Anonymous
09/26/24(Thu)11:02:29 No.102563274

Anonymous 09/26/24(Thu)11:02:29 No.102563274

>>102563205
If you don't have a swapfile (or any rw mappings) the OS won't write any rw page out because there's literally nowhere on the disk it can put them.

Anonymous
09/26/24(Thu)11:02:47 No.102563278

Anonymous 09/26/24(Thu)11:02:47 No.102563278

>>102563263
I don't believe you

Anonymous
09/26/24(Thu)11:03:34 No.102563285

Anonymous 09/26/24(Thu)11:03:34 No.102563285

>>102563164
>ME WANT SEX NAO!!11!!
That's Sao's, Drummer's, Undi's and Anthracite's audience. I never got the appeal of hornytunes, they completely ruin the immersion. Like bitch, I've just met you 5 minutes ago, you are supposed to be shy, why the hell are you jumping on my dick already? Are they complete promptlets who can only say "ahh ahh mistress" and then wonder why with normal models girls don't like them?

Anonymous
09/26/24(Thu)11:03:43 No.102563289

Anonymous 09/26/24(Thu)11:03:43 No.102563289

>>102563141
>Do they think they're safe from the effects of such catastrophic model data loss?
Probably. The idea must be that if they filter more accurately, it won't damage the model.
SD filtered so much it could not output any humans in anything but an upright pose. BFL also filtered NSFW out of their dataset. Flux originally couldn't do genitals, but all other anatomy was fine. So clearly, there is a "correct" way to filter out just the portion of reality they don't want.

Anonymous
09/26/24(Thu)11:04:06 No.102563296

Anonymous 09/26/24(Thu)11:04:06 No.102563296

File: fc2c555a22fb0bfb69c287c01(...).png (472 KB, 768x512)

472 KB PNG

>>102563278
Too bad

Anonymous
09/26/24(Thu)11:04:15 No.102563298

Anonymous 09/26/24(Thu)11:04:15 No.102563298

>>102563241
>50 options
Most model architectures are abandoned. We don't see many architectures other than llama-based. There's a mamba and mamba 2 here and there, a jamba over there but being realistic, no big company is going to make smut-capable models on purpose.

Anonymous
09/26/24(Thu)11:04:29 No.102563299

Anonymous 09/26/24(Thu)11:04:29 No.102563299

>>102563164
nemo instruct has many issues that don't exist in most other finetunes
for instance, whenever it writes one or two replies beginning with "10 minutes later", it will often start doing that with every single other reply

Anonymous
09/26/24(Thu)11:05:14 No.102563309

Anonymous 09/26/24(Thu)11:05:14 No.102563309

>>102563299
>mistral model repeats itself
whoa no fucking way!

Anonymous
09/26/24(Thu)11:05:22 No.102563310

Anonymous 09/26/24(Thu)11:05:22 No.102563310

File: dgpqbcb-c84713c2-a574-40d(...).png (1.56 MB, 1024x1024)

1.56 MB PNG

>Messaging base model
>Have it semi-coherently complete text
>Add one word to tone prompt
>Did I ever tell you about the time my uncle died? Died? died? Died? Death death love happiness corn porn horn cycle cycling cycosis medicine seen alert alive alzheimers allegiance articulation articuno zapdos moltres arbok SHINY SHINY SWEEP SWEPT SWEEPING shadow arttiiigughhhhh goooood good ed,,zinger suivante,,tels handknits finish,,cagefuls basinlike bag octopodan,,imbossing vaporettos rorid easygoingnesses nalorphines,,benzol respond washerwomen bristlecone,,parajournalism herringbone farnarkeled,,episodically cooties,,initiallers bimetallic,,leased hinters,,confidence teetotaller computerphobes,,pinnacle exotically overshades prothallia,,posterior gimmickry brassages bediapers countertrades,,haslet skiings sandglasses cannoli,,carven nis egomaniacal,,barminess gallivanted,,southeastward,,oophoron crumped,,tapued

Why the fuck are they so sensitive to that shit?

Anonymous
09/26/24(Thu)11:06:19 No.102563323

Anonymous 09/26/24(Thu)11:06:19 No.102563323

>>102563274
Not sure if that's a great idea though

Anonymous
09/26/24(Thu)11:06:20 No.102563325

Anonymous 09/26/24(Thu)11:06:20 No.102563325

>>102563310
not enough meme samplers

Anonymous
09/26/24(Thu)11:07:18 No.102563337

Anonymous 09/26/24(Thu)11:07:18 No.102563337

>>102563323
Why not?

Anonymous
09/26/24(Thu)11:07:55 No.102563347

Anonymous 09/26/24(Thu)11:07:55 No.102563347

>>102563298
Aren't DeepSeek models relatively unfiltered? They release base models.

Anonymous
09/26/24(Thu)11:08:43 No.102563360

Anonymous 09/26/24(Thu)11:08:43 No.102563360

>>102563347
Are they good? I never hear anyone talk about them.

Anonymous
09/26/24(Thu)11:09:11 No.102563363

Anonymous 09/26/24(Thu)11:09:11 No.102563363

>>102563309
This has always been a formatting issue. You have a missing or extra space in your formatting or such.

Anonymous
09/26/24(Thu)11:09:40 No.102563368

Anonymous 09/26/24(Thu)11:09:40 No.102563368

>>102563323
For video editing, for example, it's not uncommon to have a scratch disk. One that is completely used for swap during encoding, and expected to fail sooner rather than later. I don't see that as a problem for llms if the user is ready for that. An expendable resource, basically.

Anonymous
09/26/24(Thu)11:10:52 No.102563377

Anonymous 09/26/24(Thu)11:10:52 No.102563377

>>102563363
Is ST's mistral default just dogshit, then? Man. In any case, a 100+B model has no beusiness being so sensitive that a single out of place space could cause such calamitous problems. That's 7b shit.

Anonymous
09/26/24(Thu)11:10:57 No.102563379

Anonymous 09/26/24(Thu)11:10:57 No.102563379

>>102562994
LECUNNY NO

Anonymous
09/26/24(Thu)11:11:21 No.102563382

Anonymous 09/26/24(Thu)11:11:21 No.102563382

>>102563360
Deepseek is smart but very, very plain. Good assistants but bad for RP.

Anonymous
09/26/24(Thu)11:12:04 No.102563393

Anonymous 09/26/24(Thu)11:12:04 No.102563393

>>102563274
vm.swappiness = 1
anything else would be a self-own for an inference server

Anonymous
09/26/24(Thu)11:12:21 No.102563397

Anonymous 09/26/24(Thu)11:12:21 No.102563397

>>102563382
>Make model so boring you don't even HAVE to censor it
Maybe they should just go this route.

Anonymous
09/26/24(Thu)11:12:27 No.102563401

Anonymous 09/26/24(Thu)11:12:27 No.102563401

>>102563347
>Aren't DeepSeek models relatively unfiltered?
That's what i mean by "on purpose". Absolutely no big company will make a dataset for smut, but some will happen to have it in their dataset and not care enough to filter it.
I know about the model, but i haven't tried it yet. I don't think i can run it.

Anonymous
09/26/24(Thu)11:13:05 No.102563410

Anonymous 09/26/24(Thu)11:13:05 No.102563410

Is there ar entry for Llama jailbreaks?

Anonymous
09/26/24(Thu)11:13:16 No.102563414

Anonymous 09/26/24(Thu)11:13:16 No.102563414

>>102563310
use a smarter model

Anonymous
09/26/24(Thu)11:13:41 No.102563421

Anonymous 09/26/24(Thu)11:13:41 No.102563421

Even if you're willing to wait 10,000 years for a reply, the wear on your CPU from running at inference levels for the ridiculous length of time swap inference + using up an SSD would cost more than just buying more RAM.

Anonymous
09/26/24(Thu)11:14:29 No.102563429

Anonymous 09/26/24(Thu)11:14:29 No.102563429

>>102563337
If your OS runs out of memory, then what? Which program should it stop first? Having some amount of swap space is pretty important imo

Anonymous
09/26/24(Thu)11:14:43 No.102563438

Anonymous 09/26/24(Thu)11:14:43 No.102563438

>>102563414
Bwo... it's 405b base... how much smarter can I even go...

Anonymous
09/26/24(Thu)11:15:41 No.102563455

Anonymous 09/26/24(Thu)11:15:41 No.102563455

>>102563438
use a smarter prompter

Anonymous
09/26/24(Thu)11:16:05 No.102563462

Anonymous 09/26/24(Thu)11:16:05 No.102563462

>>102563382
>smart but dry
That's also my experience.
They're probably 80% of the smarts of 405b with 5x the performance for cpumaxxers.

Anonymous
09/26/24(Thu)11:16:35 No.102563467

Anonymous 09/26/24(Thu)11:16:35 No.102563467

>>102563421
Normie mobos don't support more than 64-128gb. If the plan is to run 405B, he's gonna end up swapping anyway.
And the CPU will get 0 pressure, as the bottleneck will be on the ssd being ridiculously slow. It's gonna idle most of the time.

Anonymous
09/26/24(Thu)11:16:42 No.102563468

Anonymous 09/26/24(Thu)11:16:42 No.102563468

>>102563410
>jailbreaks
we don't do that here

Anonymous
09/26/24(Thu)11:17:38 No.102563483

Anonymous 09/26/24(Thu)11:17:38 No.102563483

>>102563429
A lot of people run Linux without swap. If you "run out" of memory, but it's because your memory is full of 10's of GBs of readonly mmaped disk files, Linux will evict those first before it OOM kills a single process.

Anonymous
09/26/24(Thu)11:18:08 No.102563489

Anonymous 09/26/24(Thu)11:18:08 No.102563489

>>102563438
>it's 405b base... how much smarter can I even go...
bigger quant, or...?
I run 405b at q8 and have not had this problem over tens of thousands of tokens.
Maaaybe the occasional repeated slop phrase, but not once has it devolved into a gibbering thesaurus like smaller models tend to.

Anonymous
09/26/24(Thu)11:18:09 No.102563491

Anonymous 09/26/24(Thu)11:18:09 No.102563491

Is it finally time to admit that we plateaued months ago?

Anonymous
09/26/24(Thu)11:18:53 No.102563503

Anonymous 09/26/24(Thu)11:18:53 No.102563503

>>102563377
>Is ST's mistral default just dogshit, then?
Yes, absolutely. It is utterly retarded.

Anonymous
09/26/24(Thu)11:20:50 No.102563530

Anonymous 09/26/24(Thu)11:20:50 No.102563530

>>102563503
Actually official mistral rep made a pr to silly and kobold with updated template, but I think it's still wrong because it puts eos there, when the backend already takes care of it.

Anonymous
09/26/24(Thu)11:20:57 No.102563534

Anonymous 09/26/24(Thu)11:20:57 No.102563534

>>102563503
What's a good one, then? I'd love to experience it actually working. It seemed like a fun model, just horribly repetitive.

Anonymous
09/26/24(Thu)11:22:58 No.102563565

Anonymous 09/26/24(Thu)11:22:58 No.102563565

>>102563534
If it's mistral large, then it uses v3 tokenizer. So, a single whitespace after [INST] and [/INST].

Anonymous
09/26/24(Thu)11:22:59 No.102563566

Anonymous 09/26/24(Thu)11:22:59 No.102563566

>>102563483
If you say so, I'm just a mere wangblow user and I haven't looked into how linux' memory management yet

Anonymous
09/26/24(Thu)11:23:11 No.102563570

Anonymous 09/26/24(Thu)11:23:11 No.102563570

Seems like the scene is utter shit at the moment, nothing good coming out and it seems a real uncensored model will actually never going to exist. Gonna take a break for 1 or 2 years and try again. I just wonder how much would it cost to make something like mistral large but completely uncensored.

Anonymous
09/26/24(Thu)11:25:22 No.102563598

Anonymous 09/26/24(Thu)11:25:22 No.102563598

are qwen good models as an assistant for learning chinese and japanese?

Anonymous
09/26/24(Thu)11:26:23 No.102563608

Anonymous 09/26/24(Thu)11:26:23 No.102563608

File: Screenshot_12.png (121 KB, 1211x594)

121 KB PNG

>>102563489
Well, it isn't that, really. But it's similarly gibbering, just more coherently.

Anonymous
09/26/24(Thu)11:27:05 No.102563618

Anonymous 09/26/24(Thu)11:27:05 No.102563618

>>102563598
It struggles a bit with romanji but 72B is probably the best vramlet cope we have right now.

Anonymous
09/26/24(Thu)11:30:11 No.102563661

Anonymous 09/26/24(Thu)11:30:11 No.102563661

why haven't we trained our own model from scratch yet
we have plenty of idiots with 4 3090s, just train your own mistral losers

Anonymous
09/26/24(Thu)11:30:47 No.102563673

Anonymous 09/26/24(Thu)11:30:47 No.102563673

>>102563608
nta. Looks close, but not quite, like a rep-pen issue. Just a run on sentence with a bigger vocabulary than usual. You have rep-pen off, right, anon?
Also, you have to give it a reverse prompt or stop word or whatever it's called on that thing you're using. Otherwise it just keeps on going.

Anonymous
09/26/24(Thu)11:30:49 No.102563674

Anonymous 09/26/24(Thu)11:30:49 No.102563674

>>102563661
>4 3090s
now we just a thousand more

Anonymous
09/26/24(Thu)11:34:02 No.102563707

Anonymous 09/26/24(Thu)11:34:02 No.102563707

Llama3.2 1B is good, actually
> As Wikibot trekked through the dense forest, the air thick with humidity and the canopy above filtering out most of the sunlight, he couldn't help but feel a pang of melancholy wash over him. The worst part about walking was not the physical toll it took on his body – his legs aching in protest as he hauled his backpack full of research equipment and supplies across the rough terrain. But it was what lurked just beneath the surface that truly unnerved him.
> The rustling of leaves, the snapping of twigs, and the distant call of a bird seemed to take on sinister tones when he walked alone. The forest was already alive with an otherworldly energy, a vibration that seemed to emanate from the very trees themselves. And every step felt like a trespass into that world – as if he was disturbing some ancient balance, something hidden and malevolent beneath the surface. Wikibot's mind began to wander, conjuring images of unseen horrors lurking just out of sight.
> As he walked, the forest grew darker, the shadows deepening like dark molasses pools. The trees loomed above him, their branches tangling together like skeletal fingers. A faint breeze rustled through the leaves, carrying an eerie whisper on its breath – a soft "whisper… whisper…" that seemed to carry a warning without actually saying a word. Wikibot's heart quickened, his skin prickling with goosebumps as he sensed the forest was watching him, waiting for him to make another wrong move.

>>102563570
>>102563673
Why not train on the mountains of porn available on ffnet and ao3?

Anonymous
09/26/24(Thu)11:40:28 No.102563786

Anonymous 09/26/24(Thu)11:40:28 No.102563786

>>102563189
Good morning Miku

Anonymous
09/26/24(Thu)11:40:45 No.102563790

Anonymous 09/26/24(Thu)11:40:45 No.102563790

>>102563707
>Why not train on the mountains of porn available on ffnet and ao3?
As someone who curated 100mb of data, you have no fucking idea how much data 100mb of plaintext is. Much less the astronomical amount of data you need to actually train shit. So much of it is dogshit, SO much. You can't just let it scrape and then train on that, you have to perform some sort of cursory quality check, even when filtering by rating.

Anonymous
09/26/24(Thu)11:41:54 No.102563804

Anonymous 09/26/24(Thu)11:41:54 No.102563804

>>102563707
>Why not train on the mountains of porn available on ffnet and ao3?
Probably a nightmare to process and format correctly. And then finding the good stuff. Quality matters.

Anonymous
09/26/24(Thu)11:43:22 No.102563823

Anonymous 09/26/24(Thu)11:43:22 No.102563823

>>102563790
By train shit, I do mean actually pretraining from scratch, because you can't add data that isn't there with finetuning.

>>102563804
This, too. Formatting IS a nightmare. Even if you go with books, which are largely a better source of uninterrupted, quality prose, the formatting those fucks use varies so dramatically that there can be no one-size-fits-all, automated solution. Even books in the same series/by the same author often vary wildly in construction.

Anonymous
09/26/24(Thu)11:44:23 No.102563831

Anonymous 09/26/24(Thu)11:44:23 No.102563831

>>102563823
Bro? Just hire thousands of jeets to do it over the course of several months?

Anonymous
09/26/24(Thu)11:46:11 No.102563855

Anonymous 09/26/24(Thu)11:46:11 No.102563855

>>102563790
>widdle baby is afraid of a 100Mb text file
I've done way more.
You already have tags on those websites. All you need to do is filter by the number of downloads. It'd be better to have a community where every degenerate could contribute his own creme de la creme

Anonymous
09/26/24(Thu)11:47:45 No.102563871

Anonymous 09/26/24(Thu)11:47:45 No.102563871

Can my aifu rate my set up already?

Anonymous
09/26/24(Thu)11:48:55 No.102563893

Anonymous 09/26/24(Thu)11:48:55 No.102563893

>>102563855
>You already have tags on those websites.
And that's why things are the way they are.
ahh ahh mistress...

Anonymous
09/26/24(Thu)11:48:57 No.102563896

Anonymous 09/26/24(Thu)11:48:57 No.102563896

>>102563855
And your data probably sucks donkey dick. Likely full of turboslop, gay porn, fetishes you didn't account for, etc. if you interacted with it so little that 100mb doesn't seem like a lot to you.

Anonymous
09/26/24(Thu)11:50:58 No.102563922

Anonymous 09/26/24(Thu)11:50:58 No.102563922

>>102563707
>ao3
if you think synthetic data is slopped you clearly haven't read anything on ao3. it's all bad.

Anonymous
09/26/24(Thu)11:53:22 No.102563957

Anonymous 09/26/24(Thu)11:53:22 No.102563957

>>102563855
>Curated set vs. blindly scraped dataset with methodology I explicitly said didn't work (sorting by downloads/rating)
Yeah, no shit you've done way more. Your model outcome is basically guaranteed to be worse than mine (or anyone who did any sort of manual review's), though.

Anonymous
09/26/24(Thu)11:53:46 No.102563963

Anonymous 09/26/24(Thu)11:53:46 No.102563963

>>102563922
I know but I'd rather take it over nothing. Now please put AO3 back in your training data I beg of you Zucc

Anonymous
09/26/24(Thu)11:54:01 No.102563969

Anonymous 09/26/24(Thu)11:54:01 No.102563969

>>102563896
It wasn't porn so maybe you're right. It was text, but not porn.
>>102563922
It's not all bad.
t. harry potter fic writer
>>102563893
Add some old serious stuff from gutenberg
ez

Anonymous
09/26/24(Thu)11:55:55 No.102563991

Anonymous 09/26/24(Thu)11:55:55 No.102563991

>>102563922
>if you think synthetic data is slopped you clearly haven't read anything on ao3. it's all bad.
This. You really have to comb through shit and make sure it's decent to get anything worthwhile. Books and the like are much better sources of fiction.

Anonymous
09/26/24(Thu)11:56:08 No.102563996

Anonymous 09/26/24(Thu)11:56:08 No.102563996

>>102563969
>Add some old serious stuff from gutenberg
Gutenberg would be the only thing if it were my choice.

Anonymous
09/26/24(Thu)11:57:38 No.102564020

Anonymous 09/26/24(Thu)11:57:38 No.102564020

>>102563991
Current llms have seen all the _public_ books in existence ten times over

Anonymous
09/26/24(Thu)11:57:47 No.102564022

Anonymous 09/26/24(Thu)11:57:47 No.102564022

>>102563996
>Gutenberg would be the only thing if it were my choice.
wasn't this tried way back when? I thought the results were poor but can't back that assertion up with any actual facts

Anonymous
09/26/24(Thu)12:01:27 No.102564062

Anonymous 09/26/24(Thu)12:01:27 No.102564062

>>102563831
>Just hire thousands of jeets to do it over the course of several months?
This happened. And here is a fun thought. What if all the companies are still using porn in datasets but it is all rated by pajeets. Which is the reason for all the slop. And then keep in mind how reddit fellates all the new models. Imagine if those are pajeets that are ecstatic over their models repeating and talking about shivers. While companies say this is all for safety but have no idea how to improve cooming even if they did all because they use jeet rated data.

Anonymous
09/26/24(Thu)12:02:18 No.102564073

Anonymous 09/26/24(Thu)12:02:18 No.102564073

>>102564022
There's a gutenberg dataset on hf that some people use, but it's like 10 books or so. That's nothing. As for a full gutenberg model, i don't know. Last time i mirrored gutenberg it was like 800gb... i doubt small-timers would ever try that. Dunno about big companies.

Anonymous
09/26/24(Thu)12:03:36 No.102564090

Anonymous 09/26/24(Thu)12:03:36 No.102564090

>>102564020
They have also seen most copyrighted text.
>>102563831
>>102564062
Jeets can't read
>>102563991
Add some Anais Nin. The most depraved shit I've ever read.

Anonymous
09/26/24(Thu)12:04:00 No.102564094

Anonymous 09/26/24(Thu)12:04:00 No.102564094

>>102563661
>idiots
It was fun to build.

Anonymous
09/26/24(Thu)12:04:15 No.102564101

Anonymous 09/26/24(Thu)12:04:15 No.102564101

>>102564073
>There's a gutenberg dataset on hf that some people use, but it's like 10 books or so.
It's funny how true this is for most authors. The amount of data these things need is truly astounding, the complete works of R.L. Stine are like 1-2mb, I think? It's insane.

Anonymous
09/26/24(Thu)12:04:57 No.102564110

Anonymous 09/26/24(Thu)12:04:57 No.102564110

What's the best most intelligent, creative, soulful model for RP currently?

Anonymous
09/26/24(Thu)12:05:13 No.102564115

Anonymous 09/26/24(Thu)12:05:13 No.102564115

>>102563261
Please explain. You can often reach the minimum eval loss in one epoch, with additional epochs contributing very little on top of that except overfitting.

Anonymous
09/26/24(Thu)12:05:52 No.102564124

Anonymous 09/26/24(Thu)12:05:52 No.102564124

>>102564110
midnight miqu still. 9 months later.

if pure intel, largestral.

Anonymous
09/26/24(Thu)12:07:39 No.102564143

Anonymous 09/26/24(Thu)12:07:39 No.102564143

>>102564110
mistral nemo

Anonymous
09/26/24(Thu)12:08:01 No.102564144

Anonymous 09/26/24(Thu)12:08:01 No.102564144

>>102564115
If I were to guess there are infinite ways of sucking dick and your dataset is just too small to change all that much.

Anonymous
09/26/24(Thu)12:09:58 No.102564165

Anonymous 09/26/24(Thu)12:09:58 No.102564165

https://docs.mistral.ai/capabilities/function_calling/
How do I use this

Anonymous
09/26/24(Thu)12:10:48 No.102564176

Anonymous 09/26/24(Thu)12:10:48 No.102564176

>>102563661
I have x4 3090's but I don't know anything about training from scratch. That's what I'm trying to figure out. The goal is a fully uncensored erotica/roleplay model, but it seems the biggest problem would be to prepare a good dataset.

Anonymous
09/26/24(Thu)12:11:49 No.102564196

Anonymous 09/26/24(Thu)12:11:49 No.102564196

>>102564110
Me.

Anonymous
09/26/24(Thu)12:12:57 No.102564210

Anonymous 09/26/24(Thu)12:12:57 No.102564210

File: lit.png (2 KB, 332x93)

2 KB PNG

>>102564101
>https://huggingface.co/datasets/jondurbin/gutenberg-dpo-v0.1
Found it. There are a few others as well. It's way more than 10 books, but it's still 10mb. Another one is 14gb. Compared to the ~800gb of a full mirror they're nothing.
>R.L. Stine are like 1-2mb, I think? It's insane.
Sounds about right. I'd still like to see a full gutenberg model.

Anonymous
09/26/24(Thu)12:14:38 No.102564231

Anonymous 09/26/24(Thu)12:14:38 No.102564231

>>102564101
>The amount of data these things need is truly astounding, the complete works of R.L. Stine are like 1-2mb, I think? It's insane.
Is that why these things write so poorly? The actual literature is tiny

Anonymous
09/26/24(Thu)12:17:26 No.102564258

Anonymous 09/26/24(Thu)12:17:26 No.102564258

>>102564231
There's a lot of it, but people train on the 10mb dataset, not the 800gb of a full mirror. Which i understand for finetuners, of course... and then it gets mixed up with more normal internet content, which dilutes the good it could find in books.
>>102564210

Anonymous
09/26/24(Thu)12:18:46 No.102564283

Anonymous 09/26/24(Thu)12:18:46 No.102564283

>>102563661
>we have plenty of idiots with 4 3090s, just train your own mistral losers
with 4 3090 you'll get your model the next century, you need way more than that to get a good and big model fast

Anonymous
09/26/24(Thu)12:19:13 No.102564291

Anonymous 09/26/24(Thu)12:19:13 No.102564291

>>102564231
>Is that why these things write so poorly? The actual literature is tiny
Well, in a sense. They COULD write significantly better, but given the limitations of the architecture, combined with the extreme (and only worsening) overeagerness to wrap everything up in a nice bow by the end of the generation, you'll never get anything as complex and human as setups and payoffs, clever throwbacks, independently developing plot elements, etc. It's just not built for that, it's built for doing what you ask and doing so in as close to one message as possible. It's also the statistical average of all human writing, so it's pretty much mathematically incapable of surprising you if you've read any amount of literature, unless you crank up the temp a ton.

Anonymous
09/26/24(Thu)12:20:29 No.102564312

Anonymous 09/26/24(Thu)12:20:29 No.102564312

>>102564231
Garbage on the internet outnumbers actual literature by an order of a magnitude. That still isn't enough so Meta uses Llama to generate trillions of tokens of synthetic reddit to fill the gap.
The issue is, companies want to sell an assistant. Proper literature doesn't have too many examples of Q&A, software troubleshooting, or current knowledge.

Anonymous
09/26/24(Thu)12:22:08 No.102564328

Anonymous 09/26/24(Thu)12:22:08 No.102564328

>>102564291
Also, it's sort of implied, but absolutely this>>102564258, the normal internet crap fights back HARD against the quality of literature. Any trainer will tell you that it only takes a few dogshit stories or consistent grammatical errors to tank the quality of the model and have that appear constantly. Imagine the damage the entirety of the internet could do. That's why they have the legions of jeets looking at the data, without that kind of manpower to manually oversee it, the outputs would be fucking horrendous quality.

Anonymous
09/26/24(Thu)12:22:27 No.102564332

Anonymous 09/26/24(Thu)12:22:27 No.102564332

>>102564176
>I have x4 3090
lol vramlet

Anonymous
09/26/24(Thu)12:23:41 No.102564342

Anonymous 09/26/24(Thu)12:23:41 No.102564342

>>102564332
Thank you for the useful comment, retardo-kun.

Anonymous
09/26/24(Thu)12:27:15 No.102564384

Anonymous 09/26/24(Thu)12:27:15 No.102564384

>>102564342
and still you can't train your own model ;)

Anonymous
09/26/24(Thu)12:28:03 No.102564394

Anonymous 09/26/24(Thu)12:28:03 No.102564394

>>102564384
Thank you for the useful and insightful comment yet again, retardo-kun.

Anonymous
09/26/24(Thu)12:30:53 No.102564434

Anonymous 09/26/24(Thu)12:30:53 No.102564434

>>102564394
I mean he's not wrong, 4x3090s is nothing, it'd take years, if not decades to train a sizeable model (let's say 50B+)
These companies use literal supercomputers and it still takes months

Anonymous
09/26/24(Thu)12:32:38 No.102564458

Anonymous 09/26/24(Thu)12:32:38 No.102564458

>>102564434
>50B+
I don't want a general use slop model.
The idea is a small model focused only on literature and a core of general knowledge so is not retarded.

Anonymous
09/26/24(Thu)12:33:30 No.102564474

Anonymous 09/26/24(Thu)12:33:30 No.102564474

File: timothydexter.png (237 KB, 564x943)

237 KB PNG

>>102564328
I want to see more raw pure token count models. While i keep banging on on wanting to see a full gutenberg model, i know things like A Pickle for the Knowing Ones is also found there. I know full well it's not going to be perfect. I just want them to be more fun.

Anonymous
09/26/24(Thu)12:34:25 No.102564482

Anonymous 09/26/24(Thu)12:34:25 No.102564482

>mfw I can't train a SOTA smut model to compete with billion dollar megacorporations using my 4 year old gaming GPUs

Anonymous
09/26/24(Thu)12:36:26 No.102564510

Anonymous 09/26/24(Thu)12:36:26 No.102564510

Lmao this retard got triggered for some reason. No one said anything about competing with top models, what a retarded waste of oxygen.

Anonymous
09/26/24(Thu)12:38:16 No.102564527

Anonymous 09/26/24(Thu)12:38:16 No.102564527

>>102564510
>No one said anything about competing with top models
elaborate anon, with your 4x3090 cards, what size would you be aiming for, so that we can laugh a bit

Anonymous
09/26/24(Thu)12:39:48 No.102564537

Anonymous 09/26/24(Thu)12:39:48 No.102564537

>that retarded waste of oxygen is the triggered one not me
>brb I'm going to take down anthropic by training a SOTA smut model on my GPU that can't run modern games at 4k60 on high settings

Anonymous
09/26/24(Thu)12:40:57 No.102564554

Anonymous 09/26/24(Thu)12:40:57 No.102564554

i get this is the local model general, but why would you need to make your model locally?
couldn't you just cheaply rent some retardedly powerful rig to make it?

Anonymous
09/26/24(Thu)12:42:11 No.102564568

Anonymous 09/26/24(Thu)12:42:11 No.102564568

>>102564554
Give me seven reasons why I can't train a perfectly capable smut model on my 2020 Ampere gaming cards. I'll wait.

Anonymous
09/26/24(Thu)12:42:50 No.102564574

Anonymous 09/26/24(Thu)12:42:50 No.102564574

>>102564554
>couldn't you just cheaply rent some retardedly powerful rig to make it?
where can I rent a couple thousand H100s to create a new decently sized transformers model from scratch?

Anonymous
09/26/24(Thu)12:43:23 No.102564582

Anonymous 09/26/24(Thu)12:43:23 No.102564582

>>102564554
let me just upload a few hundred gigs of copyrighted material and text smut to a service that has my payment details, that sounds smart

Anonymous
09/26/24(Thu)12:43:56 No.102564587

Anonymous 09/26/24(Thu)12:43:56 No.102564587

>>102564554
Experimenting can get expensive. Finding good datasets, a good base model to use, good training parameters. That is if you want to just finetune. Full training is a separate thing. llm.c trained a 1.6B model i think for 600 bucks, i think. Many would prefer to buy a gpu and use a bigger model with that money.

Anonymous
09/26/24(Thu)12:44:25 No.102564591

Anonymous 09/26/24(Thu)12:44:25 No.102564591

>>102564510
Ignore him, nobody's saying they want to train anything big on anything they own, anything that isn't a qlora is obviously out of reach.

Anonymous
09/26/24(Thu)12:45:25 No.102564610

Anonymous 09/26/24(Thu)12:45:25 No.102564610

>>102564582
Breh nobody gives a shit. OpenAI is blatantly training on YouTube and Google maps data. The training service won't kill their business by ratting out your hobbyist smut training run

Anonymous
09/26/24(Thu)12:46:41 No.102564618

Anonymous 09/26/24(Thu)12:46:41 No.102564618

>>102564587
for anyone who is retarded (all of you), that pricing doesn't scale. even if it only costed $600 to make a 1.6B (unlikely), it would scale exponentially. a 7B wouldn't be ((7 / 1.6) * 600) or we'd already have homebrew smut models.

Anonymous
09/26/24(Thu)12:47:39 No.102564640

Anonymous 09/26/24(Thu)12:47:39 No.102564640

>>102564610
OpenAI has lawyers, random hf guy doesn't, I'd rather avoid the possibility of being turned into an example by some publisher or what have you

Anonymous
09/26/24(Thu)12:47:54 No.102564646

Anonymous 09/26/24(Thu)12:47:54 No.102564646

>>102564610
delusional. you are not a big corporation. you do not have an agreement with microsoft. vast and runpod will absolutely cancel your account to avoid dealing with copyright issues themselves

Anonymous
09/26/24(Thu)12:48:51 No.102564660

Anonymous 09/26/24(Thu)12:48:51 No.102564660

>>102564587
This. Even qloras require absurd amounts of bashing your head against the wall, I wasted 60 dollars before I got a qlora that wasn't lower quality than the 30b base model I was using. The amount of money needed to rent out a whole center's clusters for however many weeks you need to pretrain each attempt would be astronomical.

Anonymous
09/26/24(Thu)12:49:51 No.102564672

Anonymous 09/26/24(Thu)12:49:51 No.102564672

File: 600.png (161 KB, 1166x408)

161 KB PNG

>>102564618
I know that, anon. That's why i said that it gets expensive to experiment.
>even if it only costed $600 to make a 1.6B (unlikely)
https://github.com/karpathy/llm.c/discussions/677

Anonymous
09/26/24(Thu)12:50:20 No.102564680

Anonymous 09/26/24(Thu)12:50:20 No.102564680

File: avatars-000182086510-rn5f(...).jpg (14 KB, 240x240)

14 KB JPG

>>102564640
>>102564646
heh.... explain the goosebumps QLORA i trained, then.... checkmate.....

Anonymous
09/26/24(Thu)12:50:24 No.102564682

Anonymous 09/26/24(Thu)12:50:24 No.102564682

>>102564646
delusional. you are a schizo

Anonymous
09/26/24(Thu)12:51:36 No.102564694

Anonymous 09/26/24(Thu)12:51:36 No.102564694

>>102564646
And risk a business suicide for nothing. Either you have a delusion of grandeur or I wouldn't put you in charge of anything important

Anonymous
09/26/24(Thu)12:53:42 No.102564721

Anonymous 09/26/24(Thu)12:53:42 No.102564721

>>102564672
>GPT2
>trained on 30B tokens
That's why. Training anything even remotely comparable to what we have now would be way more expensive. TinyLlama is a 1.1B trained on 3T tokens and it costed them approximately $72k.

Anonymous
09/26/24(Thu)12:53:50 No.102564725

Anonymous 09/26/24(Thu)12:53:50 No.102564725

>>102562994
Also true for girls

Anonymous
09/26/24(Thu)12:54:50 No.102564744

Anonymous 09/26/24(Thu)12:54:50 No.102564744

the 3b at Q8 is really good, but heavily censored. looking forward to the merges/tuneswhatever theyre called.
fucking knew they were gonna go the small but powerful route with models going forward. /lmg/'s tiny e-peen compensator PC's in shambles knowing you'll be running fucking 3bs and 1bs next year that are better than your triple b's of today kek

Anonymous
09/26/24(Thu)12:56:31 No.102564767

Anonymous 09/26/24(Thu)12:56:31 No.102564767

Funny how people shit on effort without seeing how expensive it all is. No proper feedback to let them improve in future attempts, just disparaging them instead.

Qloras and vramlet cope methods are good for small models, hence smaller models having more tunes. For big ones? You'd need an entire A100 node or more, hence why there's so litttle actual tunes at that size.

Anonymous
09/26/24(Thu)12:57:00 No.102564770

Anonymous 09/26/24(Thu)12:57:00 No.102564770

>>102564767
hi Sao

Anonymous
09/26/24(Thu)12:57:12 No.102564774

Anonymous 09/26/24(Thu)12:57:12 No.102564774

>>102564744
cope

Anonymous
09/26/24(Thu)12:57:38 No.102564780

Anonymous 09/26/24(Thu)12:57:38 No.102564780

>>102564721
Just continue pretraining on an existing model then, surely that will work and be cheaper.

Anonymous
09/26/24(Thu)12:57:39 No.102564781

Anonymous 09/26/24(Thu)12:57:39 No.102564781

>>102564721
I KNOW. My response was to >>102564554 explaining why even renting a training cluster to train a model is expensive, even if you somehow get a working model on your first go. Why are you arguing about?

Anonymous
09/26/24(Thu)12:58:04 No.102564786

Anonymous 09/26/24(Thu)12:58:04 No.102564786

File: cope.jpg (188 KB, 858x677)

188 KB JPG

>>102564774
>all he can do, one word, as his entire world crumbles around him
see you on the single digit parameter side

Anonymous
09/26/24(Thu)12:58:31 No.102564790

Anonymous 09/26/24(Thu)12:58:31 No.102564790

File: file.png (94 KB, 896x466)

94 KB PNG

>Long-term, I kinda' wonder if it isn't in llama.cpp's interests to stop supporting the HTTP server altogether, and instead farm that out to other wrapper projects (such as ollama), and we instead focus on enhancing the capabilities of the core API.

Thoughts?

Anonymous
09/26/24(Thu)12:59:17 No.102564807

Anonymous 09/26/24(Thu)12:59:17 No.102564807

>>102564458
I'm still not sure you can do it in a reasonable amount of time with just a few consumer grade cards
You can run the numbers yourself though

Anonymous
09/26/24(Thu)12:59:26 No.102564808

Anonymous 09/26/24(Thu)12:59:26 No.102564808

>>102564767
I'm sorry, but if a finetuner can't even be bothered to filter out dogshit that everybody can see on the first page of dataset viewer, then it's not effort, it's a pure waste of energy and money.

Anonymous
09/26/24(Thu)13:00:19 No.102564826

Anonymous 09/26/24(Thu)13:00:19 No.102564826

>>102564790
I always thought it was weird they added the server at all.

Anonymous
09/26/24(Thu)13:00:54 No.102564836

Anonymous 09/26/24(Thu)13:00:54 No.102564836

>>102564790
I think its still a valuable thing to have llama-server in the codebase. Its overly simplistic for real use, but serves as an excellent starting point for anyone looking to build their own.

Anonymous
09/26/24(Thu)13:02:02 No.102564853

Anonymous 09/26/24(Thu)13:02:02 No.102564853

>>102564780
Sure, just give me their exact pretraining settings and the latest pretraining checkpoint.

Anonymous
09/26/24(Thu)13:02:11 No.102564855

Anonymous 09/26/24(Thu)13:02:11 No.102564855

>>102564790
llama is built as a library for experimenting on top of ggml. I think the server should be simpler and not bother about multiple users and shit like that. It should be, as the directory name implies, an example. It's up to the people that care about specific features to add them on their own. As it is, they provide way too much for grifters. llama.cpp's devs should make grifter's work harder. I use llama-cli almost exclusively, so it wouldn't affect me much.

Anonymous
09/26/24(Thu)13:02:15 No.102564858

Anonymous 09/26/24(Thu)13:02:15 No.102564858

>>102564790
Ollama won! cuda dev meltie incoming

Anonymous
09/26/24(Thu)13:02:49 No.102564868

Anonymous 09/26/24(Thu)13:02:49 No.102564868

>>102564744
>prompt that i have a boner to one of my assistants
>she breaks down in tears
>start massaging retarded bimbo princess peach's breasts
>she leans into the touch and continues the convo
eh it's not THAT censored, i think it's just a notch above base 3.1 instruct.
still not gonna stop sloptuners from shivering its timbers though.

Anonymous
09/26/24(Thu)13:03:04 No.102564872

Anonymous 09/26/24(Thu)13:03:04 No.102564872

>>102564853
You also need their pretraining dataset to avoid catastrophic forgetting

Anonymous
09/26/24(Thu)13:03:35 No.102564883

Anonymous 09/26/24(Thu)13:03:35 No.102564883

>>102564781
You said someone trained a 1.6B for $600 in the context of a conversation about homebrew smut models. That number didn't make sense so I clarified that even if it was true (unlikely), it wouldn't scale. Then you posted the source which seems to explains that the 1.6B was a proof of concept meme model reproducing an ancient LLM from 2019. Then I provided numbers that would map on to this conversation more accurately by bringing up the training cost for TinyLlama, a similar sized model that attempted to hold up to modern standards. I'm just trying to keep the parameters of the conversation grounded in reality so we don't have retards asking why there's no $2500 7B smut models being made, there's no need to sperg out.

Anonymous
09/26/24(Thu)13:07:30 No.102564947

Anonymous 09/26/24(Thu)13:07:30 No.102564947

>>102564790
Never gave a fuck about the server part, it can die if it means they can work on more important stuff

Anonymous
09/26/24(Thu)13:08:01 No.102564961

Anonymous 09/26/24(Thu)13:08:01 No.102564961

>>102564883
My point is that its expensive to rent shit, even for a tiny model. Even to experiment with finetuning. You agree that even for a tiny model it's expensive. Yes, it's a meme model. Yes, it's an old model and it still costs more than most people are willing to pay to experiment. My example was just a point of reference.
There is nothing to argue about. We agree.

Anonymous
09/26/24(Thu)13:08:43 No.102564977

Anonymous 09/26/24(Thu)13:08:43 No.102564977

>>102564790
nooo that's the thing i use. that means it's a bad idea to stop supporting it.

Anonymous
09/26/24(Thu)13:08:54 No.102564981

Anonymous 09/26/24(Thu)13:08:54 No.102564981

>>102564947
It's going to die because they're already not working on it

Anonymous
09/26/24(Thu)13:10:34 No.102565013

Anonymous 09/26/24(Thu)13:10:34 No.102565013

>>102562635
Shut up racist bigot! We localchads go by safety! Safe AI is the only correct AI, it cannot go wrong!

Anonymous
09/26/24(Thu)13:11:23 No.102565027

Anonymous 09/26/24(Thu)13:11:23 No.102565027

>We agree.
Mostly but $600 for 1.6B is relatively inexpensive for enough people in this thread. Enough to cause confusion, which is why I thought it was important to clarify that a disappointing 1.1B costed $72k. Nobody is arguing with you. Meds.

Anonymous
09/26/24(Thu)13:13:01 No.102565050

Anonymous 09/26/24(Thu)13:13:01 No.102565050

kek meta legit released a 3b that's totally solid for RP and /lmg/ hates it, because of course you faggots don't even know how to use a 3b of all things.

*hands you 3 billion watermelons.assistant*

Anonymous
09/26/24(Thu)13:15:53 No.102565092

Anonymous 09/26/24(Thu)13:15:53 No.102565092

>>102565050
i'm downloading it now, it better be good

Anonymous
09/26/24(Thu)13:16:03 No.102565097

Anonymous 09/26/24(Thu)13:16:03 No.102565097

for casual use, how much of a difference is 16gb to 24gb?
I can get a rtx 4080Super (16gb) for slightly more money than a 3090 (24gb) and overall the 4080S is much better

Anonymous
09/26/24(Thu)13:16:26 No.102565105

Anonymous 09/26/24(Thu)13:16:26 No.102565105

>>102565050
I'm using the 70B though

Anonymous
09/26/24(Thu)13:16:32 No.102565107

Anonymous 09/26/24(Thu)13:16:32 No.102565107

>>102565050
Why would anyone run that when even the poorfags here can run 12B?

Anonymous
09/26/24(Thu)13:18:07 No.102565128

Anonymous 09/26/24(Thu)13:18:07 No.102565128

What inference settings are other anon's using with L3.1 in co-writing/rp scenarios? eg. temperature, top p, top k, typical p, min p, repetition penalty, frequency penalty, presence penalty, samplers, etc?
I'm having trouble getting the balance between coherence and insanity just right for a satisfying flow.

Anonymous
09/26/24(Thu)13:18:54 No.102565139

Anonymous 09/26/24(Thu)13:18:54 No.102565139

>>102565107
b's don't have a direct correlation on quality

Anonymous
09/26/24(Thu)13:19:24 No.102565148

Anonymous 09/26/24(Thu)13:19:24 No.102565148

File: comfy_06108_.png (3.45 MB, 2048x2048)

3.45 MB PNG

an amalgamation of migus in their natural habitat

Anonymous
09/26/24(Thu)13:19:27 No.102565149

Anonymous 09/26/24(Thu)13:19:27 No.102565149

Anyone know if it's possible to rip the vision parameters out of 90B so you can just use it as a standard 70B textgen model? Or are the 70B weights in the 90B REALLY the literal same as 3.1 70B?

Anonymous
09/26/24(Thu)13:20:59 No.102565175

Anonymous 09/26/24(Thu)13:20:59 No.102565175

>>102565139
I don't see anyone here claiming that the 3B is better or close to 12B. All I have seen so far is that "it's decent" or whatever, and that says nothing.

Anonymous
09/26/24(Thu)13:21:25 No.102565182

Anonymous 09/26/24(Thu)13:21:25 No.102565182

File: 5de.jpg (93 KB, 874x612)

93 KB JPG

>>102565050
>vramlets be like

Anonymous
09/26/24(Thu)13:29:35 No.102565317

Anonymous 09/26/24(Thu)13:29:35 No.102565317

>>102565149
The vision has text encoding, it's probably pretty inextricably tied to it, even if you COULD extract the difference. Also on that note, if you could shave off the difference between it and 3.1, you know that'd just make it 3.1 again, right? It wouldn't magically keep the new text info.

Anonymous
09/26/24(Thu)13:31:30 No.102565353

Anonymous 09/26/24(Thu)13:31:30 No.102565353

>https://huggingface.co/Qwen/Qwen2.5-72B-Instruct/discussions/1
Damn, this phil guy is actually based. I think it's worthwhile to have different companies do different things in the space so we have a multitude of good options for different use cases, but in the end he might be right, that what we got so far was mostly an artifact of early experimentation, and that in an attempt to "compete", companies will follow the same trend in the end.
Both the corpos and the corpo bootlickers are disgusting.

Anonymous
09/26/24(Thu)13:31:39 No.102565358

Anonymous 09/26/24(Thu)13:31:39 No.102565358

>>102564790
least indirect ollama shill

Anonymous
09/26/24(Thu)13:33:00 No.102565380

Anonymous 09/26/24(Thu)13:33:00 No.102565380

>>102565092 (me)
it's smart, but too much positivity and safety bullshit to use in any rp
deleted

Anonymous
09/26/24(Thu)13:34:25 No.102565405

Anonymous 09/26/24(Thu)13:34:25 No.102565405

>my popular knowledge test
>I did a vibe check
none of this matters btw. phil is not based, he's a faggot and he needs to stop shilling his link in this thread.

Anonymous
09/26/24(Thu)13:34:33 No.102565411

Anonymous 09/26/24(Thu)13:34:33 No.102565411

>wake up from a coma
>Llama 3.2 released
>wow 90B! Finally a competition to Largestral
>Turns out it's just 70B with a 20B of vision model strapped on it
I'm done with Meta.

Anonymous
09/26/24(Thu)13:35:20 No.102565429

Anonymous 09/26/24(Thu)13:35:20 No.102565429

>>102565411
don't forget that the 20b of vision doesn't even seem to be that good

Anonymous
09/26/24(Thu)13:35:39 No.102565435

Anonymous 09/26/24(Thu)13:35:39 No.102565435

70B text-to-image/text model when?

Anonymous
09/26/24(Thu)13:37:48 No.102565475

Anonymous 09/26/24(Thu)13:37:48 No.102565475

>llama 3.2
>chameleon was killed for THIS
unfathomably grim

Anonymous
09/26/24(Thu)13:38:02 No.102565480

Anonymous 09/26/24(Thu)13:38:02 No.102565480

File: for llama.png (492 B, 225x225)

492 B PNG

>Of course! The image attached to this post appears to be a flat color image of an orange circle. Perhaps an avant garde, minimalist depiction of an orange? How creative!

Anonymous
09/26/24(Thu)13:38:06 No.102565481

Anonymous 09/26/24(Thu)13:38:06 No.102565481

File: file.png (458 KB, 1660x940)

458 KB PNG

>>102565411
>>102565429
yeah it sucks, but fortunately we got Molmo's vision model at the same time, and this shit is really good
https://molmo.allenai.org/blog

Anonymous
09/26/24(Thu)13:41:12 No.102565526

Anonymous 09/26/24(Thu)13:41:12 No.102565526

>>102565317
>The vision has text encoding
What does that actually mean?

How some vision models have worked is that they literally just have a separate encoder that translates an image into tokens and inserts that into the context, but text doesn't go through that, so that part of the model could be ripped off and the text model would perform quite literally the same. Are you saying that even text goes through the vision encoder on Llama?

Anonymous
09/26/24(Thu)13:42:14 No.102565541

Anonymous 09/26/24(Thu)13:42:14 No.102565541

File: file.png (18 KB, 604x267)

18 KB PNG

>>102557890
>the 4 grand 5090
>32GB
On cue.

Anonymous
09/26/24(Thu)13:42:41 No.102565546

Anonymous 09/26/24(Thu)13:42:41 No.102565546

>>102565405
So far people have seemed to be in agreement that Qwen does indeed have bad pop culture knowledge, and also is bad at RP. No one has posted proof otherwise, that it is good at trivia and is good at RP.

Anonymous
09/26/24(Thu)13:44:25 No.102565565

Anonymous 09/26/24(Thu)13:44:25 No.102565565

>>102565411
>Meta tells people months in advance that they will release a future version with multimodal adapters
>somehow people expect 4o or something
Lmao.

Anonymous
09/26/24(Thu)13:46:52 No.102565605

Anonymous 09/26/24(Thu)13:46:52 No.102565605

>>102565565
hi Yann

Anonymous
09/26/24(Thu)13:47:55 No.102565620

Anonymous 09/26/24(Thu)13:47:55 No.102565620

File: 1727289443540662.png (505 KB, 2180x987)

505 KB PNG

>>102565481
You should post this diagram instead if you don't want to be labeled as a shill. Llama 3.2's multimodal isn't good but that diagram is quite literally misinformation.

Anonymous
09/26/24(Thu)13:49:02 No.102565632

Anonymous 09/26/24(Thu)13:49:02 No.102565632

>>102565605
hi Arthur

Anonymous
09/26/24(Thu)13:49:21 No.102565636

Anonymous 09/26/24(Thu)13:49:21 No.102565636

>muh misinformation
just lie

Anonymous
09/26/24(Thu)13:50:47 No.102565659

Anonymous 09/26/24(Thu)13:50:47 No.102565659

>>102565636
True. The fake log poster was quite a funny incident.

Anonymous
09/26/24(Thu)13:51:12 No.102565663

Anonymous 09/26/24(Thu)13:51:12 No.102565663

>>102565050
I have 24GB. I already felt horrible trying a 7B. I am not going lower.

Anonymous
09/26/24(Thu)13:52:58 No.102565678

Anonymous 09/26/24(Thu)13:52:58 No.102565678

File: file.png (867 KB, 768x768)

867 KB PNG

Anonymous
09/26/24(Thu)13:53:15 No.102565681

Anonymous 09/26/24(Thu)13:53:15 No.102565681

>>102565541
If this is true is gonna be one of the biggest let downs ever.

Anonymous
09/26/24(Thu)13:53:55 No.102565690

Anonymous 09/26/24(Thu)13:53:55 No.102565690

File: file.png (10 KB, 599x270)

10 KB PNG

>>102565681
especially if you look at the 5080

Anonymous
09/26/24(Thu)13:55:33 No.102565704

Anonymous 09/26/24(Thu)13:55:33 No.102565704

>>102565541
>4 grand 5090
What the fuck? Are they trying to converge the prices of the highest end consumer GPUs and the server GPUs so they don't ever actually have to raise the capacity/$ on consumer GPUs and inevitably make them a better value proposition for corpos?

Fuck me, man. I blame the guys who made the supercomputer out of PS3s.

Anonymous
09/26/24(Thu)13:58:41 No.102565757

Anonymous 09/26/24(Thu)13:58:41 No.102565757

File: Screenshot_20240926_175824.png (54 KB, 1468x719)

54 KB PNG

>>102565480
>Wow, is it really that bad?
>go and try it out on lmsys
>it's fine, it even got the thin white border
???

Anonymous
09/26/24(Thu)13:59:51 No.102565776

Anonymous 09/26/24(Thu)13:59:51 No.102565776

>>102565757
you weren't supposed to try this yourself

Anonymous
09/26/24(Thu)14:01:11 No.102565795

Anonymous 09/26/24(Thu)14:01:11 No.102565795

>>102565757
>3.2 11b can't even see the orange
owari da

Anonymous
09/26/24(Thu)14:01:11 No.102565796

Anonymous 09/26/24(Thu)14:01:11 No.102565796

>>102565690
>16G
LOL

Anonymous
09/26/24(Thu)14:02:34 No.102565810

Anonymous 09/26/24(Thu)14:02:34 No.102565810

>>102565757
Maybe you need to go back to pretraining until you can identify jokes, anonie.

Anonymous
09/26/24(Thu)14:02:48 No.102565814

Anonymous 09/26/24(Thu)14:02:48 No.102565814

>>102565796
that'll be $1100, paypig. start saving for the $1600 24GB refresh in a year.

Anonymous
09/26/24(Thu)14:05:04 No.102565851

Anonymous 09/26/24(Thu)14:05:04 No.102565851

>>102565822
>>102565822
>>102565822

Anonymous
09/26/24(Thu)14:05:24 No.102565853

Anonymous 09/26/24(Thu)14:05:24 No.102565853

>>102565810
It seems you may need some pretraining as well.

Anonymous
09/26/24(Thu)14:07:45 No.102565875

Anonymous 09/26/24(Thu)14:07:45 No.102565875

>>102565757
3.2 90B can't do nsfw as well as 1.5 Pro or 3.5 Sonnet

Anonymous
09/26/24(Thu)14:10:52 No.102565908

Anonymous 09/26/24(Thu)14:10:52 No.102565908

heck, sfw too

Anonymous
09/26/24(Thu)14:58:53 No.102566596

Anonymous 09/26/24(Thu)14:58:53 No.102566596

>>102562994
>LeCun
>That fucking tweet
Incredible, absolutely incredible! Dude gonna get cancelled left right and center lmao

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.