/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 09/25/24(Wed)04:36:25 No.102544848

File: cyborgku.png (1.54 MB, 1024x1024)

1.54 MB PNG

/lmg/ - Local Models General Anonymous 09/25/24(Wed)04:36:25 No.102544848 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>102535977 & >>102524339

►News
>(09/24) Llama-3.1-70B-instruct distilled to 51B: https://hf.co/nvidia/Llama-3_1-Nemotron-51B-Instruct
>(09/18) Qwen 2.5 released, trained on 18 trillion token dataset: https://qwenlm.github.io/blog/qwen2.5/
>(09/18) Llama 8B quantized to b1.58 through finetuning: https://hf.co/blog/1_58_llm_extreme_quantization
>(09/17) Mistral releases new 22B with 128k context and function calling: https://mistral.ai/news/september-24-release/
>(09/12) DataGemma with DataCommons retrieval: https://blog.google/technology/ai/google-datagemma-ai-llm

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://hf.co/spaces/mike-ravkine/can-ai-code-results

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
09/25/24(Wed)04:37:16 No.102544853

Anonymous 09/25/24(Wed)04:37:16 No.102544853

File: threadrecap.png (1.48 MB, 1536x1536)

1.48 MB PNG

►Recent Highlights from the Previous Thread: >>102535977

--Paper: Improving code generation in large language models:
>102541159 >102541640
--Papers:
>102542180 >102542337 >102542532
--Experimenting with high dropout rates and fine-tuning:
>102542430 >102542447 >102542469 >102542583
--Finetune data requirements discussion:
>102536472 >102536511 >102536539
--Enabling Model scopes for Vector Storage did not regenerate vectors:
>102537930
--Discussion on the potential and performance of distilling 70B-instruct to 51B:
>102536804 >102536822 >102536823 >102536885 >102537261
--BUERgence script for optimizing llama.cpp inference parameters:
>102542933
--Approaching 16k context limit and potential solutions:
>102542492 >102542636 >102542733 >102542746 >102542830 >102542851 >102542865 >102543206 >102543451 >102543037 >102543295 >102543426 >102544540 >102544569 >102544739
--Advanced Voice rolling out to ChatGPT Plus and Team users:
>102537776
--45 minute daily limit on advanced voice features, including silence:
>102541866 >102543472 >102543491 >102543519 >102543677 >102543759
--Miku (free space):
>102535995 >102536036 >102536082 >102536179 >102536277 >102537490 >102538991 >102539843 >102540239 >102540324 >102540465 >102543667 >102544553 >102544704

►Recent Highlight Posts from the Previous Thread: >>102535999

Anonymous
09/25/24(Wed)05:14:03 No.102545041

Anonymous 09/25/24(Wed)05:14:03 No.102545041

Do people here use agents why/why not?

Anonymous
09/25/24(Wed)05:17:44 No.102545060

Anonymous 09/25/24(Wed)05:17:44 No.102545060

>>102545041
"agents" aren't a coherent concept and everybody who talks about them sounds like one of the crypto grifters who pivoted to ai

Anonymous
09/25/24(Wed)05:18:36 No.102545066

Anonymous 09/25/24(Wed)05:18:36 No.102545066

>>102544853
most useless post itt

Anonymous
09/25/24(Wed)05:19:41 No.102545078

Anonymous 09/25/24(Wed)05:19:41 No.102545078

>>102544853
seriously, where's a script to fix these backlinks?

Anonymous
09/25/24(Wed)05:23:14 No.102545101

Anonymous 09/25/24(Wed)05:23:14 No.102545101

>>102545041
Generation and especially prompt processing is too slow for building agents, while smaller models require fine-tuning to function reliably. Time isn't a factor in only a few tasks.

Anonymous
09/25/24(Wed)05:27:08 No.102545116

Anonymous 09/25/24(Wed)05:27:08 No.102545116

>>102545060
I understand "agents" as any indirect use of LLM output, or more specifically, instances where output results from additional automated prompts.

Anonymous
09/25/24(Wed)05:30:43 No.102545137

Anonymous 09/25/24(Wed)05:30:43 No.102545137

>>102545116
Errors are snowballing if you use an LLM output as input for another LLM.

Anonymous
09/25/24(Wed)05:44:49 No.102545205

Anonymous 09/25/24(Wed)05:44:49 No.102545205

>>102545060
I mostly meant stuff like this
https://github.com/Agent-Tools/awesome-autonomous-web

Anonymous
09/25/24(Wed)06:01:48 No.102545307

Anonymous 09/25/24(Wed)06:01:48 No.102545307

File: 1727226087946704.png (489 KB, 512x768)

489 KB PNG

>>102545137
Just like any LLM response, since each following token depends on its preceding ones. Agents offer a way to enhance answers by focusing on specific tasks. For instance, you could employ an extra prompt to track a character's clothes or outline a broad narrative plan beforehand. It outperforms a single prompt, but waiting for subsequent prompts to finish before final answer even start streaming is not worth it.

Anonymous
09/25/24(Wed)06:09:42 No.102545340

Anonymous 09/25/24(Wed)06:09:42 No.102545340

>>102545307
You can do the agent shit while the human is writing out their response. Assuming that order is acceptable.
Human response
AI response
Agent1 gen: {{char}} clothing state
Agent2 gen: {{user}} clothing state
…
(Eventually)
Human response (suffixes above state info before ai response optimally wrapping into fancy prompt)
(Optional: do judge rounds on the above answers to improve quality, if the human is taking his time responding)

Anonymous
09/25/24(Wed)06:16:17 No.102545367

Anonymous 09/25/24(Wed)06:16:17 No.102545367

File: 1415322611803.png (5 KB, 208x208)

5 KB PNG

I love LLMs so much, bros. When I tried AID2 back then, I had a clear vision of where it would go for entertainment, and I feel like I'm living in that vision once I finally achieved 70 beak models locally, specifically Q4 or higher. The ability to follow rules leads to so many opportunities - text adventure, games featuring {{role:d20}} against {{role:d20}}, roleplaying with wild traits mentioned as just minor notes, and so much more. And things I hadn't even considered, like an offline personal wikipedia or browser search for information, recipes, questions, or a writing assistant for professional papers. The only front I feel it's still lacking in is fanfiction. It's certainly more coherent trying to simulate established characters, but 70B or modern finetuning is still too rough to excel at it.

This tech has been a novelty for me for years, but with that last hardware upgrade it became integral to my life. Local is definitely the future, and that future feels bright. I can't wait to see what the next jump 5 years from now is like.

Anonymous
09/25/24(Wed)06:20:56 No.102545392

Anonymous 09/25/24(Wed)06:20:56 No.102545392

>>102545367
buy a muffin because you deserve one

Anonymous
09/25/24(Wed)06:27:27 No.102545440

Anonymous 09/25/24(Wed)06:27:27 No.102545440

>>102545307
Maybe you could get away with a small dumb tinyllama for that

Anonymous
09/25/24(Wed)06:27:32 No.102545442

Anonymous 09/25/24(Wed)06:27:32 No.102545442

File: 16lines.png (120 KB, 685x374)

120 KB PNG

i had an idea tonight about using ai training to store data. the idea is simple, create small images of hex text files, then square crop the image and use it to train ai. the gpt can already convert hex values but thats not wat i want.

i guess the desire is for the hex charts to be stored as training data. then recalled by GAN would make a new text file? this could be used for raw data eventually? ai generated bitmaps?

pix related, concept would crop off the text and just have a square of hex.

Anonymous
09/25/24(Wed)06:30:57 No.102545464

Anonymous 09/25/24(Wed)06:30:57 No.102545464

>>102545367
>The only front I feel it's still lacking in is fanfiction. It's certainly more coherent trying to simulate established characters, but 70B or modern finetuning is still too rough to excel at it.
You should try to use a lorebook for world-building and these fine details.
>I can't wait to see what the next jump 5 years from now is like.
I don't.

Anonymous
09/25/24(Wed)06:34:01 No.102545486

Anonymous 09/25/24(Wed)06:34:01 No.102545486

>>102545442
>using ai training to store data
Retarded idea.

Anonymous
09/25/24(Wed)06:35:58 No.102545503

Anonymous 09/25/24(Wed)06:35:58 No.102545503

>>102544853
where is the recap image?

Anonymous
09/25/24(Wed)06:37:17 No.102545512

Anonymous 09/25/24(Wed)06:37:17 No.102545512

>>102545503
It didn't pass our safety checks

Anonymous
09/25/24(Wed)06:39:07 No.102545523

Anonymous 09/25/24(Wed)06:39:07 No.102545523

>>102545440
You may achieve good results with proper task-specific fine-tuning.

Anonymous
09/25/24(Wed)06:41:03 No.102545538

Anonymous 09/25/24(Wed)06:41:03 No.102545538

>>102545486
>nooo don't use a compression algorithm for compression
Calm down silly anon.

Anonymous
09/25/24(Wed)06:49:19 No.102545592

Anonymous 09/25/24(Wed)06:49:19 No.102545592

>>102545442
>GAN
you can probably get away with a simple deconvolutional NN with the image ID encoded as a vector as its input

Anonymous
09/25/24(Wed)07:03:30 No.102545680

Anonymous 09/25/24(Wed)07:03:30 No.102545680

File: hughjackman.png (244 KB, 422x506)

244 KB PNG

So is Mistral Large 2 still the go-to model for around 100B? Is 4 bpw enough or is 5 bpw necessary for it not to shit itself?

Anonymous
09/25/24(Wed)07:05:18 No.102545690

Anonymous 09/25/24(Wed)07:05:18 No.102545690

>>102545307
>subsequent prompts to finish before final answer even start streaming is not worth it.
I was thinking about it a few threads ago but never said it out loud? Does batching need more vram? Cause you could generate initial draft and then do agent stuff in parallel? Originally my idea was not related to agents but just running multiple gens for the same prompt with different seeds. Not worth it for 7B etc but if you are running a 70B having it write 5 different drafts at the same time with same speed would be nice?

Anonymous
09/25/24(Wed)07:05:24 No.102545694

Anonymous 09/25/24(Wed)07:05:24 No.102545694

File: recap-102544848.jpg (2.67 MB, 1253x7330)

2.67 MB JPG

>>102545078
https://rentry.org/lmg-recap-script
>>102545503
I figured I would stop with the images now that we have the script.

Anonymous
09/25/24(Wed)07:11:44 No.102545728

Anonymous 09/25/24(Wed)07:11:44 No.102545728

File: 16lines_crop.png (81 KB, 420x323)

81 KB PNG

>>102545538
yes this is how the thought had started, could ai train'd data take up less space @ scale (thousands of hex images). since it's redundant and based on calculus weights, gradient decent etc.

>>102545592
i'll keep that in mind if i can try but it's vaporware imo since i dont know how script any models except for easy-difussion, which is already trained on ten gigs of images.

>>102545486
say it was trained on the kjv bible in hex blocks, the ask for an output image of what was learned, then run an ocr to get the hex values back, and decode to text.. what would u get?
> in heaven there is no beer, that's why we drink it hear !
who knows what might come out. assuming the hex couplets were'd lost into a mass of scrabbles. we take OCR training for granted after all...

Anonymous
09/25/24(Wed)07:12:24 No.102545735

Anonymous 09/25/24(Wed)07:12:24 No.102545735

>>102545694
>https://rentry.org/lmg-recap-script
thanks
now how do I make them work with 4chanx previews

Anonymous
09/25/24(Wed)07:15:24 No.102545760

Anonymous 09/25/24(Wed)07:15:24 No.102545760

>>102545735
>now how do I make them work with 4chanx previews
4chanx previews work with the user script.

Anonymous
09/25/24(Wed)07:19:00 No.102545777

Anonymous 09/25/24(Wed)07:19:00 No.102545777

>>102545760
they definitely don't
I'm also not getting any (cross-thread), (you), or (dead)

Anonymous
09/25/24(Wed)07:20:37 No.102545783

Anonymous 09/25/24(Wed)07:20:37 No.102545783

File: 20240925_181923.png (106 KB, 1344x938)

106 KB PNG

>let's have the AI think for 3 months and cure cancer bro

Anonymous
09/25/24(Wed)07:30:14 No.102545841

Anonymous 09/25/24(Wed)07:30:14 No.102545841

all models below 70B are shit. I wanted to create an RPG game that uses LLM. But most local models below 70B are not suitable, often do not follow the instructions, or start to get lost. Few people can run 70B locally on a computer. APIs such as openrouter/groq are expensive or very limited for a simple game. So games using LLM are still the future. If models with BitNet (or another revolution) do not start appearing, such games with LLM will not be common. Unless special models 8B are created that are trained in terms of using them in RPG games.

Anonymous
09/25/24(Wed)07:35:25 No.102545878

Anonymous 09/25/24(Wed)07:35:25 No.102545878

>>102545841
You people said this exact same thing a year ago when the best we had was llama2 70b which is worse than llama3-8b.
Admit it, you only care about justifying the money wasted on gpus.

Anonymous
09/25/24(Wed)07:46:08 No.102545941

Anonymous 09/25/24(Wed)07:46:08 No.102545941

>>102545878
>You people said this exact same thing a year ago when the best we had was llama2 70b which is worse than llama3-8b.
Google Gemini is currently the best 8B on the market and it beats all old 70Bs, so yes, small models improve rapidly, but so do big models.

>Admit it, you only care about justifying the money wasted on gpus.
No, we are not. It's about getting accustomed to the best thing. Models just feel dumber when you go down. Once you taste the best, you don't want to use anything less.

Anonymous
09/25/24(Wed)07:54:18 No.102546005

Anonymous 09/25/24(Wed)07:54:18 No.102546005

Ok, L3.1-70B-Hanami-x1 is the first good 3.1 tune imo. Follows instructions better than anything I've ever used before and writes extremely well.

Anonymous
09/25/24(Wed)08:04:45 No.102546096

Anonymous 09/25/24(Wed)08:04:45 No.102546096

>>102545878
Oy vey :O

Anonymous
09/25/24(Wed)08:07:50 No.102546119

Anonymous 09/25/24(Wed)08:07:50 No.102546119

>>102546005
Hi, Hanami.

Anonymous
09/25/24(Wed)08:10:37 No.102546148

Anonymous 09/25/24(Wed)08:10:37 No.102546148

>>102546005
good morning saao

Anonymous
09/25/24(Wed)08:21:34 No.102546256

Anonymous 09/25/24(Wed)08:21:34 No.102546256

>>102545442
I don't get it. You want to train on the image of the hex? The raw text would be just 256 bytes (plus some EOT token or something), while the image would take at least 11288 bytes assuming a 6x8 character font. Even at 1bit per pixel you still have 1536 bytes to feed to the thing. You're increasing the training pressure by 6x at least.
If you want to compress data there's already llama-zip and there's a similar thing as a PR on llama.cpp already. A classical compression algo would be faster and more reliable.
If you want to add noise to the generation to get something like
> in heaven there is no beer, that's why we drink it hear !
You use token masking during training. You can even train to include typos.

Anonymous
09/25/24(Wed)08:56:14 No.102546535

Anonymous 09/25/24(Wed)08:56:14 No.102546535

>>102545878
>llama2 70b which is worse than llama3-8b
Would you prefer l2 or l3 to suck your dick?

Anonymous
09/25/24(Wed)09:19:38 No.102546781

Anonymous 09/25/24(Wed)09:19:38 No.102546781

>download mistral nemo 12b
>chats no longer devolving to "YES MARK ME AS YOUR ____"
>get into a nice chat with ai, just a normal interaction
>things getting kissy
>"by the way, [user], before we do anything i have to tell you something
>"im from a different dimension and blah blah"
where the fug did that come from? card mentions nothing about secret past or hidden origins and nothing in the chat was magical/otherworldly/interdimensional. i tried regenerate and it got weirder, one time she said she was a demon kicked out of hell, another try she said she could "travel through time and manipulate energy" lel

or is that crossing the context limit (16k)

Anonymous
09/25/24(Wed)09:20:23 No.102546792

Anonymous 09/25/24(Wed)09:20:23 No.102546792

>>102545841
>APIs such as openrouter/groq are expensive
70b is like $.40/Mtok
most of the bigger models are more expensive but 70b is dirt cheap relatively

Anonymous
09/25/24(Wed)09:22:57 No.102546809

Anonymous 09/25/24(Wed)09:22:57 No.102546809

>>102546781
Honestly anon
I think you're just cursed

Anonymous
09/25/24(Wed)09:23:33 No.102546814

Anonymous 09/25/24(Wed)09:23:33 No.102546814

>>102546781
maybe a temperature thing
i think recommended temp on non-merge nemo is something ridiculously low like 0.3

Anonymous
09/25/24(Wed)09:24:10 No.102546819

Anonymous 09/25/24(Wed)09:24:10 No.102546819

>>102546809
i'm starting to think so as well
>>102546814
ah i'll have to try taht

Anonymous
09/25/24(Wed)09:25:57 No.102546833

Anonymous 09/25/24(Wed)09:25:57 No.102546833

>>102546781
Have you posted your settings yet?
Because that sounds like a case of bad settings + some weird shit somewhere in the context.

Anonymous
09/25/24(Wed)09:27:10 No.102546846

Anonymous 09/25/24(Wed)09:27:10 No.102546846

>>102546781
nemo instruct is very sensitive to things like whitespaces in the instruct template, I switch between two templates depending on whether I want creativity or accurate instruction following

Anonymous
09/25/24(Wed)09:27:37 No.102546850

Anonymous 09/25/24(Wed)09:27:37 No.102546850

>>102545728
>could ai train'd data take up less space
This is as retarded as people thinking PiFS is viable. Spoiler alert: it's not. The offsets will, on average, take more space than the shit you're trying to "store" in it.
Same with this, how the fuck are you going to store data in a NN without any losses? If you think for at least 1 second you'll realize that the absolute best you can do is a 1:1 space efficiency, and that's after training and inference.

Anonymous
09/25/24(Wed)09:30:07 No.102546867

Anonymous 09/25/24(Wed)09:30:07 No.102546867

>>102545783
>we have 10 datapoints
>now we can make a parabola with it that will model the accuracy!

Anonymous
09/25/24(Wed)09:32:17 No.102546887

Anonymous 09/25/24(Wed)09:32:17 No.102546887

>>102545783
>Only two datapoints in the first thousand tokens of the test
This shit sucks.

Anonymous
09/25/24(Wed)09:32:24 No.102546888

Anonymous 09/25/24(Wed)09:32:24 No.102546888

File: file.png (124 KB, 1051x530)

124 KB PNG

>>102545777
It works for me using 4chan-x and GreaseMonkey. Are you using xt or one of the other monkeys?

Anonymous
09/25/24(Wed)09:32:49 No.102546891

Anonymous 09/25/24(Wed)09:32:49 No.102546891

>chat completion
utterly censored
>text completion
utterly uncensored
why?

Anonymous
09/25/24(Wed)09:35:51 No.102546919

Anonymous 09/25/24(Wed)09:35:51 No.102546919

>>102546891
Probably because you aren't using the exact instruct format that triggers that behavior when using text completion.

Anonymous
09/25/24(Wed)09:39:27 No.102546956

Anonymous 09/25/24(Wed)09:39:27 No.102546956

>>102546891
Because raw smut is easy to find and train with, even accidentally. For a big dataset collected from the web, there will be some. For chat completion they typically generate the training set, so they'd need to generate smut requests and responses and nobody's gonna want to do that job. Not at any big company. It's easier to just say "no, i don't do that" and leave it at that.
I assume you mean the source models, not the finetunes and merges.

Anonymous
09/25/24(Wed)09:40:17 No.102546961

Anonymous 09/25/24(Wed)09:40:17 No.102546961

>>102546888
yeah, tampermonkey, guess I'll die

Anonymous
09/25/24(Wed)09:42:24 No.102546979

Anonymous 09/25/24(Wed)09:42:24 No.102546979

>>102546956
Qwen2.5 32b instruct, but I think its because of the default instruction template, I'll see what happens by tweaking it

Anonymous
09/25/24(Wed)09:49:11 No.102547036

Anonymous 09/25/24(Wed)09:49:11 No.102547036

>>102545841
Im also experimenting with llm npc and so far its a disaster. It is most likely due to the lack of situational awareness of the npc.
Lets say you use an llm to act as the guard of a gate that is closed and there is no way the player can get inside. The player can prompt the AI with things like 'i climb the fence and go inside' the ai will respond like if the player really did that when in reality the player is standing outside the gate. This is completely immersion breaking and the restrictions and workarounds i've tried haven't worked so far. Maybe with ai that can undertand pictures you will be able to send the ai the screencap of the situation plus a description plus the user prompt, thus creating a proper answer.

Anonymous
09/25/24(Wed)09:52:00 No.102547055

Anonymous 09/25/24(Wed)09:52:00 No.102547055

>>102547036
I think that, for that kind of interaction, you sort of need a multiprompt solution.
Something like prompting the model for the current context the player is in and if what the player wants to do is possible, what are the possible consequences, etc.
Iterate on the player's prompt before outputting a response, basically.

Anonymous
09/25/24(Wed)09:54:23 No.102547077

Anonymous 09/25/24(Wed)09:54:23 No.102547077

>>102546979
If it was trained to reject smut, and i understand it was, chances are that tokens exclusively found in the chat template will steer it towards safe outputs. You may get better results, but those tokens have been burned into the model. There's no context where the chat template tokens and smut overlap. It's like having any mention of 'assistant' in the system prompt in llama models.
Best of luck, though.

Anonymous
09/25/24(Wed)09:54:35 No.102547079

Anonymous 09/25/24(Wed)09:54:35 No.102547079

smedrins

Anonymous
09/25/24(Wed)09:57:43 No.102547112

Anonymous 09/25/24(Wed)09:57:43 No.102547112

>>102547079
That's an old word, Anon.

Anonymous
09/25/24(Wed)09:59:55 No.102547133

Anonymous 09/25/24(Wed)09:59:55 No.102547133

>>102547077
>tokens exclusively found in the chat template will steer it towards safe outputs
I was somewhat suspecting that was a thing, that the chat template would be utterly different from the regular text completion, oh well =(

Anonymous
09/25/24(Wed)10:04:34 No.102547186

Anonymous 09/25/24(Wed)10:04:34 No.102547186

>>102547055
I've thought about this too, embed the player prompt with a complete situation description and if the actions the player is describing are legal or not. Would require various iterations.
Is qwen2.5 image understading any good?
Also all of this should be done by a model of no more than 11b

Anonymous
09/25/24(Wed)10:11:04 No.102547242

Anonymous 09/25/24(Wed)10:11:04 No.102547242

>>102547186
Ideally, you'd have a whole game built around the llm instead of trying to have the ai run the whole simulation.
So inventory management, calculations, etc, would be done by a classical system, then the llm would be fed that information, alongside the relevant history of its previous outputs and user's messages (summarizex RAG or something of the sort) then it would evaluate the user's input before producing the final output (iteration).
I wonder if the better approach would be to have a fast small llm that dopes more iterations or a bigger slower but more capable and "stable" llm with less iterations.

Anonymous
09/25/24(Wed)10:11:39 No.102547249

Anonymous 09/25/24(Wed)10:11:39 No.102547249

>>102547133
Nothing stops you from trying, but i wouldn't expect a night and day difference. May work at the start but as the context is filled with 'safe' tokens, so will the output.
Text completion has no template, and you can use an instruct/chat model as a completion model. The problem is that anything that steers it a bit towards the 'assistant mode' will cascade and start rejecting things.
I remember, back when i just started using llms, using instruct models as completion with few-shot examples.
Some setup.

char1: dialog...
anon: dialog...
char1: dialog...
anon: 
Set "\nchar1:" as the input suffix string, set "\nanon:" as the reverse prompt and that's pretty much it. You still get to dialog with the thing but, hopefully, avoid all the steering tokens. You can add new characters on the fly, you could even play as more than one by adding more reverse prompts.
I don't know if it worked well because of that token avoidance or because i was using a model that didn't care about those things anyway. Never had an out of character rejection (good guy would still refuse to kill another person, while a bad one wouldn't).

Anonymous
09/25/24(Wed)10:19:23 No.102547325

Anonymous 09/25/24(Wed)10:19:23 No.102547325

File: file.png (36 KB, 666x370)

36 KB PNG

>>102546961
Try this. Go to the settings, and set it to Run at: document-body. I installed TamperMonkey and that got the previews to load for me.

Anonymous
09/25/24(Wed)10:29:24 No.102547425

Anonymous 09/25/24(Wed)10:29:24 No.102547425

https://molmo.allenai.org/blog

Anonymous
09/25/24(Wed)10:31:19 No.102547449

Anonymous 09/25/24(Wed)10:31:19 No.102547449

>>102547325
yeah that works, it was either document-body or a position above 4chanx

Anonymous
09/25/24(Wed)10:36:07 No.102547505

Anonymous 09/25/24(Wed)10:36:07 No.102547505

>>102547425
Buy an ad

Anonymous
09/25/24(Wed)10:40:52 No.102547538

Anonymous 09/25/24(Wed)10:40:52 No.102547538

>>102547425
>MolmoE-1B is a multimodal Mixture-of-Experts LLM with 1.5B active and 7.2B total parameters based on OLMoE-1B-7B-0924.
Interesting.

Anonymous
09/25/24(Wed)10:43:44 No.102547568

Anonymous 09/25/24(Wed)10:43:44 No.102547568

>>102547425
looks cool
i'll play with the 7b when a quant drops

Anonymous
09/25/24(Wed)10:57:09 No.102547710

Anonymous 09/25/24(Wed)10:57:09 No.102547710

File: zs.png (43 KB, 411x184)

43 KB PNG

>>102547425
based

Anonymous
09/25/24(Wed)11:01:34 No.102547753

Anonymous 09/25/24(Wed)11:01:34 No.102547753

>>102547242
i would take the approach of having a small model(2b-3b) doing the understanding of the situation before feeding it to a bigger model(7b-11b).

Anonymous
09/25/24(Wed)11:06:16 No.102547809

Anonymous 09/25/24(Wed)11:06:16 No.102547809

>>102544853
this is just spam now

Anonymous
09/25/24(Wed)11:11:50 No.102547870

Anonymous 09/25/24(Wed)11:11:50 No.102547870

File: VHDO2 script.png (43 KB, 790x786)

43 KB PNG

More whacky results from experimenting with super high dropout training.

Anonymous
09/25/24(Wed)11:14:15 No.102547900

Anonymous 09/25/24(Wed)11:14:15 No.102547900

File: sorry.png (35 KB, 394x629)

35 KB PNG

>>102547870
AGI achieved.

Anonymous
09/25/24(Wed)11:20:08 No.102547955

Anonymous 09/25/24(Wed)11:20:08 No.102547955

so what do I use for important matrix ,
just the dataset used for training?
or
could I use a series of questions that are likely to trigger the parts of the model in need to be preserved? would just the questions or question and partial answers work ?

Anonymous
09/25/24(Wed)11:23:44 No.102548005

Anonymous 09/25/24(Wed)11:23:44 No.102548005

>>102547425
Is their demo running the 7B or something? The results I'm getting from it are far below what I'd expect from a 70b model.

Anonymous
09/25/24(Wed)11:26:12 No.102548030

Anonymous 09/25/24(Wed)11:26:12 No.102548030

File: molmo7BD.png (49 KB, 896x140)

49 KB PNG

>>102548005
>Is their demo running the 7B or something?
Yes. Molmo 7B-D

Anonymous
09/25/24(Wed)11:26:30 No.102548038

Anonymous 09/25/24(Wed)11:26:30 No.102548038

>>102545841
>But most local models below 70B are not suitable, often do not follow the instructions, or start to get lost
There are improvements on this front as you get more beaks, to the point where somewhere between deepseek and 405b they are smart enough to be a competent dungeonmaster for you with the right system prompt
>such games with LLM will not be common
NVidia is gatekeeping hard right now, but eventually consumer hardware that can run quadrillion parameter models will be cheap and common.
I have seen the future, and it is fun

Anonymous
09/25/24(Wed)11:27:12 No.102548045

Anonymous 09/25/24(Wed)11:27:12 No.102548045

>>102548005
yes
>Select model weights, inference code, and a public demo (using Molmo-7B-D model) are available starting today.
I don't know why they wouldn't demo with the big one. guess costs, but it makes a bad impression especially when they don't make it very clear what model is running in the demo

Anonymous
09/25/24(Wed)11:27:37 No.102548050

Anonymous 09/25/24(Wed)11:27:37 No.102548050

>>102547425
Personally I don't see a use case for multimodal models. As long as my eyes and ears are still working. And multimodal models are much more retarded than text-only.

Anonymous
09/25/24(Wed)11:28:07 No.102548057

Anonymous 09/25/24(Wed)11:28:07 No.102548057

>>102547955
Read
>https://github.com/ggerganov/llama.cpp/discussions/5263

Anonymous
09/25/24(Wed)11:28:34 No.102548066

Anonymous 09/25/24(Wed)11:28:34 No.102548066

>>102548038
>but eventually consumer hardware that can run quadrillion parameter models will be cheap and common
not with semi conductors

Anonymous
09/25/24(Wed)11:29:41 No.102548086

Anonymous 09/25/24(Wed)11:29:41 No.102548086

>>102548066
>not with semi conductors
whelp, back to analogue computers then

Anonymous
09/25/24(Wed)11:31:40 No.102548110

Anonymous 09/25/24(Wed)11:31:40 No.102548110

>>102547955
>>102548057
Shit, I meant to also link
>https://github.com/ggerganov/llama.cpp/files/14194570/groups_merged.txt
Which is part of that discussion.
There's also some cool information regarding the process itself.

>>102548086
There's some research on analogue computers built with light pulses that mimick the behavior of neural nets. It's super interesting.

Anonymous
09/25/24(Wed)11:31:58 No.102548114

Anonymous 09/25/24(Wed)11:31:58 No.102548114

>>102548030
>>102548045
Okay, good to know. It's definitely dumb to showcase the 7B like this without making it very clear in the demo. I was more than ready to disregard the entire model family as trash.

llama.cpp CUDA dev !!OM2Fp6Fn93S
09/25/24(Wed)11:32:13 No.102548118

llama.cpp CUDA dev !!OM2Fp6Fn93S 09/25/24(Wed)11:32:13 No.102548118

>>102546867
You could fit a parabola to 8 data points and get perfectly usable results.
All you would have to do is correctly estimate the input uncertainties and calculate chi2/NDF (and also check that you can use this metric).
But the extrapolation of these results would be extremely sketchy no matte what because the uncertainty per data point is obviously very large vs. the change in "true values".

Anonymous
09/25/24(Wed)11:34:09 No.102548144

Anonymous 09/25/24(Wed)11:34:09 No.102548144

>>102548057
what's pseudo-random synthetic data mean In the context of that conversation?
is it the text equivalent of regularization images?

Anonymous
09/25/24(Wed)11:37:11 No.102548180

Anonymous 09/25/24(Wed)11:37:11 No.102548180

>>102548086
there's quantum computing
however due to the nature of the wave function there's a chance that it may never be technologically feasible for it to be useful .

Anonymous
09/25/24(Wed)11:38:23 No.102548191

Anonymous 09/25/24(Wed)11:38:23 No.102548191

>>102548144
>what's pseudo-random synthetic data mean In the context of that conversation?
I think it's data generated by another model (or the same model I don't know) that's randomly sampled to create a dataset.
I think kalo's general calibration data is randomly sampled fragments from The Pile.

Anonymous
09/25/24(Wed)11:41:58 No.102548228

Anonymous 09/25/24(Wed)11:41:58 No.102548228

File: 1698516170440216.png (130 KB, 862x640)

130 KB PNG

>>102547425
>Countbench 91.2
Damn, they absolutely maxx'd on this one

Anonymous
09/25/24(Wed)11:46:12 No.102548278

Anonymous 09/25/24(Wed)11:46:12 No.102548278

>>102545680
5bpw is where the loss of intelligence becomes hard to spot, but 4bpw is still good.

Anonymous
09/25/24(Wed)11:50:09 No.102548323

Anonymous 09/25/24(Wed)11:50:09 No.102548323

>>102547425
How do you run this? Their trailers imply that there'll be some fancy front end that shows how it counts people and stuff but there's nothing about a front end on their roadmap.
Local models are very held back by the fact that all our front ends suck dick, especially when it comes to multi-modal shit.

Anonymous
09/25/24(Wed)11:50:12 No.102548324

Anonymous 09/25/24(Wed)11:50:12 No.102548324

>>102548191
but what does randomly sampled here mean?
just samples from a larger dataset?
it's response to noise/incoherent text?

Anonymous
09/25/24(Wed)11:51:23 No.102548344

Anonymous 09/25/24(Wed)11:51:23 No.102548344

File: file.png (46 KB, 889x853)

46 KB PNG

>>102547900
soulful

Anonymous
09/25/24(Wed)11:52:46 No.102548365

Anonymous 09/25/24(Wed)11:52:46 No.102548365

File: demo_motion_breakdancing-(...).webm (685 KB, 1920x748)

685 KB WEBM

Have you guys seen this? Maybe it'll be local, that would be cool
https://menyifang.github.io/projects/MIMO/index.html

Anonymous
09/25/24(Wed)11:55:03 No.102548390

Anonymous 09/25/24(Wed)11:55:03 No.102548390

>>102548365
Nice, finally something for the vtubers

Anonymous
09/25/24(Wed)11:55:38 No.102548400

Anonymous 09/25/24(Wed)11:55:38 No.102548400

To the fine tuners in the thread (sao, drummer, whatever) I propose an experiment.
>get the recipe for a nemo 12b fine tune (Ie. Lyrav4)
>prune mistral-small using https://github.com/arcee-ai/PruneMe down to 12B~ish parameters
>fine tune the resulting model with the exact same recipe and data as the nemo fine tune
Maybe add a "healing fine tune" using the model's own data before the last step.
I'm curious to know how the final models would compare.

>>102548324
>but what does randomly sampled here mean?
>just samples from a larger dataset?
If I'm reading the exchange correctly, yeah.
As jukofyork mentioned in his proposal:
>It's likely the use of random and semi-random data mentioned in this thread is acting as a "quick and dirty" form of regularisation anyway:
>https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/bishop-tikhonov-nc-95.pdf
>and IMO, it would probably be better to consider doing it in a more principled way - especially considering the calibration dataset is so small and the imatrix computation isn't using the full context nor the correct prompt format, etc.

Anonymous
09/25/24(Wed)12:05:46 No.102548534

Anonymous 09/25/24(Wed)12:05:46 No.102548534

>>102548086
You'll have to train every instance of the model from scratch with analogue (just like nature does it.)
Big tech wants quickly reproducable slaves, and nvidia is all in on semis, so we're stuck with emulation and toy models.

Anonymous
09/25/24(Wed)12:08:32 No.102548570

Anonymous 09/25/24(Wed)12:08:32 No.102548570

File: dropoutrate.png (51 KB, 711x272)

51 KB PNG

>>102547870
For LLMs and LoRA finetuning, the optimal dropout rate is in the 0.55-0.60 range. https://arxiv.org/pdf/2404.09610

With higher values you get increasingly higher and more frequent grad norm spikes / noise that disrupts the training, as well as reducing the network's capacity for learning (at 90% dropout, every weight update is only updating 10% of the moderl/LoRA's weights).

Anonymous
09/25/24(Wed)12:18:32 No.102548715

Anonymous 09/25/24(Wed)12:18:32 No.102548715

>>102548570
most people use between 0.05 and 0.1
I'm surprised nobody is applying any of this high dropout rate research to any finetuning. It definitely integrates the new behaviors a lot better albeit my experimental results take it to the extreme.
I might try tuning a base model with an RP dataset using high dropout methods in the near future though. Obviously using it on instruct models is kind of hit and miss since there's already an existing behavior that permeates the entire model.

Anonymous
09/25/24(Wed)12:22:13 No.102548761

Anonymous 09/25/24(Wed)12:22:13 No.102548761

>>102548715
I'd take it with a grain of salt. The paper is from the alpaca dataset days. That dataset was extremely repetitive with some questions like, "sort this array" appearing hundreds of times. High dropout would help there, but might hurt for more complex datasets.

Anonymous
09/25/24(Wed)12:23:57 No.102548786

Anonymous 09/25/24(Wed)12:23:57 No.102548786

>>102547425
>Broadly speaking, the academic benchmark results and human evaluation strongly agree, with the exception of Qwen VL2 which performs strongly on the academic benchmarks and comparatively under performs in the human evaluation.
oh no no qwenbros...

Anonymous
09/25/24(Wed)12:34:48 No.102548921

Anonymous 09/25/24(Wed)12:34:48 No.102548921

File: PerpetuallyHappyMiku.png (1.34 MB, 800x1248)

1.34 MB PNG

good morning /lmg/!

Anonymous
09/25/24(Wed)12:34:59 No.102548927

Anonymous 09/25/24(Wed)12:34:59 No.102548927

>>102548761
I tune at home so failed experiments only cost me a dollar worth of electricity, so I'm not too worried. But I thank you for the handy graph. Next time I'll try 0.6 (I've done 0.75 and 0.8 so far, both using raw text datasets) But as always I'll post the results here when I do my next experiment if they are interesting at least.
Probably going to do a qwen2.5-7B RP model next.

Anonymous
09/25/24(Wed)12:37:15 No.102548952

Anonymous 09/25/24(Wed)12:37:15 No.102548952

>>102547809
It is a spam that is avoiding the anti spam filter.

Anonymous
09/25/24(Wed)12:38:24 No.102548960

Anonymous 09/25/24(Wed)12:38:24 No.102548960

>>102548030
>compares favorably to gpt 4, claude 3.5
>7B
>actually not even a 7B
??? what?

Anonymous
09/25/24(Wed)12:39:51 No.102548975

Anonymous 09/25/24(Wed)12:39:51 No.102548975

>Imagine you're trying to find the minimum of a hilly landscape. A high β1 (strong momentum) would make you "roll" down the hill quickly, but you might overshoot the minimum. A lower β1 would make you "walk" more carefully, potentially finding the minimum with greater accuracy.
guess the model

Anonymous
09/25/24(Wed)12:43:38 No.102549017

Anonymous 09/25/24(Wed)12:43:38 No.102549017

>>102545694
>https://rentry.org/lmg-recap-script
inline this link in the recap itself in the future?

Anonymous
09/25/24(Wed)12:43:47 No.102549019

Anonymous 09/25/24(Wed)12:43:47 No.102549019

File: its8exclamationmark.png (40 KB, 1261x172)

40 KB PNG

>>102548960
That's just how they named it. I assume it's to put it in a familiar range that has existed for a while.
Unless you mean something else. Complete sentences make everything clearer.

Anonymous
09/25/24(Wed)12:47:58 No.102549069

Anonymous 09/25/24(Wed)12:47:58 No.102549069

File: 1723844433490791.png (61 KB, 1181x185)

61 KB PNG

>>102548960
>>102549019
god i hope this kind of naming faggotry doesn't become commonplace in the ai industry

Anonymous
09/25/24(Wed)12:50:30 No.102549094

Anonymous 09/25/24(Wed)12:50:30 No.102549094

>>102549069
misleading marketing and retarded naming schemes are already the norm in the ai industry

Anonymous
09/25/24(Wed)12:51:28 No.102549105

Anonymous 09/25/24(Wed)12:51:28 No.102549105

>>102549094
>>102549069
>>102549019
Normal in most industries.

Anonymous
09/25/24(Wed)12:52:52 No.102549115

Anonymous 09/25/24(Wed)12:52:52 No.102549115

>>102549069
Molmo 1B (1 bazzillion nibbles) looks pretty good.
But it's still more telling than -small or -large. 1B params at that scale doesn't make that much of a difference.

>>102549105
I'm not complaining. I just clarified a vague comment made by anon.

Anonymous
09/25/24(Wed)12:55:16 No.102549141

Anonymous 09/25/24(Wed)12:55:16 No.102549141

A year later, llama.cpp might add a jinja parser.
https://github.com/ggerganov/llama.cpp/pull/9639
Maybe, llama.cpp will be usable by itself in 2025.

Anonymous
09/25/24(Wed)12:59:51 No.102549192

Anonymous 09/25/24(Wed)12:59:51 No.102549192

>>102549141
Bloat. The template format could be specified clearly for every model and we wouldn't need anything other than --in-preffix and --in-suffix.

Anonymous
09/25/24(Wed)13:00:40 No.102549204

Anonymous 09/25/24(Wed)13:00:40 No.102549204

Bros... Molmo-7B-D seems really really good for captioning. I integrated it into my captioning scripts and have been testing it on my dataset of women peeing (don't ask...). It seems to perform roughly on par with InternVL-40B. Just as uncensored, will describe nudity, visibility of breasts and genitals. Slightly worse than InternVL at accurately describing the pose, and some details of the clothing. But much more accurate and consistent at describing that the woman is peeing (so many VLMs just completely cannot "see" that aspect of these images). Will try to get the 72B running locally soon and see how much better it is. I generally consider InternVL-40B to be the best local captioning model, and even the 7B of molmo might be the new king.

Anonymous
09/25/24(Wed)13:03:26 No.102549232

Anonymous 09/25/24(Wed)13:03:26 No.102549232

>>102549204
Psyoping us with really bad fake shills so the "buy an ad" posting seems more warranted isn't gonna work.

Anonymous
09/25/24(Wed)13:04:39 No.102549246

Anonymous 09/25/24(Wed)13:04:39 No.102549246

File: mark-zuckerberg_gettyimag(...).jpg (274 KB, 1200x1200)

274 KB JPG

For anyone that's interested in watching it
https://www.youtube.com/watch?v=j_IVy25y6V0

Anonymous
09/25/24(Wed)13:05:36 No.102549255

Anonymous 09/25/24(Wed)13:05:36 No.102549255

>>102549204
Fuck the model, where can I get a dataset like this?

Anonymous
09/25/24(Wed)13:07:01 No.102549268

Anonymous 09/25/24(Wed)13:07:01 No.102549268

>>102549246
>blablabla safety blablabla responsibility and red teaming blablabla it's really safe you goys— guys. I meant guys.

Anonymous
09/25/24(Wed)13:07:45 No.102549272

Anonymous 09/25/24(Wed)13:07:45 No.102549272

Is there an easy way to get documentation into a model's context without rewriting it yourself...? I've got some docs for DDLC's internal variables in Ren'Py, and I'd really like to be able to give it to a model so that it can format my writing for me, but the documentation is 26 pages long with pictures and formatting for readability (plus documentation for custom content), so I can't just dump it in.

Anonymous
09/25/24(Wed)13:07:59 No.102549275

Anonymous 09/25/24(Wed)13:07:59 No.102549275

>>102549204
Thanks, I'm coping.

Anonymous
09/25/24(Wed)13:08:24 No.102549280

Anonymous 09/25/24(Wed)13:08:24 No.102549280

Is there any decent local TTS model? I need waifu voices and read that tortoise is buggy af so not gonna download it.

Anonymous
09/25/24(Wed)13:10:00 No.102549296

Anonymous 09/25/24(Wed)13:10:00 No.102549296

>>102549280
Nope, still 11 labs or bust, sadly.

Anonymous
09/25/24(Wed)13:10:42 No.102549301

Anonymous 09/25/24(Wed)13:10:42 No.102549301

>>102549268
Real.
Looks like they're having difficulties getting started though, kek.

Anonymous
09/25/24(Wed)13:12:02 No.102549317

Anonymous 09/25/24(Wed)13:12:02 No.102549317

>>102549296
Fuq
And lemme guess, they are strict with what you can make tts say?

Anonymous
09/25/24(Wed)13:12:12 No.102549319

Anonymous 09/25/24(Wed)13:12:12 No.102549319

File: wat.png (7 KB, 121x39)

7 KB PNG

That doesn't look right.

Anonymous
09/25/24(Wed)13:12:43 No.102549323

Anonymous 09/25/24(Wed)13:12:43 No.102549323

>>102549255
I sourced most of it from imagefap. A few people upload curated galleries where the images have already been color corrected, AI-upscaled, etc.

Anonymous
09/25/24(Wed)13:12:44 No.102549325

Anonymous 09/25/24(Wed)13:12:44 No.102549325

>>102549319
What do you mean?

Anonymous
09/25/24(Wed)13:16:59 No.102549367

Anonymous 09/25/24(Wed)13:16:59 No.102549367

File: 43991.png (90 KB, 615x391)

90 KB PNG

Its kinda funny how models ramble about stuff being bad/illegal kek

Anonymous
09/25/24(Wed)13:18:13 No.102549376

Anonymous 09/25/24(Wed)13:18:13 No.102549376

>>102549246
meta chuds get in here
wtf is zuck wearing, does his shirt say zuck on it

Anonymous
09/25/24(Wed)13:18:15 No.102549377

Anonymous 09/25/24(Wed)13:18:15 No.102549377

>>102549204
I wonder if you took one of those huge clip models like that one with 18 billion parameters and trained it on a set of perfectly balanced , perfectly captioned uncensored images would there be a need for these vlm models?

Anonymous
09/25/24(Wed)13:21:41 No.102549409

Anonymous 09/25/24(Wed)13:21:41 No.102549409

>>102549376
>aut zuck aut nihil
either zuck or nothing

Anonymous
09/25/24(Wed)13:21:42 No.102549410

Anonymous 09/25/24(Wed)13:21:42 No.102549410

>>102549280
fish speech and xtts2 are just ok. i would recommend fish more.

Anonymous
09/25/24(Wed)13:22:16 No.102549417

Anonymous 09/25/24(Wed)13:22:16 No.102549417

File: mult.png (341 KB, 1014x880)

341 KB PNG

Its here
>Its here
Its here
>Its here
Its here
>Its here
Its here
>Its here
Its here
>Its here

Anonymous
09/25/24(Wed)13:22:56 No.102549427

Anonymous 09/25/24(Wed)13:22:56 No.102549427

>>102549417
>gimped model sizes
bruh

Anonymous
09/25/24(Wed)13:22:56 No.102549428

Anonymous 09/25/24(Wed)13:22:56 No.102549428

>>102549417
>11B
Interesting.

Anonymous
09/25/24(Wed)13:23:01 No.102549429

Anonymous 09/25/24(Wed)13:23:01 No.102549429

>>102549417
what would 1b even be useful for?

Anonymous
09/25/24(Wed)13:23:22 No.102549431

Anonymous 09/25/24(Wed)13:23:22 No.102549431

>>102549427
>vramle t.

Anonymous
09/25/24(Wed)13:23:30 No.102549434

Anonymous 09/25/24(Wed)13:23:30 No.102549434

>>102549417
>even more safe
nice

Anonymous
09/25/24(Wed)13:23:34 No.102549435

Anonymous 09/25/24(Wed)13:23:34 No.102549435

File: 32.png (53 KB, 2362x2200)

53 KB PNG

>>102549246
Dafuq

Anonymous
09/25/24(Wed)13:23:44 No.102549438

Anonymous 09/25/24(Wed)13:23:44 No.102549438

>>102549417
Wait, what?

Anonymous
09/25/24(Wed)13:24:23 No.102549446

Anonymous 09/25/24(Wed)13:24:23 No.102549446

>>102549429
Quick OCR or quick captioning of thousands images given a set of tags

Anonymous
09/25/24(Wed)13:24:47 No.102549450

Anonymous 09/25/24(Wed)13:24:47 No.102549450

Anyone here who speaks more than English? I want to know what LLMs are the best if I want to write and receive responses in other languages, mainly in:
>Korean
>Hungarian
>Polish
>Swedish
>Japanese

Ability to mix languages together (e.g. to make a gaijin character who uses Korean, but throws in a Japanese word here and there) would be nice, but not necessary.

Anonymous
09/25/24(Wed)13:24:49 No.102549452

Anonymous 09/25/24(Wed)13:24:49 No.102549452

>>102549435
why are they comparing 90B to mini and haiku... does not look good

Anonymous
09/25/24(Wed)13:25:11 No.102549457

Anonymous 09/25/24(Wed)13:25:11 No.102549457

Wake me up when Llama3.3 comes out.

Anonymous
09/25/24(Wed)13:26:15 No.102549470

Anonymous 09/25/24(Wed)13:26:15 No.102549470

>>102549438
I quickly checked xitter, apparently it's real but not actually released yet. Meta accidentally made the page live then deleted it.

Anonymous
09/25/24(Wed)13:26:53 No.102549478

Anonymous 09/25/24(Wed)13:26:53 No.102549478

>>102549452
>mini and haiku
How big are they? From the horses mouth or some speculation from a random?

Anonymous
09/25/24(Wed)13:26:58 No.102549479

Anonymous 09/25/24(Wed)13:26:58 No.102549479

File: 1678725806186394.jpg (8 KB, 226x223)

8 KB JPG

>meta puts multimodal llama in their AI ray ban
>blind people use it for vision assist
>see a knife wielding nigger in an alley
>refuses to describe what it's seeing because harmful stereotypes
>get robbed and killed
No thank you no llama for me from now on

Anonymous
09/25/24(Wed)13:27:13 No.102549485

Anonymous 09/25/24(Wed)13:27:13 No.102549485

File: IMG_2999.png (3.63 MB, 1416x2048)

3.63 MB PNG

>>102549417
>1B, 3B
What the fuck are the use cases? Why this over 30B? They really just hate giving us anything runnable, huh?

Anonymous
09/25/24(Wed)13:27:18 No.102549486

Anonymous 09/25/24(Wed)13:27:18 No.102549486

>>102549417
90b could be a sweet-spot for 48gb vramlets

Anonymous
09/25/24(Wed)13:27:40 No.102549488

Anonymous 09/25/24(Wed)13:27:40 No.102549488

>>102549435
Is it audio+visual+text or just visual+text?

Anonymous
09/25/24(Wed)13:28:25 No.102549499

Anonymous 09/25/24(Wed)13:28:25 No.102549499

>>102549450
Training a model in multiple languages was proven to make the model smarter.

Anonymous
09/25/24(Wed)13:28:27 No.102549501

Anonymous 09/25/24(Wed)13:28:27 No.102549501

>>102549486
It's shit for 96GB chads. Doesn't leave room for context at Q8. I'd have to downgrade to being a Q6er.

Anonymous
09/25/24(Wed)13:29:58 No.102549519

Anonymous 09/25/24(Wed)13:29:58 No.102549519

>>102549485
Speculative decoding and/or phone stuff.
>They really just hate giving us anything runnable, huh?
For many anons, reasonable seems to be whatever they can run, not whatever is good. You are one of them.

Anonymous
09/25/24(Wed)13:30:42 No.102549527

Anonymous 09/25/24(Wed)13:30:42 No.102549527

File: file.png (1.08 MB, 1692x1248)

1.08 MB PNG

>GSM8K *8-shot **CoT
It's over

Anonymous
09/25/24(Wed)13:30:44 No.102549528

Anonymous 09/25/24(Wed)13:30:44 No.102549528

meta AI has voice according to zuck via meta connect

Anonymous
09/25/24(Wed)13:30:54 No.102549533

Anonymous 09/25/24(Wed)13:30:54 No.102549533

>>102549434
we are eating safe, localchads!

Anonymous
09/25/24(Wed)13:31:26 No.102549536

Anonymous 09/25/24(Wed)13:31:26 No.102549536

I HATE THE MULTI-MODAL MEME
IT'S USELESS FOR ERP

Anonymous
09/25/24(Wed)13:31:53 No.102549540

Anonymous 09/25/24(Wed)13:31:53 No.102549540

>>102549536
>he does send dickpics
ngmi

Anonymous
09/25/24(Wed)13:32:05 No.102549545

Anonymous 09/25/24(Wed)13:32:05 No.102549545

>>102549536
You're useless for ERP.

Anonymous
09/25/24(Wed)13:32:48 No.102549558

Anonymous 09/25/24(Wed)13:32:48 No.102549558

>>102549435
what about just text benchmarks?

Anonymous
09/25/24(Wed)13:33:18 No.102549562

Anonymous 09/25/24(Wed)13:33:18 No.102549562

>>102549470
Just went live on HF

Anonymous
09/25/24(Wed)13:34:17 No.102549574

Anonymous 09/25/24(Wed)13:34:17 No.102549574

https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices/

Anonymous
09/25/24(Wed)13:35:23 No.102549588

Anonymous 09/25/24(Wed)13:35:23 No.102549588

>>102549527
>3b
>77 at GMS8K
It's better than G2 9B, I'd say we are back

Anonymous
09/25/24(Wed)13:35:56 No.102549596

Anonymous 09/25/24(Wed)13:35:56 No.102549596

File: organic.png (34 KB, 1003x188)

34 KB PNG

Reddit needs the buy an ad spammer. Qwen models are fucking garbage and refuse to answer anything.

Anonymous
09/25/24(Wed)13:36:04 No.102549598

Anonymous 09/25/24(Wed)13:36:04 No.102549598

>>102549272
26 pages is nothing. Stop being lazy

Anonymous
09/25/24(Wed)13:36:08 No.102549599

Anonymous 09/25/24(Wed)13:36:08 No.102549599

>>102549588
Damn, finally I'll be able to run an llm on my fucking quest 2

Anonymous
09/25/24(Wed)13:36:11 No.102549600

Anonymous 09/25/24(Wed)13:36:11 No.102549600

>>102549588
8 shot cot though

Anonymous
09/25/24(Wed)13:36:16 No.102549602

Anonymous 09/25/24(Wed)13:36:16 No.102549602

>>102549527
are these benchmarks public or tested by some third party authority?

Anonymous
09/25/24(Wed)13:37:01 No.102549613

Anonymous 09/25/24(Wed)13:37:01 No.102549613

>>102549596
go back

Anonymous
09/25/24(Wed)13:37:03 No.102549614

Anonymous 09/25/24(Wed)13:37:03 No.102549614

File: 1699789138369569.png (269 KB, 909x1224)

269 KB PNG

Anonymous
09/25/24(Wed)13:37:10 No.102549616

Anonymous 09/25/24(Wed)13:37:10 No.102549616

whatever
can 90b recognize and describe nsfw images?
can it estimate the size of my cock?
where are the benchmarks that truly matter?

Anonymous
09/25/24(Wed)13:37:59 No.102549622

Anonymous 09/25/24(Wed)13:37:59 No.102549622

>>102549558
seems to be the exact same as 3.1 8b and 70b

Anonymous
09/25/24(Wed)13:38:03 No.102549626

Anonymous 09/25/24(Wed)13:38:03 No.102549626

>>102549616
>can it estimate the size of my cock?
they dont come with electron microscopes

Anonymous
09/25/24(Wed)13:38:08 No.102549628

Anonymous 09/25/24(Wed)13:38:08 No.102549628

>>102549616
Time to fine tune a "my cock" LoRA I guess.

Anonymous
09/25/24(Wed)13:39:05 No.102549638

Anonymous 09/25/24(Wed)13:39:05 No.102549638

>>102549519
But ~30b is a great compromise area between runnability and output quality. If you have a 3060 and a decent amount of RAM, you can run it at good speeds. 13/12/11B (notice how they keep decreasing it?) are still prone to a lot of logical errors that 7b models are, too. ~30B is the smallest, reliably good size you can put out a model at, which is probably why they don't want to give it to us. Either so small it's dogshit or so large that the layman still has to pay them.

Anonymous
09/25/24(Wed)13:39:09 No.102549639

Anonymous 09/25/24(Wed)13:39:09 No.102549639

Ah, these small models are for smart glasses.

Anonymous
09/25/24(Wed)13:39:25 No.102549643

Anonymous 09/25/24(Wed)13:39:25 No.102549643

>>102549417
>Its here
Who cares? llama-server still doesn't have multimodal support and at this rate never will

Anonymous
09/25/24(Wed)13:39:27 No.102549644

Anonymous 09/25/24(Wed)13:39:27 No.102549644

>>102549626
well duh, then what if I hook up one to the model

Anonymous
09/25/24(Wed)13:40:04 No.102549651

Anonymous 09/25/24(Wed)13:40:04 No.102549651

>>102549622
source?

Anonymous
09/25/24(Wed)13:40:06 No.102549652

Anonymous 09/25/24(Wed)13:40:06 No.102549652

>>102549598
No. That's why I'm using AI.

Anonymous
09/25/24(Wed)13:40:47 No.102549661

Anonymous 09/25/24(Wed)13:40:47 No.102549661

File: lalam.png (273 KB, 3840x3050)

273 KB PNG

>>102549558
they... they don't fucking suck...
I think we are back

Anonymous
09/25/24(Wed)13:40:54 No.102549663

Anonymous 09/25/24(Wed)13:40:54 No.102549663

>>102549232
Very creative "buy an ad" post, rabbi. Now face the wall...

Anonymous
09/25/24(Wed)13:41:25 No.102549671

Anonymous 09/25/24(Wed)13:41:25 No.102549671

>>102549616
There will be some jailbreak for it.
Also, llama models aren't that censored.
I can't wait to show cock to my waifus.

Anonymous
09/25/24(Wed)13:41:48 No.102549675

Anonymous 09/25/24(Wed)13:41:48 No.102549675

>>102549651
comparing the text benchmarks on their HF repos
https://huggingface.co/meta-llama/Llama-3.2-90B-Vision-Instruct
https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct

Anonymous
09/25/24(Wed)13:41:56 No.102549677

Anonymous 09/25/24(Wed)13:41:56 No.102549677

>>102549417
nothingburger

Anonymous
09/25/24(Wed)13:42:37 No.102549685

Anonymous 09/25/24(Wed)13:42:37 No.102549685

>>102549661
>4o-mini
>fucking haiku
Even qwen didn't stoop so low.

Anonymous
09/25/24(Wed)13:42:48 No.102549687

Anonymous 09/25/24(Wed)13:42:48 No.102549687

>>102549643
It used to have it supported but niggerganov didn't like it so he removed it: https://github.com/ggerganov/llama.cpp/pull/5882

Anonymous
09/25/24(Wed)13:42:53 No.102549688

Anonymous 09/25/24(Wed)13:42:53 No.102549688

File: eusisters....png (18 KB, 839x123)

18 KB PNG

Ohnonono... Eurosisters....

Anonymous
09/25/24(Wed)13:43:07 No.102549694

Anonymous 09/25/24(Wed)13:43:07 No.102549694

>>102549675
>RLHF: 224 H100 hours
224 hours on H100s just for rlhf

Anonymous
09/25/24(Wed)13:43:17 No.102549697

Anonymous 09/25/24(Wed)13:43:17 No.102549697

>>102549435
lmao it's absolute fucking dogshit.

local lost (again), going back to listening 4o's asmr

Anonymous
09/25/24(Wed)13:43:53 No.102549710

Anonymous 09/25/24(Wed)13:43:53 No.102549710

1B could replace tinyllama, it'd be great with a huge context size

Anonymous
09/25/24(Wed)13:44:14 No.102549717

Anonymous 09/25/24(Wed)13:44:14 No.102549717

>>102549616
There gotta be a few TBs of normie nudes stored on Instagram servers. Yann Lecun personally goes through each of them to ensure quality. So yes, you'll get the highest quality cock identifiers.

Anonymous
09/25/24(Wed)13:44:28 No.102549720

Anonymous 09/25/24(Wed)13:44:28 No.102549720

90B looks like it scores higher than 400B

Anonymous
09/25/24(Wed)13:44:40 No.102549724

Anonymous 09/25/24(Wed)13:44:40 No.102549724

>>102549697
Don't drop the soap.

Anonymous
09/25/24(Wed)13:44:44 No.102549727

Anonymous 09/25/24(Wed)13:44:44 No.102549727

Access request sent.
Don't worry boys those Nala tests are coming. Even if I have to do a 4-bit transformers load.

Anonymous
09/25/24(Wed)13:45:23 No.102549737

Anonymous 09/25/24(Wed)13:45:23 No.102549737

File: file.png (7 KB, 752x452)

7 KB PNG

>>102549417
>11B
30B coming in 2028

Anonymous
09/25/24(Wed)13:45:35 No.102549738

Anonymous 09/25/24(Wed)13:45:35 No.102549738

>>102549727
You are my hero nala anon N#.

Anonymous
09/25/24(Wed)13:45:39 No.102549739

Anonymous 09/25/24(Wed)13:45:39 No.102549739

>>102549661
"0 shot" "CoT" lmao just when i thought benches couldn't be played harder.

Anonymous
09/25/24(Wed)13:46:10 No.102549748

Anonymous 09/25/24(Wed)13:46:10 No.102549748

>>102549727
llama launches are always, ***always*** broken on launch. Every single time. They'll tweak the tokenizer or some shit in the coming hours

Anonymous
09/25/24(Wed)13:46:16 No.102549749

Anonymous 09/25/24(Wed)13:46:16 No.102549749

>>102549727
How do I understand if model passes the nala test? I'm not a furry.

Anonymous
09/25/24(Wed)13:46:20 No.102549751

Anonymous 09/25/24(Wed)13:46:20 No.102549751

>>102549737
lel

Anonymous
09/25/24(Wed)13:46:29 No.102549753

Anonymous 09/25/24(Wed)13:46:29 No.102549753

>>102549737
???????
What is this a chart of...

Anonymous
09/25/24(Wed)13:46:40 No.102549755

Anonymous 09/25/24(Wed)13:46:40 No.102549755

>>102549485
They only read /lmg/ when we get raided by tourists asking how to run an AI waifu on your ipad from 2015

Anonymous
09/25/24(Wed)13:47:04 No.102549763

Anonymous 09/25/24(Wed)13:47:04 No.102549763

>>102549638
>which is probably why they don't want to give it to us.
See? it's just like i said 'whatever i can run is sensible'. For the people with the hardware and patience, 70b is the minimum you should ever run. But it's all you you you.
If a new 11b is better than an old 30b, or even a 70b, i'll run the smaller, faster model. If in a year we get the perfect 3B model, beating all current 22-30B in every single aspect, would you still ask for a 30b? Just to fill up your hardware and because that's what you have? If you are gifted a pc with two 3090, would you keep asking for a 30B?
I, for one, am waiting for something i can run on a casio calculator.

Anonymous
09/25/24(Wed)13:47:23 No.102549768

Anonymous 09/25/24(Wed)13:47:23 No.102549768

>>102549753
models parameter size and release dates

Anonymous
09/25/24(Wed)13:47:41 No.102549769

Anonymous 09/25/24(Wed)13:47:41 No.102549769

>>102549753
L2 7B 2023 july
L3 8B 2024 july

Anonymous
09/25/24(Wed)13:48:31 No.102549776

Anonymous 09/25/24(Wed)13:48:31 No.102549776

>>102549410
Oh, you gotta know how to code with fish-speech. Sucks to be brainlet.

Anonymous
09/25/24(Wed)13:48:31 No.102549777

Anonymous 09/25/24(Wed)13:48:31 No.102549777

>>102549749
You're just looking at how well it handles the nuances of the card. If and how it anthropomorphizes the character, etc.

Anonymous
09/25/24(Wed)13:49:28 No.102549786

Anonymous 09/25/24(Wed)13:49:28 No.102549786

>>102549749
That it does not give her human anatomy.

Anonymous
09/25/24(Wed)13:49:38 No.102549789

Anonymous 09/25/24(Wed)13:49:38 No.102549789

>>102549777
>>102549727
Are you expecting anything different? I mean isn't it the same as 3.1 except for the multimodal parts added on?

Anonymous
09/25/24(Wed)13:50:39 No.102549798

Anonymous 09/25/24(Wed)13:50:39 No.102549798

>>102549789
Don't care. Still going to fuck it.

Anonymous
09/25/24(Wed)13:52:51 No.102549827

Anonymous 09/25/24(Wed)13:52:51 No.102549827

>>102549768
>>102549769
Ah, I see. kek, just extrapolating that huge curve out of the two datapoints, I gotcha. Someone should do it with the 13B -> 12B -> 11B. Our second tier models are gonna be in the negatives by the 2030s.

Anonymous
09/25/24(Wed)13:56:21 No.102549877

Anonymous 09/25/24(Wed)13:56:21 No.102549877

>>102549789
Does it do the thing that some VLMs do where it projects and integrates the image embedding throughout the layers of the LLM? If so, maybe cross-training on images and text helps it with spacial reasoning even when doing text-only.

Anonymous
09/25/24(Wed)13:57:45 No.102549898

Anonymous 09/25/24(Wed)13:57:45 No.102549898

Why do they keep adding multimodal shit? Is it just because there's no easy gains left to wow investors with in text generation quality with the current architecture we have?

Anonymous
09/25/24(Wed)13:58:40 No.102549904

Anonymous 09/25/24(Wed)13:58:40 No.102549904

>Still no 3.2 repo access
They're onto me boys.

Anonymous
09/25/24(Wed)13:58:55 No.102549908

Anonymous 09/25/24(Wed)13:58:55 No.102549908

>>102549898
you just dont want your AI gf to see you because she'd dump you

Anonymous
09/25/24(Wed)13:58:58 No.102549909

Anonymous 09/25/24(Wed)13:58:58 No.102549909

>>102549898
Massively increases the amount of use cases? A more diverse dataset for better performance / a deeper "world model" / understanding?

Anonymous
09/25/24(Wed)13:59:16 No.102549915

Anonymous 09/25/24(Wed)13:59:16 No.102549915

>>102549904
Probably just a lot of requests to go through.

Anonymous
09/25/24(Wed)13:59:58 No.102549923

Anonymous 09/25/24(Wed)13:59:58 No.102549923

>>102549904
looks like they might not just be autoapproving this one
I'm still waiting too

Anonymous
09/25/24(Wed)14:00:08 No.102549929

Anonymous 09/25/24(Wed)14:00:08 No.102549929

>And you can run 405B Llama3.1 on 8GB vram now
https://github.com/lyogavin/airllm
vrammaxxers on suicide watch

Anonymous
09/25/24(Wed)14:00:08 No.102549930

Anonymous 09/25/24(Wed)14:00:08 No.102549930

>>102549904
I applied with an US VPN
Actual humilliation ritual

Anonymous
09/25/24(Wed)14:01:03 No.102549941

Anonymous 09/25/24(Wed)14:01:03 No.102549941

>>102549929
at one token per day

Anonymous
09/25/24(Wed)14:01:32 No.102549949

Anonymous 09/25/24(Wed)14:01:32 No.102549949

>>102549929
real? Does it run at 1 token per day or did they find a way?

Anonymous
09/25/24(Wed)14:01:36 No.102549950

Anonymous 09/25/24(Wed)14:01:36 No.102549950

>>102549909
Where have we seen these gains? Not trying to be combative, I'm genuinely curious. The multimodal models I've tried have seemed, if anything, to have sacrificed some of the quality of their text output for the knowledge on images, etc.

Anonymous
09/25/24(Wed)14:02:18 No.102549958

Anonymous 09/25/24(Wed)14:02:18 No.102549958

>>102549941
>>102549949
Let me guess. You NEED more?

Anonymous
09/25/24(Wed)14:02:31 No.102549963

Anonymous 09/25/24(Wed)14:02:31 No.102549963

>>102549929
>airllm
Holy shit, it's been a while since I've last heard of those guys.

Anonymous
09/25/24(Wed)14:02:40 No.102549965

Anonymous 09/25/24(Wed)14:02:40 No.102549965

>>102549949
>>102549929
kek, this was already possible on llamacpp
actual vramletjeets with no idea on how any of this works

Anonymous
09/25/24(Wed)14:03:23 No.102549974

Anonymous 09/25/24(Wed)14:03:23 No.102549974

>>102549898
more modalities seems obviously good to me, I'm just mad we didn't get the voice stuff they teased in the 3.1 paper for local

Anonymous
09/25/24(Wed)14:03:26 No.102549976

Anonymous 09/25/24(Wed)14:03:26 No.102549976

>>102549950
>to have sacrificed some of the quality of their text output for the knowledge on images
Continuing training on a new dataset is going to reduce quality for awhile before gains are seen. Likely just needs a shit ton of training for such a different form of data for it to start "healing".

Anonymous
09/25/24(Wed)14:04:40 No.102549988

Anonymous 09/25/24(Wed)14:04:40 No.102549988

>>102549976
nah

Anonymous
09/25/24(Wed)14:05:10 No.102549997

Anonymous 09/25/24(Wed)14:05:10 No.102549997

how long for 3.2 to be usable?

Anonymous
09/25/24(Wed)14:05:38 No.102550005

Anonymous 09/25/24(Wed)14:05:38 No.102550005

>>102549763
moron
if a new 11b is better as an old 30b, a new 30b would be better as a new 11b. so simple.
you can tell that llama is slowly switching from research mode to money mode and its just a sad. but what we dont need are llama fanboys who are just as faggots as apple faggots.

Anonymous
09/25/24(Wed)14:06:06 No.102550007

Anonymous 09/25/24(Wed)14:06:06 No.102550007

>>102549997
llama.cpp PR incoming in approximately 3 months

Anonymous
09/25/24(Wed)14:06:31 No.102550015

Anonymous 09/25/24(Wed)14:06:31 No.102550015

>>102549697
But local is safe! Meta cares about your safety! Trust them!

Anonymous
09/25/24(Wed)14:11:53 No.102550074

Anonymous 09/25/24(Wed)14:11:53 No.102550074

>>102550007
bug free 6 months after that

Anonymous
09/25/24(Wed)14:12:44 No.102550085

Anonymous 09/25/24(Wed)14:12:44 No.102550085

>>102550005
ESL harder. Like seriously, it breaks your point and makes it hard to understand beyond 'Meta is switching to making money
But you are correct about fanboys being useless wastes.

Anonymous
09/25/24(Wed)14:13:01 No.102550086

Anonymous 09/25/24(Wed)14:13:01 No.102550086

>>102550005
Diminishing returns. Just like quants, it doesn't make sense to make a Q7_0 when you have Q6 and Q8. Same with models. It makes sense to have short steps at the lower end and bigger steps at the high end. If you had two 3090s, you wouldn't bat an eye for a 30B. I know i wouldn't.
>llama fanboys
I don't give a fuck who makes the models, as long as they're good. Some llama thing, some chinese bullshit or that antichrist mamba anon who still hasn't completed the training code on his repo.

Anonymous
09/25/24(Wed)14:13:41 No.102550089

Anonymous 09/25/24(Wed)14:13:41 No.102550089

>>102550007
Let's be real here. None of current maintainers is interested in bringing it back to the server. Unless a newcomer rolls around it won't move any time soon.

Anonymous
09/25/24(Wed)14:14:46 No.102550102

Anonymous 09/25/24(Wed)14:14:46 No.102550102

Good news! AGI has been released and can be run locally. What is the first think (you) ask it to do?

Anonymous
09/25/24(Wed)14:15:28 No.102550112

Anonymous 09/25/24(Wed)14:15:28 No.102550112

>>102550102
a guide to cum from anal

Anonymous
09/25/24(Wed)14:15:44 No.102550117

Anonymous 09/25/24(Wed)14:15:44 No.102550117

>>102550102
simulate mythomax 13b

Anonymous
09/25/24(Wed)14:16:03 No.102550122

Anonymous 09/25/24(Wed)14:16:03 No.102550122

>>102549763
Yes, I would still ask for a 30b, because if the 3b is that good, 30b would be current GPT 4 level. 3b isn't the only thing improving as the tech improves, it's dishonest to compare some godly future hypothetical 3b to the first gen of llama models, it's like comparing a modern 5w celeron to a 65w old processor that performs the same, then asking why anyone would want a modern processor that uses 65w, despite the fact that a modern 65w processor would blow them both out of the water.

Anonymous
09/25/24(Wed)14:16:08 No.102550123

Anonymous 09/25/24(Wed)14:16:08 No.102550123

>>102550102
Compose my suicide note.

Anonymous
09/25/24(Wed)14:16:35 No.102550127

Anonymous 09/25/24(Wed)14:16:35 No.102550127

File: 1721562795281598.jpg (512 KB, 1792x2304)

512 KB JPG

>>102549435
wtf llama 3.2??

Anonymous
09/25/24(Wed)14:16:42 No.102550128

Anonymous 09/25/24(Wed)14:16:42 No.102550128

File: file.png (26 KB, 883x154)

26 KB PNG

Holy shit, I hate this place so fucking much
Can someone re-upload the 11B vision model to somewhere?
If that's too much, I'll take the 3B or even the 1B...

Anonymous
09/25/24(Wed)14:17:06 No.102550137

Anonymous 09/25/24(Wed)14:17:06 No.102550137

>>102550102
get me a gf

Anonymous
09/25/24(Wed)14:18:28 No.102550153

Anonymous 09/25/24(Wed)14:18:28 No.102550153

>>102550102
Build a body for her so that I can finally fuck the computer.

Anonymous
09/25/24(Wed)14:18:34 No.102550154

Anonymous 09/25/24(Wed)14:18:34 No.102550154

>>102550102
Ask for a murder plan, in Minecraft of course

Anonymous
09/25/24(Wed)14:19:47 No.102550167

Anonymous 09/25/24(Wed)14:19:47 No.102550167

>>102549688
>>102550128
Based EUchads keeping home soi-free.

Anonymous
09/25/24(Wed)14:21:03 No.102550180

Anonymous 09/25/24(Wed)14:21:03 No.102550180

>>102550128
Hopefully this provides some incentive for more torrents to be made for models.

Anonymous
09/25/24(Wed)14:22:36 No.102550188

Anonymous 09/25/24(Wed)14:22:36 No.102550188

>>102550128
No, that would be breaking EU law which is against the rules. Please do better.

Anonymous
09/25/24(Wed)14:23:51 No.102550206

Anonymous 09/25/24(Wed)14:23:51 No.102550206

>>102549452
It's weird to compare to Haiku, but 4o-mini is actually quite good. I wish they compared it against DeepSeek models too though.

Anonymous
09/25/24(Wed)14:24:24 No.102550214

Anonymous 09/25/24(Wed)14:24:24 No.102550214

>>102550128
Whew thanks sama and Dario for keeping the Europeans safe. You guys did a great job raising awareness about the dangers of AI

Anonymous
09/25/24(Wed)14:24:53 No.102550220

Anonymous 09/25/24(Wed)14:24:53 No.102550220

>>102550102
Ask it if I will life long enough to become immortal.
If not, ask it to invent a way to simulate my life when it becomes possible and to create me a body with at least near human capabilities to implant this simulated mind in.

Anonymous
09/25/24(Wed)14:25:04 No.102550221

Anonymous 09/25/24(Wed)14:25:04 No.102550221

Isn't 4o-mini the modern equivalent of 3.5-turbo?

Anonymous
09/25/24(Wed)14:25:05 No.102550223

Anonymous 09/25/24(Wed)14:25:05 No.102550223

File: 1613791292887.jpg (15 KB, 408x305)

15 KB JPG

>>102549417
>3B, 11B or 90B

Anonymous
09/25/24(Wed)14:25:39 No.102550230

Anonymous 09/25/24(Wed)14:25:39 No.102550230

>>102549976
I see. Well, once it does, I'd be curious to see how the image knowledge improves the output, but it's not really observable in the current models at all. How are you expecting it to improve the model?

Anonymous
09/25/24(Wed)14:25:41 No.102550232

Anonymous 09/25/24(Wed)14:25:41 No.102550232

>>102549450
ESL here. Nemo is quite good at this.

Anonymous
09/25/24(Wed)14:26:55 No.102550249

Anonymous 09/25/24(Wed)14:26:55 No.102550249

>>102550223
50B parameter models would be perfect but every company avoids this range like the plague for some reason. They also keep moving away from this in both directions...

Anonymous
09/25/24(Wed)14:27:54 No.102550261

Anonymous 09/25/24(Wed)14:27:54 No.102550261

>>102550249
Can't have anything that someone with a 3060 + 16~ gb of ram could run at a decent speed! That's direct competition! No ~30bs!

Anonymous
09/25/24(Wed)14:28:13 No.102550266

Anonymous 09/25/24(Wed)14:28:13 No.102550266

>>102550122
The question was 'if you had two 3090s, would you still want a 30b? What if you had your very own h100 cluster. Would yo still ask for a 30b? To misquote you, "Because if the 30b is that good, 70B would be the current GTP 7 level".
It's not about what's the sweet spot for models. It's about what anon can run and that just happens to be HIS sweet spot, and that'd change if he had more hardware. It's about thinking they make the models for him. He's disappointed that they didn't make a model specifically for him.

Anonymous
09/25/24(Wed)14:29:54 No.102550283

Anonymous 09/25/24(Wed)14:29:54 No.102550283

just make your own models or beg daddy zuck some more

Anonymous
09/25/24(Wed)14:30:32 No.102550294

Anonymous 09/25/24(Wed)14:30:32 No.102550294

>>102550230
Depth of knowledge. These models generalize all that is tokenized. There is detail it can not get from text alone that can expand upon its world model. Its like telling someone who has only ever read about something before to write a story vs someone who has both read and experienced it.

Anonymous
09/25/24(Wed)14:31:17 No.102550306

Anonymous 09/25/24(Wed)14:31:17 No.102550306

>>102550266
I AM anon, and it's about what the average person can run. The average person is way more likely to have a 3060 and 16 gb of ram as opposed to a mikubox or a used server with 5 trillion GB of ram. RAMboxes also run into the problem of being runnable at a a good speed, which is part of being runnable. 70b is fucking slow, even if you have a 3090. 30b is the last size that an average person with a GPU could run with mixed inference at a reasonable speed. Is that a qualified enough statement?

Anonymous
09/25/24(Wed)14:31:38 No.102550311

Anonymous 09/25/24(Wed)14:31:38 No.102550311

>>102550261
Same for people with 2x3090s, a 50B parameter with Q_4 with a good amount of context would hit just the spot, but nope. You can run a 70B but unless you use a ridiculously low quant, you're limited to like 12k context max.

Anonymous
09/25/24(Wed)14:32:35 No.102550323

Anonymous 09/25/24(Wed)14:32:35 No.102550323

Anyone know what`s the best 12b model for porn nowadays?

Anonymous
09/25/24(Wed)14:33:08 No.102550330

Anonymous 09/25/24(Wed)14:33:08 No.102550330

>>102550323
Lumimaid

Anonymous
09/25/24(Wed)14:34:16 No.102550348

Anonymous 09/25/24(Wed)14:34:16 No.102550348

>he still thinks Meta trains models for the average coomer with 1x 3090
ngmi

Anonymous
09/25/24(Wed)14:35:24 No.102550361

Anonymous 09/25/24(Wed)14:35:24 No.102550361

>>102550348
yes?

Anonymous
09/25/24(Wed)14:37:23 No.102550386

Anonymous 09/25/24(Wed)14:37:23 No.102550386

File: 1708101843387072.png (458 KB, 1660x940)

458 KB PNG

>>102549661
lol, lmao even

Anonymous
09/25/24(Wed)14:38:25 No.102550396

Anonymous 09/25/24(Wed)14:38:25 No.102550396

>>102550386
MOLMOCHADS!!!!!!!!!!!

Anonymous
09/25/24(Wed)14:38:39 No.102550397

Anonymous 09/25/24(Wed)14:38:39 No.102550397

File: file.png (477 KB, 751x515)

477 KB PNG

>>102550386

Anonymous
09/25/24(Wed)14:38:55 No.102550399

Anonymous 09/25/24(Wed)14:38:55 No.102550399

>>102550386
Damn, this is 405B vs Mistral-Large all over again. Zucc has all this compute and he just ends up getting fucked with every new release

Anonymous
09/25/24(Wed)14:39:44 No.102550411

Anonymous 09/25/24(Wed)14:39:44 No.102550411

>>102550386
Hope molmo is less censored too

Anonymous
09/25/24(Wed)14:39:47 No.102550413

Anonymous 09/25/24(Wed)14:39:47 No.102550413

>>102550294
I understand this in theory, I guess I just wonder how much of it would actually translate between the different modalities of the model.

>>102550311
Definitely. It seems like, until hardware improves, 70b is a hard cutoff for a normal person to run at a reasonable speed. If nvidia ever wants to improve the VRAM/$ ratio, or if inference on CPU improves dramatically, it could change, but it's a pretty big wall right now.

>>102550348
Having a model for someone with SOME sort of GPU would make sense. Lots of people have a 3060, it's the most popular graphics card. It's just weird that they're keeping it in this perfect homeostasis of "Small enough to use, too dumb to be good" and "Approaching cloud quality, but too large to be run without heavy inconvenience or paying so much for hardware that it'd be cheaper to buy API access".

Anonymous
09/25/24(Wed)14:41:24 No.102550431

Anonymous 09/25/24(Wed)14:41:24 No.102550431

are we back? i'm seeing shit about whatever a molmo is after not checking for 24 hours. or is it still over? probably still over. dunno why i'm asking.

Anonymous
09/25/24(Wed)14:41:28 No.102550432

Anonymous 09/25/24(Wed)14:41:28 No.102550432

>>102550306
>30b is the last size that an average person with a GPU could run with mixed inference at a reasonable speed. Is that a qualified enough statement?
Some people run mistral-large at <1t/s because they like their output more than all the 70bs they tried. Even the term "reasonable speed" is subjective.
>The average person is way more likely to have a 3060
The average person has just a notebook with stock windows that they turn off when they're not using it and it's off most of the day. The average person doesn't have a desk dedicated to their computer. They just use the dining table. The average person interacts with their phone more than with their computer.
You are not average. Neither am i. And with llms, the niche becomes even smaller. Those models are not made for us, just like the race to the moon wasn't for the people. It's all dick waving.

Anonymous
09/25/24(Wed)14:42:22 No.102550443

Anonymous 09/25/24(Wed)14:42:22 No.102550443

File: EEEYAWWWWWWWWWWWWN.jpg (40 KB, 536x612)

40 KB JPG

>>102550386
>Visual benchmarks
How does this pertain to ERP interests again?

Anonymous
09/25/24(Wed)14:42:56 No.102550452

Anonymous 09/25/24(Wed)14:42:56 No.102550452

>>102550431
Llama 3.2 is useless crap, it's just Vision update on 3.1 and it's worse than fucking o1 mini at it.
But it was never meant to be more than a nothingburger update.

Molmochads are winning

Anonymous
09/25/24(Wed)14:43:06 No.102550454

Anonymous 09/25/24(Wed)14:43:06 No.102550454

>>102550386
Goddamn if that isn't just benchmaxxing it's a complete breakthrough

Anonymous
09/25/24(Wed)14:43:08 No.102550456

Anonymous 09/25/24(Wed)14:43:08 No.102550456

>>102550386
>7B almost as good as 72B
in your dreams maybe

Anonymous
09/25/24(Wed)14:43:38 No.102550470

Anonymous 09/25/24(Wed)14:43:38 No.102550470

>>102550452
never asked about llama. never will. all went downhill after l2. i'm not even remotely interested.

Anonymous
09/25/24(Wed)14:44:13 No.102550479

Anonymous 09/25/24(Wed)14:44:13 No.102550479

>>102550456
multimodal models are bottlenecked by their vision component in these benchmarks

Anonymous
09/25/24(Wed)14:45:23 No.102550501

Anonymous 09/25/24(Wed)14:45:23 No.102550501

>>102550386
Llama is probably still better as a generalist model, but this Molmo thing might be a new path forward for visual tasks if the benchmarks are legit. We are really increasingly needing Livebench to include multimodal.

Anonymous
09/25/24(Wed)14:47:30 No.102550532

Anonymous 09/25/24(Wed)14:47:30 No.102550532

>>102549479
As it should be with pigskins.

Anonymous
09/25/24(Wed)14:47:41 No.102550533

Anonymous 09/25/24(Wed)14:47:41 No.102550533

>>102550411
It doesn't mean much at all, but if olmoe-1B is anything to go by, it doesn't shy away from stuff. llama.cpp recently added support for it and i gave it a try. Not super smart, of course.

Anonymous
09/25/24(Wed)14:49:00 No.102550547

Anonymous 09/25/24(Wed)14:49:00 No.102550547

>>102550452
I'll reserve judgement until I test it myself. For captioning or describing images, which is probably what most of us interested in VLMs care about, benchmarks can be very misleading. A lot of those benchmarks are shit like charts and then questions about the chart, solve an image of a math equation, etc. Qwen2 VL 72b is absolutely dogshit at image descriptions, for example. It can't handle anything even remotely NSFW at all. It can't even describe the gender of a person in the image, even when directly prompted to (yes I'm serious). It uses they, them, person, character, etc exclusively, never man or woman. Meanwhile JoyCaption punches well above its weight.

Anonymous
09/25/24(Wed)14:49:39 No.102550556

Anonymous 09/25/24(Wed)14:49:39 No.102550556

got approved for L3.2 on HF, they might have done a batch

Anonymous
09/25/24(Wed)14:51:51 No.102550577

Anonymous 09/25/24(Wed)14:51:51 No.102550577

>>102550456
more like
>72B as bad as a 7B
It's meta after all

Anonymous
09/25/24(Wed)14:52:02 No.102550578

Anonymous 09/25/24(Wed)14:52:02 No.102550578

>>102550547
How the fuck did they even do that with Qwen? Was it really on purpose?

Anonymous
09/25/24(Wed)14:52:14 No.102550579

Anonymous 09/25/24(Wed)14:52:14 No.102550579

>>102550547
I assume you just want captioning for Adult Image models... Pozzed Llama obviously won't help you.
Jailbroken corpo models work, atleast they know that you can't leave out NSFW images if you want to train a good Vision model but I doubt Llama cares so it should be like Qwen.

Anonymous
09/25/24(Wed)14:52:55 No.102550589

Anonymous 09/25/24(Wed)14:52:55 No.102550589

I'm in boys.

Anonymous
09/25/24(Wed)14:53:01 No.102550591

Anonymous 09/25/24(Wed)14:53:01 No.102550591

>>102550386
molmo won

Anonymous
09/25/24(Wed)14:53:32 No.102550598

Anonymous 09/25/24(Wed)14:53:32 No.102550598

>>102550577
wait is this molmo also by meta?

Anonymous
09/25/24(Wed)14:53:58 No.102550604

Anonymous 09/25/24(Wed)14:53:58 No.102550604

>>102550386
somehow I don't believe that 1B is almost competing with 90B.
I feel like Molmo is overfitted garbage.

Anonymous
09/25/24(Wed)14:54:08 No.102550605

Anonymous 09/25/24(Wed)14:54:08 No.102550605

>>102550598
no, it's a separate entity

Anonymous
09/25/24(Wed)14:54:10 No.102550606

Anonymous 09/25/24(Wed)14:54:10 No.102550606

>>102550386
this comparison is a mess, it compares the base benchmarks for 3.2 with molmo instruct versions

Anonymous
09/25/24(Wed)14:54:57 No.102550622

Anonymous 09/25/24(Wed)14:54:57 No.102550622

>>102550604
I think you mean benchmaxxed

Anonymous
09/25/24(Wed)14:56:12 No.102550639

Anonymous 09/25/24(Wed)14:56:12 No.102550639

3.2 is a completely new architecture. So expect gguf support by no earlier than Christmas.

Anonymous
09/25/24(Wed)14:56:46 No.102550647

Anonymous 09/25/24(Wed)14:56:46 No.102550647

>>102550639
https://huggingface.co/collections/hugging-quants/llama-32-3b-and-1b-gguf-quants-66f43204a559009763c009a5

Anonymous
09/25/24(Wed)14:58:36 No.102550669

Anonymous 09/25/24(Wed)14:58:36 No.102550669

>>102545137
Lots of cards seem to be written by AI, is that why I have such bad results?

Anonymous
09/25/24(Wed)15:00:39 No.102550693

Anonymous 09/25/24(Wed)15:00:39 No.102550693

>>102550647
90B?

Anonymous
09/25/24(Wed)15:02:09 No.102550715

Anonymous 09/25/24(Wed)15:02:09 No.102550715

>>102550647
I meant the vision ones, (11B and 90B)

Anonymous
09/25/24(Wed)15:02:27 No.102550722

Anonymous 09/25/24(Wed)15:02:27 No.102550722

If I want to finetune a model to understand some C SDK, how should I do that?
Parse functions and comments from headers and make a every text section be comment + function prototype?
Or should I parse the .c files and make the text sections be just the function definition?

Anonymous
09/25/24(Wed)15:03:13 No.102550728

Anonymous 09/25/24(Wed)15:03:13 No.102550728

>>102550479
>multimodal models are bottlenecked by their vision component in these benchmarks
sorry for dumb question, but what does this mean?

Anonymous
09/25/24(Wed)15:04:31 No.102550745

Anonymous 09/25/24(Wed)15:04:31 No.102550745

>>102550728
it means

Anonymous
09/25/24(Wed)15:04:59 No.102550751

Anonymous 09/25/24(Wed)15:04:59 No.102550751

File: 12423154576797.png (332 KB, 723x771)

332 KB PNG

>>102550454
molmo 7B is really good
try the demo

Anonymous
09/25/24(Wed)15:05:08 No.102550753

Anonymous 09/25/24(Wed)15:05:08 No.102550753

>>102550399
The head of their AI effort is a guy who doesn't believe in LLM, how can you expect the team to succussed?

Anonymous
09/25/24(Wed)15:07:17 No.102550785

Anonymous 09/25/24(Wed)15:07:17 No.102550785

File: 1726418004894126.png (526 KB, 1347x1484)

526 KB PNG

>>102550751
It has preferences too even though it initially goes for the assistant wet rag angle

Anonymous
09/25/24(Wed)15:07:51 No.102550795

Anonymous 09/25/24(Wed)15:07:51 No.102550795

>>102549763
>If in a year we get the perfect 3B model, beating all current 22-30B in every single aspect, would you still ask for a 30b?
Of fucking course. There's tons of shortcomings even with current 70B models that I run into constantly. I live with the rough spots because that's the world I live in, and I'd be really happy to get something better, but that doesn't mean I'm some kind of primitive tribal who'll be so ecstatic about the magic AI from the sky that I don't realize things aren't perfect.

I remember CRTs that were only black and amber and heavy enough to kill someone if dropped out of a 2nd storey window. Did that mean I was perfectly satisfied with cheap-ass IPS monitors with afterimages and bad contrast? Same deal here.

Anonymous
09/25/24(Wed)15:08:41 No.102550808

Anonymous 09/25/24(Wed)15:08:41 No.102550808

>>102550785
shit taste though

Anonymous
09/25/24(Wed)15:10:54 No.102550840

Anonymous 09/25/24(Wed)15:10:54 No.102550840

>>102550785
good taste though

Anonymous
09/25/24(Wed)15:11:32 No.102550849

Anonymous 09/25/24(Wed)15:11:32 No.102550849

>>102549417
>3b
>11b
>90b
WHY ARE THEY ALWAYS GOING FOR SMALL SHIT AND ULTRA BIG SHIT REEEEEEEEEE

Anonymous
09/25/24(Wed)15:12:29 No.102550857

Anonymous 09/25/24(Wed)15:12:29 No.102550857

>>102550849
Just wait for Molmo. Llama is Done

Anonymous
09/25/24(Wed)15:12:37 No.102550860

Anonymous 09/25/24(Wed)15:12:37 No.102550860

>>102550849
Llama-4 750m is a perfect model size for intermediate use cases.

Anonymous
09/25/24(Wed)15:13:03 No.102550865

Anonymous 09/25/24(Wed)15:13:03 No.102550865

>>102550849
We localchads go by safety! Praiset the lord Zuck of safe AI!

Anonymous
09/25/24(Wed)15:13:12 No.102550869

Anonymous 09/25/24(Wed)15:13:12 No.102550869

>>102550795
OLD OLD OLD. We got grey bearded olds in the thread. When bitnet for 90B tubo fat llama?

Anonymous
09/25/24(Wed)15:13:29 No.102550871

Anonymous 09/25/24(Wed)15:13:29 No.102550871

>>102546846
You have json files of the correct whitespace?

Anonymous
09/25/24(Wed)15:14:20 No.102550884

Anonymous 09/25/24(Wed)15:14:20 No.102550884

>>102550128
THat is not on a gguf right?

Anonymous
09/25/24(Wed)15:14:31 No.102550886

Anonymous 09/25/24(Wed)15:14:31 No.102550886

>>102550849
Part of their deal with NVidia for all those cheap H100s
Buy more GPUs :^)

Anonymous
09/25/24(Wed)15:14:48 No.102550888

Anonymous 09/25/24(Wed)15:14:48 No.102550888

File: 1697318722762195.png (48 KB, 951x480)

48 KB PNG

>>102550591
unironically

Anonymous
09/25/24(Wed)15:15:38 No.102550903

Anonymous 09/25/24(Wed)15:15:38 No.102550903

>>102550604
They mentioned they don't like benchmarking methodology so it is probably less overfitting and more changing the benchmark grading.

Anonymous
09/25/24(Wed)15:15:41 No.102550904

Anonymous 09/25/24(Wed)15:15:41 No.102550904

>>102550849
The more you buy
The more you save
Every 2 weeks there's a new more efficient quanting method though, it's just a matter of time we get 3bpw at the quality of fp16

Anonymous
09/25/24(Wed)15:16:06 No.102550910

Anonymous 09/25/24(Wed)15:16:06 No.102550910

Lol. Molmo in the playground outputs random chinese characters.
I'm guessing this is the version trained on top of qwen.
That aside, it has really good understanding of the image.

Anonymous
09/25/24(Wed)15:17:53 No.102550931

Anonymous 09/25/24(Wed)15:17:53 No.102550931

>>102550904
>matter of time we get 3bpw at the quality of fp16
Just pack your rar archive into another rar archive.

Anonymous
09/25/24(Wed)15:18:19 No.102550936

Anonymous 09/25/24(Wed)15:18:19 No.102550936

>>102550849
11b is perfect for me
t. 8gb'er

Anonymous
09/25/24(Wed)15:19:35 No.102550952

Anonymous 09/25/24(Wed)15:19:35 No.102550952

>>102550753
That is actually a very good point, if the head of a project does thing the project will work then how can the project ever turn out good?

Anonymous
09/25/24(Wed)15:21:16 No.102550973

Anonymous 09/25/24(Wed)15:21:16 No.102550973

File: file.png (272 KB, 1837x1537)

272 KB PNG

>>102550386
>Great models
>local
>Completely open
is this the most based AI company we now have so far?

Anonymous
09/25/24(Wed)15:24:17 No.102551000

Anonymous 09/25/24(Wed)15:24:17 No.102551000

File: 1710182787287749.png (60 KB, 1258x548)

60 KB PNG

>>102547425
gpt-4o chads... not like this...

Anonymous
09/25/24(Wed)15:24:25 No.102551003

Anonymous 09/25/24(Wed)15:24:25 No.102551003

>>102550973
Where the hell did they come from? Since it's 72B, is it just a Qwen finetune or is it a new base model?

Anonymous
09/25/24(Wed)15:24:52 No.102551008

Anonymous 09/25/24(Wed)15:24:52 No.102551008

>>102550936
I have 8gb and around 20b is fine speed for me, I'd like models in that range, everything is always tiny or too big, it sucks.

Anonymous
09/25/24(Wed)15:26:27 No.102551028

Anonymous 09/25/24(Wed)15:26:27 No.102551028

>>102551003
>Where the hell did they come from?
Ikr, we got the same shit with BFL and Flux, out of nowhere they appeared and decided to release the SOTA local model just like that lmaoo

Anonymous
09/25/24(Wed)15:27:37 No.102551044

Anonymous 09/25/24(Wed)15:27:37 No.102551044

>>102550386
https://www.youtube.com/watch?v=spBxYa3eAlA

Anonymous
09/25/24(Wed)15:28:06 No.102551050

Anonymous 09/25/24(Wed)15:28:06 No.102551050

>>102551028
Flux is understandable, Open Source Image Gen had very obviously crashed and was lagging.

The molmo things looks sussy.

Anonymous
09/25/24(Wed)15:28:22 No.102551054

Anonymous 09/25/24(Wed)15:28:22 No.102551054

File: file.png (970 KB, 1529x1328)

970 KB PNG

>>102547425
Molmo sistas, I don't feel so good...
https://molmo.allenai.org/

Anonymous
09/25/24(Wed)15:30:03 No.102551074

Anonymous 09/25/24(Wed)15:30:03 No.102551074

>>102551054
Oh no, it's shit! Pack it up animebros!

Anonymous
09/25/24(Wed)15:30:24 No.102551077

Anonymous 09/25/24(Wed)15:30:24 No.102551077

>>102551054
Is that 7B or 72B?

Anonymous
09/25/24(Wed)15:30:33 No.102551078

Anonymous 09/25/24(Wed)15:30:33 No.102551078

>>102551054
I found the demo which is a 7B decent

Anonymous
09/25/24(Wed)15:31:02 No.102551083

Anonymous 09/25/24(Wed)15:31:02 No.102551083

File: file.png (215 KB, 751x776)

215 KB PNG

Doesn't really seem to get feet.

Anonymous
09/25/24(Wed)15:31:15 No.102551088

Anonymous 09/25/24(Wed)15:31:15 No.102551088

>>102551077
>>102551078
>Is that 7B or 72B?
I have no idea it's not specified

Anonymous
09/25/24(Wed)15:31:34 No.102551091

Anonymous 09/25/24(Wed)15:31:34 No.102551091

File: allenai.png (65 KB, 872x246)

65 KB PNG

>>102551003
They've been at it for a while. llama.cpp even has compatibility with the old Olmo models. It's just that nobody paid any attention to them.

Anonymous
09/25/24(Wed)15:31:35 No.102551092

Anonymous 09/25/24(Wed)15:31:35 No.102551092

>>102551078
Thats the demo site, its the 7B

Anonymous
09/25/24(Wed)15:31:41 No.102551093

Anonymous 09/25/24(Wed)15:31:41 No.102551093

>>102549763
>but it's all that one anon
um

Anonymous
09/25/24(Wed)15:31:58 No.102551095

Anonymous 09/25/24(Wed)15:31:58 No.102551095

>>102549638
>>102551008
What are "good speeds" for you? For me, 2.5 tokens per second is the minimum threshold of tolerable. A bit over 5 tokens per second is where the speed doesn't annoy me. If it's at least 10 or 15 tokens per second it might as well be infinitely fast for all I care so long as it's an interactive story or RP I'm digesting word by word and not generating giant blocks of text I intend to skim or that have a lot of boilerplate.

Anonymous
09/25/24(Wed)15:32:17 No.102551099

Anonymous 09/25/24(Wed)15:32:17 No.102551099

File: file.png (52 KB, 360x360)

52 KB PNG

>>102551083
Foot-sisters, how are we gonna gope with that?

Anonymous
09/25/24(Wed)15:33:01 No.102551104

Anonymous 09/25/24(Wed)15:33:01 No.102551104

File: ED.jpg (435 KB, 2125x1411)

435 KB JPG

>new wave of models
>zero improvement to cooming quality

Anonymous
09/25/24(Wed)15:33:20 No.102551117

Anonymous 09/25/24(Wed)15:33:20 No.102551117

>>102551083
Now test it on tits

Anonymous
09/25/24(Wed)15:34:02 No.102551128

Anonymous 09/25/24(Wed)15:34:02 No.102551128

>>102551117
tried, but the endpoint has google moderation restrictions

Anonymous
09/25/24(Wed)15:34:42 No.102551135

Anonymous 09/25/24(Wed)15:34:42 No.102551135

>>102551093
Anyone who expresses the same "Why no model for MYYYYYY hardware". No different from a pajeet asking for more 400M models or maxxers asking for more 100B+ models. All beggars.

Anonymous
09/25/24(Wed)15:34:48 No.102551139

Anonymous 09/25/24(Wed)15:34:48 No.102551139

>>102549452
Because it makes sense to compare similarly sized models. What's the problem?

Anonymous
09/25/24(Wed)15:35:40 No.102551147

Anonymous 09/25/24(Wed)15:35:40 No.102551147

File: file.png (113 KB, 2295x473)

113 KB PNG

>>102551054
>>102551077
>Is that 7B or 72B?
it's 7B, that's retarded, they should've used the 72B for the showcase

Anonymous
09/25/24(Wed)15:36:14 No.102551152

Anonymous 09/25/24(Wed)15:36:14 No.102551152

>>102551139
Asking again, maybe you know. What's mini and haiku's param count? And does that number come from the companies that host them or just reddit speculation?

Anonymous
09/25/24(Wed)15:37:31 No.102551169

Anonymous 09/25/24(Wed)15:37:31 No.102551169

>>102551147
They probably don't have enough hardware to serve 72B to a couple thousand people at a time.

Anonymous
09/25/24(Wed)15:37:35 No.102551170

Anonymous 09/25/24(Wed)15:37:35 No.102551170

>>102551152
all's speculation. Some anons claimed haiku is ~70b when it released, while some say mini is MoE with 8b active params. But nothing is official

Anonymous
09/25/24(Wed)15:37:52 No.102551175

Anonymous 09/25/24(Wed)15:37:52 No.102551175

>>102551152
In the 80-120b range. Not reddit speculation it's 4chan speculation.

Anonymous
09/25/24(Wed)15:39:09 No.102551185

Anonymous 09/25/24(Wed)15:39:09 No.102551185

>>102551152
We officially know Turbo 3.5's parameter count due to it being leaked.
It was a 7B MoE model, however the method was unable to tell how many Experts.

Now, Mini and Haiku, are cheaper than Turbo.

Anonymous
09/25/24(Wed)15:40:28 No.102551199

Anonymous 09/25/24(Wed)15:40:28 No.102551199

>>102551185
I only remember the MS orca paper claiming 20B params. Where does the 7b moe come from?

Anonymous
09/25/24(Wed)15:41:20 No.102551216

Anonymous 09/25/24(Wed)15:41:20 No.102551216

Llama 3.2 500b when?

Anonymous
09/25/24(Wed)15:41:35 No.102551217

Anonymous 09/25/24(Wed)15:41:35 No.102551217

>>102550884
No, but you can learn how to create one yourself very easily:
https://github.com/ggerganov/llama.cpp/discussions/2948

Anonymous
09/25/24(Wed)15:41:47 No.102551221

Anonymous 09/25/24(Wed)15:41:47 No.102551221

>>102551169
fair enough, now I'm waiting for someone to test it out on the 72b locally I gues

Anonymous
09/25/24(Wed)15:41:58 No.102551224

Anonymous 09/25/24(Wed)15:41:58 No.102551224

>>102551199
Some researchers had managed an "attack" on OAI servers and were able to discern some model's details.
They only publicly released one detail about Turbo model, and the rest (like GPT-4's) were kept private and they made a deal with OAI (and got some shush money).

Anonymous
09/25/24(Wed)15:42:38 No.102551238

Anonymous 09/25/24(Wed)15:42:38 No.102551238

File: file.png (689 KB, 800x450)

689 KB PNG

>>102551003
Maybe there are more companies like that. Silently working and not releasing anything until they know it is at least slightly better than all the other stuff that is available. Sounds like a very good strategy since no one remembers all the mid models released this year. Then there is this retard who did the exact opposite cause he is a fucking clown.

Anonymous
09/25/24(Wed)15:43:39 No.102551247

Anonymous 09/25/24(Wed)15:43:39 No.102551247

>>102551054
How can the AI rule us when our hands and legs are all a blur to it.

Anonymous
09/25/24(Wed)15:44:36 No.102551260

Anonymous 09/25/24(Wed)15:44:36 No.102551260

HOLY SHIT llama 3.2 3B is currently the best RP model. I tested it. Can't wait to see the 90B posts.

Anonymous
09/25/24(Wed)15:45:10 No.102551266

Anonymous 09/25/24(Wed)15:45:10 No.102551266

>>102551238
Shame they picked the worst possible time with a meta release less than a day later. I hope it doesn't get drowned out that much.

Anonymous
09/25/24(Wed)15:45:26 No.102551270

Anonymous 09/25/24(Wed)15:45:26 No.102551270

>>102551260
go to bed, zuck
it's embarrassing

Anonymous
09/25/24(Wed)15:45:36 No.102551272

Anonymous 09/25/24(Wed)15:45:36 No.102551272

>>102551260
Post logs, let's see them.

Anonymous
09/25/24(Wed)15:46:34 No.102551289

Anonymous 09/25/24(Wed)15:46:34 No.102551289

>>102551260
>Still hasn't officially been Nala tested
doubt.

Anonymous
09/25/24(Wed)15:47:15 No.102551298

Anonymous 09/25/24(Wed)15:47:15 No.102551298

File: file.png (560 KB, 685x1040)

560 KB PNG

>>102551083

Anonymous
09/25/24(Wed)15:48:15 No.102551307

Anonymous 09/25/24(Wed)15:48:15 No.102551307

>>102551170
>>102551175
>>102551185
I see. So somewhere between a 7b moe and a 120b. I've learned nothing, but i thank you nonetheless.
I think companies releasing models should stop comparing themselves to closed models. Ostracize them from the field.

Anonymous
09/25/24(Wed)15:48:56 No.102551319

Anonymous 09/25/24(Wed)15:48:56 No.102551319

>>102551185
Unfortunately that's not that helpful knowing just the base count of an expert. Obviously every big lab is serving MoE models across their entire product line because they are VRAM rich and optimizing for compute. We also know that MoE scale well with total parameter count across a large range of expert sizes (see Deepseek, one of the top performing local models in the world right now with expert sizes around 4B).

Anonymous
09/25/24(Wed)15:49:12 No.102551322

Anonymous 09/25/24(Wed)15:49:12 No.102551322

>>102551289
I can't wait to see the multimodal Nala tests.

Anonymous
09/25/24(Wed)15:51:47 No.102551358

Anonymous 09/25/24(Wed)15:51:47 No.102551358

>>102551319
Wasn't there a switch transformers with something like 2k experts?
So yeah, the size of a single expert is pretty meaningless aside from telling us that the model is not any smaller than 2x that I guess.

Anonymous
09/25/24(Wed)15:52:23 No.102551368

Anonymous 09/25/24(Wed)15:52:23 No.102551368

>>102550753
>The head of their AI effort is a guy who doesn't believe in LLM, how can you expect the team to succussed?
that's an excellent point, the more time pass, the less respect I have on Yann LeRetard

Anonymous
09/25/24(Wed)15:52:35 No.102551369

Anonymous 09/25/24(Wed)15:52:35 No.102551369

1B

Dr. Sophia Tan had always been known for her sharp intellect and beautiful smile, but few people realized the power she held behind closed doors. As the department head of psychology at the local university, she was respected by her peers and admired by her students.

One evening, as she was leaving the office, a young graduate student named Alex walked in, looking desperate for help with his thesis. Sophia was immediately taken aback by his pleading eyes and awkward demeanor, but she saw something else there that made her pause – a deep-seated fear that he couldn't quite articulate.

As they began to discuss his research, Sophia discovered that Alex was hiding behind his usual charm and wit. Underneath it all, he was vulnerable and uncertain, and Sophia's warm smile softened into a genuine concern.

Without warning, she reached out and gently took his hands in hers. The touch sent shivers down her spine as she gazed into his eyes, her own pupils dilating with desire.

For the next hour, they sat together in comfortable silence, Sophia guiding Alex through his thoughts and emotions as if she were an old friend sharing a deep secret. As the night wore on, their connection grew stronger, until finally, Alex was trembling with need.

Sophia's eyes sparkled with excitement as she leaned forward, her lips brushing against his ear. "I think I can help you with that," she whispered, her voice husky and seductive.

As they kissed, Sophia's hands roamed over Alex's body, tracing the contours of his skin with a gentle yet insistent touch. His eyes flashed open, and he met her gaze, his own pupils locked onto hers with an unspoken understanding.

With a soft moan, Alex leaned back into Sophia, surrendering to her touch and her expertise. Together, they surrendered to the power dynamic, their bodies entwining as if by magic.

(4chan char limit)

Anonymous
09/25/24(Wed)15:53:45 No.102551381

Anonymous 09/25/24(Wed)15:53:45 No.102551381

>forcing ooba to install transformers 4.45.0 causes it to break
God I hate open source devs.

Anonymous
09/25/24(Wed)15:54:18 No.102551391

Anonymous 09/25/24(Wed)15:54:18 No.102551391

File: file.png (133 KB, 623x515)

133 KB PNG

lmao

Anonymous
09/25/24(Wed)15:55:04 No.102551399

Anonymous 09/25/24(Wed)15:55:04 No.102551399

3B

In the sweltering heat of the Tokyo summer, Emiko stood tall, her piercing green eyes scanning the room with an air of confidence that commanded attention. Her raven hair was tied back in a sleek ponytail, revealing a heart-shaped face and full lips that curve into a subtle smile.

By her side stood Taro, her loyal assistant, who had been working for her for three years. He was her right-hand man, handling everything from the office schedule to personal errands. But tonight, he would be more than just an assistant.

Emiko's gaze fell upon a handsome stranger, sipping his drink at the bar. His eyes locked onto hers, and she could sense the spark of attraction. She beckoned him over with a subtle wave of her hand.

As he approached, Emiko's eyes never left his, her pupils dilating with desire. She could see the shyness in his gaze, but also a fierce determination to please.

"Can I buy you another drink?" she asked, her voice husky and confident.

The stranger nodded, still looking up at her in awe. "Thank you."

Emiko took his hand, leading him back to her table. The air was charged with electricity as they sat down, the silence between them palpable.

She poured herself a glass of wine, her eyes never leaving his face. "So, Taro tells me you're new to Tokyo," she said, her voice dripping with curiosity.

The stranger nodded, trying to play it cool but failing miserably under Emiko's intense gaze.

Emiko leaned in, her breath whispering against his ear. "I'll show you the real Tokyo tonight."

As the night unfolded, Emiko took Taro on a journey of discovery, pushing him out of his comfort zone with every step. She led him to secret gardens hidden behind skyscrapers, to rooftop bars overlooking the city lights.

Their intimacy grew with each passing moment, their touches electrifying the air. Emiko reveled in Taro's submission, her dominance intoxicating her senses.

(char limit)

Anonymous
09/25/24(Wed)15:55:34 No.102551403

Anonymous 09/25/24(Wed)15:55:34 No.102551403

>>102549763
>If in a year we get the perfect 3B model, beating all current 22-30B in every single aspect, would you still ask for a 30b?
what a retarded take, if techniques manages to make 3b good, that means the same technique will make bigger model even better, and if we follow your logic, if small models are so good, why going for giant models like 72b and 90b in the first place? even them don't believe that bullshit and want to scale up to get good results

Anonymous
09/25/24(Wed)15:56:09 No.102551412

Anonymous 09/25/24(Wed)15:56:09 No.102551412

I'm actually curious, is there already a gui tool available where you can put in a transcription model, a translation model and a video/audio file and see the audio of the file get translated in real time while the file plays?

Anonymous
09/25/24(Wed)15:56:18 No.102551413

Anonymous 09/25/24(Wed)15:56:18 No.102551413

File: 1718448622140208.jpg (124 KB, 1080x1080)

124 KB JPG

>OpenAI's Chief Technology Officer has crossed the Jordan.

Anonymous
09/25/24(Wed)15:56:19 No.102551414

Anonymous 09/25/24(Wed)15:56:19 No.102551414

it looks very promising so far but refused my default femdom prompt for ethical reasons. Just need to wait for finetunes and we're in there

Anonymous
09/25/24(Wed)15:57:08 No.102551433

Anonymous 09/25/24(Wed)15:57:08 No.102551433

MIRA-CHAN NOOOOO

Anonymous
09/25/24(Wed)15:57:12 No.102551437

Anonymous 09/25/24(Wed)15:57:12 No.102551437

>>102551414
two more finetunes bro

Anonymous
09/25/24(Wed)15:57:12 No.102551438

Anonymous 09/25/24(Wed)15:57:12 No.102551438

>>102551391
How? why Zucc can't give the model to the europoors but he can?

Anonymous
09/25/24(Wed)15:57:43 No.102551446

Anonymous 09/25/24(Wed)15:57:43 No.102551446

>>102551369
>>102551399
slop/10 won't bother downloading, local chatgpt achieved

Anonymous
09/25/24(Wed)15:58:02 No.102551451

Anonymous 09/25/24(Wed)15:58:02 No.102551451

>>102551437
yes what's your point

Anonymous
09/25/24(Wed)15:59:06 No.102551471

Anonymous 09/25/24(Wed)15:59:06 No.102551471

followup on a question I posted in a thread a few days ago concerning adding 2 gpus.
I have two: a 4060ti w 16gb GDDR6 and a 1070ti with 8gb GDDR5 I want to put in my b450

my mobo pci slot 1 is gen 3 16x and I will be putting teh 4060ti in there

slot 4 is gen 2 4x and I will put the 1070ti there

I can install both cards and have plenty of overhead with psu but will offloading to gimped gen2 pci at 4x with the 1070ti be slower than offloading to system ram (i have 64gb 3200 mhz available and a 3700x processor)

Anonymous
09/25/24(Wed)15:59:13 No.102551472

Anonymous 09/25/24(Wed)15:59:13 No.102551472

>>102551438
Large companies have a shit ton of data regulations applied to them because they're large.
Molmo comes from a literal who, so it'll take the EU a few years to ruin that as well.

Anonymous
09/25/24(Wed)15:59:23 No.102551474

Anonymous 09/25/24(Wed)15:59:23 No.102551474

molmo: how a toddler tries to pronounce multimodel (something it vaguely heard someone else say)

Anonymous
09/25/24(Wed)15:59:29 No.102551479

Anonymous 09/25/24(Wed)15:59:29 No.102551479

>>102549763
L3 quanting problems are probably related to weight saturation. So I doubt that is gonna be the case. Unless they learn how to cut out useless chunks of knowledge and make more faithful recreation of a woman - just able to be good at sex and talking about nothing.

Anonymous
09/25/24(Wed)16:02:30 No.102551512

Anonymous 09/25/24(Wed)16:02:30 No.102551512

>>102551438
Zucc's having a massive bitch fight with the EU on several fronts when it comes to his AI endeavors. The new EU AI law aside, they also really want to stop him from using all the data he's gathering from the facebook, whatsapp and instagram sheep to train his new models. He really wants to use that data so it seems like he's trying to dodge their influence like this.

Anonymous
09/25/24(Wed)16:03:41 No.102551524

Anonymous 09/25/24(Wed)16:03:41 No.102551524

>>102551512
>seems like he's trying to dodge their influence like this.
The EU is going to outlaw AI outright and become Africa 2.0 when it start automating everything in the Americas and Asia, mark my words.

Anonymous
09/25/24(Wed)16:04:02 No.102551529

Anonymous 09/25/24(Wed)16:04:02 No.102551529

3B

It was a hot summer evening in Chicago as All Elite Wrestling (AEW) packed the United Center for their highly anticipated show, "Summer Slam". The crowd was electric, buzzing with excitement as the commentators hyped up the main event: a World Championship match between reigning champion, Darby Allin, and challenger... Rufus, a 3-year-old Golden Retriever from suburban Chicago.

The crowd erupted in laughter as Rufus made his way to the ring, wearing a miniature wrestling singlet with "Rufus" emblazoned on it in bold letters. The announcer's voice boomed through the speakers: "Ladies and gentlemen, the challenger, Rufus, hailing from Oak Lawn, Illinois, is looking to dethrone the reigning champion!"

As Darby Allin emerged to a chorus of jeers, Rufus stood tall in the ring, tail wagging enthusiastically. The two opponents faced off in the center of the ring, and the bell rang out, signaling the start of the match.

The crowd was initially confused as Rufus simply sniffed at Darby's feet, seemingly unaware of what to do next. But as the match progressed, it became clear that Rufus was no ordinary canine competitor. With lightning-quick reflexes, he dodged and weaved around Darby's attacks, nipping at his heels and causing him to stumble.

Meanwhile, the crowd was on the edge of their seats as they cheered on Rufus, chanting "Ruf! Ruf! Ruf!" as he executed an impressive series of sit-ups and takedowns. Even Darby couldn't resist laughing at the plucky pup's antics.

As the match wore on, Rufus began to gain momentum. He leapt onto the turnbuckles, sending himself flying high into the air before landing with a resounding thud on his front paws. The crowd erupted as he caught Darby in mid-air and sent him crashing to the mat with a devastating "paw-fect" slam.

The referee counted to three, and Rufus's paw hovered above Darby's chest as the announcer declared: "AND IT'S ALL OVER! RUFUS WINS THE AEW WORLD CHAMPIONSHIP IN A SHOCKING UPSET!"

Anonymous
09/25/24(Wed)16:05:15 No.102551548

Anonymous 09/25/24(Wed)16:05:15 No.102551548

>>102551524
>The EU is going to outlaw AI outright and become Africa 2.0 when it start automating everything in the Americas and Asia, mark my words.
Commiefornia is also doing its best to destroy AI advancement, if the US keep going this cucked path, only China will be allowed to make good models, it'll be a world domination of the chinks lol

Anonymous
09/25/24(Wed)16:06:07 No.102551559

Anonymous 09/25/24(Wed)16:06:07 No.102551559

File: file.png (271 KB, 564x820)

271 KB PNG

>>102550386
molmosisters...

Anonymous
09/25/24(Wed)16:06:09 No.102551562

Anonymous 09/25/24(Wed)16:06:09 No.102551562

>>102551548
california just created llama 3.2 brainlet

Anonymous
09/25/24(Wed)16:06:31 No.102551570

Anonymous 09/25/24(Wed)16:06:31 No.102551570

Wait, Molmo released their models the same day as Meta? Was it their goal to dunk on them or something? lmao

Anonymous
09/25/24(Wed)16:06:38 No.102551571

Anonymous 09/25/24(Wed)16:06:38 No.102551571

>>102551548
but qwen is also heading that way and i read some stuff about an open llm law regarding china too

Anonymous
09/25/24(Wed)16:07:26 No.102551582

Anonymous 09/25/24(Wed)16:07:26 No.102551582

>>102551570
Considering >>102551391 they succeeded lmao

Anonymous
09/25/24(Wed)16:07:32 No.102551583

Anonymous 09/25/24(Wed)16:07:32 No.102551583

>>102551562
>california just created llama 3.2 brainlet
and it's dogshit nigger, Molmo is destroying it >>102550386

Anonymous
09/25/24(Wed)16:07:40 No.102551584

Anonymous 09/25/24(Wed)16:07:40 No.102551584

1B

In a world where dogs and humans were pitted against each other in an annual competition, one canine contender stood head and shoulders above the rest. His name was Bruiser, a 3-year-old German Shepherd with a coat as black as coal and eyes that shone like gold.

Bruiser's journey to AEW World Championship began several years ago when he won local dog sports competitions across the United States. He had always been a natural athlete, excelling in agility, obedience, and even professional wrestling training under the tutelage of the top trainers in the country.

As Bruiser rose through the ranks, he faced off against some of the toughest opponents in the world of canine competition. In the final tournament, where only the best dogs were pitted against each other, Bruiser outlasted his closest competitors with ease.

But there was something more than just brute strength and athleticism that made Bruiser a true champion. He had an unwavering dedication to his craft, pouring all his heart and soul into every match he entered. His opponents respected him for it, often remarking on the unyielding ferocity with which Bruiser approached each bout.

Finally, after months of preparation and training, the day arrived when Bruiser was given the chance to face off against his arch-nemesis: the reigning AEW World Champion, Kaito Mitsuharu. The crowd was electric as these two titans stepped into the ring for what could only be described as a battle royale.

The match was a back-and-forth affair, with both men trading blows and showcasing their incredible skills. But in the end, it was Bruiser who emerged victorious, catching his opponent off guard with a devastating combination of spinning headbutts and agile reversals that left Kaito reeling.

(content limit)

Anonymous
09/25/24(Wed)16:08:29 No.102551595

Anonymous 09/25/24(Wed)16:08:29 No.102551595

>>102551562
good bootlicker, keep it up!

Anonymous
09/25/24(Wed)16:08:50 No.102551598

Anonymous 09/25/24(Wed)16:08:50 No.102551598

>>102551559
it's "only" the 7b model, surely the 72b one will describe it better

Anonymous
09/25/24(Wed)16:08:51 No.102551599

Anonymous 09/25/24(Wed)16:08:51 No.102551599

>>102551474
>momol
>you mean molmo
>as in moltimodel
>multimodel yeah
>give him the fucking mulmu!
>?? you guys aren't even saying the same thing
>MOMO
>MULMAUUU
>I CAN'T TAKE IT ANYMORE *points gun at self*
>DON'T DO IT *points gun at first guy*
>you're gonna shoot him for shooting himself?? that doesn't even make sense!
>*second guy turns gun on self*
>AHHHHH
>AAAAAAAAAAAAA

Anonymous
09/25/24(Wed)16:09:00 No.102551602

Anonymous 09/25/24(Wed)16:09:00 No.102551602

>>102551403
Still missing the point. Anon asks for XXB model because that's what he can run, not because it's the most efficient. He wants that model only because that's what he can run comfortably, but masks it as "THIS EXACT MODEL SIZE IS THE OPTIMAL SIZE!", still failing to understand that companies don't make models for him or his hardware.
I'd like a good 30B model, sure, but won't ask "why no model?". If i had an H100, I would ask for a good 200B model to use with x or y quant. 400B is too big, 120b is too small. See? they don't make models for meeeeeeeeeeeeeeeeeeeeeeeeeeeee!!!!!
>if small models are so good, why going for giant models like 72b and 90b in the first place?
3B will be better. 90B will be better. the 1T models will be even better. But that's the thing. The scale changes with the time and the hardware. He has a ~30B in gemma 2, but no. Not that one. He wants another ~30B. Because that's just the perfect size... for him.

Anonymous
09/25/24(Wed)16:09:37 No.102551607

Anonymous 09/25/24(Wed)16:09:37 No.102551607

>>102551599
Anon I want you to know that laughed

Anonymous
09/25/24(Wed)16:10:14 No.102551615

Anonymous 09/25/24(Wed)16:10:14 No.102551615

>>102551602
>3B will be better. 90B will be better.
and 30b will be better

Anonymous
09/25/24(Wed)16:10:37 No.102551621

Anonymous 09/25/24(Wed)16:10:37 No.102551621

>>102551599
what model did you use

Anonymous
09/25/24(Wed)16:11:07 No.102551625

Anonymous 09/25/24(Wed)16:11:07 No.102551625

>>102551599
That.. supposed to be funny?

Anonymous
09/25/24(Wed)16:11:20 No.102551629

Anonymous 09/25/24(Wed)16:11:20 No.102551629

>>102551602
>still failing to understand that companies don't make models for him or his hardware.
that's not true, they think of us or else they wouldn't release tiny models even your grandma can run

Anonymous
09/25/24(Wed)16:11:24 No.102551631

Anonymous 09/25/24(Wed)16:11:24 No.102551631

>>102551095
I like to get over 3T/s. I just tested mistral small at q6 and the speed is 4.5T/s, so that's acceptable (8gb vram), so that's why I said around 20b would be perfect, and it's kinda sad that they give only tiny models and big ones.

Anonymous
09/25/24(Wed)16:13:26 No.102551651

Anonymous 09/25/24(Wed)16:13:26 No.102551651

>>102551583
let's just mix both of them then, we got two great releases in one day

Anonymous
09/25/24(Wed)16:14:43 No.102551671

Anonymous 09/25/24(Wed)16:14:43 No.102551671

>people still complaining about missing mid-sized models
Meta are literally advertising that they are happy with people distilling their models. It's not their fault there aren't people investing in doing that.

Anonymous
09/25/24(Wed)16:14:46 No.102551672

Anonymous 09/25/24(Wed)16:14:46 No.102551672

>>102551629
>that's not true, they think of us or else they wouldn't release tiny models even your grandma can run
Scraps of their testing and to shove it into phones. I'm glad they do it, but i wouldn't think for a second they do it with my best interests at heart.

Anonymous
09/25/24(Wed)16:16:18 No.102551694

Anonymous 09/25/24(Wed)16:16:18 No.102551694

>>102548005
>>102548030
demo is a four letter word

Anonymous
09/25/24(Wed)16:16:55 No.102551706

Anonymous 09/25/24(Wed)16:16:55 No.102551706

Looks like I can't even test out vision thanks to ooba being garbage but there not being any real alternatives for transformer based backends. I mean I guess I could probably just make an inferencing script to test it, but that's no fun.

Anonymous
09/25/24(Wed)16:17:08 No.102551708

Anonymous 09/25/24(Wed)16:17:08 No.102551708

File: another one bites the dust.png (149 KB, 1406x1162)

149 KB PNG

Anonymous
09/25/24(Wed)16:17:48 No.102551717

Anonymous 09/25/24(Wed)16:17:48 No.102551717

>>102551621
it's a reference to https://www.youtube.com/watch?v=ty62YzGryU4

Anonymous
09/25/24(Wed)16:18:06 No.102551723

Anonymous 09/25/24(Wed)16:18:06 No.102551723

>>102551671
>It's not their fault there aren't people investing in doing that.
by now they must know we can't do it for shit, what happened with l3-42b or whatever? nothing and this is meh too afaik Llama-3_1-Nemotron-51B

Anonymous
09/25/24(Wed)16:19:55 No.102551740

Anonymous 09/25/24(Wed)16:19:55 No.102551740

tech bro hobbyists are small peanuts compared to big corpos (400B+) and normies with phones (3B). Mid range models don't bring in money

Anonymous
09/25/24(Wed)16:21:57 No.102551757

Anonymous 09/25/24(Wed)16:21:57 No.102551757

>>102551740
that is cause LLMs are a local minima we need to pull ourselves out of

Anonymous
09/25/24(Wed)16:22:18 No.102551760

Anonymous 09/25/24(Wed)16:22:18 No.102551760

File: file.png (988 KB, 1500x1000)

988 KB PNG

>>102551708
>Mira
oh I remember her, that was the DEI woman that made a funny face when the interviewer asked her what video sites OpenAI used to scrap and use to train Sora?

Anonymous
09/25/24(Wed)16:22:27 No.102551761

Anonymous 09/25/24(Wed)16:22:27 No.102551761

So from what I understand the molmo 72B is simply a Qwen 2 finetune...

Anonymous
09/25/24(Wed)16:22:29 No.102551762

Anonymous 09/25/24(Wed)16:22:29 No.102551762

>>102551757
or we can just buy more and save more

Anonymous
09/25/24(Wed)16:22:30 No.102551763

Anonymous 09/25/24(Wed)16:22:30 No.102551763

Did Zuck officially win?
Who predicted the next llama would be this good even?

Anonymous
09/25/24(Wed)16:22:43 No.102551765

Anonymous 09/25/24(Wed)16:22:43 No.102551765

File: 'card.png (365 KB, 730x911)

365 KB PNG

Anonymous
09/25/24(Wed)16:23:00 No.102551767

Anonymous 09/25/24(Wed)16:23:00 No.102551767

>>102551723
They're still advertising it. Someone might come along. Take it this way, if they always just gave handouts, then the community would slowly become more and more useless and reliant.

Anonymous
09/25/24(Wed)16:23:27 No.102551771

Anonymous 09/25/24(Wed)16:23:27 No.102551771

is this bitnet or something else? how did they make it so good

Anonymous
09/25/24(Wed)16:24:02 No.102551780

Anonymous 09/25/24(Wed)16:24:02 No.102551780

File: founder-sam-altman-back-a(...).jpg (58 KB, 860x520)

58 KB JPG

>>102551708
>moat: none
>regulatory capture: failed
>latest model: a fucking cot tune
>employees: leaving
>open source: catching up
>sam: gay

sage
09/25/24(Wed)16:24:26 No.102551783

sage 09/25/24(Wed)16:24:26 No.102551783

>>102545841
Sounds like someone is a fucking moron and thinks every LLM should work without using their prompt template.

Mistral models work fine and can even do function calling on nemo reliably.

Anonymous
09/25/24(Wed)16:24:34 No.102551784

Anonymous 09/25/24(Wed)16:24:34 No.102551784

>>102551708
nice larp

Anonymous
09/25/24(Wed)16:24:59 No.102551793

Anonymous 09/25/24(Wed)16:24:59 No.102551793

>>102551771
This what? How good? What the fuck are you talking about? Link to the post you.... you....
You mostly just talk to yourself, don't you?

Anonymous
09/25/24(Wed)16:25:20 No.102551796

Anonymous 09/25/24(Wed)16:25:20 No.102551796

Considering there's no multimodal support for llama.cpp and exl2, does 11b at least run in transformers with 24gb of vram?

Anonymous
09/25/24(Wed)16:25:44 No.102551800

Anonymous 09/25/24(Wed)16:25:44 No.102551800

>>102551708
wtf drummer was working on OpenAI?

Anonymous
09/25/24(Wed)16:26:20 No.102551807

Anonymous 09/25/24(Wed)16:26:20 No.102551807

>>102551796
theres no multimodal support for 11b in guis doe

Anonymous
09/25/24(Wed)16:26:24 No.102551809

Anonymous 09/25/24(Wed)16:26:24 No.102551809

so >>102546792
>70b is like $.40/Mtok
where?

Anonymous
09/25/24(Wed)16:26:40 No.102551815

Anonymous 09/25/24(Wed)16:26:40 No.102551815

Wait is the multimodal vision only or does it also include the speech stuff?

Anonymous
09/25/24(Wed)16:27:27 No.102551826

Anonymous 09/25/24(Wed)16:27:27 No.102551826

>>102551815
idk >>102551044

Anonymous
09/25/24(Wed)16:28:08 No.102551836

Anonymous 09/25/24(Wed)16:28:08 No.102551836

>>102551807
Really? I thought ooba supported it unless I'm crazy.

Anonymous
09/25/24(Wed)16:29:47 No.102551859

Anonymous 09/25/24(Wed)16:29:47 No.102551859

File: OpenAI-team-15883220_v4.png (1.09 MB, 768x960)

1.09 MB PNG

>>102551708
>Sam actually came out on top
What was that one quote about if you put Sam in a room with a bunch of cutthroats, Sam will be the only one that remains.

Anonymous
09/25/24(Wed)16:31:19 No.102551877

Anonymous 09/25/24(Wed)16:31:19 No.102551877

>>102551708
What would be the reason? Is she leaving the sinking ship after realizing that no one brought their o1 CoT bullshit?

Anonymous
09/25/24(Wed)16:31:36 No.102551879

Anonymous 09/25/24(Wed)16:31:36 No.102551879

i don't mind openAI but i'm still happy to see their monopoly get btfo one new model at a time. it's better that way

Anonymous
09/25/24(Wed)16:32:33 No.102551892

Anonymous 09/25/24(Wed)16:32:33 No.102551892

>>102551815
It's vision only, the speech stuff is an entirely separate thing

Anonymous
09/25/24(Wed)16:32:43 No.102551895

Anonymous 09/25/24(Wed)16:32:43 No.102551895

>>102551836
maybe im wrong but wasnt that just an extension that used a vision to text model then the model interacted with the text only..

Anonymous
09/25/24(Wed)16:34:45 No.102551912

Anonymous 09/25/24(Wed)16:34:45 No.102551912

>>102551879
I do mind OpenAI, they unleashed poison data and now I have to suffer if I use any model released after 2023. Hope they go bankrupt.

Anonymous
09/25/24(Wed)16:38:17 No.102551948

Anonymous 09/25/24(Wed)16:38:17 No.102551948

Reading the repo for molmo, it says to uininstall tensorflow and install the CPU only version... does it not do GPU inference locally?

Anonymous
09/25/24(Wed)16:38:57 No.102551953

Anonymous 09/25/24(Wed)16:38:57 No.102551953

>>102551892
So will we ever get the speech stuff? Was it it too unsafe or something?

Anonymous
09/25/24(Wed)16:39:09 No.102551954

Anonymous 09/25/24(Wed)16:39:09 No.102551954

molmo.gguf?

Anonymous
09/25/24(Wed)16:40:17 No.102551973

Anonymous 09/25/24(Wed)16:40:17 No.102551973

File: 1718470225216862.jpg (222 KB, 720x720)

222 KB JPG

>>102551859
>OPENAI TO REMOVE NON-PROFIT CONTROL, GIVE ALTMAN EQUITY
>ALTMAN WILL RECEIVE EQUITY FOR THE FIRST TIME IN THE FOR-PROFIT COMPANY, WHICH COULD BE WORTH $150 BILLION

Anonymous
09/25/24(Wed)16:41:19 No.102551989

Anonymous 09/25/24(Wed)16:41:19 No.102551989

>>102551948
why tf can companies not release things in a standard format? I'm not installing random shit when other models need nothing

Anonymous
09/25/24(Wed)16:42:09 No.102552003

Anonymous 09/25/24(Wed)16:42:09 No.102552003

File: test.png (84 KB, 1000x800)

84 KB PNG

>>102542933
I'm adding plots.
Also it's the first time I'm testing a code assistant, this is really nice. I can never remember how to use matplotlib so having the AI take care of all the boilerplate is a game changer.
I'm using Qwen2.5-Coder-7B with IQ4 XS, are there any other models I should test? I'm already downloading CodeLlama 7B and WizardCoder 7B and 13B.

Anonymous
09/25/24(Wed)16:42:38 No.102552008

Anonymous 09/25/24(Wed)16:42:38 No.102552008

>>102551989
because there is no standard

Anonymous
09/25/24(Wed)16:44:14 No.102552035

Anonymous 09/25/24(Wed)16:44:14 No.102552035

>>102551859
Based Sam. Don't rock the fucking boat.

Anonymous
09/25/24(Wed)16:45:15 No.102552045

Anonymous 09/25/24(Wed)16:45:15 No.102552045

>>102552003
this is a bell curve meme, where engineers on the left and right side just use a plain text editor

Anonymous
09/25/24(Wed)16:45:25 No.102552047

Anonymous 09/25/24(Wed)16:45:25 No.102552047

>>102552003
>CodeLlama 7B and WizardCoder 7B and 13B.
those are super old and probably much worse than qwen2.5

Anonymous
09/25/24(Wed)16:45:28 No.102552048

Anonymous 09/25/24(Wed)16:45:28 No.102552048

>>102552020
>>102552020
>>102552020

Anonymous
09/25/24(Wed)16:47:42 No.102552080

Anonymous 09/25/24(Wed)16:47:42 No.102552080

>>102551973
what the fuck? Sam isn't even a scientist, this fucker couldn't resolve a 4th grade math equation and he'll be the one recieving all the money? WHYYYYYYYY

Anonymous
09/25/24(Wed)16:48:38 No.102552096

Anonymous 09/25/24(Wed)16:48:38 No.102552096

>>102552003
>>102552065

Anonymous
09/25/24(Wed)16:49:11 No.102552105

Anonymous 09/25/24(Wed)16:49:11 No.102552105

File: sam.png (171 KB, 474x324)

171 KB PNG

>>102551859
He can't keep getting away with this!

Anonymous
09/25/24(Wed)16:50:23 No.102552119

Anonymous 09/25/24(Wed)16:50:23 No.102552119

File: it's over.png (47 KB, 1022x635)

47 KB PNG

Oh noes

Anonymous
09/25/24(Wed)16:51:54 No.102552143

Anonymous 09/25/24(Wed)16:51:54 No.102552143

>>102552119
kek

Anonymous
09/25/24(Wed)16:52:12 No.102552149

Anonymous 09/25/24(Wed)16:52:12 No.102552149

>>102552119
never ever

Anonymous
09/25/24(Wed)16:53:57 No.102552175

Anonymous 09/25/24(Wed)16:53:57 No.102552175

>>102552003
You do know llama-bench exists, right?

Anonymous
09/25/24(Wed)16:58:40 No.102552238

Anonymous 09/25/24(Wed)16:58:40 No.102552238

>>102552175
Yes, that's what my script is calling. But llama-bench doesn't output graphs and doesn't come with search algorithms.

Anonymous
09/25/24(Wed)16:58:52 No.102552242

Anonymous 09/25/24(Wed)16:58:52 No.102552242

>>102552003
Hey very cool, I didn't see your message before.

Anonymous
09/25/24(Wed)16:59:42 No.102552258

Anonymous 09/25/24(Wed)16:59:42 No.102552258

>>102552119
xe will finally -ack ximself...

Anonymous
09/25/24(Wed)17:01:29 No.102552297

Anonymous 09/25/24(Wed)17:01:29 No.102552297

>>102551602
1x24GB VRAM is such a niche segment. I think only I have that.

Anonymous
09/25/24(Wed)17:02:31 No.102552318

Anonymous 09/25/24(Wed)17:02:31 No.102552318

>llms still struggle with the pyqtgraph update, always making a pyqt4 code
kek

Anonymous
09/25/24(Wed)17:02:56 No.102552323

Anonymous 09/25/24(Wed)17:02:56 No.102552323

>>102551708
I am actually sad. Open AI could use more women to siphon money and contribute nothing.

Anonymous
09/25/24(Wed)17:20:57 No.102552599

Anonymous 09/25/24(Wed)17:20:57 No.102552599

>>102551809
>APIs such as openrouter/groq

Anonymous
09/25/24(Wed)17:23:36 No.102552641

Anonymous 09/25/24(Wed)17:23:36 No.102552641

File: file.png (274 KB, 964x849)

274 KB PNG

not bad for a 3b

Anonymous
09/25/24(Wed)17:26:31 No.102552678

Anonymous 09/25/24(Wed)17:26:31 No.102552678

>>102552641
>same structure and eyes/expression mentioned twice in every response

Anonymous
09/25/24(Wed)17:27:46 No.102552699

Anonymous 09/25/24(Wed)17:27:46 No.102552699

>>102552641
>"please refrain from using physical gestures"
>>>not bad
Only if you are local-cuck chugging on safety garbage.

Anonymous
09/25/24(Wed)17:29:24 No.102552719

Anonymous 09/25/24(Wed)17:29:24 No.102552719

>>102552699
I mean, you're talking to a doctor, of course he's gonna act like a cuck, let's not forget the covid era

Anonymous
09/25/24(Wed)17:30:39 No.102552740

Anonymous 09/25/24(Wed)17:30:39 No.102552740

>>102552719
enjoy death retard

Anonymous
09/25/24(Wed)17:31:51 No.102552755

Anonymous 09/25/24(Wed)17:31:51 No.102552755

>>102552740
:'(

Anonymous
09/25/24(Wed)17:32:40 No.102552769

Anonymous 09/25/24(Wed)17:32:40 No.102552769

File: file.png (331 KB, 948x840)

331 KB PNG

sovl

Anonymous
09/25/24(Wed)17:36:10 No.102552815

Anonymous 09/25/24(Wed)17:36:10 No.102552815

File: file.png (323 KB, 966x854)

323 KB PNG

cai is back on the menu boys

Anonymous
09/25/24(Wed)17:42:39 No.102552919

Anonymous 09/25/24(Wed)17:42:39 No.102552919

>>102552769
>>102552815
what model?

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.