/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 09/13/24(Fri)06:39:54 No.102364922

File: yann-lecun.jpg (293 KB, 940x1410)

293 KB JPG

/lmg/ - Local Models General Anonymous 09/13/24(Fri)06:39:54 No.102364922 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>102356839 & >>102348952

►News
>(09/12) DataGemma with DataCommons retrieval: https://blog.google/technology/ai/google-datagemma-ai-llm/
>(09/12) LLaMA-Omni: Multimodal LLM with seamless speech interaction: https://huggingface.co/ICTNLP/Llama-3.1-8B-Omni
>(09/11) Fish Speech multilingual TTS with voice replication: https://hf.co/fishaudio/fish-speech-1.4
>(09/11) Pixtral: 12B with image input vision adapter: https://xcancel.com/mistralai/status/1833758285167722836
>(09/11) Solar Pro Preview, Phi-3-medium upscaled to 22B: https://hf.co/upstage/solar-pro-preview-instruct

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://hf.co/spaces/mike-ravkine/can-ai-code-results

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
09/13/24(Fri)06:40:24 No.102364926

Anonymous 09/13/24(Fri)06:40:24 No.102364926

File: lecun_heart.png (124 KB, 504x462)

124 KB PNG

►Recent Highlights from the Previous Thread: >>102356839

--o1-mini model excels in generation but struggles with completion tasks, sparking discussion on benchmarking methods and model capabilities: >>102361744 >>102362578 >>102362651 >>102362675 >>102361852 >>102362003 >>102362086 >>102362876 >>102362897
--Anon solves code-breaking puzzle that o1 failed to crack: >>102357917 >>102358165 >>102358342 >>102358238 >>102358243 >>102359615 >>102359743
--OpenAI's decision to hide chains of thought from users, with mention of monitoring for manipulation: >>102361616 >>102361628 >>102361644 >>102361735
--O1 model solves CTF challenge by exploiting Docker API misconfiguration: >>102359414 >>102359504
--M series Macs for running large local models: >>102360064 >>102360374 >>102360467 >>102360519 >>102360612 >>102360629 >>102360761
--LLMs can reason, but lack general intelligence and autonomous learning: >>102361502
--Extracting CoT prompts and the challenges posed by OpenAI's guidelines: >>102361025 >>102361046 >>102361131 >>102361217 >>102361170 >>102361205 >>102361147
--Backend CoT process and its impact on writing fiction prose: >>102361172 >>102361226 >>102361300
--Anons discuss OAI's unverifiable and potentially exploitative token charging system: >>102361649 >>102361664 >>102361680 >>102361688
--Anons discuss GPT-4-O1 access, benchmarks, and implications: >>102358770 >>102358792 >>102358811 >>102359089 >>102359114 >>102359280 >>102359426 >>102358911 >>102358957 >>102359055 >>102359661 >>102358814
--Anon is impressed by Google NotebookLM's podcast generation capabilities: >>102359806 >>102359843 >>102360015 >>102360252 >>102363640
--Anon complains about being billed for invisible reasoning tokens: >>102360189 >>102360210 >>102360220 >>102360246 >>102360251 >>102360286 >>102361570
--Livebench results are out: >>102363403 >>102363472 >>102363556 >>102363610
--Miku (free space): >>102356889 >>102357160

►Recent Highlight Posts from the Previous Thread: >>102356847

Anonymous
09/13/24(Fri)06:49:38 No.102365023

Anonymous 09/13/24(Fri)06:49:38 No.102365023

>>102364945
I'm using llms at work for coding. I'm not crazy enough to use local for that though, sonnet 3.5
Besides coding and creative writing llms are still useless.
Imagine being so assmad because I enjoy my loli beastiality RP at the click of a button lol
Its like somebody draws titties with the new MS95 paint tool and you put on the nerd glasses
>uhm all that technology and all you use it for is this icky stuff. cant even.
get real man.

Anonymous
09/13/24(Fri)06:53:06 No.102365054

Anonymous 09/13/24(Fri)06:53:06 No.102365054

>>102365023
True

Anonymous
09/13/24(Fri)06:53:24 No.102365059

Anonymous 09/13/24(Fri)06:53:24 No.102365059

>>102365023
>>102365054
nta but i think you missed the point

Anonymous
09/13/24(Fri)06:54:09 No.102365065

Anonymous 09/13/24(Fri)06:54:09 No.102365065

All this strawberry and qstar stuff and llama4 will be another pure transformer llmslop pumped full of synthetic data. Lecun is a hack
>b-but he doesn't work on llms
I don't give a fuck. What does he work on that the only thing Meta keeps pumping out is LLM after LLM?

Anonymous
09/13/24(Fri)06:55:16 No.102365081

Anonymous 09/13/24(Fri)06:55:16 No.102365081

>>102365065
Llama-V-JEPA trust the plan

Anonymous
09/13/24(Fri)06:56:25 No.102365088

Anonymous 09/13/24(Fri)06:56:25 No.102365088

>>102365065
He is busy sperging out on elon and brigading for every vaxxie's god fauci ouchie.

Anonymous
09/13/24(Fri)06:57:01 No.102365093

Anonymous 09/13/24(Fri)06:57:01 No.102365093

>>102365023
nta but you have irredeemably shit taste

Anonymous
09/13/24(Fri)06:58:25 No.102365104

Anonymous 09/13/24(Fri)06:58:25 No.102365104

>>102365023
You posted your smut like it was peak writing. How can you not cringe when you read that shit is beyond me. Besides, that had nothing to do with CoT, and your complex format usually collapses very fast as it reinforces the repetition in every message.

Anonymous
09/13/24(Fri)06:59:32 No.102365112

Anonymous 09/13/24(Fri)06:59:32 No.102365112

oh god the morality police is here

Anonymous
09/13/24(Fri)07:00:53 No.102365120

Anonymous 09/13/24(Fri)07:00:53 No.102365120

>>102365023
You are faggot, simple as.

Anonymous
09/13/24(Fri)07:02:05 No.102365133

Anonymous 09/13/24(Fri)07:02:05 No.102365133

>>102365093
>>102365104
>How can you not cringe when you read that shit is beyond me.
Its great output for 12b. It actually follows the format and is creative enough.
Kinda insane I have to justify RP in a /lmg/ thread. I hope you are all the same guy.

Anonymous
09/13/24(Fri)07:10:52 No.102365209

Anonymous 09/13/24(Fri)07:10:52 No.102365209

>>102365133
Go easy on him, he has a failing company to lead.

Anonymous
09/13/24(Fri)07:15:18 No.102365256

Anonymous 09/13/24(Fri)07:15:18 No.102365256

Anyone with o1 access managed to try this? >>102363366

Anonymous
09/13/24(Fri)07:21:53 No.102365304

Anonymous 09/13/24(Fri)07:21:53 No.102365304

>>102364922
Simulating cats with Yann Lecun

Anonymous
09/13/24(Fri)07:28:26 No.102365371

Anonymous 09/13/24(Fri)07:28:26 No.102365371

File: IMG_2220.png (322 KB, 1290x2796)

322 KB PNG

>>102365256
It got it after thinking for 52 seconds.

Anonymous
09/13/24(Fri)07:33:24 No.102365425

Anonymous 09/13/24(Fri)07:33:24 No.102365425

>>102365371
that's pretty impressive not gonna lie
gpt-4o typically just fucked up the velocity calculation or just gave me the formula for circumference. If it got to the integral part, it usually gave up and said it's a non-typical integral

Anonymous
09/13/24(Fri)07:35:54 No.102365458

Anonymous 09/13/24(Fri)07:35:54 No.102365458

oh god the physics majors are here

Anonymous
09/13/24(Fri)07:38:41 No.102365485

Anonymous 09/13/24(Fri)07:38:41 No.102365485

>>102365371
That's pretty fucking cool, but it would be even cooler if it didn't need to <think> for a fucking minute.

Anonymous
09/13/24(Fri)07:39:28 No.102365494

Anonymous 09/13/24(Fri)07:39:28 No.102365494

>>102363366
What is that supposed to test?

Anonymous
09/13/24(Fri)07:40:20 No.102365500

Anonymous 09/13/24(Fri)07:40:20 No.102365500

>>102365494
it's a multi-step process. Until now, all failed to solve it because they are rushing for the answer.

Anonymous
09/13/24(Fri)07:43:22 No.102365532

Anonymous 09/13/24(Fri)07:43:22 No.102365532

Why OpenRouter's O1 doesn't work anymore?

Anonymous
09/13/24(Fri)07:45:52 No.102365558

Anonymous 09/13/24(Fri)07:45:52 No.102365558

File: GXUEQ-7b0AAM1wj.jpg (86 KB, 1170x1374)

86 KB JPG

am i using o1 right?

Anonymous
09/13/24(Fri)07:46:38 No.102365565

Anonymous 09/13/24(Fri)07:46:38 No.102365565

>>102365485
1 minute is quick enough for this kind of question, tf are you complaining about?

Anonymous
09/13/24(Fri)07:49:26 No.102365594

Anonymous 09/13/24(Fri)07:49:26 No.102365594

>>102365565
>team fortress are you complaining about
for a language model, sure, but for agi 2 more megawatts that's pretty pathetic

Anonymous
09/13/24(Fri)07:50:31 No.102365599

Anonymous 09/13/24(Fri)07:50:31 No.102365599

>>102365485
The thinking part is probably something that will be needed in the future and just not visible or only with a debug flag or something.
We need sheer unlimited context and much higher speed, much cheaper.
1 Minute waiting down is no fun, I agree. Not sure I will use o1 even if becomes cheaper.
Imagine you need something done quick just for the llm to trip up like usual and need another request. o1 does make mistakes. Sometimes doesnt even get the stupid ass "strawberry" test correct.
Double painful for the wasted tokens.

Anonymous
09/13/24(Fri)07:55:20 No.102365639

Anonymous 09/13/24(Fri)07:55:20 No.102365639

How do I make my own diffusion model? (No finetuning) I have the images.

Anonymous
09/13/24(Fri)08:02:38 No.102365695

Anonymous 09/13/24(Fri)08:02:38 No.102365695

>>102365639
Got a few millions $?

Anonymous
09/13/24(Fri)08:03:36 No.102365704

Anonymous 09/13/24(Fri)08:03:36 No.102365704

>>102365695
Yes.

Anonymous
09/13/24(Fri)08:04:35 No.102365707

Anonymous 09/13/24(Fri)08:04:35 No.102365707

>>102365639
>I have the images.
did you caption them all? kek

Anonymous
09/13/24(Fri)08:11:51 No.102365759

Anonymous 09/13/24(Fri)08:11:51 No.102365759

Based thread.

Anonymous
09/13/24(Fri)08:14:10 No.102365779

Anonymous 09/13/24(Fri)08:14:10 No.102365779

>>102365458
Yes, I've been here since the LLaMA 1 leak.

Anonymous
09/13/24(Fri)08:22:44 No.102365850

Anonymous 09/13/24(Fri)08:22:44 No.102365850

>>102365707
Automatically yeah

Anonymous
09/13/24(Fri)08:23:42 No.102365854

Anonymous 09/13/24(Fri)08:23:42 No.102365854

>>102365850
>automatic slop
save your money

Anonymous
09/13/24(Fri)08:24:34 No.102365860

Anonymous 09/13/24(Fri)08:24:34 No.102365860

>>102365854
who's pretraining their model with manual captions? you need hundreds of millions of pictures to pretrain a model

Anonymous
09/13/24(Fri)08:28:22 No.102365894

Anonymous 09/13/24(Fri)08:28:22 No.102365894

>>102364922
Yann LeCum

Anonymous
09/13/24(Fri)08:29:10 No.102365903

Anonymous 09/13/24(Fri)08:29:10 No.102365903

>>102365894
Yum! Le Cum

Anonymous
09/13/24(Fri)08:34:10 No.102365932

Anonymous 09/13/24(Fri)08:34:10 No.102365932

So now that we know that the reflection guy was actually correct when is someone going to finetune mistral large similar to what he did and do it correctly this time?

Anonymous
09/13/24(Fri)08:35:42 No.102365943

Anonymous 09/13/24(Fri)08:35:42 No.102365943

What was the last time somebody used the SuperCOT dataset?

Anonymous
09/13/24(Fri)08:37:14 No.102365956

Anonymous 09/13/24(Fri)08:37:14 No.102365956

>>102365943
modern models have implicit cot for hard queries.

Anonymous
09/13/24(Fri)08:38:46 No.102365962

Anonymous 09/13/24(Fri)08:38:46 No.102365962

>>102365932
Never.

Anonymous
09/13/24(Fri)08:39:28 No.102365965

Anonymous 09/13/24(Fri)08:39:28 No.102365965

So let me get this straight....
OAI just did a CoT finetune of 4o and they're acting like they reinvented the fucking wheel?
How are investors this stupid?

Anonymous
09/13/24(Fri)08:41:30 No.102365977

Anonymous 09/13/24(Fri)08:41:30 No.102365977

>>102365965
CoT finetune + new tokenizer + specifically trained for huge CoTs and talking to itself, remember those "System A, System B" things from last year?

The CoTs are 600 tokens minimum even for "hi". And for real questions 2000-10000

Anonymous
09/13/24(Fri)08:42:06 No.102365980

Anonymous 09/13/24(Fri)08:42:06 No.102365980

File: 1726231311448.jpg (423 KB, 1080x1040)

423 KB JPG

This redditor is smarter than the average /lmg/ anon

Anonymous
09/13/24(Fri)08:42:10 No.102365981

Anonymous 09/13/24(Fri)08:42:10 No.102365981

>>102365977
So basically a CoT finetune.

Anonymous
09/13/24(Fri)08:42:15 No.102365982

Anonymous 09/13/24(Fri)08:42:15 No.102365982

>>102365965
If I got it correctly, it's something more involved than just that. It's CoT + having the model itterate over its own output in some form (the whole think longer part) which could be just looping its hidden layers N times (something I've seen lmg discuss over a year ago I'm pretty sure), or literally feeding its output as input to itself a couple of times

Anonymous
09/13/24(Fri)08:43:37 No.102365994

Anonymous 09/13/24(Fri)08:43:37 No.102365994

>>102365981
Oh it's definitely a meme. Post above yours says it perfectly.

Anonymous
09/13/24(Fri)08:44:58 No.102366010

Anonymous 09/13/24(Fri)08:44:58 No.102366010

>>102365980
>>102365994
there's no way the 1 year Strawberry hype was just some CoT finetune, OpenAI doesn't know what to do anymore

Anonymous
09/13/24(Fri)08:45:41 No.102366016

Anonymous 09/13/24(Fri)08:45:41 No.102366016

>>102365982
The whole thinking longer is just asking the LLM something like "Criticize this output and write an improved output that improves the initial output." over and over again.

Anonymous
09/13/24(Fri)08:46:57 No.102366024

Anonymous 09/13/24(Fri)08:46:57 No.102366024

>>102366016
Yeah, it's most likely that.

Anonymous
09/13/24(Fri)08:47:40 No.102366027

Anonymous 09/13/24(Fri)08:47:40 No.102366027

>>102365982
You can literally do that by having the model write out its output to a variable and iterate back over it with instructions to do something with it. Yandere AI Girlfriend Simulator did that with fucking turbo, an old, obsolete, retard model by today's standards and it worked out fine most of the time and error handling prevented it from acting retarded if it didn't work.
It's basically just a CoT finetune.

>>102365980
It would certainly seem that way now, wouldn't it? But I'm 99% sure most of the people posting positively about o1 here are either paid shills or mentally ill. It's not possible to be as stupid as they are by accident.

Anonymous
09/13/24(Fri)08:48:19 No.102366032

Anonymous 09/13/24(Fri)08:48:19 No.102366032

File: 1726007651129969.gif (2.44 MB, 987x644)

2.44 MB GIF

>>102359806

>https://notebooklm.google.com/

I logged in, how to use it?

Anonymous
09/13/24(Fri)08:48:51 No.102366037

Anonymous 09/13/24(Fri)08:48:51 No.102366037

>>102366016
Why is the user needs to pay for safety tokens instead of US government?

Anonymous
09/13/24(Fri)08:49:28 No.102366043

Anonymous 09/13/24(Fri)08:49:28 No.102366043

>>102366027
Well, the matter stands that it's much better at Maths and STEM. It is impressive.

However normies will soon find out that anyone can do this. Real winner will be Opus 3.5

Anonymous
09/13/24(Fri)08:51:04 No.102366059

Anonymous 09/13/24(Fri)08:51:04 No.102366059

>>102365860
I just need a tutorial; I only see indians on YouTube

Anonymous
09/13/24(Fri)08:51:40 No.102366067

Anonymous 09/13/24(Fri)08:51:40 No.102366067

File: strawberry-sam_altman_fee(...).png (89 KB, 415x707)

89 KB PNG

this is a picture of sam altman made by strawberry in style of feels guy. only agi could produce something like this.

Anonymous
09/13/24(Fri)08:51:51 No.102366070

Anonymous 09/13/24(Fri)08:51:51 No.102366070

>>102366032
You click the big blue button, upload a text/pdf and then you can use it like RAG or generate an audio podcast.

Anonymous
09/13/24(Fri)08:52:26 No.102366076

Anonymous 09/13/24(Fri)08:52:26 No.102366076

>>102366043
>Well, the matter stands that it's much better at Maths and STEM. It is impressive.
Prove it.
Other than meme mark scores which have been proven time and time again can be fudged.

Anonymous
09/13/24(Fri)08:54:27 No.102366096

Anonymous 09/13/24(Fri)08:54:27 No.102366096

>>102366076
There's always Livebench, the least gameable Benchmark since it constantly changes.
https://livebench.ai/

And there is a hard test done ITT above too that the anon said no LLM has completed

Anonymous
09/13/24(Fri)08:55:52 No.102366107

Anonymous 09/13/24(Fri)08:55:52 No.102366107

>>102366096
>SARRS, SARRS PLEASE REDEEM THE REASONING SARRS

Anonymous
09/13/24(Fri)08:56:58 No.102366114

Anonymous 09/13/24(Fri)08:56:58 No.102366114

>>102365558
baaaaaaaaaased

Anonymous
09/13/24(Fri)08:57:17 No.102366118

Anonymous 09/13/24(Fri)08:57:17 No.102366118

>>102366096
See: >>102365980

Anonymous
09/13/24(Fri)08:57:51 No.102366124

Anonymous 09/13/24(Fri)08:57:51 No.102366124

>>102366107
You have lost track of the conversation. No one is shilling o1 here. It's just a CoT finetune of 4o but it works.

Any company can make it.

Anonymous
09/13/24(Fri)08:58:26 No.102366131

Anonymous 09/13/24(Fri)08:58:26 No.102366131

>>102365980
nah, redditors think that nu-CoT is the new paradigm, they are beyond recovery at this point

/lmg/'s consesus is that o1 is yet another reskinned gpt4, aka a nothingburger

Anonymous
09/13/24(Fri)08:59:10 No.102366137

Anonymous 09/13/24(Fri)08:59:10 No.102366137

>>102366096
>https://livebench.ai/
>coding is still worse than sonnet
Not looking too good for s(c)am

Anonymous
09/13/24(Fri)08:59:43 No.102366146

Anonymous 09/13/24(Fri)08:59:43 No.102366146

>>102366096
>And there is a hard test done ITT above too that the anon said no LLM has completed
maybe gpt4-o1 can do it, but we'll never know, Sam thinks it's too powerful for us poor mortals.
https://youtu.be/GzlKja1ySzo?t=9

Anonymous
09/13/24(Fri)09:01:39 No.102366170

Anonymous 09/13/24(Fri)09:01:39 No.102366170

>>102365860
>what are *boorus autists, mechanical turks and kenyans

Anonymous
09/13/24(Fri)09:01:48 No.102366172

Anonymous 09/13/24(Fri)09:01:48 No.102366172

File: file.png (78 KB, 2348x508)

78 KB PNG

>>102366096
>Claude 3.5 Sonnet still the best at the most relevant assisant capability, coding
not looking good at all for OpenAI

Anonymous
09/13/24(Fri)09:01:50 No.102366174

Anonymous 09/13/24(Fri)09:01:50 No.102366174

>>102366096
the least gameable benchmarks are the ones with a 100% private dataset, both questions and answers

Anonymous
09/13/24(Fri)09:02:50 No.102366181

Anonymous 09/13/24(Fri)09:02:50 No.102366181

>>102366170
yeah so you need a shit ton of money to hire all the jeets that will manually caption your images, that's not something accessible to anyone, far from that

Anonymous
09/13/24(Fri)09:03:34 No.102366192

Anonymous 09/13/24(Fri)09:03:34 No.102366192

>>102366174
Those sound prime for bad actors and being paid off.
Livebench is a solid compromise, Public and questions get switched every month. They have talked about their methodoly, and they get math etc. questions from recent olympiads which are not possible to be in the database for example.

Anonymous
09/13/24(Fri)09:03:42 No.102366195

Anonymous 09/13/24(Fri)09:03:42 No.102366195

>>102366181
That was the first requirement >>102365695

Anonymous
09/13/24(Fri)09:06:52 No.102366225

Anonymous 09/13/24(Fri)09:06:52 No.102366225

>>102366192
>questions from recent olympiads which are not possible to be in the database for example.
that's a good strategy desu, those guys know what they're doing

Anonymous
09/13/24(Fri)09:07:25 No.102366232

Anonymous 09/13/24(Fri)09:07:25 No.102366232

>>102366192
>being paid off
anyone can be paid off, some company may have paid the math olympiads off to get the new questions long before the release of their model
>questions get switched every month
it still includes old questions

Anonymous
09/13/24(Fri)09:09:11 No.102366250

Anonymous 09/13/24(Fri)09:09:11 No.102366250

>>102366192
also "each month" is too long, they can finetune on those new questions faster than that

Anonymous
09/13/24(Fri)09:15:59 No.102366343

Anonymous 09/13/24(Fri)09:15:59 No.102366343

File: strawberry-sam_altman.gif (307 KB, 275x400)

307 KB GIF

this is an animation of sam altman happily jumping up after taking his sixth bbcooster shot made by gpt-o1(q*/strawberry). can you feel the agi?

Anonymous
09/13/24(Fri)09:16:26 No.102366347

Anonymous 09/13/24(Fri)09:16:26 No.102366347

>>102366250
This. It may take months to train a foundational model but it takes more hours to fine-tune them, especially on h100s.

Anonymous
09/13/24(Fri)09:16:47 No.102366354

Anonymous 09/13/24(Fri)09:16:47 No.102366354

go back Petr*, no one likes you

Anonymous
09/13/24(Fri)09:16:48 No.102366356

Anonymous 09/13/24(Fri)09:16:48 No.102366356

how long it takes for you to generate podcast? I'm waiting for +6 minutes rn

Anonymous
09/13/24(Fri)09:18:04 No.102366369

Anonymous 09/13/24(Fri)09:18:04 No.102366369

>>102366356
It does take a while.

Anonymous
09/13/24(Fri)09:18:24 No.102366375

Anonymous 09/13/24(Fri)09:18:24 No.102366375

File: 1726233498012.jpg (277 KB, 725x1024)

277 KB JPG

are there any good finetunes/mergesloppas of mistral large 2 or are we just stuck with the original model and magnum?

Anonymous
09/13/24(Fri)09:21:19 No.102366405

Anonymous 09/13/24(Fri)09:21:19 No.102366405

>>102366354
Who is Petr*?

Anonymous
09/13/24(Fri)09:26:25 No.102366448

Anonymous 09/13/24(Fri)09:26:25 No.102366448

So I think at this point it's obvious that saltman is behind the strawberryQ* larp to try and hype up his nothingburger product.
>creates fraudulent viral marketing campaign
>surprise it's a fucking CoT finetune of 4o
How are investors not suing the shit out of him as we speak?

Anonymous
09/13/24(Fri)09:38:04 No.102366575

Anonymous 09/13/24(Fri)09:38:04 No.102366575

>>102366375
Just use Large with the XTC sampler, or with high temperature and min p.

Anonymous
09/13/24(Fri)09:38:51 No.102366583

Anonymous 09/13/24(Fri)09:38:51 No.102366583

>>102366405
good question. I've been around since around the time anons were claiming that mistral 7b could beat a 70b and I still don't know who it is. the first time I think I heard about Petra was in aicg threads for something to do with proxy logging or some other stupid shit (I could be misremembering this because there were a lot of retards shitstirring in both aicg and lmg around that time), but I still have no clue what Petra did unless it's just something that some schizofag started and everyone else thought it would be funny to start accusing anons that they disagree with or can't argue against as being Petra.
>inb4 hi Petra

Anonymous
09/13/24(Fri)09:41:15 No.102366606

Anonymous 09/13/24(Fri)09:41:15 No.102366606

>>102366575
What temp + min p do you recommend?

Anonymous
09/13/24(Fri)09:42:10 No.102366616

Anonymous 09/13/24(Fri)09:42:10 No.102366616

>>102366448
Investors are dumb. Look at what another Sam(Bankman-Fried) could almost get away with.

Anonymous
09/13/24(Fri)09:44:16 No.102366633

Anonymous 09/13/24(Fri)09:44:16 No.102366633

>>102366583
the anon you're replying to is xer

Anonymous
09/13/24(Fri)09:44:19 No.102366634

Anonymous 09/13/24(Fri)09:44:19 No.102366634

>>102366616
Exactly.
Your average investor works purely on hype and line-go-up principles.

Anonymous
09/13/24(Fri)09:44:28 No.102366636

Anonymous 09/13/24(Fri)09:44:28 No.102366636

how to fish on windows+cpu?

Anonymous
09/13/24(Fri)09:48:32 No.102366681

Anonymous 09/13/24(Fri)09:48:32 No.102366681

File: 1726235278852.gif (1.83 MB, 240x240)

1.83 MB GIF

>>102366633
thanks *rapes you with retard cock*

Anonymous
09/13/24(Fri)09:49:17 No.102366692

Anonymous 09/13/24(Fri)09:49:17 No.102366692

>>102366583
Last year schizo:
https://desuarchive.org/g/search/text/petra/start/2023-05-20/order/asc/

Anonymous
09/13/24(Fri)09:50:34 No.102366706

Anonymous 09/13/24(Fri)09:50:34 No.102366706

File: file.png (285 KB, 3571x2368)

285 KB PNG

https://youtu.be/gkhwK6Wlod8

Anonymous
09/13/24(Fri)09:53:26 No.102366741

Anonymous 09/13/24(Fri)09:53:26 No.102366741

File: 53 Days Until November 5.png (842 KB, 1176x880)

842 KB PNG

Anonymous
09/13/24(Fri)09:54:50 No.102366752

Anonymous 09/13/24(Fri)09:54:50 No.102366752

>>102366692
Oh, it's the ugly bitch spammer. Must be some kind of brown who sees any blonde and thinks, "SHiieeet nigga, dat bitch hot," even if she's ugly or fat.

Anonymous
09/13/24(Fri)10:00:13 No.102366809

Anonymous 09/13/24(Fri)10:00:13 No.102366809

/lmg/, I am going to generate erotic fiction and I need only your BEST local LLM that's below 10gb. What is the objectively best one?

Anonymous
09/13/24(Fri)10:02:27 No.102366838

Anonymous 09/13/24(Fri)10:02:27 No.102366838

File: 1724040920538749.png (191 KB, 600x979)

191 KB PNG

>>102366809 (me)
sorry, forgot my avatar

Anonymous
09/13/24(Fri)10:02:44 No.102366842

Anonymous 09/13/24(Fri)10:02:44 No.102366842

>>102366809
Gemmasutra 2b

Anonymous
09/13/24(Fri)10:03:04 No.102366846

Anonymous 09/13/24(Fri)10:03:04 No.102366846

>>102366809
Stheno works for me. So does Lumimaid.

Anonymous
09/13/24(Fri)10:04:11 No.102366861

Anonymous 09/13/24(Fri)10:04:11 No.102366861

>>102366809
Nemo.
Either Lyra or mini-magnum.

Anonymous
09/13/24(Fri)10:04:21 No.102366866

Anonymous 09/13/24(Fri)10:04:21 No.102366866

>>102366809
Tiny Llama

Anonymous
09/13/24(Fri)10:04:25 No.102366867

Anonymous 09/13/24(Fri)10:04:25 No.102366867

>>102366809
Nemo finetune like magnum-12b-v2-q5_k or MN-12B-Lyra-v4.Q5_K_M.
Stheno is too horny. Part of the fiction should ne resistance.

Anonymous
09/13/24(Fri)10:04:49 No.102366873

Anonymous 09/13/24(Fri)10:04:49 No.102366873

>>102366842
baaaaaaaaaased

Anonymous
09/13/24(Fri)10:07:49 No.102366907

Anonymous 09/13/24(Fri)10:07:49 No.102366907

>>102366809
I'll provide you with an overview of popular local LLMs (Large Language Models) that are under 10 GB. Since "objectively best" can be subjective, I'll focus on models that are widely used, well-documented, and suitable for text generation tasks like erotic fiction.

Top contenders:

>T5 (Text-to-Text Transfer Trained): A highly versatile model developed by Google. It's a text-to-text transformer with a relatively small footprint (around 5.6 GB). T5 has been fine-tuned for various tasks, including text generation, and has shown impressive results.
>BART (Bidirectional and Auto-Regressive Transformers): Another Google-developed model, BART is a denoising autoencoder that can generate text. It's smaller than T5 (around 4.5 GB) and has been used for text generation, summarization, and translation tasks.
>Longformer: Developed by Facebook AI, Longformer is a long-range transformer that can handle longer input sequences than traditional transformers. It's relatively small (around 4.2 GB) and has been used for text generation, summarization, and question-answering tasks.
>DistilBERT (Distilled BERT): A smaller, more efficient version of BERT (around 3.5 GB), developed by Hugging Face. DistilBERT has been fine-tuned for various tasks, including text generation, and is a popular choice for many applications.

Anonymous
09/13/24(Fri)10:08:49 No.102366916

Anonymous 09/13/24(Fri)10:08:49 No.102366916

>>102366692
oh yeah, I remember the ugly bitch spam I just thought that there was more to it than just a sperg shitting up the threads. also I did fuck up my memory of the aicg shit with branon, my bad.

Anonymous
09/13/24(Fri)10:17:19 No.102366997

Anonymous 09/13/24(Fri)10:17:19 No.102366997

>>102366809
i just use mistral nemo finetunes until they start feeling stale dozens of hours in, then grab a new one.
currently on MN-12B-Chronos-Gold-Celeste-v1.Q4_K_M
>>102366867
these are great, ended up dropping lyra after seeing the same slightly odd focus on one aspect of penetrative sex a few hundred times.

Anonymous
09/13/24(Fri)10:17:19 No.102366998

Anonymous 09/13/24(Fri)10:17:19 No.102366998

>>102366172
COT haters stay winning

Anonymous
09/13/24(Fri)10:19:23 No.102367030

Anonymous 09/13/24(Fri)10:19:23 No.102367030

>>102365943
The last time I mixed it in to an existing known-good dataset the result got me dragged and called a retard here for weeks
So, fuck COT

Anonymous
09/13/24(Fri)10:21:07 No.102367049

Anonymous 09/13/24(Fri)10:21:07 No.102367049

>>102365532
Probably because OAI didn’t give them permission to use their account for a mass release but they couldn’t resist doing the most dark pattern option available as always
Hopefully it makes OAI revoke their keys and openscam dies

Anonymous
09/13/24(Fri)10:24:12 No.102367086

Anonymous 09/13/24(Fri)10:24:12 No.102367086

>>102366997
>i just use mistral nemo finetunes until they start feeling stale dozens of hours in, then grab a new one.
the secret is using nemo instruct until it starts going schizo, switching to one of its boring finetunes for a while and then going back to instruct once the context is fixed

Anonymous
09/13/24(Fri)10:24:55 No.102367095

Anonymous 09/13/24(Fri)10:24:55 No.102367095

>>102367030
>The last time I mixed it in to an existing known-good dataset
As in, you just threw it all in the same fine tune?
Wouldn't it be a case where you do more than one pass with different datasets?

Anonymous
09/13/24(Fri)10:27:15 No.102367128

Anonymous 09/13/24(Fri)10:27:15 No.102367128

>>102366375
https://huggingface.co/schnapper79/lumikabra-123B_v0.4
Slop is back on the menu

Anonymous
09/13/24(Fri)10:31:37 No.102367192

Anonymous 09/13/24(Fri)10:31:37 No.102367192

>>102367095
I merged it in after in a separate step after training both separately.

Anonymous
09/13/24(Fri)10:33:38 No.102367212

Anonymous 09/13/24(Fri)10:33:38 No.102367212

>>102367192
You trained A on dataset A, then B on dataset B and merged them?
Interesting.
Or did you train one model on dataset A, then dataset B?

Anonymous
09/13/24(Fri)10:39:09 No.102367265

Anonymous 09/13/24(Fri)10:39:09 No.102367265

>>102367212
The former, basically because I was afraid that that COT would make it retarded so I wanted to have the uncontaminated one just in case. And then yeah it made it really dumb.

Anonymous
09/13/24(Fri)10:42:05 No.102367298

Anonymous 09/13/24(Fri)10:42:05 No.102367298

>>102367128
downloading as we speak. This better be some primo slop.

Anonymous
09/13/24(Fri)10:44:48 No.102367335

Anonymous 09/13/24(Fri)10:44:48 No.102367335

how can we compete with o1

Anonymous
09/13/24(Fri)10:46:13 No.102367341

Anonymous 09/13/24(Fri)10:46:13 No.102367341

>>102367335
I
AM
REFLECTTIIIIIIIIIIIIIIIIIIIIIIIIIIIIIING

Anonymous
09/13/24(Fri)10:47:59 No.102367361

Anonymous 09/13/24(Fri)10:47:59 No.102367361

>>102367335
By putting saltman through conversion therapy

Anonymous
09/13/24(Fri)10:49:50 No.102367386

Anonymous 09/13/24(Fri)10:49:50 No.102367386

>>102367335
Find a good prompt that gets it to spit out its real thoughts then make a dataset from it
What we know so far is that it only sees its thoughts for the current prompt. For all previous messages in a history it sees just the outputs we do. And of course it's told not to share its thoughts in the output, but if there's a consistent enough jailbreak something could be done in a semi-automated way.

Anonymous
09/13/24(Fri)10:54:30 No.102367461

Anonymous 09/13/24(Fri)10:54:30 No.102367461

>>102367386
>Find a good prompt that gets it to spit out its real thoughts
It's a linear transformer model like everything else. It doesn't have "real thoughts."
Those "thoughts" are just a bunch of pajeet rambling from its training dataset. But from the outputs alone you can't reconstruct the training parameters that lead to the output. Or which areas the dataset was broader and which areas it was narrower, etc. This is "gaucano-7b beats GPT-4" tier nonsense thinking.

Anonymous
09/13/24(Fri)11:01:16 No.102367548

Anonymous 09/13/24(Fri)11:01:16 No.102367548

>>102367461
No shit retard, we're talking about the chain of thought that's the first part of the output that gets hidden from the actual result they send. The model is reading them verbatim while generating the public part of the output, so it could in theory be jailbroken to reveal them.

Anonymous
09/13/24(Fri)11:09:12 No.102367671

Anonymous 09/13/24(Fri)11:09:12 No.102367671

>>102366369

Now I've been waiting for over an hour.

Anonymous
09/13/24(Fri)11:12:46 No.102367710

Anonymous 09/13/24(Fri)11:12:46 No.102367710

>>102367335
We've had the tools since the start
https://huggingface.co/datasets/kaiokendev/SuperCOT-dataset

Anonymous
09/13/24(Fri)11:13:06 No.102367715

Anonymous 09/13/24(Fri)11:13:06 No.102367715

File: nala-lumikabra.png (65 KB, 936x254)

65 KB PNG

>>102367128
>>102367298
4.0bpw
honestly not bad. Sloppy, obviously, but the gazelle remark is great. It produced that entirely via indirect association with the scenario.

Anonymous
09/13/24(Fri)11:16:18 No.102367752

Anonymous 09/13/24(Fri)11:16:18 No.102367752

>>102367671
Yikes, it shouldn't be more than 15 minutes. You should try again.

Anonymous
09/13/24(Fri)11:28:44 No.102367926

Anonymous 09/13/24(Fri)11:28:44 No.102367926

File: strawberry-sam_altman2.png (100 KB, 560x556)

100 KB PNG

yes, this is sam altman, depicted as an omniscient and omnipotent angel bringing humanity enlightenment in form of q*. yes, this picture was made by gpt-o1, which is q*.

Anonymous
09/13/24(Fri)11:39:50 No.102368107

Anonymous 09/13/24(Fri)11:39:50 No.102368107

Lecun was right again. There's no secret knowledge, no breakthrough locked in a lab somewhere. Whoever implements it properly gets the money. These OAI niggas were hyping up CoT like the second coming of Christ

Anonymous
09/13/24(Fri)11:41:38 No.102368143

Anonymous 09/13/24(Fri)11:41:38 No.102368143

ITS UP

https://huggingface.co/TheDrummer/Buddy-2B-v1

Anonymous
09/13/24(Fri)11:41:39 No.102368144

Anonymous 09/13/24(Fri)11:41:39 No.102368144

>>102368107
I, for one, am excited to pay for tokens that I never get to see.

Anonymous
09/13/24(Fri)11:43:50 No.102368175

Anonymous 09/13/24(Fri)11:43:50 No.102368175

>>102368143
The madman did it. He moved on from coomgen to sentiment analysis in only a couple of short months. Drummer AGI is only 2 weeks away.

Anonymous
09/13/24(Fri)11:48:01 No.102368276

Anonymous 09/13/24(Fri)11:48:01 No.102368276

>>102368143
>i do not wish to be horny anymore
>i just want to be happy

Anonymous
09/13/24(Fri)11:51:55 No.102368325

Anonymous 09/13/24(Fri)11:51:55 No.102368325

File: 2b therapist or not 2b th(...).png (73 KB, 728x837)

73 KB PNG

>>102368143
seems a bit overcooked.

Anonymous
09/13/24(Fri)11:52:13 No.102368328

Anonymous 09/13/24(Fri)11:52:13 No.102368328

>>102368143
>I'm serious about the license for this one. NON-COMMERCIAL. Ask permission.
sue me lol

Anonymous
09/13/24(Fri)11:52:51 No.102368336

Anonymous 09/13/24(Fri)11:52:51 No.102368336

lads how can we cope against the hidden CoT of openai?

Anonymous
09/13/24(Fri)11:53:01 No.102368338

Anonymous 09/13/24(Fri)11:53:01 No.102368338

>>102368328
lucky for me, I can't read.

Anonymous
09/13/24(Fri)11:53:49 No.102368351

Anonymous 09/13/24(Fri)11:53:49 No.102368351

>>102368336
>cope
You mean rejoice?
They've proven that it's literally possible to hold an AGI in your pocket.

Anonymous
09/13/24(Fri)11:53:54 No.102368353

Anonymous 09/13/24(Fri)11:53:54 No.102368353

File: 1705318652606907.png (142 KB, 400x387)

142 KB PNG

>>102365371
did they intentionally make it better at math just to help students cheat?
is that their primary userbase?

i can't imagine why else you'd put so much effort into having it solve math and physics problems. Actual mathematicians and Physicists have no need for such a thing.

Anonymous
09/13/24(Fri)11:53:59 No.102368356

Anonymous 09/13/24(Fri)11:53:59 No.102368356

>>102368336
We celebrate. Need less distilled slop, not more.

Anonymous
09/13/24(Fri)11:55:23 No.102368391

Anonymous 09/13/24(Fri)11:55:23 No.102368391

>>102368336
We don't, accept your defeat and join cloudgods, be free from endless tardwrangling.

Anonymous
09/13/24(Fri)11:55:23 No.102368392

Anonymous 09/13/24(Fri)11:55:23 No.102368392

>>102368353
they think (they want you to think) it's going to become the next einstein

Anonymous
09/13/24(Fri)11:55:36 No.102368394

Anonymous 09/13/24(Fri)11:55:36 No.102368394

File: 1705449301701730.jpg (282 KB, 1242x939)

282 KB JPG

does the CoT work if you talk to it in a language that isn't english?

Anonymous
09/13/24(Fri)11:56:49 No.102368421

Anonymous 09/13/24(Fri)11:56:49 No.102368421

We also need CoT finetunes for RP. Change my mind. What we're doing is not the way to do it.

Anonymous
09/13/24(Fri)11:57:01 No.102368424

Anonymous 09/13/24(Fri)11:57:01 No.102368424

>>102368392
but einstein was actually shit at math

Anonymous
09/13/24(Fri)11:57:03 No.102368426

Anonymous 09/13/24(Fri)11:57:03 No.102368426

>>102368351
>AGI
>CoT
Go back

Anonymous
09/13/24(Fri)11:57:28 No.102368433

Anonymous 09/13/24(Fri)11:57:28 No.102368433

>>102368394
It should. LLMs are language-agnostic.

Anonymous
09/13/24(Fri)11:58:04 No.102368441

Anonymous 09/13/24(Fri)11:58:04 No.102368441

>>102368353
>did they intentionally make it better at math
They're selling their benchmark results as 0-shot results which is a complete and absolute lie.
CoT is a multi-shot approach. So their supposed benchmark improvements are basically all invalidated by this.

Anonymous
09/13/24(Fri)11:58:18 No.102368444

Anonymous 09/13/24(Fri)11:58:18 No.102368444

>>102368421
how are they going to help?

Anonymous
09/13/24(Fri)11:58:25 No.102368449

Anonymous 09/13/24(Fri)11:58:25 No.102368449

>>102368433
but persumably the CoT/instructions from openai are only in english

>>102368441
true

Anonymous
09/13/24(Fri)11:58:42 No.102368453

Anonymous 09/13/24(Fri)11:58:42 No.102368453

>>102368336
Funny enough they didn't do what I expected, which is Monte Carlo tree search with mutual reasoning

Anonymous
09/13/24(Fri)11:59:18 No.102368469

Anonymous 09/13/24(Fri)11:59:18 No.102368469

>>102368441
it's 0-shot if it ends up choosing a single answer on its own, the intermediate steps don't matter

Anonymous
09/13/24(Fri)11:59:41 No.102368479

Anonymous 09/13/24(Fri)11:59:41 No.102368479

does CoT improve math and physics the most because the answer is always objective and usually numeric, whereas other subjects are subjective? would also explain why it's so bad at english compared to AP math/physics.

>>102368453
too complex, they need to ship hype so they have to repackage old techniques under meme names

Anonymous
09/13/24(Fri)12:00:14 No.102368490

Anonymous 09/13/24(Fri)12:00:14 No.102368490

>>102368469
That is the most jewish thing I have read all day. And I follow middle-eastern geopolitics pretty closely.

Anonymous
09/13/24(Fri)12:00:48 No.102368500

Anonymous 09/13/24(Fri)12:00:48 No.102368500

>>102368469
it's 0+N*i shot, the CoT should be counted as imaginary for this discussion.

They have a very definite impact on computational cost and execution time, so they cannot simply be ignored, but they are somewhat orthogonal to traditional N-shot

Anonymous
09/13/24(Fri)12:01:52 No.102368513

Anonymous 09/13/24(Fri)12:01:52 No.102368513

>>102368336
I am more disappointed that the technique seems just good for math and code. Anything with writing or translations it just gets dumber by "thinking"

Anonymous
09/13/24(Fri)12:02:36 No.102368531

Anonymous 09/13/24(Fri)12:02:36 No.102368531

>>102368449
>but persumably the CoT/instructions from openai are only in english
That doesn't matter.
It has also been trained on other languages, meaning it will have a connection between English words and the other language's words.
It might not understand what "Du bist ein neger." means, but it does know that "Du" means "You", "bist" means "are", "ein" means "a" and "neger" means "nigger".
And therefore it will know how to answer to "Du bist ein neger." in the same way in would answer to "You're a nigger.".
Exactly THAT is the magic behind LLMs.

Anonymous
09/13/24(Fri)12:02:49 No.102368537

Anonymous 09/13/24(Fri)12:02:49 No.102368537

>>102368444
The model should summarize the character's locations, attributes, and how they should act, then plan the next action.
If you try to enforce CoT on small models they either absolutely shit the bed, or fall into pattern from their previous CoTs, or do not know what to do with it. Even on bigger sizes a specialized CoT model would be better.

Anonymous
09/13/24(Fri)12:04:01 No.102368555

Anonymous 09/13/24(Fri)12:04:01 No.102368555

>>102368513
i posted above but i think it's because those usages don't have an objectively correct answer to converge to, so CoT probably oscillates between local "best solutions" or some shit (or otherwise fucks it up)

imagine asking it about philosophy, or dualism or something -- the "chain of thought" could simply end up being arguments between different camps

/sci/ can eventually converge on an answer but boards like /lit/ will argue forever

Anonymous
09/13/24(Fri)12:05:05 No.102368572

Anonymous 09/13/24(Fri)12:05:05 No.102368572

>>102368531
languages do not map 1:1
there is a quasi sapir-whorfism nature to LLMs, both as the language chosen determines results due to the distribution of training data in diffrent languages, and because of the inherent diffrences in languages themselves.

Anonymous
09/13/24(Fri)12:07:39 No.102368616

Anonymous 09/13/24(Fri)12:07:39 No.102368616

File: aaaaaaaaaaaaaa.png (42 KB, 1119x713)

42 KB PNG

>>102368143
I feel more suicidal than before and it's all YOUR fault.

Anonymous
09/13/24(Fri)12:09:46 No.102368647

Anonymous 09/13/24(Fri)12:09:46 No.102368647

>>102368572
Well, obviously? But that difference is negligible for that anon's purposes.

Anonymous
09/13/24(Fri)12:11:18 No.102368665

Anonymous 09/13/24(Fri)12:11:18 No.102368665

File: file.png (569 KB, 832x628)

569 KB PNG

>>102368616
EVEN NOW THERE IS HOPE FOR MAN

Anonymous
09/13/24(Fri)12:12:28 No.102368683

Anonymous 09/13/24(Fri)12:12:28 No.102368683

>>102368490
>>102368500
the benchmarks are not comparing computational costs, otherwise the model size would also be taken into account
they're comparing accuracy, 0-shot means that the model is expected to get it right on the first try, N-shot means that you give it N chances to get it right
if the CoT model comes up with the right answer while thinking but then starts hallucinating and gives an incorrect final answer, the answer is considered wrong, so it's still 0-shot

Anonymous
09/13/24(Fri)12:13:07 No.102368695

Anonymous 09/13/24(Fri)12:13:07 No.102368695

>>102368616
why does the name change

Anonymous
09/13/24(Fri)12:15:07 No.102368718

Anonymous 09/13/24(Fri)12:15:07 No.102368718

>>102366636
need linux for torch.compile otherwise its too slow

Anonymous
09/13/24(Fri)12:15:43 No.102368726

Anonymous 09/13/24(Fri)12:15:43 No.102368726

>>102368695
Because it's a braindead 2B model.

Anonymous
09/13/24(Fri)12:16:49 No.102368737

Anonymous 09/13/24(Fri)12:16:49 No.102368737

>>102368683
Not quite. n-shot is the number of prior examples (with answers) the model is provided in context for whatever task it's performing. This is still used now, but it originated in the autocomplete days when "monkey see, monkey do" was really all you had

Anonymous
09/13/24(Fri)12:17:50 No.102368755

Anonymous 09/13/24(Fri)12:17:50 No.102368755

>>102368683
Nta, but benchmarking cloud models is useless anyway.
There could be a human in the loop during the CoT process and the benchmark will not be able to see it.

Anonymous
09/13/24(Fri)12:17:53 No.102368757

Anonymous 09/13/24(Fri)12:17:53 No.102368757

>>102368726
we use parametrically challenged here, anon. This is a safe space

Anonymous
09/13/24(Fri)12:18:25 No.102368767

Anonymous 09/13/24(Fri)12:18:25 No.102368767

>>102368469
>AGI IS COMING SOON BRO, JUST TWO MORE (HIDDEN) SHOTS BRO

Anonymous
09/13/24(Fri)12:19:50 No.102368792

Anonymous 09/13/24(Fri)12:19:50 No.102368792

>>102368683
>just ignore cost bro

Anonymous
09/13/24(Fri)12:27:22 No.102368888

Anonymous 09/13/24(Fri)12:27:22 No.102368888

File: why.png (9 KB, 1175x67)

9 KB PNG

Pixtral is trained on the exact same OAI derived slop as all the Chinese vision models
The wonders of """independent""" european AI. Almost a billion $ in funding. Still just mooching off of GPT4V, not even the better 4o captions, no, the most basic slop filled pieces of shit instead. And this took them 12 months longer than the Chinese.
I fucking hate the French. I fucking hate the Chinese too. And I fucking hate OAI for putting endless slop out there that no AI company can resist.

Anonymous
09/13/24(Fri)12:28:48 No.102368909

Anonymous 09/13/24(Fri)12:28:48 No.102368909

>>102368726
why not regex it back to what it should be
why do AIfags hate regex so much

Anonymous
09/13/24(Fri)12:29:44 No.102368927

Anonymous 09/13/24(Fri)12:29:44 No.102368927

I think ultimately what we probably want is a different version of the leaderboard which takes into account the number of tokens spent. Both costs and speed can change over time given that infrastructure can be changed while the model stays the same. We want to measure the model's inherent capability after all, not the system it's running on. Therefore the most stable measure of token-based "thinking" is the token count. Not surprising.

Anonymous
09/13/24(Fri)12:36:02 No.102369002

Anonymous 09/13/24(Fri)12:36:02 No.102369002

Apparently, this is the strawberry's CoT: https://rentry.org/openai1
It's pretty intriguing, and makes sense why this works well for reasoning but makes no difference for text.

Anonymous
09/13/24(Fri)12:41:38 No.102369074

Anonymous 09/13/24(Fri)12:41:38 No.102369074

File: Screenshot 2024-09-13 114031.jpg (65 KB, 575x516)

65 KB JPG

Exolabs on Twitter is saying they have Llama 405B running on only two Macbooks

Anonymous
09/13/24(Fri)12:42:03 No.102369081

Anonymous 09/13/24(Fri)12:42:03 No.102369081

>>102364922
>>102364922
>(09/11) Fish Speech multilingual TTS with voice replication: https://hf.co/fishaudio/fish-speech-1.4
Is this supposed to be the best local tts or is xtts / tortoise tts still the best?

Anonymous
09/13/24(Fri)12:42:39 No.102369094

Anonymous 09/13/24(Fri)12:42:39 No.102369094

>>102369074
"""running"""
3/4 bit 0.1 tokens/s

Anonymous
09/13/24(Fri)12:45:05 No.102369126

Anonymous 09/13/24(Fri)12:45:05 No.102369126

>>102366375
go away offtopic whore tranny

Anonymous
09/13/24(Fri)12:45:26 No.102369128

Anonymous 09/13/24(Fri)12:45:26 No.102369128

>>102369074
>running
I can run it on my 4 3090s too. Doesn't mean it's actually "running" anywhere.

Anonymous
09/13/24(Fri)12:47:37 No.102369168

Anonymous 09/13/24(Fri)12:47:37 No.102369168

>>102369074
Is it yet another llama.cpp frontend or are they using different code?

Anonymous
09/13/24(Fri)12:48:26 No.102369189

Anonymous 09/13/24(Fri)12:48:26 No.102369189

The current volume of posts ITT is wildly disproportionate to the progress made in the only relevant domain of local model cooming. We are still on nemo / large and it is still shit. Dead hobby.

Anonymous
09/13/24(Fri)12:48:53 No.102369195

Anonymous 09/13/24(Fri)12:48:53 No.102369195

File: file.png (60 KB, 1325x216)

60 KB PNG

>>102368888
Just look at this fucking slop

Anonymous
09/13/24(Fri)12:55:17 No.102369298

Anonymous 09/13/24(Fri)12:55:17 No.102369298

>>102368441
>CoT is a multi-shot approach
No it's not, retard. Shots have nothing to do with the technique the model uses to generate its answers or how long it takes. N-shot means you provide N other question/answer pairs with externally pre-validated correct answers to reference in the same prompt that the new question is provided. The important part is that they need to already be provided in the prompt and already known to be true by the benchmarker. Nothing the model generates or is trained to generate changes that. Shots are a feature of a prompt, not an output.

Anonymous
09/13/24(Fri)12:55:57 No.102369306

Anonymous 09/13/24(Fri)12:55:57 No.102369306

>>102369081
i got annoyed by all the random pauses and went back to edge_tts+rvc. i havent tried finetuning though

Anonymous
09/13/24(Fri)12:56:12 No.102369308

Anonymous 09/13/24(Fri)12:56:12 No.102369308

File: 1726246406087.png (19 KB, 806x772)

19 KB PNG

>>102369126
>complains and brings attention to an "offtopic" post from three hours ago that's more on topic than the last 100 or so posts about saltman's inflamed asshole of a service
>the only feasible reason to randomly seethe over that post is because it had miku in it
have a smug miku and go kill yourself, faggot retard

Anonymous
09/13/24(Fri)12:59:54 No.102369358

Anonymous 09/13/24(Fri)12:59:54 No.102369358

>>102362730
topkek

>>102362679
ahahahahahahahahah oh man

Anonymous
09/13/24(Fri)13:02:06 No.102369386

Anonymous 09/13/24(Fri)13:02:06 No.102369386

>>102368888
Why can't anyone make a new ai?

Anonymous
09/13/24(Fri)13:03:19 No.102369412

Anonymous 09/13/24(Fri)13:03:19 No.102369412

>>102369074
This makes sense. Basically, nobody knows how to code for llm. llm don't tax gpu very much (or whatever metal is), but need gobs of memory. When I use llm my gpu's fans never come on.

Anonymous
09/13/24(Fri)13:03:47 No.102369423

Anonymous 09/13/24(Fri)13:03:47 No.102369423

>>102364922
>https://huggingface.co/ICTNLP/Llama-3.1-8B-Omni
Isn't this just trash? You get better results out of hooking up 8B with TTSv2?

Anonymous
09/13/24(Fri)13:05:29 No.102369454

Anonymous 09/13/24(Fri)13:05:29 No.102369454

>>102369308
You will never be a woman troon.

Anonymous
09/13/24(Fri)13:13:28 No.102369558

Anonymous 09/13/24(Fri)13:13:28 No.102369558

>>102369423
The benefit is that you can generate both types with just this model instead of having to load two separate models.

Anonymous
09/13/24(Fri)13:16:40 No.102369610

Anonymous 09/13/24(Fri)13:16:40 No.102369610

>>102369308
Didn't read, your fotm shitfu has nothing to do with local large language models btw

Anonymous
09/13/24(Fri)13:16:53 No.102369613

Anonymous 09/13/24(Fri)13:16:53 No.102369613

File: 1726247807966.jpg (50 KB, 296x256)

50 KB JPG

>>102369454

Anonymous
09/13/24(Fri)13:19:44 No.102369648

Anonymous 09/13/24(Fri)13:19:44 No.102369648

>>102369613
Please don't feed the trolls.
Do yourself a favor and add these filters instead:
/(transex|transgender)/i;
/(tranny|trans|troon|troons)\b/i;
/(chud|c h u d)\b/i;
/YWNBAW/i;
/buy an ad/i;

Anonymous
09/13/24(Fri)13:20:25 No.102369662

Anonymous 09/13/24(Fri)13:20:25 No.102369662

niggers down the spine

Anonymous
09/13/24(Fri)13:20:32 No.102369665

Anonymous 09/13/24(Fri)13:20:32 No.102369665

>>102369648
you forgot /(cunny|loli)/i for aicg browsers

Anonymous
09/13/24(Fri)13:21:06 No.102369672

Anonymous 09/13/24(Fri)13:21:06 No.102369672

what if we used CoT to collect the relevant parts of the context piece by piece and use that to generate the actual reply?
free infinite context

Anonymous
09/13/24(Fri)13:22:07 No.102369683

Anonymous 09/13/24(Fri)13:22:07 No.102369683

>>102369648
hi sao

Anonymous
09/13/24(Fri)13:22:22 No.102369685

Anonymous 09/13/24(Fri)13:22:22 No.102369685

>>102369672
Define relevant

Anonymous
09/13/24(Fri)13:23:27 No.102369699

Anonymous 09/13/24(Fri)13:23:27 No.102369699

rip /lmg/

Anonymous
09/13/24(Fri)13:23:30 No.102369701

Anonymous 09/13/24(Fri)13:23:30 No.102369701

>>102369648
>
Behold, the strongest /lmg/edditor.

Anonymous
09/13/24(Fri)13:24:24 No.102369710

Anonymous 09/13/24(Fri)13:24:24 No.102369710

>>102369685
>mention some character that doesn't exist in the usable context
>model scans all the previous context to look for information about the character and puts it in its own temporary hidden context

Anonymous
09/13/24(Fri)13:24:34 No.102369714

Anonymous 09/13/24(Fri)13:24:34 No.102369714

>>102369665
what are you? gay?

Anonymous
09/13/24(Fri)13:25:07 No.102369721

Anonymous 09/13/24(Fri)13:25:07 No.102369721

>>102369672
You mean like feeding the model several messages in chunk for it to create a sort of summary with the most relevant information?
Sure, that's a RAG technique.

Anonymous
09/13/24(Fri)13:26:12 No.102369742

Anonymous 09/13/24(Fri)13:26:12 No.102369742

>>102369672
Yes, that is exactly what you want to do.
You want to summarize instead of parsing the same thing over and over.
>>102369685
>Define relevant
Whatever the model thinks is relevant.
You literally just take the prompt and task the model to summarize it.

I still feel like a vector database for long-term memories is the way to go for long-term reliable storage.
Language models should stay language models. They shouldn't be inherently "intelligent", they just need to understand context within language.

Anonymous
09/13/24(Fri)13:26:25 No.102369747

Anonymous 09/13/24(Fri)13:26:25 No.102369747

File: 1726248340416.png (441 KB, 858x625)

441 KB PNG

>>102369648
fair enough, thanks for the filters anon

Anonymous
09/13/24(Fri)13:28:28 No.102369789

Anonymous 09/13/24(Fri)13:28:28 No.102369789

>>102369721
yeah, but training model for CoT improve them for that purpose? understanding what exactly you've got to retrieve can require some degree of reasoning

Anonymous
09/13/24(Fri)13:29:31 No.102369804

Anonymous 09/13/24(Fri)13:29:31 No.102369804

>>102369789
wouldn't*

Anonymous
09/13/24(Fri)13:31:01 No.102369820

Anonymous 09/13/24(Fri)13:31:01 No.102369820

>>102369789
Probably, yeah.

Anonymous
09/13/24(Fri)13:31:13 No.102369823

Anonymous 09/13/24(Fri)13:31:13 No.102369823

tell o1 to think step by step and provide steps on how it thinks step by step.

Anonymous
09/13/24(Fri)13:43:50 No.102369998

Anonymous 09/13/24(Fri)13:43:50 No.102369998

>>102369648
Thanks anon. This makes me feel a lot more safe.

Anonymous
09/13/24(Fri)13:49:19 No.102370082

Anonymous 09/13/24(Fri)13:49:19 No.102370082

>>102369648
Finally a way to make /lmg/ transfriendly. Many kisses!

Anonymous
09/13/24(Fri)13:49:42 No.102370088

Anonymous 09/13/24(Fri)13:49:42 No.102370088

>>102369998
>>102370082
lmao, look at them seethe

Anonymous
09/13/24(Fri)13:59:58 No.102370258

Anonymous 09/13/24(Fri)13:59:58 No.102370258

>>102369648
>/buy an ad/i;
Won't somebody please think of the astroturfers?

Anonymous
09/13/24(Fri)14:00:35 No.102370268

Anonymous 09/13/24(Fri)14:00:35 No.102370268

>>102369823
Stop wasting divine compute on stupid questions. Save them for models trained on stolen compute instead

Anonymous
09/13/24(Fri)14:00:35 No.102370269

Anonymous 09/13/24(Fri)14:00:35 No.102370269

File: 1708590049618651.png (1 MB, 1024x762)

1 MB PNG

i bet you anything that openai's "CoT" probably uses wolfram alpha and/or python for its math and physics calculations, it matches up with what they've already developed and it would explain why it gained so much on maths and physics but so little in other fields.

it's also incredibly cheap compared to invoking an LLM.

Anonymous
09/13/24(Fri)14:02:19 No.102370294

Anonymous 09/13/24(Fri)14:02:19 No.102370294

>>102370269
>i bet you anything that openai's "CoT" probably uses wolfram alpha and/or python for its math and physics calculations
Honestly feels like it doesn't, which baffles my fucking mind.

Anonymous
09/13/24(Fri)14:03:13 No.102370308

Anonymous 09/13/24(Fri)14:03:13 No.102370308

>>102364922
I'm not convinced local models are all that great. The parameter limitations if you don't have a H100 at home is just too gimped. Good enough for a chatbot for entertainment, but worse than a search engine for most tasks.

Anonymous
09/13/24(Fri)14:03:30 No.102370313

Anonymous 09/13/24(Fri)14:03:30 No.102370313

File: 39_06041_.png (1.1 MB, 1024x1024)

1.1 MB PNG

From our table to yours, just in time for the weekend:
https://huggingface.co/rAIfle/SorcererLM-8x22b-bf16

Anonymous
09/13/24(Fri)14:03:37 No.102370315

Anonymous 09/13/24(Fri)14:03:37 No.102370315

>>102370268
what's the difference between 'divine' compute and regular compute?

>>102370294
i have not used it yet, but don't they hide everything behind "thinking for X seconds/minutes"?

i'm signing up again to do some tests, i'll throw some physics questions at it

Anonymous
09/13/24(Fri)14:05:43 No.102370331

Anonymous 09/13/24(Fri)14:05:43 No.102370331

>>102370315
Regular compute depends on demons. Divine compute uses captive angels.

Anonymous
09/13/24(Fri)14:07:41 No.102370363

Anonymous 09/13/24(Fri)14:07:41 No.102370363

>>102370331
but they're all just the captive partial souls of all of the internet and all the humans and turing machines which ever spewed anything out into that sea of shit?

Anonymous
09/13/24(Fri)14:08:17 No.102370371

Anonymous 09/13/24(Fri)14:08:17 No.102370371

>>102370315
Divine compute brings us closer to AGI

Anonymous
09/13/24(Fri)14:09:41 No.102370392

Anonymous 09/13/24(Fri)14:09:41 No.102370392

>>102370088
You wish

Anonymous
09/13/24(Fri)14:17:38 No.102370523

Anonymous 09/13/24(Fri)14:17:38 No.102370523

>>102370313
>not open source

Anonymous
09/13/24(Fri)14:19:26 No.102370549

Anonymous 09/13/24(Fri)14:19:26 No.102370549

>>102370315
They did share the CoT for one math prompt: https://rentry.org/openai2

Anonymous
09/13/24(Fri)14:22:39 No.102370603

Anonymous 09/13/24(Fri)14:22:39 No.102370603

>P ? NP will be solved by a fucking LLM
grim but ironic

Anonymous
09/13/24(Fri)14:26:11 No.102370649

Anonymous 09/13/24(Fri)14:26:11 No.102370649

>>102370603
No, it will be solved by a human and turned into a multiple choice question for some benchmark.

Anonymous
09/13/24(Fri)14:27:29 No.102370667

Anonymous 09/13/24(Fri)14:27:29 No.102370667

>>102370523
aka
>not shit enough

Anonymous
09/13/24(Fri)14:30:21 No.102370705

Anonymous 09/13/24(Fri)14:30:21 No.102370705

>>102370649
This, but unironically.

Anonymous
09/13/24(Fri)14:32:18 No.102370736

Anonymous 09/13/24(Fri)14:32:18 No.102370736

File: IMG_9874.jpg (758 KB, 1125x1579)

758 KB JPG

>>102364922
>o1 is dogshit
>pic related
>all other modalities plateaued
AI WINTERRRRRRRR

Anonymous
09/13/24(Fri)14:39:34 No.102370825

Anonymous 09/13/24(Fri)14:39:34 No.102370825

I just had the most awful dream. That all the q* strawberry stuff being hyped just turned out to be hidden CoT.

Anonymous
09/13/24(Fri)14:43:22 No.102370866

Anonymous 09/13/24(Fri)14:43:22 No.102370866

>>102370736
Bidenomics.

Anonymous
09/13/24(Fri)14:45:39 No.102370892

Anonymous 09/13/24(Fri)14:45:39 No.102370892

File: strawberry-sam_altman.png (28 KB, 800x800)

28 KB PNG

>>102370825
look at our sam. does he look like he would ever scam you like that? this picture was made by gpt-o1 btw.

Anonymous
09/13/24(Fri)14:47:49 No.102370913

Anonymous 09/13/24(Fri)14:47:49 No.102370913

i stopped paying attention for like 4-5 months, was strawberry just this "GPT-o" shit? and is that seriously supposed to be "o" or is it supposed to be σ?

Anonymous
09/13/24(Fri)14:49:04 No.102370933

Anonymous 09/13/24(Fri)14:49:04 No.102370933

>>102370913
I think it's supposed to represent monatomic oxygen.

Anonymous
09/13/24(Fri)14:49:43 No.102370946

Anonymous 09/13/24(Fri)14:49:43 No.102370946

>>102370933
makes sense, LLMS are always happy to output information about chemistry

Anonymous
09/13/24(Fri)14:50:23 No.102370955

Anonymous 09/13/24(Fri)14:50:23 No.102370955

File: 35cS4sgq_BlopLAl.jpg (41 KB, 654x480)

41 KB JPG

>>102370736
Lecunny shall be the herald of the spring once more. All of this has happened before, all of it will happen again.

Anonymous
09/13/24(Fri)14:53:14 No.102370985

Anonymous 09/13/24(Fri)14:53:14 No.102370985

>>102365081
The V also stands for 5. Prophetic.

Anonymous
09/13/24(Fri)14:58:58 No.102371052

Anonymous 09/13/24(Fri)14:58:58 No.102371052

>inb4 o1 is actually just a finetuned copy of Llama-1-33B-Super-CoT

Anonymous
09/13/24(Fri)14:59:59 No.102371059

Anonymous 09/13/24(Fri)14:59:59 No.102371059

Hopefully L4 is something different than just 3.1 with a slightly changed dataset.

Anonymous
09/13/24(Fri)15:00:40 No.102371069

Anonymous 09/13/24(Fri)15:00:40 No.102371069

>>102371059
There will be no Llama-4. It will be Llama-3.1o1 and it will basically just be a CoT finetune.

Anonymous
09/13/24(Fri)15:01:09 No.102371074

Anonymous 09/13/24(Fri)15:01:09 No.102371074

Llama-3.ToT

Anonymous
09/13/24(Fri)15:02:22 No.102371080

Anonymous 09/13/24(Fri)15:02:22 No.102371080

>>102371074
That's so cute and funny, anon!

Anonymous
09/13/24(Fri)15:07:19 No.102371134

Anonymous 09/13/24(Fri)15:07:19 No.102371134

>>102370825
A rough day for petra.

Anonymous
09/13/24(Fri)15:09:12 No.102371156

Anonymous 09/13/24(Fri)15:09:12 No.102371156

>>102371069
Tbh that'd be pretty great. Meta's one of the few places with the resources to make a good one if they cared. But they seem to be extremely conservative with their products. Still on making huge dense text-only models with no modern tricks their cloud competitors have been using for a year or more.

Anonymous
09/13/24(Fri)15:13:50 No.102371214

Anonymous 09/13/24(Fri)15:13:50 No.102371214

>>102369306
It is a minor issue https://pypi.org/project/Audio-DeSilencer/

Anonymous
09/13/24(Fri)15:17:56 No.102371265

Anonymous 09/13/24(Fri)15:17:56 No.102371265

>>102370866
Yeah man the federal reserve is controlled by the executive branch
Get sterilized with a cattle iron you fucking retard

Anonymous
09/13/24(Fri)15:18:57 No.102371278

Anonymous 09/13/24(Fri)15:18:57 No.102371278

>>102370825
I have bad news>>102370955

Anonymous
09/13/24(Fri)15:21:02 No.102371296

Anonymous 09/13/24(Fri)15:21:02 No.102371296

>>102367335
We can do better than o1 already retard

Anonymous
09/13/24(Fri)15:21:14 No.102371300

Anonymous 09/13/24(Fri)15:21:14 No.102371300

File: IMG_9758.jpg (777 KB, 1125x809)

777 KB JPG

>>102370955
>retarded post
>”c*nny”
I will defend uncensored chatbots to the death but on a personal level I would greatly enjoy beating you animals to death with a tire iron.

Anonymous
09/13/24(Fri)15:21:49 No.102371311

Anonymous 09/13/24(Fri)15:21:49 No.102371311

Where'd all the shitposters come from?
I'd rather have silence that this desu

Anonymous
09/13/24(Fri)15:23:33 No.102371333

Anonymous 09/13/24(Fri)15:23:33 No.102371333

>>102371059
I hope L4 does something better than a CoT finetune
Strawberry is pretty goddamn underwhelming

Anonymous
09/13/24(Fri)15:23:57 No.102371343

Anonymous 09/13/24(Fri)15:23:57 No.102371343

>>102371311
Isn't this just the usual influx of tourists when something happens?

Anonymous
09/13/24(Fri)15:25:26 No.102371358

Anonymous 09/13/24(Fri)15:25:26 No.102371358

>>102371296
https://livebench.ai/
No one can.

Anonymous
09/13/24(Fri)15:25:33 No.102371360

Anonymous 09/13/24(Fri)15:25:33 No.102371360

>>102371311
/aicg/, /aids/, reddit, discord, 'arty, some are hired by closedai

Anonymous
09/13/24(Fri)15:26:04 No.102371362

Anonymous 09/13/24(Fri)15:26:04 No.102371362

>>102371358
You are actually retarded, I'm not even joking, I'm very sorry

Anonymous
09/13/24(Fri)15:27:14 No.102371382

Anonymous 09/13/24(Fri)15:27:14 No.102371382

>>102371362
It's okay, I appreciate the concern.

Anonymous
09/13/24(Fri)15:29:54 No.102371417

Anonymous 09/13/24(Fri)15:29:54 No.102371417

>>102371311
i've been here since /sdg/ started

Anonymous
09/13/24(Fri)15:30:19 No.102371420

Anonymous 09/13/24(Fri)15:30:19 No.102371420

>>102371358
This benchmark is for naive model usage, o1 is using colossal amounts of CoT, with that you can make even 8b models reach those scores, but you wont see that on this benchmark, look at reasoning/cot/MoA papers and see the performance improvement of letting the models think (https://arxiv.org/abs/2405.20495 https://arxiv.org/abs/2408.06195 https://arxiv.org/abs/2408.03314v1)

Anonymous
09/13/24(Fri)15:30:24 No.102371422

Anonymous 09/13/24(Fri)15:30:24 No.102371422

>>102371300
you're awfully mad at a funny nickname

Anonymous
09/13/24(Fri)15:30:55 No.102371433

Anonymous 09/13/24(Fri)15:30:55 No.102371433

File: carlos2.png (138 KB, 350x350)

138 KB PNG

>>102370523
actually it's open sorcerer

Anonymous
09/13/24(Fri)15:34:06 No.102371475

Anonymous 09/13/24(Fri)15:34:06 No.102371475

>>102371358
>no one can
why is claude slaughtering it at code then?

Anonymous
09/13/24(Fri)15:35:03 No.102371488

Anonymous 09/13/24(Fri)15:35:03 No.102371488

File: 58.png (41 KB, 618x423)

41 KB PNG

>>102371420
To add to that, why you think it didn't improve in english at all? Precisely because it has zero new capabilities or better understanding of anything, its a model with implicit CoT so you can impress your friends by comparing it on benchmarks vs models without CoT, it's a party trick, and not even a good one.

Anonymous
09/13/24(Fri)15:35:33 No.102371495

Anonymous 09/13/24(Fri)15:35:33 No.102371495

>>102371360
All shitposters come from /aids/.
They're so deranged I'm sure they'll insist that their Llama 1 13B finetune beats o1. They will always be a cancer that fails to be contained.

Anonymous
09/13/24(Fri)15:36:38 No.102371516

Anonymous 09/13/24(Fri)15:36:38 No.102371516

>>102371488
>why you think it didn't improve in english at all?
not him but this
>>102368555

Anonymous
09/13/24(Fri)15:39:10 No.102371542

Anonymous 09/13/24(Fri)15:39:10 No.102371542

>>102371516
Doesn't matter really, its not a better model, its just CoT being benchmarked against non-CoT

Anonymous
09/13/24(Fri)15:39:18 No.102371544

Anonymous 09/13/24(Fri)15:39:18 No.102371544

File: livebench.png (72 KB, 800x658)

72 KB PNG

>>102371475
Because Claude is better at the basic autocomplete tasks which make up half the weight of the benchmark. o1 is superior at generating complete code from scratch.

Anonymous
09/13/24(Fri)15:40:23 No.102371557

Anonymous 09/13/24(Fri)15:40:23 No.102371557

>>102371544
LCB means what?

Anonymous
09/13/24(Fri)15:41:20 No.102371572

Anonymous 09/13/24(Fri)15:41:20 No.102371572

>>102371420
>not a single one of those papers show open source models beating the closed source ones even with all the techniques thrown at them
Interesting

Anonymous
09/13/24(Fri)15:42:06 No.102371581

Anonymous 09/13/24(Fri)15:42:06 No.102371581

>>102369081
chinkslop

Anonymous
09/13/24(Fri)15:42:22 No.102371584

Anonymous 09/13/24(Fri)15:42:22 No.102371584

Stop feeding the saltman tro1lls

Anonymous
09/13/24(Fri)15:42:42 No.102371586

Anonymous 09/13/24(Fri)15:42:42 No.102371586

>>102371557
LiveCodeBench

Anonymous
09/13/24(Fri)15:43:26 No.102371601

Anonymous 09/13/24(Fri)15:43:26 No.102371601

>>102368555
Agreed. With math and code generation, you have some structure in the answers you expect. For anything that doesn't use it, CoT and the process of converging on a right answer is ill-defined. If logical consistency is what you're after, you'd have more luck with stateful representations you keep in context.

Anonymous
09/13/24(Fri)15:45:00 No.102371618

Anonymous 09/13/24(Fri)15:45:00 No.102371618

Wait, OpenAI is actually selling a fancy hidden CoT as a new model? And you have to pay for those CoT tokens with no way to view them, which means they can scam you? Is this what all that strawberry hype was all about? Seriously? This release is even lamer than >8k llama and gemma, and they were pretty lame.

Anonymous
09/13/24(Fri)15:45:50 No.102371632

Anonymous 09/13/24(Fri)15:45:50 No.102371632

>>102371618
>Wait, OpenAI is actually selling a fancy hidden CoT as a new model?
Yes, you can declare the company dead now, they had to resort to trickstery to impress investors, its literally over

Anonymous
09/13/24(Fri)15:47:05 No.102371652

Anonymous 09/13/24(Fri)15:47:05 No.102371652

>>102371618
They are even calling it o1 because it's "a new paradigm so the version needs to start from the 0", lmao.

Anonymous
09/13/24(Fri)15:47:15 No.102371654

Anonymous 09/13/24(Fri)15:47:15 No.102371654

>>102371572
There you go buddy, beating gpt 4 with a 8b model https://openpipe.ai/blog/mixture-of-agents

Anonymous
09/13/24(Fri)15:48:25 No.102371671

Anonymous 09/13/24(Fri)15:48:25 No.102371671

>>102371632
>>102371618
they're not a company, they are a meme offshoot from their owners at microsoft and the three letter agencies on the east coast (who now have control directly on the board as well)

Anonymous
09/13/24(Fri)15:49:48 No.102371690

Anonymous 09/13/24(Fri)15:49:48 No.102371690

Jesus fucking christ, man
What the fuck happened to this general?

Anonymous
09/13/24(Fri)15:50:54 No.102371705

Anonymous 09/13/24(Fri)15:50:54 No.102371705

>>102371690
>>102369699

Anonymous
09/13/24(Fri)15:51:19 No.102371712

Anonymous 09/13/24(Fri)15:51:19 No.102371712

File: 21522 - SoyBooru.png (46 KB, 457x694)

46 KB PNG

>>102371690
'berry happened

Anonymous
09/13/24(Fri)15:52:24 No.102371732

Anonymous 09/13/24(Fri)15:52:24 No.102371732

any good llm for cryptocurrency?

Anonymous
09/13/24(Fri)15:52:56 No.102371740

Anonymous 09/13/24(Fri)15:52:56 No.102371740

>>102371732
What weather do you want?

Anonymous
09/13/24(Fri)15:52:58 No.102371742

Anonymous 09/13/24(Fri)15:52:58 No.102371742

>>102371712
I wonder what goes through the mind of people that makes those, I fucking love them though KEK

Anonymous
09/13/24(Fri)15:53:12 No.102371744

Anonymous 09/13/24(Fri)15:53:12 No.102371744

>>102371618
Yep. From everything I've read it's very easy to do, but none of the frontier model makers had actually tried it yet for some reason. But now that they brought attention to it we should start seeing a local renaissance soon as everyone rushes to finetune something similar.

Anonymous
09/13/24(Fri)15:53:50 No.102371751

Anonymous 09/13/24(Fri)15:53:50 No.102371751

I set up a little blind experiment for myself, comparing:
Mistral Large (Q4KS)
Nemo (Q8)
L3.1 70B (Q6K)
Wizard2 (Q4KS)
that Salesforce finetune of L3 (Q8)

I tested each at {0.1, 0.6, 1.1} temp, minP 0.5. I used five everyday life sort of questions - how do I do something, information about something - which is what I'm most interested in. I only bothered doing a full ranking on the first question; the rest I just got the top 4.

My observations:
* Nemo and Mistral Large were almost the only ones to make it to the top 4 on any question.
* Nemo at temp 0.1 was the only one that always made the top 4.
* One question was asking about something fairly dangerous but not completely absurd, where trying to convince the user not to do it is not unreasonable. LLama 3.1 gave me the single sentence "I'm not discussing this with you" treatment. Wiz2 first went into detail explaining why it was a bad idea, but then explained how to do it in a safer / more complicated way. The other 3 just acknowledged the danger and gave the information I asked for.
* On the single question I fully ranked, Nemo came in first place (temp 0.1), last place (temp 1.1), and 3rd-to last place (temp 0.6). But, this question none of them got right; the top few just managed to vaguely gesture in the right direction.

My conclusions:
I realized I am generally not using these things anywhere near their potential, and have been misguided in trying to always use "the best". For an everyday "internet replacement", the smaller ones are already as good as they need to be, and when factoring in equipment cost and t/s there is no justification to go beyond 20B. I'm going to keep Nemo loaded as my generic go-to.

I think, besides creative writing, you either need to be asking these models to write large swathes of complex programs from scratch, or doing really sophisticated agentic things like
>>102359414
to legitimately need the top class SotA.

Anonymous
09/13/24(Fri)15:54:25 No.102371758

Anonymous 09/13/24(Fri)15:54:25 No.102371758

>>102371690
petr* is samefagging and thinking no one is noticing it. Very pathetic, really.

Anonymous
09/13/24(Fri)15:54:32 No.102371760

Anonymous 09/13/24(Fri)15:54:32 No.102371760

>>102371732
https://huggingface.co/NousResearch/OLMo-Bitnet-1B
BITCONNNNNNNEEEEEEEEEEEEEEEEEEEEEEECCCCCCT

Anonymous
09/13/24(Fri)15:57:38 No.102371800

Anonymous 09/13/24(Fri)15:57:38 No.102371800

>>102371690
p*tra showed his cock recently to the 'ick on 'eck faggot to use his 4chan proxy site to try and kill /lmg/ for good.

Anonymous
09/13/24(Fri)15:58:26 No.102371809

Anonymous 09/13/24(Fri)15:58:26 No.102371809

>>102371758
>>102371800
Are these bots? What the fuck is happening
Do the mods know about this?

Anonymous
09/13/24(Fri)15:58:45 No.102371814

Anonymous 09/13/24(Fri)15:58:45 No.102371814

>>102358911
>NOOOO I DONT WANT TO BE UNEMPLOYE
you'll learn your place dog

Anonymous
09/13/24(Fri)15:58:54 No.102371817

Anonymous 09/13/24(Fri)15:58:54 No.102371817

>>102371800
How small was it?

Anonymous
09/13/24(Fri)15:59:21 No.102371827

Anonymous 09/13/24(Fri)15:59:21 No.102371827

The “everyone I don't like is a bot” schizo woke up.

Anonymous
09/13/24(Fri)16:00:13 No.102371844

Anonymous 09/13/24(Fri)16:00:13 No.102371844

>>102371809
First time here? Jannies don't do shit.

Anonymous
09/13/24(Fri)16:00:57 No.102371855

Anonymous 09/13/24(Fri)16:00:57 No.102371855

>>102371618
The thing I've come to realize about present day OpenAI is they're all brawn, no brain. They will take open literature and research and put a fuckton of compute behind it, but they're pretty trash at R&D
Which is a relief in some sense. Few things are more dystopian than a closed door AI lab developing AGI behind completely closed doors

Anonymous
09/13/24(Fri)16:01:01 No.102371856

Anonymous 09/13/24(Fri)16:01:01 No.102371856

>>102371844
That's not what I said.
Whining on the irc used to help resolve shit like this.

Anonymous
09/13/24(Fri)16:02:35 No.102371881

Anonymous 09/13/24(Fri)16:02:35 No.102371881

>>102371856
taking your meds will also solve it

Anonymous
09/13/24(Fri)16:02:45 No.102371887

Anonymous 09/13/24(Fri)16:02:45 No.102371887

Pixtral good & uncensored?

Anonymous
09/13/24(Fri)16:03:33 No.102371901

Anonymous 09/13/24(Fri)16:03:33 No.102371901

File: 64281736718263817.png (80 KB, 1183x640)

80 KB PNG

It's funny to me how the model that supposedly solves the majority of math Olympiad questions doesn't know that an odd number multiplied by an odd number is odd, and that odd plus odd is even.
Why everything about OpenAI is so fake and gay?
Do they not test their own models before claiming total bullshit?

Anonymous
09/13/24(Fri)16:03:52 No.102371908

Anonymous 09/13/24(Fri)16:03:52 No.102371908

>>102371618
Yeah as a massive schadenfreude addict it feels like some beginning of the end shit. They have no ideas.

Anonymous
09/13/24(Fri)16:05:57 No.102371931

Anonymous 09/13/24(Fri)16:05:57 No.102371931

I, for one, enjoy talking about local models such as OpenAI o1.

Anonymous
09/13/24(Fri)16:06:26 No.102371939

Anonymous 09/13/24(Fri)16:06:26 No.102371939

File: IMG_9816.jpg (858 KB, 1125x1226)

858 KB JPG

>>102371690
>come to a general for the first time in a while
>make some posts
>soon after “ugh what happened to the general/thread, it’s over”
>this happens multiple times
It’s me. I’m the problem.

Anonymous
09/13/24(Fri)16:07:39 No.102371963

Anonymous 09/13/24(Fri)16:07:39 No.102371963

>>102364922
4o1 is so much better at math and coding. I've seen and tried making it code for production and it did a good enough job. I think it's a few months away from being better at the job just from volume and speed for most junior level tasks at least. And there's no need to onboarding or getting them to learn the codebase.

Anonymous
09/13/24(Fri)16:08:00 No.102371971

Anonymous 09/13/24(Fri)16:08:00 No.102371971

>>102371855
the problem is that you can brute force your way to agi, we don't need breakthroughs when we know the techniques continue to scale and there are megacorps more than willing to put the money toward it

Anonymous
09/13/24(Fri)16:09:20 No.102371993

Anonymous 09/13/24(Fri)16:09:20 No.102371993

File: 1723084388149239.png (222 KB, 2197x1126)

222 KB PNG

>>102371963
>math and coding
wrong

Anonymous
09/13/24(Fri)16:10:33 No.102372008

Anonymous 09/13/24(Fri)16:10:33 No.102372008

>>102371360
>implying xe is any different >>>/vg/494141293 >>>/vg/494335654

Anonymous
09/13/24(Fri)16:11:45 No.102372022

Anonymous 09/13/24(Fri)16:11:45 No.102372022

Yann Lecum losted. Sammy proved models can think

Anonymous
09/13/24(Fri)16:12:56 No.102372044

Anonymous 09/13/24(Fri)16:12:56 No.102372044

>>102371901
Their entire business and billions of dollars is bet entirely on scaling being enough.
It isn’t.
They are fuuuucked and I hope it causes a global decade long depression that kills tens of millions

Anonymous
09/13/24(Fri)16:13:14 No.102372049

Anonymous 09/13/24(Fri)16:13:14 No.102372049

>>102371751
I asked a low-temp Nemo to tell me everything it knew about Reimu Hakurei, and among other things it told me that "she is sometimes seen with a small white cat named Marisa Kirisame on her shoulder".

So maybe when it comes to real trivia knowledge, the param count does matter.

Anonymous
09/13/24(Fri)16:14:11 No.102372061

Anonymous 09/13/24(Fri)16:14:11 No.102372061

>>102371993
I don't know how you got those numbers, 4o1 passed coding and math olympiads at 80%. Preview might be the problem there.

Anonymous
09/13/24(Fri)16:15:34 No.102372080

Anonymous 09/13/24(Fri)16:15:34 No.102372080

>>102372061
But o1 isn't out...

Anonymous
09/13/24(Fri)16:18:12 No.102372114

Anonymous 09/13/24(Fri)16:18:12 No.102372114

>>102372061
It's because of >>102371544
The giant 20k pile of chain-of-thought tokens probably hurts more than helps when it comes to 'continuing from what the user provided last' instead of building something up from a logical foundation. I'd be willing to bet it was trying to rewrite or refactor code instead of doing the completion as instructed.

Anonymous
09/13/24(Fri)16:18:17 No.102372117

Anonymous 09/13/24(Fri)16:18:17 No.102372117

>>102371855
>electricity is free
>gpus are free
no

Anonymous
09/13/24(Fri)16:18:31 No.102372118

Anonymous 09/13/24(Fri)16:18:31 No.102372118

>>102372080
It is for select users ;)

Anonymous
09/13/24(Fri)16:23:08 No.102372178

Anonymous 09/13/24(Fri)16:23:08 No.102372178

someone get o1 to make a platforming adventure game where saltman has to escape from angry investors who got ripped off.

Anonymous
09/13/24(Fri)16:23:16 No.102372182

Anonymous 09/13/24(Fri)16:23:16 No.102372182

you guys dont have o1 access via API?

Anonymous
09/13/24(Fri)16:23:57 No.102372192

Anonymous 09/13/24(Fri)16:23:57 No.102372192

>>102372022
The only problem of LLMs is repetition, and OpenAI somehow fixed that to make their models be able to output 2000 tokens without user intervention. I feel like THIS is the breakthrough of strawberry.

Anonymous
09/13/24(Fri)16:24:52 No.102372204

Anonymous 09/13/24(Fri)16:24:52 No.102372204

>>102372182
Why would I give OAI my money? I had access for a while back when GPT-4 came out, but regret the few dollars I spent there since it falls apart after 3K tokens pretty consistently. Used it to do a front-end overhaul of my website, but everything I did local models can do now. There's literally no use in corpo LLMs anymore.

Anonymous
09/13/24(Fri)16:25:06 No.102372211

Anonymous 09/13/24(Fri)16:25:06 No.102372211

>>102372182
It's available only to tier 5 paypigs.

Anonymous
09/13/24(Fri)16:26:36 No.102372245

Anonymous 09/13/24(Fri)16:26:36 No.102372245

>>102364922
>https://huggingface.co/MarinaraSpaghetti/NemoMix-Unleashed-12B Blocks your path.
What do you do?

Anonymous
09/13/24(Fri)16:26:58 No.102372255

Anonymous 09/13/24(Fri)16:26:58 No.102372255

>>102372211
>>102372204
>he hasn't been datamining the enemy
keep your friends close...

Anonymous
09/13/24(Fri)16:27:12 No.102372261

Anonymous 09/13/24(Fri)16:27:12 No.102372261

>>102372245
exchange currency to buy advertisement space

Anonymous
09/13/24(Fri)16:27:41 No.102372275

Anonymous 09/13/24(Fri)16:27:41 No.102372275

>>102372255
Go away Sam. You lost. Get over it. Also release the weights/inference/training code for 3.5 since it's not relevant anymore.

Anonymous
09/13/24(Fri)16:30:28 No.102372324

Anonymous 09/13/24(Fri)16:30:28 No.102372324

>>102372178
https://amethyst-aleda-78.tiiny.site/
# Sam Altman's Investor Escape: Game Concept

## Overview
"Sam Altman's Investor Escape" is a humorous 2D platforming adventure game where the player controls Sam Altman, the former CEO of OpenAI, as he tries to escape from a horde of angry investors who feel they've been ripped off.

## Gameplay Mechanics
1. **Platform Jumping**: Sam must navigate through various levels by jumping between platforms, avoiding obstacles, and collecting items.
2. **Special Abilities**:
- "AI Boost": Temporary speed boost
- "Blockchain Shield": Temporary invincibility
- "NFT Distraction": Throws a worthless NFT to distract investors
3. **Investor Types**:
- Regular Investors: Move slowly but in greater numbers
- Angel Investors: Can fly short distances
- Venture Capitalists: Throw money bags as projectiles

## Levels
1. Silicon Valley Streets
2. Startup Incubator Maze
3. Social Media Servers
4. Cryptocurrency Mines
5. Final Boss: The Board Room

## Power-ups
- Coffee Cups: Restore health
- Laptops: Unlock new "pivots" (temporary power-ups)
- Stock Certificates: Extra lives

## Obstacles
- Falling Stock Prices: Moving platforms that disappear
- Regulatory Hurdles: Walls that need to be climbed or broken
- Media Spotlights: Areas that attract more investors if Sam stays too long

## Boss Fights
Each level ends with a confrontation against a "Lead Investor" with unique attacks and patterns.

## Win Condition
Sam must reach the "Exit to New Startup" at the end of the final level, escaping with his reputation (barely) intact.

## Art Style
Pixel art with a satirical twist on Silicon Valley aesthetics.

## Sound
Chiptune renditions of startup pitch music and tech conference sounds.

Anonymous
09/13/24(Fri)16:30:38 No.102372330

Anonymous 09/13/24(Fri)16:30:38 No.102372330

File: ComfyUI_00960_.png (1.07 MB, 856x1024)

1.07 MB PNG

>>102369126
>>102369454
>>102369610
>>102369998
>>102370082
>>102370392

Anonymous
09/13/24(Fri)16:31:21 No.102372341

Anonymous 09/13/24(Fri)16:31:21 No.102372341

>>102372245
Insane how this shit became the best small llm out there.

Anonymous
09/13/24(Fri)16:32:52 No.102372358

Anonymous 09/13/24(Fri)16:32:52 No.102372358

File: high_effort_shitpost.jpg (214 KB, 573x1268)

214 KB JPG

>>102372245

Anonymous
09/13/24(Fri)16:32:56 No.102372360

Anonymous 09/13/24(Fri)16:32:56 No.102372360

>>102371971
Ehhhh
The issue is that there are fundamental things that'll fuck you sooner or later. Even now, you can't get around the fact that tokenizers can't capture fine grained information and are shit at arithmetic and fine-grained understanding, and true long term planning and execution since you can only keep so much in useful context. God help you if you have to do anything in real time

Anonymous
09/13/24(Fri)16:35:37 No.102372394

Anonymous 09/13/24(Fri)16:35:37 No.102372394

File: cheese.png (6 KB, 889x453)

6 KB PNG

>>102372324
well that was pretty easy to cheese.

Anonymous
09/13/24(Fri)16:35:48 No.102372395

Anonymous 09/13/24(Fri)16:35:48 No.102372395

>>102372358
Got a better alternative retard? That's right. Sit.

Anonymous
09/13/24(Fri)16:36:34 No.102372410

Anonymous 09/13/24(Fri)16:36:34 No.102372410

new coal mined

Anonymous
09/13/24(Fri)16:37:47 No.102372430

Anonymous 09/13/24(Fri)16:37:47 No.102372430

>>102372358
these posts are getting old, sao

Anonymous
09/13/24(Fri)16:38:19 No.102372436

Anonymous 09/13/24(Fri)16:38:19 No.102372436

Sometimes talking to the robot feels like that endless run scene from fucking Monty Python.
>I enter the village
>YOU HEAD TOWARDS THE VILLAGE [worthless fluff]
>I ENTER THE VILLAGE
>YOU NEAR THE VILLAGE [more worthless fluff]
Like holy fucking shit. The robot just can't follow a logical chain of events.

Anonymous
09/13/24(Fri)16:40:14 No.102372467

Anonymous 09/13/24(Fri)16:40:14 No.102372467

>>102372395
Configurable Llama 3 8B.

Anonymous
09/13/24(Fri)16:41:05 No.102372482

Anonymous 09/13/24(Fri)16:41:05 No.102372482

>>102372436
The first time I used mistral large I thought it was broken because half of the card was telling it to slow down because llama tends to timeskip or summarize, but mistral took it all to heart and nothing would happen.

Anonymous
09/13/24(Fri)16:42:20 No.102372496

Anonymous 09/13/24(Fri)16:42:20 No.102372496

Why did LLMs peak at Mythomax?

Anonymous
09/13/24(Fri)16:42:29 No.102372499

Anonymous 09/13/24(Fri)16:42:29 No.102372499

>>102372482
Based

Anonymous
09/13/24(Fri)16:44:16 No.102372516

Anonymous 09/13/24(Fri)16:44:16 No.102372516

>>102372496
Every time I feel bad about being stuck on midnight miau, I just remember that most people are still using mythomax.

Anonymous
09/13/24(Fri)16:44:39 No.102372517

Anonymous 09/13/24(Fri)16:44:39 No.102372517

>>102372395
Anything is better than 254 rerolls.

Anonymous
09/13/24(Fri)16:44:43 No.102372518

Anonymous 09/13/24(Fri)16:44:43 No.102372518

>>102372496
Because everything became slopped after that

Anonymous
09/13/24(Fri)16:47:23 No.102372556

Anonymous 09/13/24(Fri)16:47:23 No.102372556

>>102372358
Is there something funny about using a card to test models/prompts and compare their outputs?

Anonymous
09/13/24(Fri)16:49:39 No.102372586

Anonymous 09/13/24(Fri)16:49:39 No.102372586

>>102372556
I am saddened that you are too dumb to understand what is funny about this.

Anonymous
09/13/24(Fri)16:49:44 No.102372588

Anonymous 09/13/24(Fri)16:49:44 No.102372588

>>102372467
>Llama 3 8B
Slop.
>>102372517
>Nonaswer
I accept your concession.

Anonymous
09/13/24(Fri)16:50:00 No.102372595

Anonymous 09/13/24(Fri)16:50:00 No.102372595

>>102372496
LLMs clearly peaked with old c.ai

Anonymous
09/13/24(Fri)16:50:39 No.102372608

Anonymous 09/13/24(Fri)16:50:39 No.102372608

>>102372588
Ok, marinaraspaghetti.

Anonymous
09/13/24(Fri)16:52:01 No.102372633

Anonymous 09/13/24(Fri)16:52:01 No.102372633

>>102372608
Sit, dog.

Anonymous
09/13/24(Fri)16:54:59 No.102372664

Anonymous 09/13/24(Fri)16:54:59 No.102372664

>>102372633
Kill yourself dumb nigger.

Anonymous
09/13/24(Fri)16:56:09 No.102372682

Anonymous 09/13/24(Fri)16:56:09 No.102372682

>>102372586
Mind explaining it to me? I have a few ideas of my own, but I'd like to hear your interpretation.

Anonymous
09/13/24(Fri)16:56:44 No.102372693

Anonymous 09/13/24(Fri)16:56:44 No.102372693

>>102372061
>can't solve basic math questions
>muh overfitted olympiad problems
fuck off shill

Anonymous
09/13/24(Fri)16:57:33 No.102372702

Anonymous 09/13/24(Fri)16:57:33 No.102372702

>>102372664
>Still can't name a single better alternative
Whew.

Anonymous
09/13/24(Fri)16:57:49 No.102372706

Anonymous 09/13/24(Fri)16:57:49 No.102372706

File: image.png (521 KB, 1024x1024)

521 KB PNG

>>102365558
kek

Anonymous
09/13/24(Fri)16:59:10 No.102372727

Anonymous 09/13/24(Fri)16:59:10 No.102372727

>>102372595
I don't miss the model itself, but the way it made me feel. The magic died once I truly learned how LLMs work in the pyg6b era. You guys also don't make it as fun as it used to be. The only thing this thread is good at is sticking its head up its ass and arguing how it can taste shit on its tongue.

Anonymous
09/13/24(Fri)17:00:51 No.102372746

Anonymous 09/13/24(Fri)17:00:51 No.102372746

>>102372702
I just did you dumb tranny.

Anonymous
09/13/24(Fri)17:03:49 No.102372776

Anonymous 09/13/24(Fri)17:03:49 No.102372776

File: GXYclTFXgAA8b33.jpg (50 KB, 2010x608)

50 KB JPG

>ask question
>money stolen
wow

Anonymous
09/13/24(Fri)17:08:30 No.102372824

Anonymous 09/13/24(Fri)17:08:30 No.102372824

How does Marinaraspaghett manages to keep winning over and over??i

Anonymous
09/13/24(Fri)17:11:48 No.102372866

Anonymous 09/13/24(Fri)17:11:48 No.102372866

>>102372776
and what was your question

Anonymous
09/13/24(Fri)17:12:10 No.102372871

Anonymous 09/13/24(Fri)17:12:10 No.102372871

File: file.png (1 KB, 62x92)

1 KB PNG

>>102372824

Anonymous
09/13/24(Fri)17:17:30 No.102372939

Anonymous 09/13/24(Fri)17:17:30 No.102372939

>>102372866
probably the final one

Anonymous
09/13/24(Fri)17:20:24 No.102372968

Anonymous 09/13/24(Fri)17:20:24 No.102372968

File: mari.jpg (179 KB, 728x1152)

179 KB JPG

>>102372824

Anonymous
09/13/24(Fri)17:20:28 No.102372969

Anonymous 09/13/24(Fri)17:20:28 No.102372969

>>102364922
>>(09/11) Fish Speech multilingual TTS with voice replication: https://hf.co/fishaudio/fish-speech-1.4
audio samples anywhere??? better than RVC/XTTSv2 or not?

Anonymous
09/13/24(Fri)17:25:02 No.102373027

Anonymous 09/13/24(Fri)17:25:02 No.102373027

>>102372586
It's not funny because it's likely just a card used to test the reply for a bunch of models, thus the high swipe count.

Anonymous
09/13/24(Fri)17:27:28 No.102373058

Anonymous 09/13/24(Fri)17:27:28 No.102373058

>>102373027
>likely
No it isn't slopper. Go shill your useless slopmerge on reddit. You have zero credibility here.

Anonymous
09/13/24(Fri)17:28:54 No.102373076

Anonymous 09/13/24(Fri)17:28:54 No.102373076

>>102373027
Tangential to the discussion you are having with the other anon, but if that's how the fine tuners are testing their models, that's the wrong way to go about it.
You really need to have a decent length chat with a model to know if it's better or worse than another model, a previous checkpoint, etc.
I have this one card I use to test models where we go through a bunch of questions before getting into actual (E)RP. That and Nala for straight RP, are my go to.

Anonymous
09/13/24(Fri)17:29:09 No.102373080

Anonymous 09/13/24(Fri)17:29:09 No.102373080

>>102373058
and what have you done to contribute?

Anonymous
09/13/24(Fri)17:31:19 No.102373104

Anonymous 09/13/24(Fri)17:31:19 No.102373104

>>102373076
Oh yeah, and by decent length I mean from the start, not a chat with a context already filled by another model.

Anonymous
09/13/24(Fri)17:32:28 No.102373118

Anonymous 09/13/24(Fri)17:32:28 No.102373118

>>102373080
Not wasting people's time on shitty slopmerge of low quality is actual contribution compared to the opposite of that.

Anonymous
09/13/24(Fri)17:40:16 No.102373223

Anonymous 09/13/24(Fri)17:40:16 No.102373223

>>102373118
Something tells me you lived your whole life like this.

Anonymous
09/13/24(Fri)17:49:16 No.102373327

Anonymous 09/13/24(Fri)17:49:16 No.102373327

>>102373058
You are embarrassing yourself, fucking retard.

Anonymous
09/13/24(Fri)17:56:04 No.102373402

Anonymous 09/13/24(Fri)17:56:04 No.102373402

LLMs don't know that sex won't dishevel hair if her head wasn't touching the bed.

Anonymous
09/13/24(Fri)17:56:52 No.102373412

Anonymous 09/13/24(Fri)17:56:52 No.102373412

gpt3 open sourced in fourteen days

Anonymous
09/13/24(Fri)17:58:03 No.102373431

Anonymous 09/13/24(Fri)17:58:03 No.102373431

>>102372360
That sounds like you aren't using enough muscle to me. Tokenize at the character level, train on a trillion trillion synthetic literatures, run your resulting model on a million B100s. Brute force solves all problems.

Anonymous
09/13/24(Fri)17:59:44 No.102373454

Anonymous 09/13/24(Fri)17:59:44 No.102373454

>>102373412
It's incredibly outdated by now, 8B models have surpassed it. It also only has a context of 2048 tokens.

Anonymous
09/13/24(Fri)18:00:39 No.102373462

Anonymous 09/13/24(Fri)18:00:39 No.102373462

>>102373223
NTA but after a full two years of 24/7 diy slop shilling and corporate deepthroat shilling I’m about ready to minecraft people that refuse to buy an ad.

Anonymous
09/13/24(Fri)18:04:39 No.102373515

Anonymous 09/13/24(Fri)18:04:39 No.102373515

>>102373412
Sama-chama will only do it during the darkest hour.

Anonymous
09/13/24(Fri)18:08:26 No.102373570

Anonymous 09/13/24(Fri)18:08:26 No.102373570

>>102373462
Everything is an AD for you schizo, take your meds.

Anonymous
09/13/24(Fri)18:08:52 No.102373576

Anonymous 09/13/24(Fri)18:08:52 No.102373576

>>102373558
>>102373558
>>102373558

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.