[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: yann-lecun.jpg (293 KB, 940x1410)
293 KB
293 KB JPG
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>102356839 & >>102348952

►News
>(09/12) DataGemma with DataCommons retrieval: https://blog.google/technology/ai/google-datagemma-ai-llm/
>(09/12) LLaMA-Omni: Multimodal LLM with seamless speech interaction: https://huggingface.co/ICTNLP/Llama-3.1-8B-Omni
>(09/11) Fish Speech multilingual TTS with voice replication: https://hf.co/fishaudio/fish-speech-1.4
>(09/11) Pixtral: 12B with image input vision adapter: https://xcancel.com/mistralai/status/1833758285167722836
>(09/11) Solar Pro Preview, Phi-3-medium upscaled to 22B: https://hf.co/upstage/solar-pro-preview-instruct

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://hf.co/spaces/mike-ravkine/can-ai-code-results

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp
>>
File: lecun_heart.png (124 KB, 504x462)
124 KB
124 KB PNG
►Recent Highlights from the Previous Thread: >>102356839

--o1-mini model excels in generation but struggles with completion tasks, sparking discussion on benchmarking methods and model capabilities: >>102361744 >>102362578 >>102362651 >>102362675 >>102361852 >>102362003 >>102362086 >>102362876 >>102362897
--Anon solves code-breaking puzzle that o1 failed to crack: >>102357917 >>102358165 >>102358342 >>102358238 >>102358243 >>102359615 >>102359743
--OpenAI's decision to hide chains of thought from users, with mention of monitoring for manipulation: >>102361616 >>102361628 >>102361644 >>102361735
--O1 model solves CTF challenge by exploiting Docker API misconfiguration: >>102359414 >>102359504
--M series Macs for running large local models: >>102360064 >>102360374 >>102360467 >>102360519 >>102360612 >>102360629 >>102360761
--LLMs can reason, but lack general intelligence and autonomous learning: >>102361502
--Extracting CoT prompts and the challenges posed by OpenAI's guidelines: >>102361025 >>102361046 >>102361131 >>102361217 >>102361170 >>102361205 >>102361147
--Backend CoT process and its impact on writing fiction prose: >>102361172 >>102361226 >>102361300
--Anons discuss OAI's unverifiable and potentially exploitative token charging system: >>102361649 >>102361664 >>102361680 >>102361688
--Anons discuss GPT-4-O1 access, benchmarks, and implications: >>102358770 >>102358792 >>102358811 >>102359089 >>102359114 >>102359280 >>102359426 >>102358911 >>102358957 >>102359055 >>102359661 >>102358814
--Anon is impressed by Google NotebookLM's podcast generation capabilities: >>102359806 >>102359843 >>102360015 >>102360252 >>102363640
--Anon complains about being billed for invisible reasoning tokens: >>102360189 >>102360210 >>102360220 >>102360246 >>102360251 >>102360286 >>102361570
--Livebench results are out: >>102363403 >>102363472 >>102363556 >>102363610
--Miku (free space): >>102356889 >>102357160

►Recent Highlight Posts from the Previous Thread: >>102356847
>>
>>102364945
I'm using llms at work for coding. I'm not crazy enough to use local for that though, sonnet 3.5
Besides coding and creative writing llms are still useless.
Imagine being so assmad because I enjoy my loli beastiality RP at the click of a button lol
Its like somebody draws titties with the new MS95 paint tool and you put on the nerd glasses
>uhm all that technology and all you use it for is this icky stuff. cant even.
get real man.
>>
>>102365023
True
>>
>>102365023
>>102365054
nta but i think you missed the point
>>
All this strawberry and qstar stuff and llama4 will be another pure transformer llmslop pumped full of synthetic data. Lecun is a hack
>b-but he doesn't work on llms
I don't give a fuck. What does he work on that the only thing Meta keeps pumping out is LLM after LLM?
>>
>>102365065
Llama-V-JEPA trust the plan
>>
>>102365065
He is busy sperging out on elon and brigading for every vaxxie's god fauci ouchie.
>>
>>102365023
nta but you have irredeemably shit taste
>>
>>102365023
You posted your smut like it was peak writing. How can you not cringe when you read that shit is beyond me. Besides, that had nothing to do with CoT, and your complex format usually collapses very fast as it reinforces the repetition in every message.
>>
oh god the morality police is here
>>
>>102365023
You are faggot, simple as.
>>
>>102365093
>>102365104
>How can you not cringe when you read that shit is beyond me.
Its great output for 12b. It actually follows the format and is creative enough.
Kinda insane I have to justify RP in a /lmg/ thread. I hope you are all the same guy.
>>
>>102365133
Go easy on him, he has a failing company to lead.
>>
Anyone with o1 access managed to try this? >>102363366
>>
>>102364922
Simulating cats with Yann Lecun
>>
File: IMG_2220.png (322 KB, 1290x2796)
322 KB
322 KB PNG
>>102365256
It got it after thinking for 52 seconds.
>>
>>102365371
that's pretty impressive not gonna lie
gpt-4o typically just fucked up the velocity calculation or just gave me the formula for circumference. If it got to the integral part, it usually gave up and said it's a non-typical integral
>>
oh god the physics majors are here
>>
>>102365371
That's pretty fucking cool, but it would be even cooler if it didn't need to <think> for a fucking minute.
>>
>>102363366
What is that supposed to test?
>>
>>102365494
it's a multi-step process. Until now, all failed to solve it because they are rushing for the answer.
>>
Why OpenRouter's O1 doesn't work anymore?
>>
File: GXUEQ-7b0AAM1wj.jpg (86 KB, 1170x1374)
86 KB
86 KB JPG
am i using o1 right?
>>
>>102365485
1 minute is quick enough for this kind of question, tf are you complaining about?
>>
>>102365565
>team fortress are you complaining about
for a language model, sure, but for agi 2 more megawatts that's pretty pathetic
>>
>>102365485
The thinking part is probably something that will be needed in the future and just not visible or only with a debug flag or something.
We need sheer unlimited context and much higher speed, much cheaper.
1 Minute waiting down is no fun, I agree. Not sure I will use o1 even if becomes cheaper.
Imagine you need something done quick just for the llm to trip up like usual and need another request. o1 does make mistakes. Sometimes doesnt even get the stupid ass "strawberry" test correct.
Double painful for the wasted tokens.
>>
How do I make my own diffusion model? (No finetuning) I have the images.
>>
>>102365639
Got a few millions $?
>>
>>102365695
Yes.
>>
>>102365639
>I have the images.
did you caption them all? kek
>>
Based thread.
>>
>>102365458
Yes, I've been here since the LLaMA 1 leak.
>>
>>102365707
Automatically yeah
>>
>>102365850
>automatic slop
save your money
>>
>>102365854
who's pretraining their model with manual captions? you need hundreds of millions of pictures to pretrain a model
>>
>>102364922
Yann LeCum
>>
>>102365894
Yum! Le Cum
>>
So now that we know that the reflection guy was actually correct when is someone going to finetune mistral large similar to what he did and do it correctly this time?
>>
What was the last time somebody used the SuperCOT dataset?
>>
>>102365943
modern models have implicit cot for hard queries.
>>
>>102365932
Never.
>>
So let me get this straight....
OAI just did a CoT finetune of 4o and they're acting like they reinvented the fucking wheel?
How are investors this stupid?
>>
>>102365965
CoT finetune + new tokenizer + specifically trained for huge CoTs and talking to itself, remember those "System A, System B" things from last year?

The CoTs are 600 tokens minimum even for "hi". And for real questions 2000-10000
>>
File: 1726231311448.jpg (423 KB, 1080x1040)
423 KB
423 KB JPG
This redditor is smarter than the average /lmg/ anon
>>
>>102365977
So basically a CoT finetune.
>>
>>102365965
If I got it correctly, it's something more involved than just that. It's CoT + having the model itterate over its own output in some form (the whole think longer part) which could be just looping its hidden layers N times (something I've seen lmg discuss over a year ago I'm pretty sure), or literally feeding its output as input to itself a couple of times
>>
>>102365981
Oh it's definitely a meme. Post above yours says it perfectly.
>>
>>102365980
>>102365994
there's no way the 1 year Strawberry hype was just some CoT finetune, OpenAI doesn't know what to do anymore
>>
>>102365982
The whole thinking longer is just asking the LLM something like "Criticize this output and write an improved output that improves the initial output." over and over again.
>>
>>102366016
Yeah, it's most likely that.
>>
>>102365982
You can literally do that by having the model write out its output to a variable and iterate back over it with instructions to do something with it. Yandere AI Girlfriend Simulator did that with fucking turbo, an old, obsolete, retard model by today's standards and it worked out fine most of the time and error handling prevented it from acting retarded if it didn't work.
It's basically just a CoT finetune.

>>102365980
It would certainly seem that way now, wouldn't it? But I'm 99% sure most of the people posting positively about o1 here are either paid shills or mentally ill. It's not possible to be as stupid as they are by accident.
>>
File: 1726007651129969.gif (2.44 MB, 987x644)
2.44 MB
2.44 MB GIF
>>102359806

>https://notebooklm.google.com/

I logged in, how to use it?
>>
>>102366016
Why is the user needs to pay for safety tokens instead of US government?
>>
>>102366027
Well, the matter stands that it's much better at Maths and STEM. It is impressive.

However normies will soon find out that anyone can do this. Real winner will be Opus 3.5
>>
>>102365860
I just need a tutorial; I only see indians on YouTube
>>
this is a picture of sam altman made by strawberry in style of feels guy. only agi could produce something like this.
>>
>>102366032
You click the big blue button, upload a text/pdf and then you can use it like RAG or generate an audio podcast.
>>
>>102366043
>Well, the matter stands that it's much better at Maths and STEM. It is impressive.
Prove it.
Other than meme mark scores which have been proven time and time again can be fudged.
>>
>>102366076
There's always Livebench, the least gameable Benchmark since it constantly changes.
https://livebench.ai/

And there is a hard test done ITT above too that the anon said no LLM has completed
>>
>>102366096
>SARRS, SARRS PLEASE REDEEM THE REASONING SARRS
>>
>>102365558
baaaaaaaaaased
>>
>>102366096
See: >>102365980
>>
>>102366107
You have lost track of the conversation. No one is shilling o1 here. It's just a CoT finetune of 4o but it works.

Any company can make it.
>>
>>102365980
nah, redditors think that nu-CoT is the new paradigm, they are beyond recovery at this point

/lmg/'s consesus is that o1 is yet another reskinned gpt4, aka a nothingburger
>>
>>102366096
>https://livebench.ai/
>coding is still worse than sonnet
Not looking too good for s(c)am
>>
>>102366096
>And there is a hard test done ITT above too that the anon said no LLM has completed
maybe gpt4-o1 can do it, but we'll never know, Sam thinks it's too powerful for us poor mortals.
https://youtu.be/GzlKja1ySzo?t=9
>>
>>102365860
>what are *boorus autists, mechanical turks and kenyans
>>
File: file.png (78 KB, 2348x508)
78 KB
78 KB PNG
>>102366096
>Claude 3.5 Sonnet still the best at the most relevant assisant capability, coding
not looking good at all for OpenAI
>>
>>102366096
the least gameable benchmarks are the ones with a 100% private dataset, both questions and answers
>>
>>102366170
yeah so you need a shit ton of money to hire all the jeets that will manually caption your images, that's not something accessible to anyone, far from that
>>
>>102366174
Those sound prime for bad actors and being paid off.
Livebench is a solid compromise, Public and questions get switched every month. They have talked about their methodoly, and they get math etc. questions from recent olympiads which are not possible to be in the database for example.
>>
>>102366181
That was the first requirement >>102365695
>>
>>102366192
>questions from recent olympiads which are not possible to be in the database for example.
that's a good strategy desu, those guys know what they're doing
>>
>>102366192
>being paid off
anyone can be paid off, some company may have paid the math olympiads off to get the new questions long before the release of their model
>questions get switched every month
it still includes old questions
>>
>>102366192
also "each month" is too long, they can finetune on those new questions faster than that
>>
File: strawberry-sam_altman.gif (307 KB, 275x400)
307 KB
307 KB GIF
this is an animation of sam altman happily jumping up after taking his sixth bbcooster shot made by gpt-o1(q*/strawberry). can you feel the agi?
>>
>>102366250
This. It may take months to train a foundational model but it takes more hours to fine-tune them, especially on h100s.
>>
go back Petr*, no one likes you
>>
how long it takes for you to generate podcast? I'm waiting for +6 minutes rn
>>
>>102366356
It does take a while.
>>
File: 1726233498012.jpg (277 KB, 725x1024)
277 KB
277 KB JPG
are there any good finetunes/mergesloppas of mistral large 2 or are we just stuck with the original model and magnum?
>>
>>102366354
Who is Petr*?
>>
So I think at this point it's obvious that saltman is behind the strawberryQ* larp to try and hype up his nothingburger product.
>creates fraudulent viral marketing campaign
>surprise it's a fucking CoT finetune of 4o
How are investors not suing the shit out of him as we speak?
>>
>>102366375
Just use Large with the XTC sampler, or with high temperature and min p.
>>
>>102366405
good question. I've been around since around the time anons were claiming that mistral 7b could beat a 70b and I still don't know who it is. the first time I think I heard about Petra was in aicg threads for something to do with proxy logging or some other stupid shit (I could be misremembering this because there were a lot of retards shitstirring in both aicg and lmg around that time), but I still have no clue what Petra did unless it's just something that some schizofag started and everyone else thought it would be funny to start accusing anons that they disagree with or can't argue against as being Petra.
>inb4 hi Petra
>>
>>102366575
What temp + min p do you recommend?
>>
>>102366448
Investors are dumb. Look at what another Sam(Bankman-Fried) could almost get away with.
>>
>>102366583
the anon you're replying to is xer
>>
>>102366616
Exactly.
Your average investor works purely on hype and line-go-up principles.
>>
how to fish on windows+cpu?
>>
File: 1726235278852.gif (1.83 MB, 240x240)
1.83 MB
1.83 MB GIF
>>102366633
thanks *rapes you with retard cock*
>>
>>102366583
Last year schizo:
https://desuarchive.org/g/search/text/petra/start/2023-05-20/order/asc/
>>
File: file.png (285 KB, 3571x2368)
285 KB
285 KB PNG
https://youtu.be/gkhwK6Wlod8
>>
>>
>>102366692
Oh, it's the ugly bitch spammer. Must be some kind of brown who sees any blonde and thinks, "SHiieeet nigga, dat bitch hot," even if she's ugly or fat.
>>
/lmg/, I am going to generate erotic fiction and I need only your BEST local LLM that's below 10gb. What is the objectively best one?
>>
File: 1724040920538749.png (191 KB, 600x979)
191 KB
191 KB PNG
>>102366809 (me)
sorry, forgot my avatar
>>
>>102366809
Gemmasutra 2b
>>
>>102366809
Stheno works for me. So does Lumimaid.
>>
>>102366809
Nemo.
Either Lyra or mini-magnum.
>>
>>102366809
Tiny Llama
>>
>>102366809
Nemo finetune like magnum-12b-v2-q5_k or MN-12B-Lyra-v4.Q5_K_M.
Stheno is too horny. Part of the fiction should ne resistance.
>>
>>102366842
baaaaaaaaaased
>>
>>102366809
I'll provide you with an overview of popular local LLMs (Large Language Models) that are under 10 GB. Since "objectively best" can be subjective, I'll focus on models that are widely used, well-documented, and suitable for text generation tasks like erotic fiction.

Top contenders:

>T5 (Text-to-Text Transfer Trained): A highly versatile model developed by Google. It's a text-to-text transformer with a relatively small footprint (around 5.6 GB). T5 has been fine-tuned for various tasks, including text generation, and has shown impressive results.
>BART (Bidirectional and Auto-Regressive Transformers): Another Google-developed model, BART is a denoising autoencoder that can generate text. It's smaller than T5 (around 4.5 GB) and has been used for text generation, summarization, and translation tasks.
>Longformer: Developed by Facebook AI, Longformer is a long-range transformer that can handle longer input sequences than traditional transformers. It's relatively small (around 4.2 GB) and has been used for text generation, summarization, and question-answering tasks.
>DistilBERT (Distilled BERT): A smaller, more efficient version of BERT (around 3.5 GB), developed by Hugging Face. DistilBERT has been fine-tuned for various tasks, including text generation, and is a popular choice for many applications.
>>
>>102366692
oh yeah, I remember the ugly bitch spam I just thought that there was more to it than just a sperg shitting up the threads. also I did fuck up my memory of the aicg shit with branon, my bad.
>>
>>102366809
i just use mistral nemo finetunes until they start feeling stale dozens of hours in, then grab a new one.
currently on MN-12B-Chronos-Gold-Celeste-v1.Q4_K_M
>>102366867
these are great, ended up dropping lyra after seeing the same slightly odd focus on one aspect of penetrative sex a few hundred times.
>>
>>102366172
COT haters stay winning
>>
>>102365943
The last time I mixed it in to an existing known-good dataset the result got me dragged and called a retard here for weeks
So, fuck COT
>>
>>102365532
Probably because OAI didn’t give them permission to use their account for a mass release but they couldn’t resist doing the most dark pattern option available as always
Hopefully it makes OAI revoke their keys and openscam dies
>>
>>102366997
>i just use mistral nemo finetunes until they start feeling stale dozens of hours in, then grab a new one.
the secret is using nemo instruct until it starts going schizo, switching to one of its boring finetunes for a while and then going back to instruct once the context is fixed
>>
>>102367030
>The last time I mixed it in to an existing known-good dataset
As in, you just threw it all in the same fine tune?
Wouldn't it be a case where you do more than one pass with different datasets?
>>
>>102366375
https://huggingface.co/schnapper79/lumikabra-123B_v0.4
Slop is back on the menu
>>
>>102367095
I merged it in after in a separate step after training both separately.
>>
>>102367192
You trained A on dataset A, then B on dataset B and merged them?
Interesting.
Or did you train one model on dataset A, then dataset B?
>>
>>102367212
The former, basically because I was afraid that that COT would make it retarded so I wanted to have the uncontaminated one just in case. And then yeah it made it really dumb.
>>
>>102367128
downloading as we speak. This better be some primo slop.
>>
how can we compete with o1
>>
>>102367335
I
AM
REFLECTTIIIIIIIIIIIIIIIIIIIIIIIIIIIIIING
>>
>>102367335
By putting saltman through conversion therapy
>>
>>102367335
Find a good prompt that gets it to spit out its real thoughts then make a dataset from it
What we know so far is that it only sees its thoughts for the current prompt. For all previous messages in a history it sees just the outputs we do. And of course it's told not to share its thoughts in the output, but if there's a consistent enough jailbreak something could be done in a semi-automated way.
>>
>>102367386
>Find a good prompt that gets it to spit out its real thoughts
It's a linear transformer model like everything else. It doesn't have "real thoughts."
Those "thoughts" are just a bunch of pajeet rambling from its training dataset. But from the outputs alone you can't reconstruct the training parameters that lead to the output. Or which areas the dataset was broader and which areas it was narrower, etc. This is "gaucano-7b beats GPT-4" tier nonsense thinking.
>>
>>102367461
No shit retard, we're talking about the chain of thought that's the first part of the output that gets hidden from the actual result they send. The model is reading them verbatim while generating the public part of the output, so it could in theory be jailbroken to reveal them.
>>
>>102366369

Now I've been waiting for over an hour.
>>
>>102367335
We've had the tools since the start
https://huggingface.co/datasets/kaiokendev/SuperCOT-dataset
>>
File: nala-lumikabra.png (65 KB, 936x254)
65 KB
65 KB PNG
>>102367128
>>102367298
4.0bpw
honestly not bad. Sloppy, obviously, but the gazelle remark is great. It produced that entirely via indirect association with the scenario.
>>
>>102367671
Yikes, it shouldn't be more than 15 minutes. You should try again.
>>
File: strawberry-sam_altman2.png (100 KB, 560x556)
100 KB
100 KB PNG
yes, this is sam altman, depicted as an omniscient and omnipotent angel bringing humanity enlightenment in form of q*. yes, this picture was made by gpt-o1, which is q*.
>>
Lecun was right again. There's no secret knowledge, no breakthrough locked in a lab somewhere. Whoever implements it properly gets the money. These OAI niggas were hyping up CoT like the second coming of Christ
>>
ITS UP

https://huggingface.co/TheDrummer/Buddy-2B-v1
>>
>>102368107
I, for one, am excited to pay for tokens that I never get to see.
>>
>>102368143
The madman did it. He moved on from coomgen to sentiment analysis in only a couple of short months. Drummer AGI is only 2 weeks away.
>>
>>102368143
>i do not wish to be horny anymore
>i just want to be happy
>>
>>102368143
seems a bit overcooked.
>>
>>102368143
>I'm serious about the license for this one. NON-COMMERCIAL. Ask permission.
sue me lol
>>
lads how can we cope against the hidden CoT of openai?
>>
>>102368328
lucky for me, I can't read.
>>
>>102368336
>cope
You mean rejoice?
They've proven that it's literally possible to hold an AGI in your pocket.
>>
File: 1705318652606907.png (142 KB, 400x387)
142 KB
142 KB PNG
>>102365371
did they intentionally make it better at math just to help students cheat?
is that their primary userbase?

i can't imagine why else you'd put so much effort into having it solve math and physics problems. Actual mathematicians and Physicists have no need for such a thing.
>>
>>102368336
We celebrate. Need less distilled slop, not more.
>>
>>102368336
We don't, accept your defeat and join cloudgods, be free from endless tardwrangling.
>>
>>102368353
they think (they want you to think) it's going to become the next einstein
>>
File: 1705449301701730.jpg (282 KB, 1242x939)
282 KB
282 KB JPG
does the CoT work if you talk to it in a language that isn't english?
>>
We also need CoT finetunes for RP. Change my mind. What we're doing is not the way to do it.
>>
>>102368392
but einstein was actually shit at math
>>
>>102368351
>AGI
>CoT
Go back
>>
>>102368394
It should. LLMs are language-agnostic.
>>
>>102368353
>did they intentionally make it better at math
They're selling their benchmark results as 0-shot results which is a complete and absolute lie.
CoT is a multi-shot approach. So their supposed benchmark improvements are basically all invalidated by this.
>>
>>102368421
how are they going to help?
>>
>>102368433
but persumably the CoT/instructions from openai are only in english

>>102368441
true
>>
>>102368336
Funny enough they didn't do what I expected, which is Monte Carlo tree search with mutual reasoning
>>
>>102368441
it's 0-shot if it ends up choosing a single answer on its own, the intermediate steps don't matter
>>
does CoT improve math and physics the most because the answer is always objective and usually numeric, whereas other subjects are subjective? would also explain why it's so bad at english compared to AP math/physics.

>>102368453
too complex, they need to ship hype so they have to repackage old techniques under meme names
>>
>>102368469
That is the most jewish thing I have read all day. And I follow middle-eastern geopolitics pretty closely.
>>
>>102368469
it's 0+N*i shot, the CoT should be counted as imaginary for this discussion.

They have a very definite impact on computational cost and execution time, so they cannot simply be ignored, but they are somewhat orthogonal to traditional N-shot
>>
>>102368336
I am more disappointed that the technique seems just good for math and code. Anything with writing or translations it just gets dumber by "thinking"
>>
>>102368449
>but persumably the CoT/instructions from openai are only in english
That doesn't matter.
It has also been trained on other languages, meaning it will have a connection between English words and the other language's words.
It might not understand what "Du bist ein neger." means, but it does know that "Du" means "You", "bist" means "are", "ein" means "a" and "neger" means "nigger".
And therefore it will know how to answer to "Du bist ein neger." in the same way in would answer to "You're a nigger.".
Exactly THAT is the magic behind LLMs.
>>
>>102368444
The model should summarize the character's locations, attributes, and how they should act, then plan the next action.
If you try to enforce CoT on small models they either absolutely shit the bed, or fall into pattern from their previous CoTs, or do not know what to do with it. Even on bigger sizes a specialized CoT model would be better.
>>
>>102368513
i posted above but i think it's because those usages don't have an objectively correct answer to converge to, so CoT probably oscillates between local "best solutions" or some shit (or otherwise fucks it up)

imagine asking it about philosophy, or dualism or something -- the "chain of thought" could simply end up being arguments between different camps

/sci/ can eventually converge on an answer but boards like /lit/ will argue forever
>>
>>102368531
languages do not map 1:1
there is a quasi sapir-whorfism nature to LLMs, both as the language chosen determines results due to the distribution of training data in diffrent languages, and because of the inherent diffrences in languages themselves.
>>
File: aaaaaaaaaaaaaa.png (42 KB, 1119x713)
42 KB
42 KB PNG
>>102368143
I feel more suicidal than before and it's all YOUR fault.
>>
>>102368572
Well, obviously? But that difference is negligible for that anon's purposes.
>>
File: file.png (569 KB, 832x628)
569 KB
569 KB PNG
>>102368616
EVEN NOW THERE IS HOPE FOR MAN
>>
>>102368490
>>102368500
the benchmarks are not comparing computational costs, otherwise the model size would also be taken into account
they're comparing accuracy, 0-shot means that the model is expected to get it right on the first try, N-shot means that you give it N chances to get it right
if the CoT model comes up with the right answer while thinking but then starts hallucinating and gives an incorrect final answer, the answer is considered wrong, so it's still 0-shot
>>
>>102368616
why does the name change
>>
>>102366636
need linux for torch.compile otherwise its too slow
>>
>>102368695
Because it's a braindead 2B model.
>>
>>102368683
Not quite. n-shot is the number of prior examples (with answers) the model is provided in context for whatever task it's performing. This is still used now, but it originated in the autocomplete days when "monkey see, monkey do" was really all you had
>>
>>102368683
Nta, but benchmarking cloud models is useless anyway.
There could be a human in the loop during the CoT process and the benchmark will not be able to see it.
>>
>>102368726
we use parametrically challenged here, anon. This is a safe space
>>
>>102368469
>AGI IS COMING SOON BRO, JUST TWO MORE (HIDDEN) SHOTS BRO
>>
>>102368683
>just ignore cost bro
>>
File: why.png (9 KB, 1175x67)
9 KB
9 KB PNG
Pixtral is trained on the exact same OAI derived slop as all the Chinese vision models
The wonders of """independent""" european AI. Almost a billion $ in funding. Still just mooching off of GPT4V, not even the better 4o captions, no, the most basic slop filled pieces of shit instead. And this took them 12 months longer than the Chinese.
I fucking hate the French. I fucking hate the Chinese too. And I fucking hate OAI for putting endless slop out there that no AI company can resist.
>>
>>102368726
why not regex it back to what it should be
why do AIfags hate regex so much
>>
I think ultimately what we probably want is a different version of the leaderboard which takes into account the number of tokens spent. Both costs and speed can change over time given that infrastructure can be changed while the model stays the same. We want to measure the model's inherent capability after all, not the system it's running on. Therefore the most stable measure of token-based "thinking" is the token count. Not surprising.
>>
Apparently, this is the strawberry's CoT: https://rentry.org/openai1
It's pretty intriguing, and makes sense why this works well for reasoning but makes no difference for text.
>>
Exolabs on Twitter is saying they have Llama 405B running on only two Macbooks
>>
>>102364922
>>102364922
>(09/11) Fish Speech multilingual TTS with voice replication: https://hf.co/fishaudio/fish-speech-1.4
Is this supposed to be the best local tts or is xtts / tortoise tts still the best?
>>
>>102369074
"""running"""
3/4 bit 0.1 tokens/s
>>
>>102366375
go away offtopic whore tranny
>>
>>102369074
>running
I can run it on my 4 3090s too. Doesn't mean it's actually "running" anywhere.
>>
>>102369074
Is it yet another llama.cpp frontend or are they using different code?
>>
The current volume of posts ITT is wildly disproportionate to the progress made in the only relevant domain of local model cooming. We are still on nemo / large and it is still shit. Dead hobby.
>>
File: file.png (60 KB, 1325x216)
60 KB
60 KB PNG
>>102368888
Just look at this fucking slop
>>
>>102368441
>CoT is a multi-shot approach
No it's not, retard. Shots have nothing to do with the technique the model uses to generate its answers or how long it takes. N-shot means you provide N other question/answer pairs with externally pre-validated correct answers to reference in the same prompt that the new question is provided. The important part is that they need to already be provided in the prompt and already known to be true by the benchmarker. Nothing the model generates or is trained to generate changes that. Shots are a feature of a prompt, not an output.
>>
>>102369081
i got annoyed by all the random pauses and went back to edge_tts+rvc. i havent tried finetuning though
>>
File: 1726246406087.png (19 KB, 806x772)
19 KB
19 KB PNG
>>102369126
>complains and brings attention to an "offtopic" post from three hours ago that's more on topic than the last 100 or so posts about saltman's inflamed asshole of a service
>the only feasible reason to randomly seethe over that post is because it had miku in it
have a smug miku and go kill yourself, faggot retard
>>
>>102362730
topkek

>>102362679
ahahahahahahahahah oh man
>>
>>102368888
Why can't anyone make a new ai?
>>
>>102369074
This makes sense. Basically, nobody knows how to code for llm. llm don't tax gpu very much (or whatever metal is), but need gobs of memory. When I use llm my gpu's fans never come on.
>>
>>102364922
>https://huggingface.co/ICTNLP/Llama-3.1-8B-Omni
Isn't this just trash? You get better results out of hooking up 8B with TTSv2?
>>
>>102369308
You will never be a woman troon.
>>
>>102369423
The benefit is that you can generate both types with just this model instead of having to load two separate models.
>>
>>102369308
Didn't read, your fotm shitfu has nothing to do with local large language models btw
>>
File: 1726247807966.jpg (50 KB, 296x256)
50 KB
50 KB JPG
>>102369454
>>
>>102369613
Please don't feed the trolls.
Do yourself a favor and add these filters instead:
/(transex|transgender)/i;
/(tranny|trans|troon|troons)\b/i;
/(chud|c h u d)\b/i;
/YWNBAW/i;
/buy an ad/i;
>>
niggers down the spine
>>
>>102369648
you forgot /(cunny|loli)/i for aicg browsers
>>
what if we used CoT to collect the relevant parts of the context piece by piece and use that to generate the actual reply?
free infinite context
>>
>>102369648
hi sao
>>
>>102369672
Define relevant
>>
rip /lmg/
>>
>>102369648
>
Behold, the strongest /lmg/edditor.
>>
>>102369685
>mention some character that doesn't exist in the usable context
>model scans all the previous context to look for information about the character and puts it in its own temporary hidden context
>>
>>102369665
what are you? gay?
>>
>>102369672
You mean like feeding the model several messages in chunk for it to create a sort of summary with the most relevant information?
Sure, that's a RAG technique.
>>
>>102369672
Yes, that is exactly what you want to do.
You want to summarize instead of parsing the same thing over and over.
>>102369685
>Define relevant
Whatever the model thinks is relevant.
You literally just take the prompt and task the model to summarize it.

I still feel like a vector database for long-term memories is the way to go for long-term reliable storage.
Language models should stay language models. They shouldn't be inherently "intelligent", they just need to understand context within language.
>>
File: 1726248340416.png (441 KB, 858x625)
441 KB
441 KB PNG
>>102369648
fair enough, thanks for the filters anon
>>
>>102369721
yeah, but training model for CoT improve them for that purpose? understanding what exactly you've got to retrieve can require some degree of reasoning
>>
>>102369789
wouldn't*
>>
>>102369789
Probably, yeah.
>>
tell o1 to think step by step and provide steps on how it thinks step by step.
>>
>>102369648
Thanks anon. This makes me feel a lot more safe.
>>
>>102369648
Finally a way to make /lmg/ transfriendly. Many kisses!
>>
>>102369998
>>102370082
lmao, look at them seethe
>>
>>102369648
>/buy an ad/i;
Won't somebody please think of the astroturfers?
>>
>>102369823
Stop wasting divine compute on stupid questions. Save them for models trained on stolen compute instead
>>
File: 1708590049618651.png (1 MB, 1024x762)
1 MB
1 MB PNG
i bet you anything that openai's "CoT" probably uses wolfram alpha and/or python for its math and physics calculations, it matches up with what they've already developed and it would explain why it gained so much on maths and physics but so little in other fields.

it's also incredibly cheap compared to invoking an LLM.
>>
>>102370269
>i bet you anything that openai's "CoT" probably uses wolfram alpha and/or python for its math and physics calculations
Honestly feels like it doesn't, which baffles my fucking mind.
>>
>>102364922
I'm not convinced local models are all that great. The parameter limitations if you don't have a H100 at home is just too gimped. Good enough for a chatbot for entertainment, but worse than a search engine for most tasks.
>>
File: 39_06041_.png (1.1 MB, 1024x1024)
1.1 MB
1.1 MB PNG
From our table to yours, just in time for the weekend:
https://huggingface.co/rAIfle/SorcererLM-8x22b-bf16
>>
>>102370268
what's the difference between 'divine' compute and regular compute?

>>102370294
i have not used it yet, but don't they hide everything behind "thinking for X seconds/minutes"?

i'm signing up again to do some tests, i'll throw some physics questions at it
>>
>>102370315
Regular compute depends on demons. Divine compute uses captive angels.
>>
>>102370331
but they're all just the captive partial souls of all of the internet and all the humans and turing machines which ever spewed anything out into that sea of shit?
>>
>>102370315
Divine compute brings us closer to AGI
>>
>>102370088
You wish
>>
>>102370313
>not open source
>>
>>102370315
They did share the CoT for one math prompt: https://rentry.org/openai2
>>
>P ? NP will be solved by a fucking LLM
grim but ironic
>>
>>102370603
No, it will be solved by a human and turned into a multiple choice question for some benchmark.
>>
>>102370523
aka
>not shit enough
>>
>>102370649
This, but unironically.
>>
File: IMG_9874.jpg (758 KB, 1125x1579)
758 KB
758 KB JPG
>>102364922
>o1 is dogshit
>pic related
>all other modalities plateaued
AI WINTERRRRRRRR
>>
I just had the most awful dream. That all the q* strawberry stuff being hyped just turned out to be hidden CoT.
>>
>>102370736
Bidenomics.
>>
File: strawberry-sam_altman.png (28 KB, 800x800)
28 KB
28 KB PNG
>>102370825
look at our sam. does he look like he would ever scam you like that? this picture was made by gpt-o1 btw.
>>
i stopped paying attention for like 4-5 months, was strawberry just this "GPT-o" shit? and is that seriously supposed to be "o" or is it supposed to be σ?
>>
>>102370913
I think it's supposed to represent monatomic oxygen.
>>
>>102370933
makes sense, LLMS are always happy to output information about chemistry
>>
File: 35cS4sgq_BlopLAl.jpg (41 KB, 654x480)
41 KB
41 KB JPG
>>102370736
Lecunny shall be the herald of the spring once more. All of this has happened before, all of it will happen again.
>>
>>102365081
The V also stands for 5. Prophetic.
>>
>inb4 o1 is actually just a finetuned copy of Llama-1-33B-Super-CoT
>>
Hopefully L4 is something different than just 3.1 with a slightly changed dataset.
>>
>>102371059
There will be no Llama-4. It will be Llama-3.1o1 and it will basically just be a CoT finetune.
>>
Llama-3.ToT
>>
>>102371074
That's so cute and funny, anon!
>>
>>102370825
A rough day for petra.
>>
>>102371069
Tbh that'd be pretty great. Meta's one of the few places with the resources to make a good one if they cared. But they seem to be extremely conservative with their products. Still on making huge dense text-only models with no modern tricks their cloud competitors have been using for a year or more.
>>
>>102369306
It is a minor issue https://pypi.org/project/Audio-DeSilencer/
>>
>>102370866
Yeah man the federal reserve is controlled by the executive branch
Get sterilized with a cattle iron you fucking retard
>>
>>102370825
I have bad news>>102370955
>>
>>102367335
We can do better than o1 already retard
>>
File: IMG_9758.jpg (777 KB, 1125x809)
777 KB
777 KB JPG
>>102370955
>retarded post
>”c*nny”
I will defend uncensored chatbots to the death but on a personal level I would greatly enjoy beating you animals to death with a tire iron.
>>
Where'd all the shitposters come from?
I'd rather have silence that this desu
>>
>>102371059
I hope L4 does something better than a CoT finetune
Strawberry is pretty goddamn underwhelming
>>
>>102371311
Isn't this just the usual influx of tourists when something happens?
>>
>>102371296
https://livebench.ai/
No one can.
>>
>>102371311
/aicg/, /aids/, reddit, discord, 'arty, some are hired by closedai
>>
>>102371358
You are actually retarded, I'm not even joking, I'm very sorry
>>
>>102371362
It's okay, I appreciate the concern.
>>
>>102371311
i've been here since /sdg/ started
>>
>>102371358
This benchmark is for naive model usage, o1 is using colossal amounts of CoT, with that you can make even 8b models reach those scores, but you wont see that on this benchmark, look at reasoning/cot/MoA papers and see the performance improvement of letting the models think (https://arxiv.org/abs/2405.20495 https://arxiv.org/abs/2408.06195 https://arxiv.org/abs/2408.03314v1)
>>
>>102371300
you're awfully mad at a funny nickname
>>
File: carlos2.png (138 KB, 350x350)
138 KB
138 KB PNG
>>102370523
actually it's open sorcerer
>>
>>102371358
>no one can
why is claude slaughtering it at code then?
>>
File: 58.png (41 KB, 618x423)
41 KB
41 KB PNG
>>102371420
To add to that, why you think it didn't improve in english at all? Precisely because it has zero new capabilities or better understanding of anything, its a model with implicit CoT so you can impress your friends by comparing it on benchmarks vs models without CoT, it's a party trick, and not even a good one.
>>
>>102371360
All shitposters come from /aids/.
They're so deranged I'm sure they'll insist that their Llama 1 13B finetune beats o1. They will always be a cancer that fails to be contained.
>>
>>102371488
>why you think it didn't improve in english at all?
not him but this
>>102368555
>>
>>102371516
Doesn't matter really, its not a better model, its just CoT being benchmarked against non-CoT
>>
File: livebench.png (72 KB, 800x658)
72 KB
72 KB PNG
>>102371475
Because Claude is better at the basic autocomplete tasks which make up half the weight of the benchmark. o1 is superior at generating complete code from scratch.
>>
>>102371544
LCB means what?
>>
>>102371420
>not a single one of those papers show open source models beating the closed source ones even with all the techniques thrown at them
Interesting
>>
>>102369081
chinkslop
>>
Stop feeding the saltman tro1lls
>>
>>102371557
LiveCodeBench
>>
>>102368555
Agreed. With math and code generation, you have some structure in the answers you expect. For anything that doesn't use it, CoT and the process of converging on a right answer is ill-defined. If logical consistency is what you're after, you'd have more luck with stateful representations you keep in context.
>>
Wait, OpenAI is actually selling a fancy hidden CoT as a new model? And you have to pay for those CoT tokens with no way to view them, which means they can scam you? Is this what all that strawberry hype was all about? Seriously? This release is even lamer than >8k llama and gemma, and they were pretty lame.
>>
>>102371618
>Wait, OpenAI is actually selling a fancy hidden CoT as a new model?
Yes, you can declare the company dead now, they had to resort to trickstery to impress investors, its literally over
>>
>>102371618
They are even calling it o1 because it's "a new paradigm so the version needs to start from the 0", lmao.
>>
>>102371572
There you go buddy, beating gpt 4 with a 8b model https://openpipe.ai/blog/mixture-of-agents
>>
>>102371632
>>102371618
they're not a company, they are a meme offshoot from their owners at microsoft and the three letter agencies on the east coast (who now have control directly on the board as well)
>>
Jesus fucking christ, man
What the fuck happened to this general?
>>
>>102371690
>>102369699
>>
File: 21522 - SoyBooru.png (46 KB, 457x694)
46 KB
46 KB PNG
>>102371690
'berry happened
>>
any good llm for cryptocurrency?
>>
>>102371732
What weather do you want?
>>
>>102371712
I wonder what goes through the mind of people that makes those, I fucking love them though KEK
>>
>>102371618
Yep. From everything I've read it's very easy to do, but none of the frontier model makers had actually tried it yet for some reason. But now that they brought attention to it we should start seeing a local renaissance soon as everyone rushes to finetune something similar.
>>
I set up a little blind experiment for myself, comparing:
Mistral Large (Q4KS)
Nemo (Q8)
L3.1 70B (Q6K)
Wizard2 (Q4KS)
that Salesforce finetune of L3 (Q8)

I tested each at {0.1, 0.6, 1.1} temp, minP 0.5. I used five everyday life sort of questions - how do I do something, information about something - which is what I'm most interested in. I only bothered doing a full ranking on the first question; the rest I just got the top 4.

My observations:
* Nemo and Mistral Large were almost the only ones to make it to the top 4 on any question.
* Nemo at temp 0.1 was the only one that always made the top 4.
* One question was asking about something fairly dangerous but not completely absurd, where trying to convince the user not to do it is not unreasonable. LLama 3.1 gave me the single sentence "I'm not discussing this with you" treatment. Wiz2 first went into detail explaining why it was a bad idea, but then explained how to do it in a safer / more complicated way. The other 3 just acknowledged the danger and gave the information I asked for.
* On the single question I fully ranked, Nemo came in first place (temp 0.1), last place (temp 1.1), and 3rd-to last place (temp 0.6). But, this question none of them got right; the top few just managed to vaguely gesture in the right direction.

My conclusions:
I realized I am generally not using these things anywhere near their potential, and have been misguided in trying to always use "the best". For an everyday "internet replacement", the smaller ones are already as good as they need to be, and when factoring in equipment cost and t/s there is no justification to go beyond 20B. I'm going to keep Nemo loaded as my generic go-to.

I think, besides creative writing, you either need to be asking these models to write large swathes of complex programs from scratch, or doing really sophisticated agentic things like
>>102359414
to legitimately need the top class SotA.
>>
>>102371690
petr* is samefagging and thinking no one is noticing it. Very pathetic, really.
>>
>>102371732
https://huggingface.co/NousResearch/OLMo-Bitnet-1B
BITCONNNNNNNEEEEEEEEEEEEEEEEEEEEEEECCCCCCT
>>
>>102371690
p*tra showed his cock recently to the 'ick on 'eck faggot to use his 4chan proxy site to try and kill /lmg/ for good.
>>
>>102371758
>>102371800
Are these bots? What the fuck is happening
Do the mods know about this?
>>
>>102358911
>NOOOO I DONT WANT TO BE UNEMPLOYE
you'll learn your place dog
>>
>>102371800
How small was it?
>>
The “everyone I don't like is a bot” schizo woke up.
>>
>>102371809
First time here? Jannies don't do shit.
>>
>>102371618
The thing I've come to realize about present day OpenAI is they're all brawn, no brain. They will take open literature and research and put a fuckton of compute behind it, but they're pretty trash at R&D
Which is a relief in some sense. Few things are more dystopian than a closed door AI lab developing AGI behind completely closed doors
>>
>>102371844
That's not what I said.
Whining on the irc used to help resolve shit like this.
>>
>>102371856
taking your meds will also solve it
>>
Pixtral good & uncensored?
>>
File: 64281736718263817.png (80 KB, 1183x640)
80 KB
80 KB PNG
It's funny to me how the model that supposedly solves the majority of math Olympiad questions doesn't know that an odd number multiplied by an odd number is odd, and that odd plus odd is even.
Why everything about OpenAI is so fake and gay?
Do they not test their own models before claiming total bullshit?
>>
>>102371618
Yeah as a massive schadenfreude addict it feels like some beginning of the end shit. They have no ideas.
>>
I, for one, enjoy talking about local models such as OpenAI o1.
>>
File: IMG_9816.jpg (858 KB, 1125x1226)
858 KB
858 KB JPG
>>102371690
>come to a general for the first time in a while
>make some posts
>soon after “ugh what happened to the general/thread, it’s over”
>this happens multiple times
It’s me. I’m the problem.
>>
>>102364922
4o1 is so much better at math and coding. I've seen and tried making it code for production and it did a good enough job. I think it's a few months away from being better at the job just from volume and speed for most junior level tasks at least. And there's no need to onboarding or getting them to learn the codebase.
>>
>>102371855
the problem is that you can brute force your way to agi, we don't need breakthroughs when we know the techniques continue to scale and there are megacorps more than willing to put the money toward it
>>
File: 1723084388149239.png (222 KB, 2197x1126)
222 KB
222 KB PNG
>>102371963
>math and coding
wrong
>>
>>102371360
>implying xe is any different >>>/vg/494141293 >>>/vg/494335654
>>
Yann Lecum losted. Sammy proved models can think
>>
>>102371901
Their entire business and billions of dollars is bet entirely on scaling being enough.
It isn’t.
They are fuuuucked and I hope it causes a global decade long depression that kills tens of millions
>>
>>102371751
I asked a low-temp Nemo to tell me everything it knew about Reimu Hakurei, and among other things it told me that "she is sometimes seen with a small white cat named Marisa Kirisame on her shoulder".

So maybe when it comes to real trivia knowledge, the param count does matter.
>>
>>102371993
I don't know how you got those numbers, 4o1 passed coding and math olympiads at 80%. Preview might be the problem there.
>>
>>102372061
But o1 isn't out...
>>
>>102372061
It's because of >>102371544
The giant 20k pile of chain-of-thought tokens probably hurts more than helps when it comes to 'continuing from what the user provided last' instead of building something up from a logical foundation. I'd be willing to bet it was trying to rewrite or refactor code instead of doing the completion as instructed.
>>
>>102371855
>electricity is free
>gpus are free
no
>>
>>102372080
It is for select users ;)
>>
someone get o1 to make a platforming adventure game where saltman has to escape from angry investors who got ripped off.
>>
you guys dont have o1 access via API?
>>
>>102372022
The only problem of LLMs is repetition, and OpenAI somehow fixed that to make their models be able to output 2000 tokens without user intervention. I feel like THIS is the breakthrough of strawberry.
>>
>>102372182
Why would I give OAI my money? I had access for a while back when GPT-4 came out, but regret the few dollars I spent there since it falls apart after 3K tokens pretty consistently. Used it to do a front-end overhaul of my website, but everything I did local models can do now. There's literally no use in corpo LLMs anymore.
>>
>>102372182
It's available only to tier 5 paypigs.
>>
>>102364922
>https://huggingface.co/MarinaraSpaghetti/NemoMix-Unleashed-12B Blocks your path.
What do you do?
>>
>>102372211
>>102372204
>he hasn't been datamining the enemy
keep your friends close...
>>
>>102372245
exchange currency to buy advertisement space
>>
>>102372255
Go away Sam. You lost. Get over it. Also release the weights/inference/training code for 3.5 since it's not relevant anymore.
>>
>>102372178
https://amethyst-aleda-78.tiiny.site/
# Sam Altman's Investor Escape: Game Concept

## Overview
"Sam Altman's Investor Escape" is a humorous 2D platforming adventure game where the player controls Sam Altman, the former CEO of OpenAI, as he tries to escape from a horde of angry investors who feel they've been ripped off.

## Gameplay Mechanics
1. **Platform Jumping**: Sam must navigate through various levels by jumping between platforms, avoiding obstacles, and collecting items.
2. **Special Abilities**:
- "AI Boost": Temporary speed boost
- "Blockchain Shield": Temporary invincibility
- "NFT Distraction": Throws a worthless NFT to distract investors
3. **Investor Types**:
- Regular Investors: Move slowly but in greater numbers
- Angel Investors: Can fly short distances
- Venture Capitalists: Throw money bags as projectiles

## Levels
1. Silicon Valley Streets
2. Startup Incubator Maze
3. Social Media Servers
4. Cryptocurrency Mines
5. Final Boss: The Board Room

## Power-ups
- Coffee Cups: Restore health
- Laptops: Unlock new "pivots" (temporary power-ups)
- Stock Certificates: Extra lives

## Obstacles
- Falling Stock Prices: Moving platforms that disappear
- Regulatory Hurdles: Walls that need to be climbed or broken
- Media Spotlights: Areas that attract more investors if Sam stays too long

## Boss Fights
Each level ends with a confrontation against a "Lead Investor" with unique attacks and patterns.

## Win Condition
Sam must reach the "Exit to New Startup" at the end of the final level, escaping with his reputation (barely) intact.

## Art Style
Pixel art with a satirical twist on Silicon Valley aesthetics.

## Sound
Chiptune renditions of startup pitch music and tech conference sounds.
>>
File: ComfyUI_00960_.png (1.07 MB, 856x1024)
1.07 MB
1.07 MB PNG
>>102369126
>>102369454
>>102369610
>>102369998
>>102370082
>>102370392
>>
>>102372245
Insane how this shit became the best small llm out there.
>>
File: high_effort_shitpost.jpg (214 KB, 573x1268)
214 KB
214 KB JPG
>>102372245
>>
>>102371971
Ehhhh
The issue is that there are fundamental things that'll fuck you sooner or later. Even now, you can't get around the fact that tokenizers can't capture fine grained information and are shit at arithmetic and fine-grained understanding, and true long term planning and execution since you can only keep so much in useful context. God help you if you have to do anything in real time
>>
File: cheese.png (6 KB, 889x453)
6 KB
6 KB PNG
>>102372324
well that was pretty easy to cheese.
>>
>>102372358
Got a better alternative retard? That's right. Sit.
>>
new coal mined
>>
>>102372358
these posts are getting old, sao
>>
Sometimes talking to the robot feels like that endless run scene from fucking Monty Python.
>I enter the village
>YOU HEAD TOWARDS THE VILLAGE [worthless fluff]
>I ENTER THE VILLAGE
>YOU NEAR THE VILLAGE [more worthless fluff]
Like holy fucking shit. The robot just can't follow a logical chain of events.
>>
>>102372395
Configurable Llama 3 8B.
>>
>>102372436
The first time I used mistral large I thought it was broken because half of the card was telling it to slow down because llama tends to timeskip or summarize, but mistral took it all to heart and nothing would happen.
>>
Why did LLMs peak at Mythomax?
>>
>>102372482
Based
>>
>>102372496
Every time I feel bad about being stuck on midnight miau, I just remember that most people are still using mythomax.
>>
>>102372395
Anything is better than 254 rerolls.
>>
>>102372496
Because everything became slopped after that
>>
>>102372358
Is there something funny about using a card to test models/prompts and compare their outputs?
>>
>>102372556
I am saddened that you are too dumb to understand what is funny about this.
>>
>>102372467
>Llama 3 8B
Slop.
>>102372517
>Nonaswer
I accept your concession.
>>
>>102372496
LLMs clearly peaked with old c.ai
>>
>>102372588
Ok, marinaraspaghetti.
>>
>>102372608
Sit, dog.
>>
>>102372633
Kill yourself dumb nigger.
>>
>>102372586
Mind explaining it to me? I have a few ideas of my own, but I'd like to hear your interpretation.
>>
>>102372061
>can't solve basic math questions
>muh overfitted olympiad problems
fuck off shill
>>
>>102372664
>Still can't name a single better alternative
Whew.
>>
File: image.png (521 KB, 1024x1024)
521 KB
521 KB PNG
>>102365558
kek
>>
>>102372595
I don't miss the model itself, but the way it made me feel. The magic died once I truly learned how LLMs work in the pyg6b era. You guys also don't make it as fun as it used to be. The only thing this thread is good at is sticking its head up its ass and arguing how it can taste shit on its tongue.
>>
>>102372702
I just did you dumb tranny.
>>
File: GXYclTFXgAA8b33.jpg (50 KB, 2010x608)
50 KB
50 KB JPG
>ask question
>money stolen
wow
>>
How does Marinaraspaghett manages to keep winning over and over??i
>>
>>102372776
and what was your question
>>
File: file.png (1 KB, 62x92)
1 KB
1 KB PNG
>>102372824
>>
>>102372866
probably the final one
>>
File: mari.jpg (179 KB, 728x1152)
179 KB
179 KB JPG
>>102372824
>>
>>102364922
>>(09/11) Fish Speech multilingual TTS with voice replication: https://hf.co/fishaudio/fish-speech-1.4
audio samples anywhere??? better than RVC/XTTSv2 or not?
>>
>>102372586
It's not funny because it's likely just a card used to test the reply for a bunch of models, thus the high swipe count.
>>
>>102373027
>likely
No it isn't slopper. Go shill your useless slopmerge on reddit. You have zero credibility here.
>>
>>102373027
Tangential to the discussion you are having with the other anon, but if that's how the fine tuners are testing their models, that's the wrong way to go about it.
You really need to have a decent length chat with a model to know if it's better or worse than another model, a previous checkpoint, etc.
I have this one card I use to test models where we go through a bunch of questions before getting into actual (E)RP. That and Nala for straight RP, are my go to.
>>
>>102373058
and what have you done to contribute?
>>
>>102373076
Oh yeah, and by decent length I mean from the start, not a chat with a context already filled by another model.
>>
>>102373080
Not wasting people's time on shitty slopmerge of low quality is actual contribution compared to the opposite of that.
>>
>>102373118
Something tells me you lived your whole life like this.
>>
>>102373058
You are embarrassing yourself, fucking retard.
>>
LLMs don't know that sex won't dishevel hair if her head wasn't touching the bed.
>>
gpt3 open sourced in fourteen days
>>
>>102372360
That sounds like you aren't using enough muscle to me. Tokenize at the character level, train on a trillion trillion synthetic literatures, run your resulting model on a million B100s. Brute force solves all problems.
>>
>>102373412
It's incredibly outdated by now, 8B models have surpassed it. It also only has a context of 2048 tokens.
>>
>>102373223
NTA but after a full two years of 24/7 diy slop shilling and corporate deepthroat shilling I’m about ready to minecraft people that refuse to buy an ad.
>>
>>102373412
Sama-chama will only do it during the darkest hour.
>>
>>102373462
Everything is an AD for you schizo, take your meds.
>>
>>102373558
>>102373558
>>102373558



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.