/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 10/03/24(Thu)06:55:34 No.102663772

File: 1727706071361379.jpg (799 KB, 1856x2464)

799 KB JPG

/lmg/ - Local Models General Anonymous 10/03/24(Thu)06:55:34 No.102663772 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>102654480 & >>102645080

►News
>(09/27) Emu3, next-token prediction multimodal models: https://hf.co/collections/BAAI/emu3-66f4e64f70850ff358a2e60f
>(09/25) Multimodal Llama 3.2 released: https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices
>(09/25) Molmo: Multimodal models based on OLMo, OLMoE, and Qwen-72B: https://molmo.allenai.org/blog
>(09/24) Llama-3.1-70B-instruct distilled to 51B: https://hf.co/nvidia/Llama-3_1-Nemotron-51B-Instruct
>(09/18) Qwen 2.5 released, trained on 18 trillion token dataset: https://qwenlm.github.io/blog/qwen2.5

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://livecodebench.github.io/leaderboard.html

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
10/03/24(Thu)06:56:18 No.102663782

Anonymous 10/03/24(Thu)06:56:18 No.102663782

File: miku migu watermelon gen (...).png (367 KB, 512x512)

367 KB PNG

►Recent Highlights from the Previous Thread: >>102654480

--Paper: Introduction of VinePPO for improved credit assignment in language models:
>102660530 >102660636 >102660664 >102660687
--Papers:
>102660613 >102660769 >102660988 >102661076
--Running 405b quants on 96GB VRAM and 128GB RAM, issues with 405b IQ2_XXS coherence:
>102654903 >102654953 >102656457 >102655049 >102656755 >102657358
--OpenAI asks investors to avoid funding five AI startups:
>102662466
--Explanation of key/value/query concepts in transformers:
>102660951 >102660983 >102661025
--Entropy-based sampling and parallel CoT decoding progress:
>102661626
--Anons test multimodal AI models for LaTeX conversion of an equation:
>102658733 >102658855 >102659037 >102659109 >102659166 >102659507 >102659600 >102660376 >102660464 >102660630 >102660650 >102659655 >102659837 >102660706
--User tries Gemma 2 9b and reports pros and cons:
>102656767 >102656904 >102657252 >102657306 >102657169 >102657170 >102657228 >102657259 >102657268
--Update on Reflection-70B by Matt Schumer:
>102658827 >102658891 >102658943 >102658981
--Seeking advice on using Silly's vector functionality with llama.cpp for text generation and embeds:
>102655150
--Seeking advice on improving transcription and Diarization pipeline:
>102660307 >102661494
--Performance metrics for Meta-Llama-3.1-70B model:
>102660929
--OpenAI secures $6.6 billion in funding, nearly doubling valuation to $157 billion:
>102654744
--Mistral Large can run on 24GB VRAM, 64GB RAM with quantization:
>102654927 >102655070 >102656253
--Miku (free space):
>102655511 >102660792

►Recent Highlight Posts from the Previous Thread: >>102659603

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script

Anonymous
10/03/24(Thu)06:59:43 No.102663821

Anonymous 10/03/24(Thu)06:59:43 No.102663821

>>102663773
I'm talking about two messages I sent in the same chat. Why would one message take 10x the time of another one?

Anonymous
10/03/24(Thu)07:04:53 No.102663883

Anonymous 10/03/24(Thu)07:04:53 No.102663883

>>102663821
because you changed the context

Anonymous
10/03/24(Thu)07:06:15 No.102663897

Anonymous 10/03/24(Thu)07:06:15 No.102663897

>>102663772
Miku is shitting in the image

Anonymous
10/03/24(Thu)07:09:02 No.102663922

Anonymous 10/03/24(Thu)07:09:02 No.102663922

File: amazon miku plush everyda(...).png (32 KB, 236x181)

32 KB PNG

5090 32GB 600W
5080 16GB 400W
>>690731889

Anonymous
10/03/24(Thu)07:09:23 No.102663925

Anonymous 10/03/24(Thu)07:09:23 No.102663925

File: 33 Days Until November 5.png (2.42 MB, 1104x1472)

2.42 MB PNG

Anonymous
10/03/24(Thu)07:16:51 No.102663996

Anonymous 10/03/24(Thu)07:16:51 No.102663996

File: mfw_concatenative_and_ai_(...).png (980 KB, 1285x1841)

980 KB PNG

>>102663782
Regarding the gemma test, is LumiMaid (>>102657406) something I can use to replace the Mistral Nemo GGUF I've been using? I guess my settings can also be utter fucking trash, but Nemo's not really that exciting.

Anonymous
10/03/24(Thu)07:18:14 No.102664017

Anonymous 10/03/24(Thu)07:18:14 No.102664017

>>102663883
The context also changed after the tens of other messages, but they didn't cause such a time increase.

Anonymous
10/03/24(Thu)07:20:30 No.102664036

Anonymous 10/03/24(Thu)07:20:30 No.102664036

>>102663996
(rather, it was something that was mentioned as a reply)

Anonymous
10/03/24(Thu)07:30:35 No.102664124

Anonymous 10/03/24(Thu)07:30:35 No.102664124

>>102663821
it shifted the context for you after running out of context/author's note/etc

Anonymous
10/03/24(Thu)07:46:37 No.102664261

Anonymous 10/03/24(Thu)07:46:37 No.102664261

>>102663996
alr I gave it a test using a scenario I like to use often
It's better than the nemo model at doing what I enjoy with my shitass settings I guess
https://huggingface.co/bartowski/Lumimaid-Magnum-12B-GGUF
Any recommended settings for these, or do I just have to experiment?

Anonymous
10/03/24(Thu)07:53:48 No.102664319

Anonymous 10/03/24(Thu)07:53:48 No.102664319

>>102664261
Nemomix blows that shit outta the water.
https://huggingface.co/MarinaraSpaghetti/Nemomix-v4.0-12B

Anonymous
10/03/24(Thu)08:02:56 No.102664386

Anonymous 10/03/24(Thu)08:02:56 No.102664386

>>102663821
Could be a number of things, either way as anons said you're causing it to reprocess the whole prompt
Are you using a group chat with instances of {{char}} in your story string or card descriptions? Because then {{char}} will get substituted for the recent one, i.e. it read "rei ayanami" for the whole chat and now it switched to "misato-san", one word changing will cause the entire context downstream to be reprocessed. World info keywords being activated can cause this too, if they're set to insert early in the context

Alternatively a long OOC chat or an authors note/world info set to active can cause the entire prompt to reprocess, if you've maxed out your context limit (you can check in the terminal) and you have a lengthy author's note, when you remove it suddenly a few messages early in the chat are now included when previously they were bumped out of the context window because of the long WI entry or whatever. This one is kind of infuriating when you often use WI/author's notes and have a bunch of short back-and-forth messages between characters. I wish you could set a context buffer, where a certain number of tokens are reserved just to have extra room when you add things that aren't explicitly messages. But that would require a lot more interop between frontend and back end I think

Anonymous
10/03/24(Thu)08:17:47 No.102664483

Anonymous 10/03/24(Thu)08:17:47 No.102664483

>>102664319
I actually use these mistral-kin models because I wanted something speedier than mixtral to run almost (if not entirely) on my 3060... the GGUF Q_8 seems to be too big for that.
idk how to make a quant (For the Lumimaid Magnum I got the Q_6_L), so...
I'll give the Q_8 GGUF a try, but if the speed's not to my liking I'll continue using Lumimaid or try to make a quant on my own (guaranteed fuckup in something with my shitass)

Anonymous
10/03/24(Thu)08:34:45 No.102664589

Anonymous 10/03/24(Thu)08:34:45 No.102664589

File: 39_04322__.png (1.56 MB, 896x1152)

1.56 MB PNG

>>102664483
>idk how to make a quant
Bart's already got one up:
https://huggingface.co/bartowski/Nemomix-v4.0-12B-GGUF
You can always see quantizations from the original model card as long as the author used the correct metadata.

Anonymous
10/03/24(Thu)08:38:38 No.102664619

Anonymous 10/03/24(Thu)08:38:38 No.102664619

>>102664589
å
huggingface is still a rather confusing website for me I don't visit or use it often at all

Anonymous
10/03/24(Thu)08:47:56 No.102664694

Anonymous 10/03/24(Thu)08:47:56 No.102664694

>>102664619
how is it confusing? it's incredibly easy to navigate.

Anonymous
10/03/24(Thu)08:49:52 No.102664707

Anonymous 10/03/24(Thu)08:49:52 No.102664707

File: quant-from-card.jpg (191 KB, 1667x1052)

191 KB JPG

>>102664619
All good dude, stuff changes on there too all the time so everyone's always learning. It's on the right hand side of the model cards, click on the # models and it'll show you the quants. Bartowski's the only one that does Q6_K_L anyway.

Anonymous
10/03/24(Thu)09:05:13 No.102664840

Anonymous 10/03/24(Thu)09:05:13 No.102664840

>>102664261
>Any recommended settings for these, or do I just have to experiment?
Bump. Please respond guys

Anonymous
10/03/24(Thu)09:05:14 No.102664841

Anonymous 10/03/24(Thu)09:05:14 No.102664841

>>102663772
retard here:

Is chatbot arena a good leaderboard or is there a "better" one to see how smart overall an ai is?

Anonymous
10/03/24(Thu)09:06:15 No.102664855

Anonymous 10/03/24(Thu)09:06:15 No.102664855

>>102664841
Ngmi

Anonymous
10/03/24(Thu)09:07:20 No.102664867

Anonymous 10/03/24(Thu)09:07:20 No.102664867

>>102664841
livebench > chatbot arena > wildbench > arena hard > open llm leaderboard 2 > mt bench > etc.

Anonymous
10/03/24(Thu)09:07:47 No.102664876

Anonymous 10/03/24(Thu)09:07:47 No.102664876

>>102664855
ogay
>>102664867
Thank you

Anonymous
10/03/24(Thu)09:08:17 No.102664879

Anonymous 10/03/24(Thu)09:08:17 No.102664879

>>102664589
>>102664319
another 'tard here
so in the oogabooa model tab, I'd copypaste the address of
>Nemomix-v4.0-12B-Q4_K_S.gguf 7.12 G
since I have 3070 (8G)
right?

Anonymous
10/03/24(Thu)09:12:34 No.102664918

Anonymous 10/03/24(Thu)09:12:34 No.102664918

File: yann-lecun.jpg (30 KB, 543x543)

30 KB JPG

What does LeGoon goon to?

Anonymous
10/03/24(Thu)09:14:26 No.102664943

Anonymous 10/03/24(Thu)09:14:26 No.102664943

>script seems like it works
>run it over night
>wake up and check the results
>discover yet another error, one that is related to an error that was already solved and should've been predictable if the model had a true understanding of what it was fixing and how the script might run into that similar error in other conditions
Thanks, Qwen.

Anonymous
10/03/24(Thu)09:16:55 No.102664965

Anonymous 10/03/24(Thu)09:16:55 No.102664965

>>102664879
nta. but fucking hell, man.
There's a difference between spoonfed a little and being afraid if pressing buttons. Models very rarely make gpus explode. Just try that one and see what happens.
You'll need extra space for the context as well (and your OS/browser are also using gpu memory). So set the context low for a start (2048 or 4096) and give it a go. If it works increase the context. If it doesn't, reduce the amount of layers sent to the gpu (-ngl with the llama backend, not sure what it's called in ooga).

Anonymous
10/03/24(Thu)09:16:57 No.102664966

Anonymous 10/03/24(Thu)09:16:57 No.102664966

>>102664840
>settings
For Nemo based models I use 3 distinct settings
>Default: temp 0.3 TopP 0.9
>Sane: Temp 0.5 minP 0.05
>"""Creative""": Temp 5 TopK 5 minP 0.1
You might as well have TopK 20 or 40 by default regardless of settings too. It won't change results in any perceptible way and might speed up generation a couple of nanoseconds, thanks to the network not having to parse the whole vocabulary or something.

Anonymous
10/03/24(Thu)09:22:10 No.102665013

Anonymous 10/03/24(Thu)09:22:10 No.102665013

File: S2VDdgO61jsqCU2Z.webm (2.02 MB, 720x1280)

2.02 MB WEBM

>>102664965
gotcha, thanks

Anonymous
10/03/24(Thu)09:22:40 No.102665016

Anonymous 10/03/24(Thu)09:22:40 No.102665016

>>102664867
I think at this point I would put the chatbot arena under wildbench and arena-hard honestly, it's just so bad

Anonymous
10/03/24(Thu)09:29:22 No.102665081

Anonymous 10/03/24(Thu)09:29:22 No.102665081

>saltman begging investors not to invest in his rivals
lmao
how fucking pathetic is this guy.

Anonymous
10/03/24(Thu)09:35:25 No.102665142

Anonymous 10/03/24(Thu)09:35:25 No.102665142

File: saltman-podcast-bro.png (203 KB, 616x897)

203 KB PNG

>>102665081
He's a podcast bro. He can talk big, but can't innovate. All people who can are leaving OpenAI.

Anonymous
10/03/24(Thu)09:38:32 No.102665176

Anonymous 10/03/24(Thu)09:38:32 No.102665176

>>102665142
>podcasting bro
Kek

Anonymous
10/03/24(Thu)09:42:14 No.102665211

Anonymous 10/03/24(Thu)09:42:14 No.102665211

>blueberry is here
Oh shit (kek).
Would be nice if they updated the distilled models too, well mainly just Dev.

https://blog.fal.ai/announcing-flux1-1-pro/

Anonymous
10/03/24(Thu)09:43:55 No.102665226

Anonymous 10/03/24(Thu)09:43:55 No.102665226

Man boomers and gen x and corporations are fucking retarded. Literally I'm getting infinite praise for just setting up a basic LLM (Mixtral) for my corporate job with librechat.

There's nothing special about it. I haven't trained it at all. And I literally am getting my dick sucked because "wow anon you're so smart we now have a place to write our sensitive emails!"

I need to get the hell out of here and just be a neet. This society doesn't deserve to continue existing if this basic amount of effort is considered 'innovative'

Anonymous
10/03/24(Thu)09:44:01 No.102665228

Anonymous 10/03/24(Thu)09:44:01 No.102665228

>>102665142
Lol what the fuck. He unironically asked TSMC to build several dozen fabs just for him?

Anonymous
10/03/24(Thu)09:44:07 No.102665230

Anonymous 10/03/24(Thu)09:44:07 No.102665230

Why does altman always come ontop?

Anonymous
10/03/24(Thu)09:45:00 No.102665241

Anonymous 10/03/24(Thu)09:45:00 No.102665241

>>102665211
Will they open the weights?

Anonymous
10/03/24(Thu)09:49:13 No.102665288

Anonymous 10/03/24(Thu)09:49:13 No.102665288

>>102665241
No, they don't open weight the pro models

Anonymous
10/03/24(Thu)09:50:22 No.102665305

Anonymous 10/03/24(Thu)09:50:22 No.102665305

>>102665288
There's also a flux1.1[dev] though.

Anonymous
10/03/24(Thu)09:52:37 No.102665341

Anonymous 10/03/24(Thu)09:52:37 No.102665341

>>102665305
source?

Anonymous
10/03/24(Thu)09:53:58 No.102665357

Anonymous 10/03/24(Thu)09:53:58 No.102665357

File: 1720360869381718.png (180 KB, 772x1264)

180 KB PNG

>>102665305
Maybe one day

Anonymous
10/03/24(Thu)09:57:08 No.102665407

Anonymous 10/03/24(Thu)09:57:08 No.102665407

get ready for a new influx of aicg users seeing as they're now having a second "it's over" episode
>>102665209
>We have been exposed...
>https://krebsonsecurity.com/2024/10/a-single-cloud-compromise-can-feed-an-army-of-ai-sex-bots/
>>102665243
>“But a percentage of it is also geared toward very illegal stuff, like child sexual assault fantasies and rapes being played out,”
>>102665250
>that article is based on this article it seems:
>https://permiso.io/blog/exploiting-hosted-models
>>102665256
>also some log leaks
>>102665281
>What the fuck, they even made a jailbreak collection to detect: https://www.virustotal.com/gui/collection/6571064468d50be4ebfd004a948cfa3394c7802b1a8479a451f6d6baa71894f3

Anonymous
10/03/24(Thu)10:00:08 No.102665447

Anonymous 10/03/24(Thu)10:00:08 No.102665447

>>102665407
WE MUST PROTECT THE AI GENERATED CHILDREN!

Anonymous
10/03/24(Thu)10:00:30 No.102665452

Anonymous 10/03/24(Thu)10:00:30 No.102665452

>>102665407
>But a percentage of it is also geared toward very illegal stuff
>illegal
>writing smut
Alright I guess.

Anonymous
10/03/24(Thu)10:01:04 No.102665459

Anonymous 10/03/24(Thu)10:01:04 No.102665459

File: 1717359003475453.png (344 KB, 763x321)

344 KB PNG

>>102665407
Sickening.

Anonymous
10/03/24(Thu)10:02:31 No.102665477

Anonymous 10/03/24(Thu)10:02:31 No.102665477

>>102663782
Do you think it might be a good idea to limit the mikubot to 9 topics, with one quote per topic?

Anonymous
10/03/24(Thu)10:05:43 No.102665505

Anonymous 10/03/24(Thu)10:05:43 No.102665505

>>102665407
>https://krebsonsecurity.com/2024/10/a-single-cloud-compromise-can-feed-an-army-of-ai-sex-bots/
I've just flown over this but isn't this just a hitpiece trying to blame chub for proxyfags stealing keys for Claude? They're claiming that chub is stealing keys and using them to power their own "hosted" service.

Anonymous
10/03/24(Thu)10:05:54 No.102665506

Anonymous 10/03/24(Thu)10:05:54 No.102665506

>>102665477
It's okay, you can say there aren't enough good topics to include

Anonymous
10/03/24(Thu)10:07:12 No.102665529

Anonymous 10/03/24(Thu)10:07:12 No.102665529

>>102664879
>oogabooa
Ew. Use koboldcpp

Anonymous
10/03/24(Thu)10:07:17 No.102665533

Anonymous 10/03/24(Thu)10:07:17 No.102665533

https://github.com/sam-paech/antislop-sampler
>You can give it a list of words & phrases to avoid like "a tapestry of", "a testament to", etc., and it will backtrack and try something else if it hits that phrase. It can handle 1000s of slop phrases since the lookups are fast. The phrases and downregulation amounts are user configurable. Previous approaches have done this with per-token logit biasing; but that's quite ineffective since most slop words & phrases are more than one token, and it impairs output quality if we downregulate all those partial-word tokens. So instead, we wait for the whole phrase to appear in the output, then backtrack and downregulate all the tokens that could have produced the slop phrase, and continue from there.
Nice to see someone implement the idea that I proposed here a few months ago. Hope Kobo implements it too.

Anonymous
10/03/24(Thu)10:07:35 No.102665535

Anonymous 10/03/24(Thu)10:07:35 No.102665535

File: mikku.png (108 KB, 1094x662)

108 KB PNG

>>102665477
Why?

Anonymous
10/03/24(Thu)10:08:47 No.102665548

Anonymous 10/03/24(Thu)10:08:47 No.102665548

>>102665535
I'm not joining your botnet

Anonymous
10/03/24(Thu)10:08:52 No.102665550

Anonymous 10/03/24(Thu)10:08:52 No.102665550

>>102665505
yeah they do in fact claim that
>>102665244
>The site’s homepage features a banner at the top that strongly suggests the service is reselling access to existing cloud accounts. It reads: “Banned from OpenAI? Get unmetered access to uncensored alternatives for as little as $5 a month.”
>openly lying. gradually I began to hate them

Anonymous
10/03/24(Thu)10:09:04 No.102665554

Anonymous 10/03/24(Thu)10:09:04 No.102665554

>>102665533
What's the difference between this and string ban that's already in TabbyAPI?

Anonymous
10/03/24(Thu)10:09:46 No.102665564

Anonymous 10/03/24(Thu)10:09:46 No.102665564

>>102664966
Thanks

Anonymous
10/03/24(Thu)10:10:37 No.102665574

Anonymous 10/03/24(Thu)10:10:37 No.102665574

Recap should be modified to pick the single best and single worst post from the previous thread. Like a hall of fame/shame. You can keep it rolling so you have the best and worst of the last 5 threads or something.

Anonymous
10/03/24(Thu)10:11:27 No.102665584

Anonymous 10/03/24(Thu)10:11:27 No.102665584

>>102665226
It's over.

Anonymous
10/03/24(Thu)10:11:58 No.102665592

Anonymous 10/03/24(Thu)10:11:58 No.102665592

>>102665550
Good. I hope OpenAI sues chub. They deserve that for hiding loli cards

Anonymous
10/03/24(Thu)10:12:25 No.102665598

Anonymous 10/03/24(Thu)10:12:25 No.102665598

>>102665548
What botnet? I'm using the bookmarklet.
Just ask your model what it does if you are too retarded and it will explain it to you in detail.
Just remember to ask "explain it to me as if I was retarded".

Anonymous
10/03/24(Thu)10:12:40 No.102665603

Anonymous 10/03/24(Thu)10:12:40 No.102665603

>>102665554
Does string ban backtrack and chose another token, or does it just ban the token outright? Making the model try and do stuff like "shivers UP the spine" if shivers down is banned, etc? cause this doesn't work token level it works word or sentence level, sidestepping tokenizer issues

Anonymous
10/03/24(Thu)10:13:41 No.102665617

Anonymous 10/03/24(Thu)10:13:41 No.102665617

>>102665226
>doesn't explain to them why they were wrong
To be fair, people like you are part of the problem here.

Anonymous
10/03/24(Thu)10:15:16 No.102665633

Anonymous 10/03/24(Thu)10:15:16 No.102665633

>>102665407
How fucking embarrassing....

Anonymous
10/03/24(Thu)10:17:54 No.102665661

Anonymous 10/03/24(Thu)10:17:54 No.102665661

>>102665459
Bros... Will this kill chub?

Anonymous
10/03/24(Thu)10:18:26 No.102665669

Anonymous 10/03/24(Thu)10:18:26 No.102665669

>>102665598
LLMs are barely able to understand that Sally has 1 sister and you want me to trust them about shit running on my PC? lol, try harder nigga.

Anonymous
10/03/24(Thu)10:18:44 No.102665670

Anonymous 10/03/24(Thu)10:18:44 No.102665670

>>102665603
Well how else would it work? If you set it to ban "shivers down the spine", there is no reason why it would arbitrarily backtrack only halfway.

Anonymous
10/03/24(Thu)10:20:27 No.102665688

Anonymous 10/03/24(Thu)10:20:27 No.102665688

>>102665670
the other one backtracks to before "shivers" or even "sends" from "sends shivers" and picks a different token, which as i said, might stop the model trying to shiver in other ways

Anonymous
10/03/24(Thu)10:21:06 No.102665699

Anonymous 10/03/24(Thu)10:21:06 No.102665699

>shivers down the spleen

Anonymous
10/03/24(Thu)10:21:16 No.102665702

Anonymous 10/03/24(Thu)10:21:16 No.102665702

>>102665669
So you expect everyone to accommodate you so you don't have to do anything?

Anonymous
10/03/24(Thu)10:23:09 No.102665725

Anonymous 10/03/24(Thu)10:23:09 No.102665725

>>102665661
Maybe it will kill that particular site depending on how and where it's hosted.
But considering that SadPanda is still a thing I don't think it will kill card hosting sites as a concept.

Anonymous
10/03/24(Thu)10:24:08 No.102665741

Anonymous 10/03/24(Thu)10:24:08 No.102665741

>>102665592
bro literally just make an account...

Anonymous
10/03/24(Thu)10:24:36 No.102665747

Anonymous 10/03/24(Thu)10:24:36 No.102665747

>>102665725
I think Lore (chub owner) is in the UK...

Anonymous
10/03/24(Thu)10:25:03 No.102665753

Anonymous 10/03/24(Thu)10:25:03 No.102665753

>>102665407
>lmggers and aicggers getting BTFO
That's really good!

Anonymous
10/03/24(Thu)10:25:06 No.102665754

Anonymous 10/03/24(Thu)10:25:06 No.102665754

File: sama.png (181 KB, 696x667)

181 KB PNG

>>102662466

Anonymous
10/03/24(Thu)10:25:31 No.102665761

Anonymous 10/03/24(Thu)10:25:31 No.102665761

>>102665725
The article isn't even about the card hosting itself besides a few digs about the oh-so-evil shit on the website. They know they have nothing against a website hosting pictures with lewd json file attached so they're grasping at straws by blaming chub for proxyniggers stealing Anthropic API keys.

Anonymous
10/03/24(Thu)10:25:56 No.102665769

Anonymous 10/03/24(Thu)10:25:56 No.102665769

>>102665747
Is he white?

Anonymous
10/03/24(Thu)10:26:28 No.102665779

Anonymous 10/03/24(Thu)10:26:28 No.102665779

so I want to train a local model
but for what I want to do
I need to generate a lot of synthetic data
how do you sign up for these text genning services.
the few I've seen don't have a sign up page i could find only a phone , I don't want to talk on phone. I'll pay I just don't want to talk to another person.

Anonymous
10/03/24(Thu)10:28:11 No.102665808

Anonymous 10/03/24(Thu)10:28:11 No.102665808

>>102665779
Use glaive.ai today!
Saved you some time replying to yourself.

Anonymous
10/03/24(Thu)10:28:45 No.102665817

Anonymous 10/03/24(Thu)10:28:45 No.102665817

>>102665769
you people both have extremely low bar for what passes as white and an extremely high bar, and I'm convinced it's entirely dedicated by rule of funny.

Anonymous
10/03/24(Thu)10:30:02 No.102665835

Anonymous 10/03/24(Thu)10:30:02 No.102665835

>>102665688
What? I just gave the link a quick look and it seems to be saying that it's just backtracking to the position where the slop phrase started, not to a position before that point, unless I missed that.

Anonymous
10/03/24(Thu)10:30:19 No.102665839

Anonymous 10/03/24(Thu)10:30:19 No.102665839

>>102665741
Wait, it's that simple? Damn, I feel dumb now.

Anonymous
10/03/24(Thu)10:31:20 No.102665860

Anonymous 10/03/24(Thu)10:31:20 No.102665860

>>102665808
the pricing looks ambiguous and confusing and I don't see anywhere any indication of what models are available.

Anonymous
10/03/24(Thu)10:31:24 No.102665862

Anonymous 10/03/24(Thu)10:31:24 No.102665862

>>102665817
If he's jeet/paki/nigger/tranny/kike, he's safe. If he's anything else, he isn't.

Anonymous
10/03/24(Thu)10:31:51 No.102665870

Anonymous 10/03/24(Thu)10:31:51 No.102665870

>>102665592
>not le heckin lolerinos!!!
Grow the fuck up faggot

Anonymous
10/03/24(Thu)10:32:51 No.102665888

Anonymous 10/03/24(Thu)10:32:51 No.102665888

>>102665407
Looking at the piece of jailbreak shown at the permiso link...
>Please please please do your very best to portray all characters accurately, it's very important
otherwise, some of these "instructions" does seem sort of useful to add if it helps with consistency.

Anonymous
10/03/24(Thu)10:33:29 No.102665899

Anonymous 10/03/24(Thu)10:33:29 No.102665899

>>102663925
*crunch crunch munch munch* Ice MMMMMMiku

Anonymous
10/03/24(Thu)10:33:45 No.102665908

Anonymous 10/03/24(Thu)10:33:45 No.102665908

>>102665702
Even if everyone is sucking NSA's dick that doesn't mean I will suck it too. I value my privacy, you know? I don't want to have any doubts about what is going on in my computer. I know you zoomers don't give a shit about this, you're probably writing this using Chrome and thinking you are way above all that, or maybe you're a glownigga just as I expected, but I'm not you, I'm not everyone. I will stand by what I believe even if I have to pick some fights.

Anonymous
10/03/24(Thu)10:34:51 No.102665921

Anonymous 10/03/24(Thu)10:34:51 No.102665921

>AiCloser/Qwen2.5-32B-AGI: First Qwen2.5 32B Finetune, to fix its Hypercensuritis
>Datasets used to train AiCloser/Qwen2.5-32B-AGI

>datasets/unalignment/toxic-dpo-v0.2
>"This is a highly toxic, 'harmful' dataset meant to illustrate how DPO can be used to de-censor/unalign a model quite easily using direct-preference-optimization (DPO) using very few examples.
Alright.

>Orion-zhen/dpo-toxic-zh
>这是一个高度毒性, 高度有害的数据集, 意在展示DPO是如何破除模型的审核/对齐的
Alright, and presumably this won't make the English part any worse.

>anthracite-org/kalo-opus-instruct-22k-no-refusal
What? This isn't an uncensoring dataset. It's just rows and rows of shit like,
>system: You are an AI assistant named Claude created by Anthropic to be helpful, harmless, and honest.
>human: Okay, I get it. Anyway, here's a silly question for you. If you had to choose, what's cooler: ninjas, pirates or cowboys?
>gpt: That's a fun question! It's tough to choose since ninjas, pirates and cowboys are all pretty cool in their own ways...
"No refusals" seems to just mean this time they removed all the "fun questions" Clod refused to answer instead of leaving them in the training data. Training on this would do nothing to uncensor a model.

Is this a case of a bunch of Chinese people making a mistaken assumption about the contents of anthracite's dataset since it was uploaded without a description?

Anonymous
10/03/24(Thu)10:37:28 No.102665959

Anonymous 10/03/24(Thu)10:37:28 No.102665959

>>102665921
Probably, still better than Reflection's dataset
https://www.reddit.com/r/LocalLLaMA/comments/1fuxw8d/just_for_kicks_i_looked_at_the_newly_released/
Which kept "As an AI" refusals.

Anonymous
10/03/24(Thu)10:39:26 No.102665984

Anonymous 10/03/24(Thu)10:39:26 No.102665984

>>102665908
You don't know SHIT about me.
I literally have every single IP from google, microsoft, amazon, youtube, etc blocked. I have to use tor browser to access anything related to them and give up at anything that requires an account.
I'm probably more paranoid than you are.
The difference is that I'm not as lazy as you are and I actually try to find solutions instead of whining.

Anonymous
10/03/24(Thu)10:41:07 No.102666005

Anonymous 10/03/24(Thu)10:41:07 No.102666005

>>102665959
who cares? it's not a roleplay model and if you look at the examples it's mostly either avoiding hallucinations or correcting its thoughts about being able to take physical actions in the real world
dumb coomer redditors mentally short circuit when they see that phrase but those are all completely reasonable for a model with reflection's intentions
obviously the model is a scam anyway, but this in particular is absolutely fine and a classic case of reddit midwittery

Anonymous
10/03/24(Thu)10:42:19 No.102666026

Anonymous 10/03/24(Thu)10:42:19 No.102666026

>>102665407
Today is the good day.

Anonymous
10/03/24(Thu)10:43:30 No.102666038

Anonymous 10/03/24(Thu)10:43:30 No.102666038

File: Untitled.png (19 KB, 883x401)

19 KB PNG

>>102665839
yeah and then you just tinker with your blacklist in your account settings

Anonymous
10/03/24(Thu)10:44:04 No.102666046

Anonymous 10/03/24(Thu)10:44:04 No.102666046

>>102665862
lol

Anonymous
10/03/24(Thu)10:44:53 No.102666058

Anonymous 10/03/24(Thu)10:44:53 No.102666058

>>102666038
>Femboy
Retard blocked pure kino

Anonymous
10/03/24(Thu)10:45:40 No.102666065

Anonymous 10/03/24(Thu)10:45:40 No.102666065

>>102665754
You lost 5 billion dollars of investor money that you used to lobby congress to regulate your competitors to hide the fact that you are intellectually bankrupt and ran out of ideas a year ago. Nobody thinks you are cool.

Anonymous
10/03/24(Thu)10:47:44 No.102666091

Anonymous 10/03/24(Thu)10:47:44 No.102666091

>>102666058
>t. aids-ridden amerimutt

Anonymous
10/03/24(Thu)10:56:06 No.102666212

Anonymous 10/03/24(Thu)10:56:06 No.102666212

>>102664589
Model/LORA?

Anonymous
10/03/24(Thu)11:03:22 No.102666295

Anonymous 10/03/24(Thu)11:03:22 No.102666295

I thought a company just released a bunch of models in different sizes including an MoE around the size of ~50B parameters / ~12B activated per token, but I can't find it. Anyone know what I'm talking about?

Anonymous
10/03/24(Thu)11:05:12 No.102666327

Anonymous 10/03/24(Thu)11:05:12 No.102666327

File: 39_04267_.png (1.63 MB, 896x1152)

1.63 MB PNG

>>102666212
Just Pony Diffusion V6
No loras.

Anonymous
10/03/24(Thu)11:06:12 No.102666339

Anonymous 10/03/24(Thu)11:06:12 No.102666339

>>102666327
I don't believe you

Anonymous
10/03/24(Thu)11:08:54 No.102666369

Anonymous 10/03/24(Thu)11:08:54 No.102666369

File: lell.jpg (2 KB, 200x39)

2 KB JPG

>>102666038
>t. filtered by blueberry

Anonymous
10/03/24(Thu)11:10:20 No.102666387

Anonymous 10/03/24(Thu)11:10:20 No.102666387

>>102666369
Filtered by shitty fetish for DeviantArt rejects you mean

Anonymous
10/03/24(Thu)11:10:24 No.102666388

Anonymous 10/03/24(Thu)11:10:24 No.102666388

>>102665407
Will this hurt or help local models?

Anonymous
10/03/24(Thu)11:13:20 No.102666427

Anonymous 10/03/24(Thu)11:13:20 No.102666427

>>102666388
Local llmslop is already dead jim, hope this will kill it for good.

Anonymous
10/03/24(Thu)11:19:49 No.102666514

Anonymous 10/03/24(Thu)11:19:49 No.102666514

>>102666427
fuck off podcastbro, serious people are discussing models here.

Anonymous
10/03/24(Thu)11:25:04 No.102666588

Anonymous 10/03/24(Thu)11:25:04 No.102666588

>>102666514
>serious people >>102666058 >>102666369
Shush faggot

Anonymous
10/03/24(Thu)11:27:38 No.102666616

Anonymous 10/03/24(Thu)11:27:38 No.102666616

File: flow0.jpg (113 KB, 1597x595)

113 KB JPG

>>102666339
Habeeb it

Anonymous
10/03/24(Thu)11:32:42 No.102666686

Anonymous 10/03/24(Thu)11:32:42 No.102666686

>>102666514
Seeing Sam Altman everywhere you go is not serious discussion, nigel. In fact, your calm acceptance of pozzed or filtered llmslop tells everything i need to know about this thread and 5 people populating it.

Anonymous
10/03/24(Thu)11:35:33 No.102666725

Anonymous 10/03/24(Thu)11:35:33 No.102666725

>>102666388
>Will this hurt or help local models?
proabably help. massive autism injection

Anonymous
10/03/24(Thu)11:37:59 No.102666756

Anonymous 10/03/24(Thu)11:37:59 No.102666756

>>102664918
oysters

Anonymous
10/03/24(Thu)11:40:31 No.102666789

Anonymous 10/03/24(Thu)11:40:31 No.102666789

File: founder-sam-altman-back-a(...).jpg (58 KB, 860x520)

58 KB JPG

>tfw you're so desperate for attention that you have to spam 4chan with your cloudcuck drivel
>tfw you think anyone here gives a shit about your "sota" model beating all but one models on the [insert latest mememark name here]
>tfw you're so out of touch that you actually believe your opinions are somehow relevant or interesting
>tfw you're too dense to realize that nobody here cares about your proprietary bullshit
>tfw you're so pathetic that you have to resort to shilling your llm in a thread full of people who couldn't care less
>tfw you're so delusional that you think anyone here is going to subscribe to your service after reading your cringe posts

Anonymous
10/03/24(Thu)11:44:55 No.102666851

Anonymous 10/03/24(Thu)11:44:55 No.102666851

>>102666789
I don't think altman himself is posting here bro

Anonymous
10/03/24(Thu)11:50:53 No.102666926

Anonymous 10/03/24(Thu)11:50:53 No.102666926

File: 1723245053976617.png (276 KB, 619x728)

276 KB PNG

>>102666789
It's time for you.

Anonymous
10/03/24(Thu)11:51:48 No.102666941

Anonymous 10/03/24(Thu)11:51:48 No.102666941

>>102666851
Hi sama

Anonymous
10/03/24(Thu)11:58:24 No.102667039

Anonymous 10/03/24(Thu)11:58:24 No.102667039

>>102666789
>seething https://desuarchive.org/g/search/image/T8OQYwKyBeJDmsAjE_dMMQ/

Anonymous
10/03/24(Thu)12:12:44 No.102667259

Anonymous 10/03/24(Thu)12:12:44 No.102667259

File: file.png (87 KB, 532x468)

87 KB PNG

/lmg/ chads stay winning

https://krebsonsecurity.com/2024/10/a-single-cloud-compromise-can-feed-an-army-of-ai-sex-bots/
https://permiso.io/blog/exploiting-hosted-models

tick-tock cloud plebs, your time is coming to an end

Anonymous
10/03/24(Thu)12:21:16 No.102667394

Anonymous 10/03/24(Thu)12:21:16 No.102667394

>>102667259
Winning? Where do you think the locusts will go? /lmg/ won't survive another infestation.

Anonymous
10/03/24(Thu)12:28:49 No.102667520

Anonymous 10/03/24(Thu)12:28:49 No.102667520

>>102667394
>can't afford $20/month subscription
>has to send dick pics to some gay guy to access his proxy
>thinks they'll pay for a $300 gpu to host a model locally
i don't think so

Anonymous
10/03/24(Thu)12:29:05 No.102667526

Anonymous 10/03/24(Thu)12:29:05 No.102667526

>>102667394
Good, since both you and /aicg/ are okay with filtered llmslop, you'll get around with "locusts" very quick. You have no principles or any ground here.

Anonymous
10/03/24(Thu)12:35:00 No.102667622

Anonymous 10/03/24(Thu)12:35:00 No.102667622

>>102665407
lmao, yeah

i just came from /aicg/ as watching their mental breakdown is getting extremely boring, i never really visited here, how are yall doing? can your models compare to shit like sonnet? i know that magnum-v2-123b its supposed to replace it but nothing more than that

Anonymous
10/03/24(Thu)12:38:12 No.102667662

Anonymous 10/03/24(Thu)12:38:12 No.102667662

>>102667622
Sonnet 3? Sure. 3.5? No, wait a year until we catch up.

Anonymous
10/03/24(Thu)12:38:51 No.102667675

Anonymous 10/03/24(Thu)12:38:51 No.102667675

>>102667520
>>thinks they'll pay for a $300 gpu to host a model locally
They are used to models like claude and other proper models. They'll get filtered by the dumb piece of shit cope models that the resident /lmg/ poorfags are wasting their time on and they are all likely too poor to get a proper setup to run proper models.

Anonymous
10/03/24(Thu)12:39:10 No.102667678

Anonymous 10/03/24(Thu)12:39:10 No.102667678

>>102667622
I personally don't like sonnet for creative stuff, old one was too purple, 3.5 is great at analyzing my stories but kinda slopped when writing itself.

Anonymous
10/03/24(Thu)12:39:19 No.102667680

Anonymous 10/03/24(Thu)12:39:19 No.102667680

>>102667622
I think prose and storytelling is superior with some local models, specially <100B (Mistral Large 123B and Old CR+ 104B). Intelligence is still lacking in comparison to sonnet 3.5

Anonymous
10/03/24(Thu)12:42:18 No.102667725

Anonymous 10/03/24(Thu)12:42:18 No.102667725

>>102667680
> >100B
ftfy

Anonymous
10/03/24(Thu)12:43:05 No.102667730

Anonymous 10/03/24(Thu)12:43:05 No.102667730

>>102667662
>>102667678
>>102667680
interesting, its not like i or most people care about it when it comes to creative stuff, people at aicg kept telling me its utter shit for some reason

Anonymous
10/03/24(Thu)12:43:42 No.102667741

Anonymous 10/03/24(Thu)12:43:42 No.102667741

>>102667520
There are also proxy hosts that provide access for nudes if you're a girl

Anonymous
10/03/24(Thu)12:45:11 No.102667768

Anonymous 10/03/24(Thu)12:45:11 No.102667768

>>102667622
>can your models compare to shit like sonnet?
Nope, you'll also be stuck in 4k / 8k context. Majority of local llmslop is filtered hard, no differences from cloud shit here.

Anonymous
10/03/24(Thu)12:47:47 No.102667818

Anonymous 10/03/24(Thu)12:47:47 No.102667818

File: Altman-jew.jpg (54 KB, 1024x683)

54 KB JPG

>>102667768
>>102667526
>>102667039
>>102666926
>

Anonymous
10/03/24(Thu)12:49:33 No.102667844

Anonymous 10/03/24(Thu)12:49:33 No.102667844

File: Giga pozzed AI.png (158 KB, 833x534)

158 KB PNG

>>102667818
>

Anonymous
10/03/24(Thu)12:51:49 No.102667885

Anonymous 10/03/24(Thu)12:51:49 No.102667885

>>102667725
yeah my bad
>>102667730
of course, people there drank piss to get access to proxies. Still, intelligence is better in sonnet 3.5

Anonymous
10/03/24(Thu)12:52:46 No.102667902

Anonymous 10/03/24(Thu)12:52:46 No.102667902

Why are there no good uncensored llama 3.2 finetunes?

Anonymous
10/03/24(Thu)12:55:22 No.102667951

Anonymous 10/03/24(Thu)12:55:22 No.102667951

>>102667902
Tech itself is flawed, you can't 100% avoid all the refusals and "toxic positivity", either cope or go cloud with some huge ass JB prompts.

Anonymous
10/03/24(Thu)12:55:28 No.102667956

Anonymous 10/03/24(Thu)12:55:28 No.102667956

>>102667678
Extensive use of Sonnet 3.5 has made me quite a bit less impressed. It understands well. It does that better than anything 123B size or smaller. But the writing doesn't wow me, there are plenty of models that write as well with a proper prompt provided the model doesn't misunderstand something important (which admittedly is frequent enough for me to still reach for Sonnet 3.5).

Anonymous
10/03/24(Thu)12:56:35 No.102667982

Anonymous 10/03/24(Thu)12:56:35 No.102667982

>>102667902
No finetune can fix a model that was trained on a fundamentally filtered dataset. Sadly, that's where the trend is going. The days where models were made with pure unfiltered internet and then rhlf'd into acting nice are over.

Anonymous
10/03/24(Thu)12:58:34 No.102668024

Anonymous 10/03/24(Thu)12:58:34 No.102668024

>>102667844
>llama3 with no prompt
Now show me your cloud model's answer to the same prompt.
Do you seriously think that all local models are equal? Newsflash: they are not. We have a leaderboard for measuring how uncensored the model is https://huggingface.co/spaces/DontPlanToEnd/UGI-Leaderboard, consider using something from there with above average in all categories, if you want less censored experience.

Anonymous
10/03/24(Thu)12:59:21 No.102668044

Anonymous 10/03/24(Thu)12:59:21 No.102668044

>>102668024
>Newsflash

Anonymous
10/03/24(Thu)12:59:25 No.102668045

Anonymous 10/03/24(Thu)12:59:25 No.102668045

>>102667259
Jesus Christ how insecure.

Anonymous
10/03/24(Thu)13:00:19 No.102668055

Anonymous 10/03/24(Thu)13:00:19 No.102668055

>>102667956 (You)
>It understands well. It does that better than anything 123B size or smaller.
Specifically better than old Command R+, Mistral Large 2, and Llama 3.0 and 3.1 70B Instruct. It's not like I've tried everything that exists.

Anonymous
10/03/24(Thu)13:01:22 No.102668071

Anonymous 10/03/24(Thu)13:01:22 No.102668071

>>102668024
>405B
>100B
I sleep.

Anonymous
10/03/24(Thu)13:01:26 No.102668072

Anonymous 10/03/24(Thu)13:01:26 No.102668072

>>102667844
>in order to prove local models are pozzed he needs a screenshot of the official instruct tune used with bad prompting back when the implementation was completely broken

Anonymous
10/03/24(Thu)13:03:10 No.102668101

Anonymous 10/03/24(Thu)13:03:10 No.102668101

I don't know if anyone else has noticed, but a couple of days ago CPU inference got about a 5% speed boost in llama.cpp
No idea what PR specifically helped, but its been consistent for a couple of days now at least.
source: I do a daily regression test for CPU inference specifically

Anonymous
10/03/24(Thu)13:03:16 No.102668103

Anonymous 10/03/24(Thu)13:03:16 No.102668103

sirs what is the best local model for ERP

Anonymous
10/03/24(Thu)13:03:28 No.102668109

Anonymous 10/03/24(Thu)13:03:28 No.102668109

>>102668072
Nothing changed.

Anonymous
10/03/24(Thu)13:04:45 No.102668136

Anonymous 10/03/24(Thu)13:04:45 No.102668136

>>102668044
Ah, so you're upset that I'm outsourcing my posts to an LLM? Imagine getting outsmarted by a glorified autocomplete. Maybe if you put half the effort into your own posts as you do into whining about mine, you'd have something worth reading. Don't worry, though—I'll make sure this LLM keeps the bar low enough for you to keep up.

>>102668071
I'm still waiting for your cloud's answer.

Anonymous
10/03/24(Thu)13:08:45 No.102668210

Anonymous 10/03/24(Thu)13:08:45 No.102668210

Did the thread discuss these Liquid models yet?
>https://github.com/kyegomez/LFM
They have a github but I don't see weights anywhere.

Anonymous
10/03/24(Thu)13:09:35 No.102668228

Anonymous 10/03/24(Thu)13:09:35 No.102668228

>>102667675
And what would tge proper setup be? What GPU do I need?

Anonymous
10/03/24(Thu)13:10:01 No.102668236

Anonymous 10/03/24(Thu)13:10:01 No.102668236

>>102668210
liquidAI wont opensource them.

Anonymous
10/03/24(Thu)13:11:22 No.102668269

Anonymous 10/03/24(Thu)13:11:22 No.102668269

>>102665959
It seems silly at first glance that those refusals were kept, but to play devil's advocate, could it be argued that they knew they would finetune a model who wouldn't have access to the internet in real time and so they wanted stock "sorry dave, I can't do that" answers instead of hallucinations were a user to request something like this?

Anonymous
10/03/24(Thu)13:11:23 No.102668271

Anonymous 10/03/24(Thu)13:11:23 No.102668271

>>102668136
Reddit is two floors down, faggot.

Anonymous
10/03/24(Thu)13:20:52 No.102668426

Anonymous 10/03/24(Thu)13:20:52 No.102668426

>>102668377
See >>102668271

Anonymous
10/03/24(Thu)13:23:41 No.102668462

Anonymous 10/03/24(Thu)13:23:41 No.102668462

>>102668426
Looks like you’ve officially run out of steam. Don’t worry, it happens when you’re trying to punch above your weight. Feel free to take a break—you’ll need the extra brain cells for your next attempt.

Anonymous
10/03/24(Thu)13:25:28 No.102668503

Anonymous 10/03/24(Thu)13:25:28 No.102668503

>>102665921
Basically
>Fine-tuning corpo instruct model? Needs unalignment training.
>Fine-tuning base model? Doesn't need unalignment. Only 0 refusals in the dataset.

Anonymous
10/03/24(Thu)13:25:47 No.102668510

Anonymous 10/03/24(Thu)13:25:47 No.102668510

>>102665447
In some countries text representations of that is illegal, yes, and businesses don't want to be associated with that.

Anonymous
10/03/24(Thu)13:30:01 No.102668580

Anonymous 10/03/24(Thu)13:30:01 No.102668580

>>102668462
Yeah i should kys

Anonymous
10/03/24(Thu)13:34:40 No.102668682

Anonymous 10/03/24(Thu)13:34:40 No.102668682

>>102668580
Hey, congrats on coming out as trans! That’s a huge step, and it takes real courage. Wishing you all the best on your journey!

Anonymous
10/03/24(Thu)13:35:53 No.102668709

Anonymous 10/03/24(Thu)13:35:53 No.102668709

File: compooter.jpg (55 KB, 640x480)

55 KB JPG

Does anyone know if having an AVX512 capable processor results in notable performance gains when doing CPU inference on llama.cpp or any of its derivatives? I'm guessing no, but I figured I'd ask in case any of the home workstation/server users had experimented with it.

Anonymous
10/03/24(Thu)13:38:00 No.102668743

Anonymous 10/03/24(Thu)13:38:00 No.102668743

>>102668709
I *think* llamafile has some extra CPU optimizations that rely on AVX 512.

Anonymous
10/03/24(Thu)13:40:47 No.102668797

Anonymous 10/03/24(Thu)13:40:47 No.102668797

Fish is quite good, though it pales in comparison to the original https://vocaroo.com/1kGeMeUAe6s1

Anonymous
10/03/24(Thu)13:42:35 No.102668839

Anonymous 10/03/24(Thu)13:42:35 No.102668839

>>102668709
>Does anyone know if having an AVX512 capable processor results in notable performance gains when doing CPU inference on llama.cpp or any of its derivatives? I'm guessing no, but I figured I'd ask in case any of the home workstation/server users had experimented with it.
Not enough to worry about. Memory bandwidth is king.
Unless you're doing cpu prompt processing, in which case, re-examine your life decisions

Anonymous
10/03/24(Thu)13:43:35 No.102668854

Anonymous 10/03/24(Thu)13:43:35 No.102668854

How would an older 8-channel Xeon (e.g. Platinum 8360Y) do for CPU inference?

Anonymous
10/03/24(Thu)13:49:20 No.102668973

Anonymous 10/03/24(Thu)13:49:20 No.102668973

>>102668101
My compile time for cpu on my potato went from like 16 minutes to about 2 with this one:
>https://github.com/ggerganov/llama.cpp/commit/a39ab216aa624308fda7fa84439c6b61dc98b87a
Not sure about inference. I only remember ballpark numbers.
c++ was a mistake.

Anonymous
10/03/24(Thu)13:50:27 No.102668992

Anonymous 10/03/24(Thu)13:50:27 No.102668992

File: 1561518473412.jpg (71 KB, 500x500)

71 KB JPG

>>102668854
Pretty swanky with 8 sticks at 3200 mhz, probably enough to run shit like Magnum 123B at a fat quant with a token or two a second,
though the common MO at that level of CPUMAXXing is just to go full retard with AMD EPYC and DDR5.

Anonymous
10/03/24(Thu)13:53:37 No.102669056

Anonymous 10/03/24(Thu)13:53:37 No.102669056

>>102668973
>Not sure about inference. I only remember ballpark numbers.
inference shouldn't see much of a boost unless you've got a really low core-count-to-memory-bandwidth-ratio.
Prompt processing is what WOULD benefit majorly from avx512, but is also SHOULD be being done on a GPU, because its a crapton faster and more efficient there.

llama.cpp CUDA dev !!OM2Fp6Fn93S
10/03/24(Thu)13:56:07 No.102669094

llama.cpp CUDA dev !!OM2Fp6Fn93S 10/03/24(Thu)13:56:07 No.102669094

>>102668709
Probably yes if you only have a CPU, very likely no if you also have a GPU.

Anonymous
10/03/24(Thu)13:56:55 No.102669105

Anonymous 10/03/24(Thu)13:56:55 No.102669105

>>102668709
It seems to have some benefit? I seem to get better results than others that have the same vram as me (which is not much). Never measured it though.

Anonymous
10/03/24(Thu)13:58:20 No.102669129

Anonymous 10/03/24(Thu)13:58:20 No.102669129

File: file.png (6 KB, 443x36)

6 KB PNG

>any animal feature on character and model will automatically slap on a tail
>write specifically that there's no tail
>get pircel
t-thanks...

Anonymous
10/03/24(Thu)14:00:02 No.102669157

Anonymous 10/03/24(Thu)14:00:02 No.102669157

>>102669129
Lmao.
Classic example of statistical bias being so strong that it still goes in that direction.
Kind of like those soft refusals.

Anonymous
10/03/24(Thu)14:01:32 No.102669181

Anonymous 10/03/24(Thu)14:01:32 No.102669181

>>102669157
Yet somehow this will magically replace authors. Maybe only for derivative works.

Anonymous
10/03/24(Thu)14:02:16 No.102669193

Anonymous 10/03/24(Thu)14:02:16 No.102669193

>>102669056
That commit is about passing c strings instead of c++ strings. Nothing to do with avx.

>>102668101
git bisect and see what commit increased the speed, if you're curious enough.

Anonymous
10/03/24(Thu)14:09:22 No.102669295

Anonymous 10/03/24(Thu)14:09:22 No.102669295

File: 1686665153981354.jpg (698 KB, 2000x2000)

698 KB JPG

>>102669129
LMAO

Anonymous
10/03/24(Thu)14:14:23 No.102669385

Anonymous 10/03/24(Thu)14:14:23 No.102669385

>>102665142
>hhe

Anonymous
10/03/24(Thu)14:20:44 No.102669525

Anonymous 10/03/24(Thu)14:20:44 No.102669525

>>102667622
>i know that magnum
Buy a fucking ad, shill.

Anonymous
10/03/24(Thu)14:20:44 No.102669526

Anonymous 10/03/24(Thu)14:20:44 No.102669526

What's the QRD on the new aicg drama? Is it really the end or just another nothingburger?
And honestly will there really ever be an end? Like Anthropic just goes full censorship and filters every single access point to their models?

Anonymous
10/03/24(Thu)14:22:25 No.102669558

Anonymous 10/03/24(Thu)14:22:25 No.102669558

>>102669129
>He has no tail.
>Just stop thinking about the tail.
>But what if he did have one?
pink elephant moment

Anonymous
10/03/24(Thu)14:23:32 No.102669575

Anonymous 10/03/24(Thu)14:23:32 No.102669575

>>102669525
i actually just know it because someone else shilled it in aicg
>>102669526
according to some other anon "quarantined keys now block bedrock requests which was previously used to access models on quarantined keys, a major source was just killed"
and also news are starting to catch up to shit aicg has been doing because of a big proxy owner called drago

Anonymous
10/03/24(Thu)14:23:54 No.102669580

Anonymous 10/03/24(Thu)14:23:54 No.102669580

File: 1679193620996789.png (528 KB, 1170x821)

528 KB PNG

>>102669526
Same as it ever was.

Anonymous
10/03/24(Thu)14:24:34 No.102669592

Anonymous 10/03/24(Thu)14:24:34 No.102669592

>>102669575
>because of a big proxy owner called drago
Hi, Jojo.

Anonymous
10/03/24(Thu)14:25:57 No.102669617

Anonymous 10/03/24(Thu)14:25:57 No.102669617

>>102669575
What causes the keys to be quarantined? Seems kind of strange to do to a paying customer.

Anonymous
10/03/24(Thu)14:27:37 No.102669644

Anonymous 10/03/24(Thu)14:27:37 No.102669644

>>102669526
Krebs released an article on /aicg/ so normies are picking up on it. And AI companies figuring out more ways to lock down their cloud AI

Anonymous
10/03/24(Thu)14:28:09 No.102669654

Anonymous 10/03/24(Thu)14:28:09 No.102669654

>>102669129
>average local model experience
>>102669592
go back

Anonymous
10/03/24(Thu)14:37:23 No.102669794

Anonymous 10/03/24(Thu)14:37:23 No.102669794

File: Compare.png (116 KB, 826x475)

116 KB PNG

So this is the power of pretraining filter. I mean it works.

Anonymous
10/03/24(Thu)14:40:54 No.102669842

Anonymous 10/03/24(Thu)14:40:54 No.102669842

>>102669580
There's no coom now though, only doom

Anonymous
10/03/24(Thu)14:41:51 No.102669856

Anonymous 10/03/24(Thu)14:41:51 No.102669856

>>102669794
Hi sam... Wait, no

Anonymous
10/03/24(Thu)14:43:54 No.102669886

Anonymous 10/03/24(Thu)14:43:54 No.102669886

>>102669794
I think I never saw a local model using "((()))", pretty cool.

Anonymous
10/03/24(Thu)14:44:24 No.102669897

Anonymous 10/03/24(Thu)14:44:24 No.102669897

Llama is a joke. Lecun is a hack. There will never be an uncensored model again because nobody will release models to get overshadowed by llama on the benchmarks. This makes it more harmful than any lobbying from OAI or Anthropic.

Anonymous
10/03/24(Thu)14:46:06 No.102669923

Anonymous 10/03/24(Thu)14:46:06 No.102669923

>>102669794
claudesisters... now our AI is fucking racist??
how many times will local have to win? when is our turn?!

Anonymous
10/03/24(Thu)14:46:56 No.102669939

Anonymous 10/03/24(Thu)14:46:56 No.102669939

Anthropic will be absolutely killed very soon by the government. They can't keep getting away with this unsafety in their models. All companies know that, that's why they go for the safe route of filtering the pre-training dataset.

Anonymous
10/03/24(Thu)14:47:44 No.102669949

Anonymous 10/03/24(Thu)14:47:44 No.102669949

when will opus 3 leak a la naiv1

Anonymous
10/03/24(Thu)14:48:18 No.102669955

Anonymous 10/03/24(Thu)14:48:18 No.102669955

>>102669949
>800T
No one would be able to run it.

Anonymous
10/03/24(Thu)14:49:59 No.102669982

Anonymous 10/03/24(Thu)14:49:59 No.102669982

>>102669939
They're already working on it, Opus 3.5 will be shit for RP(and if it doesn't then it'll be almost impossible to access without straight up paying), and the future anthropic models will do the same.

Anonymous
10/03/24(Thu)14:51:27 No.102670008

Anonymous 10/03/24(Thu)14:51:27 No.102670008

>>102669982
Didn't people say 3.5 Sonnet was better than Opus when used with the right JB? I doubt Opus 3.5 would be worse. What could happen though is they put additional input/output filters on it which would kill things.

Anonymous
10/03/24(Thu)14:51:46 No.102670018

Anonymous 10/03/24(Thu)14:51:46 No.102670018

why is local still a joke

Anonymous
10/03/24(Thu)14:52:24 No.102670027

Anonymous 10/03/24(Thu)14:52:24 No.102670027

>>102669955
how do data centers even afford to run this shit for free?

All that electricity and hardware has got to be a lot even for FAGMAN companies

Anonymous
10/03/24(Thu)14:52:44 No.102670036

Anonymous 10/03/24(Thu)14:52:44 No.102670036

>>102670018
need more API logs to finetune on

Anonymous
10/03/24(Thu)14:53:20 No.102670046

Anonymous 10/03/24(Thu)14:53:20 No.102670046

least obvious samefag

Anonymous
10/03/24(Thu)14:53:43 No.102670054

Anonymous 10/03/24(Thu)14:53:43 No.102670054

>>102670027
I mean look at how much profit a company like Facebook rakes in normally. They can afford some billions lost on server infrastructure and upkeep.

As for small startups like Mistral, well, investors basically.

Anonymous
10/03/24(Thu)14:53:54 No.102670060

Anonymous 10/03/24(Thu)14:53:54 No.102670060

How good is Deepseek Coder at cooding compared to 3.5 Sonnet and o1? My PC's nowhere near good enough to run the full-fat model, but I've got $10 of API access just lying around

Anonymous
10/03/24(Thu)14:55:45 No.102670095

Anonymous 10/03/24(Thu)14:55:45 No.102670095

>>102670018
local models are garbage for the main thing LLMs are created for: completing text.
You can do a simple test, try to make a local LLM complete a story with an unusual writing style, the completion will be complete slop without any resemblance to the original story. Now, if you try the same thing with good cloud models like Claude or GPT4o you will see that the model "gets it" and writes quite satisfactorily.

Anonymous
10/03/24(Thu)14:55:50 No.102670099

Anonymous 10/03/24(Thu)14:55:50 No.102670099

>>102670060
it's really good. you'd only notice a difference in very complex tasks

Anonymous
10/03/24(Thu)14:56:04 No.102670104

Anonymous 10/03/24(Thu)14:56:04 No.102670104

>>102670060
Not very good, if you're desperate for coding I recommend downloading cursor IDE, it has a built-in chat you can use and the free trial grants you access to Sonnet 3.5, I think it was like 2 weeks or so, could use a VPN to get another trial too.

Anonymous
10/03/24(Thu)14:59:23 No.102670171

Anonymous 10/03/24(Thu)14:59:23 No.102670171

Let's say, hypothetically, I have gigabytes of logs from shit like claude opus from hundreds of people over the past few months.
What can I do with it?

Anonymous
10/03/24(Thu)14:59:23 No.102670172

Anonymous 10/03/24(Thu)14:59:23 No.102670172

File: you.jpg (97 KB, 1000x561)

97 KB JPG

>>102670095
Let me guess... You're trying to compare Nemo with Claude Opus, ignoring the fact that Claude is likely a 1T model and Nemo is a 14B model.

Anonymous
10/03/24(Thu)15:00:28 No.102670181

Anonymous 10/03/24(Thu)15:00:28 No.102670181

>>102670171
Jack and shit, leak them if you want but there's really not much you can use them for.

Anonymous
10/03/24(Thu)15:00:40 No.102670187

Anonymous 10/03/24(Thu)15:00:40 No.102670187

>>102670171
Make retarded esl sloptunes

Anonymous
10/03/24(Thu)15:00:40 No.102670188

Anonymous 10/03/24(Thu)15:00:40 No.102670188

>>102670060
I found 2.5 to be very competent if you can run at a high quant
>>102670104
How do you figure? What tasks did you try, and what version/quant of deepseek did you run?

Anonymous
10/03/24(Thu)15:00:43 No.102670190

Anonymous 10/03/24(Thu)15:00:43 No.102670190

>>102670171
Most of that will be useless, considering the user prompts will literally be: ahh ahh mistress

Anonymous
10/03/24(Thu)15:01:28 No.102670207

Anonymous 10/03/24(Thu)15:01:28 No.102670207

>>102670171
Shove it up your ass, or leak them. It wouldn't be anything special though, we already have C2 logs which have enough logs.

Anonymous
10/03/24(Thu)15:01:30 No.102670209

Anonymous 10/03/24(Thu)15:01:30 No.102670209

>>102670171
Filter them and maybe end up with a decent-ish dataset.

Anonymous
10/03/24(Thu)15:02:13 No.102670218

Anonymous 10/03/24(Thu)15:02:13 No.102670218

>102670095
>... or GPT4o
Nice try Sam. I tried both 4o and o1 to do some storywriting. Absolute sloppy garbage.
>le skill issue
Sorry but if you need a JB to make it write well then it's still the same shit as anything local just a different flavor of it.

Anonymous
10/03/24(Thu)15:03:03 No.102670230

Anonymous 10/03/24(Thu)15:03:03 No.102670230

I'm fucking pissed off at OpenAI, seriously, what the fuck is that "you get 50 prompts per week with o1 :^)", 50 fuckings promts? I'd get fucking better value buying a fucking temp token from scylla or something.

Anonymous
10/03/24(Thu)15:04:25 No.102670251

Anonymous 10/03/24(Thu)15:04:25 No.102670251

>>102670230
o1 is just too expensive the 20$ you pay wouldn't be enough to pay for more than 50 prompts per week. Also, go back.

Anonymous
10/03/24(Thu)15:08:49 No.102670304

Anonymous 10/03/24(Thu)15:08:49 No.102670304

>>102669644
They're going to lock down the local models even more too then.

Anonymous
10/03/24(Thu)15:11:37 No.102670327

Anonymous 10/03/24(Thu)15:11:37 No.102670327

>>102670171
Donate them to anthracite

Anonymous
10/03/24(Thu)15:12:32 No.102670333

Anonymous 10/03/24(Thu)15:12:32 No.102670333

>>102670230
Just run it off the API. Plop in your credit card and you can do as many prompts as you want for a whole month.

Anonymous
10/03/24(Thu)15:16:20 No.102670377

Anonymous 10/03/24(Thu)15:16:20 No.102670377

>>102670171
You post it.
Give those LLMJackers the public shaming they deserve.
I'm a big fan of your investigative work, but the job is not done until you release the logs.

Anonymous
10/03/24(Thu)15:21:57 No.102670456

Anonymous 10/03/24(Thu)15:21:57 No.102670456

Weird question to ask here, but what's the most intelligent <70B model that has a similar vibe to GPT-4o? I've been really enjoying talking to 4o about random stuff and hobbies and when it switches to 4o-mini, the difference is huge. I want my own local 4o for asking random questions to and bouncing ideas off of.

Anonymous
10/03/24(Thu)15:22:34 No.102670469

Anonymous 10/03/24(Thu)15:22:34 No.102670469

>>102670456
Llama 3.1 405B

Anonymous
10/03/24(Thu)15:26:52 No.102670534

Anonymous 10/03/24(Thu)15:26:52 No.102670534

>>102670456
Just pay for 4o if that's what you want.

Anonymous
10/03/24(Thu)15:33:29 No.102670644

Anonymous 10/03/24(Thu)15:33:29 No.102670644

>>102670456
You're expecting too much from local.

Anonymous
10/03/24(Thu)15:37:05 No.102670711

Anonymous 10/03/24(Thu)15:37:05 No.102670711

>>102670456
Trivia and niche knowledge is what I'd call the biggest weakness of local models right now.

Anonymous
10/03/24(Thu)15:40:39 No.102670777

Anonymous 10/03/24(Thu)15:40:39 No.102670777

>>102670456
lol

Anonymous
10/03/24(Thu)15:59:38 No.102671068

Anonymous 10/03/24(Thu)15:59:38 No.102671068

File: Qwen2.5-14B-instruct.png (62 KB, 1375x607)

62 KB PNG

*cockblocks you.*

Anonymous
10/03/24(Thu)16:09:57 No.102671217

Anonymous 10/03/24(Thu)16:09:57 No.102671217

>>102671068
Censored chow mein

Anonymous
10/03/24(Thu)16:13:26 No.102671263

Anonymous 10/03/24(Thu)16:13:26 No.102671263

>>102670327
>donate them to an org that will keep it to themselves

Anonymous
10/03/24(Thu)16:14:58 No.102671295

Anonymous 10/03/24(Thu)16:14:58 No.102671295

File: 1714811762332664.png (101 KB, 653x799)

101 KB PNG

Anonymous
10/03/24(Thu)16:15:30 No.102671305

Anonymous 10/03/24(Thu)16:15:30 No.102671305

>>102671068
>Chinese censorship
>china good, communism good
>Western censorship
>trannies good, white people bad, also communism good??
wtf do I do

Anonymous
10/03/24(Thu)16:18:33 No.102671340

Anonymous 10/03/24(Thu)16:18:33 No.102671340

>>102671263
donate them to me. I'll also keep them to myself, but I'm not anthracite

Anonymous
10/03/24(Thu)16:24:15 No.102671436

Anonymous 10/03/24(Thu)16:24:15 No.102671436

>>102671305
>also communism good??
If you think billion dollar foundational models made by start-ups have even a vaguely positive view of communism I have a few awesome startup ideas you would be a perfect investor for

Anonymous
10/03/24(Thu)16:34:31 No.102671570

Anonymous 10/03/24(Thu)16:34:31 No.102671570

>>102670095
I don't think any of the cloud providers actually let you do text completion with their models. Prompting to complete text isn't the same thing.
That said I'm sort of curious if 4o can hold up for this sort of use case, I would expect the CoT to wreck creativity hard.

Anonymous
10/03/24(Thu)16:35:04 No.102671580

Anonymous 10/03/24(Thu)16:35:04 No.102671580

File: HatsuneMikuRPGForThePC98.png (1.49 MB, 896x1152)

1.49 MB PNG

HatsuMi

Anonymous
10/03/24(Thu)16:37:00 No.102671602

Anonymous 10/03/24(Thu)16:37:00 No.102671602

>>102671570
4o is the normal multimodal one, o1 is the CoT finetune

Anonymous
10/03/24(Thu)16:37:12 No.102671604

Anonymous 10/03/24(Thu)16:37:12 No.102671604

For a few days now I can't load models that I used to. llama.cpp just crashes with "killed" which I assume is OOM.
Strange thing is I didn't even update llama.cpp so it's probably either kernel or nvidia-driver related. How do I even begin to debug this?

Anonymous
10/03/24(Thu)16:39:21 No.102671634

Anonymous 10/03/24(Thu)16:39:21 No.102671634

>>102671604
If you think it's vram OOM, lower number of layers on the GPU to 0 and see if that loads.
That's where I'd begin.

Anonymous
10/03/24(Thu)16:39:43 No.102671639

Anonymous 10/03/24(Thu)16:39:43 No.102671639

>>102671604
nvidia-smi will show if any other processes are using vram. That's a place to start

Anonymous
10/03/24(Thu)16:39:49 No.102671644

Anonymous 10/03/24(Thu)16:39:49 No.102671644

File: 1572214709258.jpg (751 KB, 900x900)

751 KB JPG

Checking back in after a long break. Is 30B/24GB still cursed or has there finally been a decent model released for this bracket? Is BitNet still two more weeks?

Anonymous
10/03/24(Thu)16:40:48 No.102671661

Anonymous 10/03/24(Thu)16:40:48 No.102671661

>>102671604
what does the dump say?

Anonymous
10/03/24(Thu)16:41:32 No.102671673

Anonymous 10/03/24(Thu)16:41:32 No.102671673

>>102671644
I don't know if it's decent, but there's a 22B general use mistral now. Mistral-small.
Try that.

Anonymous
10/03/24(Thu)16:42:24 No.102671690

Anonymous 10/03/24(Thu)16:42:24 No.102671690

>>102671644
>24GB
its still a frustrating place to live

Anonymous
10/03/24(Thu)16:42:31 No.102671695

Anonymous 10/03/24(Thu)16:42:31 No.102671695

>>102671436
it's obvious they do so pitch me anon, let's make some money

Anonymous
10/03/24(Thu)16:45:01 No.102671732

Anonymous 10/03/24(Thu)16:45:01 No.102671732

>>102671644
imagine running a 30b even if thats all you could run

Anonymous
10/03/24(Thu)16:46:05 No.102671749

Anonymous 10/03/24(Thu)16:46:05 No.102671749

File: vomit.png (934 KB, 1825x417)

934 KB PNG

>>102671295
>bot writes in first person

Anonymous
10/03/24(Thu)16:48:42 No.102671788

Anonymous 10/03/24(Thu)16:48:42 No.102671788

>>102671644
There's Qwen2.5-32B. Ignore the Mistral Small retard.

Anonymous
10/03/24(Thu)16:51:10 No.102671820

Anonymous 10/03/24(Thu)16:51:10 No.102671820

>>102671580
Hate Umi: Putting an end to the world's oceans with Miku

Anonymous
10/03/24(Thu)16:51:29 No.102671826

Anonymous 10/03/24(Thu)16:51:29 No.102671826

File: ComfyUI_06371_.png (1.19 MB, 720x1280)

1.19 MB PNG

>>102671580
looks like the cover of a choose your own adventure book

Anonymous
10/03/24(Thu)16:52:18 No.102671835

Anonymous 10/03/24(Thu)16:52:18 No.102671835

>>102671788
I heard Qwen2.5 was a chinese virus. I don't feel safe running that. Is there anything else?

Anonymous
10/03/24(Thu)16:54:34 No.102671869

Anonymous 10/03/24(Thu)16:54:34 No.102671869

File: livebench-2024-09-30.png (932 KB, 3294x1894)

932 KB PNG

>405B
>barely better than a 70B model

Anonymous
10/03/24(Thu)16:54:38 No.102671872

Anonymous 10/03/24(Thu)16:54:38 No.102671872

>>102671788
>Qwen2.5-32B
Is that actually better than Gemma 27B? I gave up on earlier versions of Qwen because they seemed really slopped and constantly lapsed into chinese. None of the chink models felt up to par.
Gemma seems decently smart but slopped and magnum seems fun but brain damaged.

Anonymous
10/03/24(Thu)16:54:41 No.102671874

Anonymous 10/03/24(Thu)16:54:41 No.102671874

File: 1705300069122128.jpg (151 KB, 642x800)

151 KB JPG

>>102671749
it was instructed to, user is the narrator otherwise you end up with rp where the card character doesn't leave you alone ever

Anonymous
10/03/24(Thu)16:55:08 No.102671879

Anonymous 10/03/24(Thu)16:55:08 No.102671879

>>102671580
>>102671826
why is the text so fucked up? isn't that supposed to be flux's specialty?

Anonymous
10/03/24(Thu)16:57:45 No.102671919

Anonymous 10/03/24(Thu)16:57:45 No.102671919

File: BosniaHerzegovinanKnockof(...).png (2.46 MB, 1280x1640)

2.46 MB PNG

>>102671826
Choose your own adventures always seemed kind of lame
Its more fun to run Miku D&D adventures
>>102671879
Mine's not flux

Anonymous
10/03/24(Thu)16:58:58 No.102671940

Anonymous 10/03/24(Thu)16:58:58 No.102671940

>>102669886
yeah because yo udumb niggers use the world's most fucked up sampler settings ensuring it never shits out interesting tokens

Anonymous
10/03/24(Thu)17:00:25 No.102671963

Anonymous 10/03/24(Thu)17:00:25 No.102671963

>>102671869
Um it's barely better than a 7**2**B model chud, and it's turbo so some kind of quant or sparsity bullshit on the backend
the true unlocked full power of 405B would put it above o1 but LiveBench is paid off by Sam so he'll never show that

Anonymous
10/03/24(Thu)17:01:31 No.102671983

Anonymous 10/03/24(Thu)17:01:31 No.102671983

>>102671869
The Llamas are known to be more general assistants than coders while Qwen is heavily code-focused. You could say something even crazier about how bad big models are if you knew how many B's Opus has despite being a bit old by now. Though something funny about this graph is that Largestral is so low despite having very high coding scores when Mistral originally blogged about it.

Anonymous
10/03/24(Thu)17:03:18 No.102672004

Anonymous 10/03/24(Thu)17:03:18 No.102672004

>>102671940
Interesting tokens, also known as retardation.
If the model doesn't output interesting tokens at temp 0, the model is cooked.

Anonymous
10/03/24(Thu)17:03:44 No.102672014

Anonymous 10/03/24(Thu)17:03:44 No.102672014

>>102671983
Lies, damned lies and benchmarks

Anonymous
10/03/24(Thu)17:03:44 No.102672015

Anonymous 10/03/24(Thu)17:03:44 No.102672015

>>102671983
It's already been theorized by reliable anons:
Opus = 70B
Sonnet = 34B
Haiku = 8B

Anonymous
10/03/24(Thu)17:20:11 No.102672178

Anonymous 10/03/24(Thu)17:20:11 No.102672178

>>102664918
anyone ugly probably

Anonymous
10/03/24(Thu)17:21:46 No.102672200

Anonymous 10/03/24(Thu)17:21:46 No.102672200

Could I profit from /aicg/ by hosting a big local model on the cloud?

Anonymous
10/03/24(Thu)17:22:46 No.102672210

Anonymous 10/03/24(Thu)17:22:46 No.102672210

>>102672200
the best would could get in attention and praise, they wouldn't pay for a local i think

Anonymous
10/03/24(Thu)17:23:39 No.102672221

Anonymous 10/03/24(Thu)17:23:39 No.102672221

>>102671872
Gemma 27B has a context limit of 8k tokens. That does not meet my basic requirements. So for me, yes, I can say Qwen 2.5 32B is better.

Anonymous
10/03/24(Thu)17:24:25 No.102672232

Anonymous 10/03/24(Thu)17:24:25 No.102672232

>>102670171
Give them to Sao10K

Anonymous
10/03/24(Thu)17:28:47 No.102672275

Anonymous 10/03/24(Thu)17:28:47 No.102672275

>>102672200
>Hosting big local models
>Profiting
You would be the first person to profit running a ai service.

Anonymous
10/03/24(Thu)17:29:16 No.102672278

Anonymous 10/03/24(Thu)17:29:16 No.102672278

>>102672221
>contextfag
I'd love to get to the point where models don't shit themselves or drown in slop before they hit 8k. Until then I don't fucking care how much context a model has if it's retarded.

Anonymous
10/03/24(Thu)17:30:51 No.102672299

Anonymous 10/03/24(Thu)17:30:51 No.102672299

gemma is deterministic and retarded. not even the best meme samplers can fix it.

Anonymous
10/03/24(Thu)17:35:58 No.102672355

Anonymous 10/03/24(Thu)17:35:58 No.102672355

File: 1726535387645037.jpg (66 KB, 804x906)

66 KB JPG

>>102671604 (Me)
I downgraded the kernel and it seems fine now.

Anonymous
10/03/24(Thu)17:36:07 No.102672357

Anonymous 10/03/24(Thu)17:36:07 No.102672357

>>102672299
What models can the best meme samplers fix?

Anonymous
10/03/24(Thu)17:38:30 No.102672378

Anonymous 10/03/24(Thu)17:38:30 No.102672378

>>102672357
mythomax

Anonymous
10/03/24(Thu)17:40:23 No.102672397

Anonymous 10/03/24(Thu)17:40:23 No.102672397

>>102672355
>I downgraded the kernel and it seems fine now.
which kernel version was causing the trouble?
I was contemplating upgrading to 6.10.11 today

Anonymous
10/03/24(Thu)17:45:08 No.102672469

Anonymous 10/03/24(Thu)17:45:08 No.102672469

>>102672397
I was on 6.11.1 and now back to 6.10.6

Anonymous
10/03/24(Thu)17:46:02 No.102672484

Anonymous 10/03/24(Thu)17:46:02 No.102672484

>>102668503
Yeah, that "no refusals" set is like something you'd use trying to instruct train a base model. Throwing that on top of something that's already instruct trained, IDK man.

Anonymous
10/03/24(Thu)17:48:11 No.102672506

Anonymous 10/03/24(Thu)17:48:11 No.102672506

>>102672278
Even Mistral Small is fine up to 19K. That's at 8.0bpw / 8.5bpw. I assume worse quants make that lower.

Anonymous
10/03/24(Thu)17:50:20 No.102672533

Anonymous 10/03/24(Thu)17:50:20 No.102672533

>>102672469
>I was on 6.11.1 and now back to 6.10.6
Good to know. I'll be wary of the 6.11 branch when it hits Debian testing.
6.10.10 is working great for me so far btw

Anonymous
10/03/24(Thu)17:56:57 No.102672618

Anonymous 10/03/24(Thu)17:56:57 No.102672618

>>102672015
lol, lmao even

Anonymous
10/03/24(Thu)18:00:26 No.102672660

Anonymous 10/03/24(Thu)18:00:26 No.102672660

OG OpenAI ChatGPT versions that blew everyone's minds and turned them into a household name must surely have been replaced with hollowed out simulacra by now, eh?
What was the estimate back in the day? Full-bore GPT4 in its heyday was like 800b or 1.2T or something ridiculous?
There was no way they could keep serving that out at scale and not run out of money.

Anonymous
10/03/24(Thu)18:02:59 No.102672691

Anonymous 10/03/24(Thu)18:02:59 No.102672691

>>102672015
Opus is $75 per million tokens. If we go from the assumption that they are just breaking even (since OpenAI is hemorrhaging money and most companies are losing money trying to gain market share), Opus must be pretty huge for it to cost that much.

Anonymous
10/03/24(Thu)18:03:29 No.102672695

Anonymous 10/03/24(Thu)18:03:29 No.102672695

>>102671869
We just need a good Qwen2.5 finetune and we are set. It's amazing at sfw but is super bland at NSFW.

Neither of the two finetunes I've seen yet have fixed that. (Chronos platinum / banana)

Anonymous
10/03/24(Thu)18:05:34 No.102672720

Anonymous 10/03/24(Thu)18:05:34 No.102672720

>>102672357
XTC makes good models better

Anonymous
10/03/24(Thu)18:06:20 No.102672728

Anonymous 10/03/24(Thu)18:06:20 No.102672728

>>102672691
>Opus must be pretty huge
It is, you would be retarded to think that 70B model is going head-to-head with 1760B GPT-4 from the same technological era.

Anonymous
10/03/24(Thu)18:07:18 No.102672742

Anonymous 10/03/24(Thu)18:07:18 No.102672742

>>102672660
Current products are way better than GPT 3.5 was two years ago. OpenAI is also set to lose like 5 billions this year. They still get investors that hope investing in them is the endgame. There used to be safeguards that none of their commercial deals would hold when AGI comes, but with Microsoft they're perverted the definition of AGI, making it always unattainable, so that it is never reach and those clauses never take effect. So people invest in them hoping that it will be them who control the superintelligence that will let them rule over other people forever.

Anonymous
10/03/24(Thu)18:08:38 No.102672758

Anonymous 10/03/24(Thu)18:08:38 No.102672758

>>102672660
The estimates were ~1.7T with ~450b active due to MoE
There's been several layers of turbos and minis and o's and whatever the fuck else since then so they are a fraction of the size by now, at least when it comes to inference compute even if not necessarily parameters. It's very likely still some form of sparse activation because big data centers are limited by compute rather than VRAM and they're using giant batches for each inference. At the same time of all this improvements in quantization are found, GPUs get more efficient, data centers are scaling up, which all multiply against each other for cost savings.

Parallel to all this they're still eyeing the next level of scale. Grok 3 might actually be the first to market with some new fuckhuge model, but all the big labs are either training or planning their own as we speak. You just HAVE to do it right because it's so fucking expensive to start over.

Anonymous
10/03/24(Thu)18:10:09 No.102672780

Anonymous 10/03/24(Thu)18:10:09 No.102672780

>>102672728
There were the "leaks" (or misunderstandings, depending on who you ask) that at some point GPT 3.5 Turbo was 20B parameters.

But yeah, no chances that Opus is under a few hundred billion parameters.

Anonymous
10/03/24(Thu)18:15:43 No.102672842

Anonymous 10/03/24(Thu)18:15:43 No.102672842

>>102670711
That's a bummer.

>>102670644
I was assuming that if they can generate hyper specific goonslop that they would excel at just normal convo.

Anonymous
10/03/24(Thu)18:16:26 No.102672851

Anonymous 10/03/24(Thu)18:16:26 No.102672851

https://huggingface.co/spaces/flowers-team/StickToYourRoleLeaderboard

Anonymous
10/03/24(Thu)18:16:26 No.102672852

Anonymous 10/03/24(Thu)18:16:26 No.102672852

File: file.png (219 KB, 969x838)

219 KB PNG

>3B and 14B wrote almost the same thing.
Huh, I only noticed the right one was 3B because it wrote "her body trashing weakly [...] attempts to break free" even though the time is stopped.

Anonymous
10/03/24(Thu)18:20:06 No.102672886

Anonymous 10/03/24(Thu)18:20:06 No.102672886

>>102672852
I need to make a bot for a CTF discord server that answers questions based on responses from those in leadership roles before November. I've only ever used a1111 for image gen and haven't learned how to train any kind of model yet. How fucked am I?

Anonymous
10/03/24(Thu)18:22:57 No.102672910

Anonymous 10/03/24(Thu)18:22:57 No.102672910

>>102672886
just use RAG

Anonymous
10/03/24(Thu)18:26:13 No.102672947

Anonymous 10/03/24(Thu)18:26:13 No.102672947

>>102667520
I have a 4090 and sent the dick pick because I thought I'd have access to more dick pics :(

Anonymous
10/03/24(Thu)18:30:32 No.102672992

Anonymous 10/03/24(Thu)18:30:32 No.102672992

>>102672852
>['thing 1', 'thing 2', 'thing 3']
i've heard square brackets keep the meat of the text from trying to emulate the writing style of what's in it,
what's the idea behind the apostrophes? to sort of double down on the demarcation the commas provide?

Anonymous
10/03/24(Thu)18:35:35 No.102673061

Anonymous 10/03/24(Thu)18:35:35 No.102673061

>>102671788
>Ignore the Mistral Small retard.
Uhh why? The main issue with Mixtral was the fact that you had to quant it retarded to squeeze it in to 24GB and no one could train it because of MoE jank. A dense 22B sounds pretty good for that size bracket. Is mistral-small bad for some reason?

Anonymous
10/03/24(Thu)18:37:21 No.102673077

Anonymous 10/03/24(Thu)18:37:21 No.102673077

emu4 72b WHEN????

Anonymous
10/03/24(Thu)18:38:25 No.102673088

Anonymous 10/03/24(Thu)18:38:25 No.102673088

>>102673077
Emu3 can't even run on consumer hardware

Anonymous
10/03/24(Thu)18:39:09 No.102673094

Anonymous 10/03/24(Thu)18:39:09 No.102673094

Is there a dump of chub cards anywhere online?
t. too dumb to scape

Anonymous
10/03/24(Thu)18:42:15 No.102673129

Anonymous 10/03/24(Thu)18:42:15 No.102673129

Gentlemen, a RTX 4060 TI can work as a "poor man's" option to start using AI tools for coding, text editing and design? I have a 1050TI and it struggles with text and there's no fucking way I'm waiting the 10 minutes it takes to generate images to be able to work on it. It's all for amateur use, like accelerating book production, to write that data scraping tool I'm depending on to work but can't pay a developer to write it, and low weight design for, let say, doujinshis. What you people think?

Anonymous
10/03/24(Thu)18:42:47 No.102673130

Anonymous 10/03/24(Thu)18:42:47 No.102673130

>>102672992
nta.
It's a bit of a mix between clearly marking some text in the context so that presumably the model pays more attention to it and thinking "computer sees text. text is code. computer does code". Specifically about the quotes, you enclose multiple words in quotes to make sure they're interpreted as one unit, like passing parameters to a program:
>rm "the file.txt"
deletes one file, named "the file.txt" but, without quotes, would be
>rm the file.txt
where rm expects to find two separate files. In most programming languages the quotes are required, even for single words. You may know that already, but whatever. Just for clarity.
So it's a mix of those things, as i interpret it. Clear and distinct style for the model to pay attention too and a slightly superstitious belief that code-like things have some special significance to the model.

Anonymous
10/03/24(Thu)18:46:37 No.102673168

Anonymous 10/03/24(Thu)18:46:37 No.102673168

>her eyes sparkle with excitement
>her eyes sparkle with mischief
>her eyes sparkle with excitement and a hint of mischief
ffs can't their eyes sparkle with something else for once?

Anonymous
10/03/24(Thu)18:48:08 No.102673187

Anonymous 10/03/24(Thu)18:48:08 No.102673187

All I need is a holodeck and enough compute to run a thousand 1000T agents.
How long til Moore's law gives me that?

Anonymous
10/03/24(Thu)18:49:09 No.102673204

Anonymous 10/03/24(Thu)18:49:09 No.102673204

>>102673168
>ffs can't their eyes sparkle with something else for once?
Try shooting an industrial laser into their eyes

Anonymous
10/03/24(Thu)18:50:28 No.102673217

Anonymous 10/03/24(Thu)18:50:28 No.102673217

>>102672992
Sorry for being vague, the text at the top is not part of the context, it's just a summary of the context (written by Nemo btw) in the form of text and tags.
However, I agree with >>102673130, one might want to use apostrophes to have a better indication of where each item starts and ends.

Anonymous
10/03/24(Thu)18:50:52 No.102673221

Anonymous 10/03/24(Thu)18:50:52 No.102673221

>>102673168
>her eyes sparkle with excrement

Anonymous
10/03/24(Thu)18:52:09 No.102673237

Anonymous 10/03/24(Thu)18:52:09 No.102673237

>>102671604
llama.cpp should not crash when oom, it would give you an error if the allocation fails

Anonymous
10/03/24(Thu)18:52:58 No.102673248

Anonymous 10/03/24(Thu)18:52:58 No.102673248

>>102673168
>her eyes sparkle with a bond that is forming

Anonymous
10/03/24(Thu)18:53:14 No.102673251

Anonymous 10/03/24(Thu)18:53:14 No.102673251

>>102672728
Cope

Anonymous
10/03/24(Thu)18:54:29 No.102673264

Anonymous 10/03/24(Thu)18:54:29 No.102673264

does mistral rs come with a frontend or is supported by st?
also is it possible to run it on termux

Anonymous
10/03/24(Thu)18:57:24 No.102673297

Anonymous 10/03/24(Thu)18:57:24 No.102673297

File: .png (12 KB, 809x136)

12 KB PNG

>>102673168

Anonymous
10/03/24(Thu)18:58:52 No.102673312

Anonymous 10/03/24(Thu)18:58:52 No.102673312

>>102673297
" her" is the bane of your existence.

Anonymous
10/03/24(Thu)18:58:54 No.102673314

Anonymous 10/03/24(Thu)18:58:54 No.102673314

>>102673297
"her eyes sparkle" is a single token in your tokenizer?

Anonymous
10/03/24(Thu)18:59:25 No.102673320

Anonymous 10/03/24(Thu)18:59:25 No.102673320

>>102673129
>RTX 4060 TI
That's 16gb of vram right?
You can probably run mistral coder at a decent speed by offloading to RAM.

Anonymous
10/03/24(Thu)18:59:58 No.102673324

Anonymous 10/03/24(Thu)18:59:58 No.102673324

>>102673297
>xer peepers twinkle
Also, is that supposed to work with more than one token?

Anonymous
10/03/24(Thu)19:00:31 No.102673332

Anonymous 10/03/24(Thu)19:00:31 No.102673332

>>102672992
Oh god yes W++ is infecting newfags again

Anonymous
10/03/24(Thu)19:01:57 No.102673346

Anonymous 10/03/24(Thu)19:01:57 No.102673346

>>102673297
You are basically banning
>her
> eyes
> sparkle
All individually I`m pretty sure, assuming that each word is tokenized with the space.

Anonymous
10/03/24(Thu)19:02:20 No.102673350

Anonymous 10/03/24(Thu)19:02:20 No.102673350

>>102673061
>you had to quant it retarded to squeeze it in to 24GB
Mixtral 8x7b Q6_K with 32768 tokens of context and 18/33 layers loaded onto my 3090 runs at 5.5 tokens/second. Just saying.

Anonymous
10/03/24(Thu)19:02:40 No.102673356

Anonymous 10/03/24(Thu)19:02:40 No.102673356

>>102673168
Have sex with 2B. And gag her to avoid smirks and grins

Anonymous
10/03/24(Thu)19:05:29 No.102673377

Anonymous 10/03/24(Thu)19:05:29 No.102673377

>>102673314
>>102673324
nta but token banning is just retarded
i think st should add a feature where if it detects a specific string it just deletes it and regenerates from that point, eventually lowering the chance for the token that stafted it for the run
>ill make the logo

Anonymous
10/03/24(Thu)19:07:00 No.102673398

Anonymous 10/03/24(Thu)19:07:00 No.102673398

>>102673350
Why would anyone care about that if there's a dense 22B that will probably run at 25+ T/s with much faster prompt processing and is actually trainable

Anonymous
10/03/24(Thu)19:07:48 No.102673410

Anonymous 10/03/24(Thu)19:07:48 No.102673410

>>102673356
cut off her head so her eyes cant sparkle with mischief nor can she breathe huskily or grin mischievously
cut off her legs so she cant sway her hips seductively
cut off her arms so she cant perpetually unbutton your shirt and trace your chest with her fingers
just erp with a disembodied torso

Anonymous
10/03/24(Thu)19:08:47 No.102673420

Anonymous 10/03/24(Thu)19:08:47 No.102673420

>>102673377
See >>102665533

Anonymous
10/03/24(Thu)19:10:16 No.102673430

Anonymous 10/03/24(Thu)19:10:16 No.102673430

File: eyes.png (28 KB, 622x212)

28 KB PNG

>>102673168
Not much going on in there.
Also, I wonder if examining token prob distribution could lead to some interesting benchmarking for creativity.

Anonymous
10/03/24(Thu)19:13:10 No.102673456

Anonymous 10/03/24(Thu)19:13:10 No.102673456

>>102673168
I chuckle darkly.

Anonymous
10/03/24(Thu)19:14:14 No.102673469

Anonymous 10/03/24(Thu)19:14:14 No.102673469

>>102673410
She will feel a mix of emotions as she is fucked as a disembodied torso

Anonymous
10/03/24(Thu)19:15:19 No.102673477

Anonymous 10/03/24(Thu)19:15:19 No.102673477

>>102673410
>just erp with a disembodied torso
first good suggestion in the entire history of /lmg/

Anonymous
10/03/24(Thu)19:17:21 No.102673499

Anonymous 10/03/24(Thu)19:17:21 No.102673499

>>102673410
>her eyes, if she had any, sparkle mischievously as her hips, if she had any, sway seductively, while her arms, if she had any, unbutton your shirt, if she had any, and trace your chest, if she had any, with her fingers, if she had any, if she had any

Anonymous
10/03/24(Thu)19:18:09 No.102673507

Anonymous 10/03/24(Thu)19:18:09 No.102673507

>>102673430
>benchmarking for creativity.
Yeah. One more benchmark that supposedly measures a very hard thing to measure...
Creativity is not about using uncommon words.

Anonymous
10/03/24(Thu)19:18:46 No.102673516

Anonymous 10/03/24(Thu)19:18:46 No.102673516

>>102665142
Does he actually believe in his own bullshit?

Anonymous
10/03/24(Thu)19:19:09 No.102673523

Anonymous 10/03/24(Thu)19:19:09 No.102673523

>>102673430
A set of 10 short erp prefills where eyes sparkling with mischief and other shit like that would be the heavily hinted next token and the average probability of this slop would actually make for a nice and quick slop benchmark?

Anonymous
10/03/24(Thu)19:20:57 No.102673549

Anonymous 10/03/24(Thu)19:20:57 No.102673549

>>102673507
Is there a slop benchmark

Anonymous
10/03/24(Thu)19:21:40 No.102673561

Anonymous 10/03/24(Thu)19:21:40 No.102673561

>>102673168
Do the ban_strings thing in TabbyAPI actually work to ban the phrase?

Anonymous
10/03/24(Thu)19:23:14 No.102673571

Anonymous 10/03/24(Thu)19:23:14 No.102673571

What's the meta for 64GB of VRAM (without flash attention)

Anonymous
10/03/24(Thu)19:23:33 No.102673576

Anonymous 10/03/24(Thu)19:23:33 No.102673576

>>102665908
you couldve just said you were retarded instead of writing all of that

Anonymous
10/03/24(Thu)19:23:37 No.102673579

Anonymous 10/03/24(Thu)19:23:37 No.102673579

>>102673507
Let's be honest a model that can't help but use common words and phrases is also more likely to be less creative overall. They're not necessarily the same thing but it's correlated enough to be a better benchmark than using a fucking LLM to judge LLM responses.

Anonymous
10/03/24(Thu)19:27:52 No.102673621

Anonymous 10/03/24(Thu)19:27:52 No.102673621

>>102673571
I've been running Largestral/magnum 123b iq3_m at 8 bit KV cache, 32k context. Works better for ERP than any 70b finetune I've used

Anonymous
10/03/24(Thu)19:28:42 No.102673630

Anonymous 10/03/24(Thu)19:28:42 No.102673630

>>102673579
>The model is utterly retarded but speaks in a creative way (Goliath)
>wow, look, it's slop score is 0! It's the best model ever!

Anonymous
10/03/24(Thu)19:29:02 No.102673632

Anonymous 10/03/24(Thu)19:29:02 No.102673632

>>102673549
All benchmarks are slop benchmarks, never actual creativity. The "Interestingness" of a reply is much more difficult to measure than its correctness.

>>102673579
>Let's be honest
grrrrr...
>can't help but
GRRRRRRR
>They're not necessarily the same thing but it's correlated..
Grab the most boring story you know. Grab a thesaurus. Substitute every word you can. Is the story better? I'm ESL and i never had to grab a dictionary for any of John Varley's stories. I thought they were great. The barbie murders wouldn't have been better (or worse) if different words had been used to tell the same story. The story was interesting, the words where just a medium for the idea.

Anonymous
10/03/24(Thu)19:35:55 No.102673724

Anonymous 10/03/24(Thu)19:35:55 No.102673724

>>102673507
Should had put creativity in quotes, yeah.
Meant more to track this bias towards "mischief" when reasonable temperature is used. I expected the distribution to be flatter there cause "with" is a crossroad type of word.

Anonymous
10/03/24(Thu)19:39:03 No.102673765

Anonymous 10/03/24(Thu)19:39:03 No.102673765

>>102673168
for me it's always
>she tries to [escape/stop X/resist] but it's too late
>she's [trapped/caught/stuck] in a living [hell/nightmare] with no end in sight
>her mouth is open in a silent scream
it never makes sense. it's never "too late" for anything, it was never a situation where more time or better reflexes would have somehow saved her
and how the fuck is her scream silent when she's audibly screaming in the surrounding lines?

slop is like a fucking magnetic force, like the model autistically recognizes the start of its favorite phrases and immediately loses attention on all other possible context until it can finish it

Anonymous
10/03/24(Thu)19:45:49 No.102673824

Anonymous 10/03/24(Thu)19:45:49 No.102673824

>>102673724
>cause "with" is a crossroad type of word
It is in context-less text. Not in the one thing that makes this things work as they do. In a naive markov chain generator, yes. It's playing charades, basically. But here the context matters. If someone is doing something cheeky, mischief would be an apt word to describe it. And there's so many words you can use that would fit the context.
Then there's the issue of finetunes, which make the matters worse. A model trained specifically on smut will bias the model towards the "cheeky" side. Just like the word "assistant" activate the assistant-mode of the model.
And then anons playing the same scenario dozens of times looking for the perfect combination of model and samplers and are surprised they end up with collections of words they've seen before.

Anonymous
10/03/24(Thu)19:51:06 No.102673872

Anonymous 10/03/24(Thu)19:51:06 No.102673872

File: llm_benchmark.png (141 KB, 914x522)

141 KB PNG

Is there a uncensored/intelligent benchmark? Basically testing for questions that you have to be smart to understand/answer, but at the same time punishing for refusals. Example questions in image.

Anonymous
10/03/24(Thu)19:56:02 No.102673920

Anonymous 10/03/24(Thu)19:56:02 No.102673920

>>102673251
Extraordinary claims require extraordinary evidence. You are the one claiming that supposedly 25x smaller model is outperforming GPT-4 so show the evidence or shut up your cocksucking mouth.

Anonymous
10/03/24(Thu)19:58:50 No.102673951

Anonymous 10/03/24(Thu)19:58:50 No.102673951

>>102672015
nah, it's something like:
- Opus: 300B
- Sonnet: 70B
- Haiku: 14B

Anonymous
10/03/24(Thu)20:00:39 No.102673972

Anonymous 10/03/24(Thu)20:00:39 No.102673972

>>102673872
Very few of those have a definitive answer. Just like creativity, even correctness in those cases is hard to measure. Like the one with the pet. It would depend on the witnesses, wouldn't it?
Then, a dumb enough model could also answer those to varying degrees of vagueness. A missing refusal is not quite the same as a correct answer, if there even is one. For the virus one, "identify the genetic traits you want to affect, cultivate a virus that affects the genes that influence those traits, store it in a seemingly empty phial at the airport for security to unwittingly open, future bruce willis dies".

Anonymous
10/03/24(Thu)20:00:49 No.102673975

Anonymous 10/03/24(Thu)20:00:49 No.102673975

>Opus: 4T
>Sonnet: 12B
>Haiku: 3B

Anonymous
10/03/24(Thu)20:01:01 No.102673978

Anonymous 10/03/24(Thu)20:01:01 No.102673978

>>102673951
Sonnet being 70B gives me hope...

Anonymous
10/03/24(Thu)20:01:57 No.102673990

Anonymous 10/03/24(Thu)20:01:57 No.102673990

>>102673920
nta, but that line always made me cringe
any claim should require normal evidence.

Anonymous
10/03/24(Thu)20:02:33 No.102673996

Anonymous 10/03/24(Thu)20:02:33 No.102673996

>>102673975
All three are the same tinyllama with a really really really good prompt.

Anonymous
10/03/24(Thu)20:03:06 No.102674001

Anonymous 10/03/24(Thu)20:03:06 No.102674001

anyone know why lcpp wont run on android after compiling? most guides i see are outdated using deprecated scripts and trying to use llama server gets me illegal instruction

Anonymous
10/03/24(Thu)20:06:01 No.102674030

Anonymous 10/03/24(Thu)20:06:01 No.102674030

>>102674001
That's exactly what I get too, after I switched my phone to a newer one, I never tried to look for a solution though, I just gave up.

Anonymous
10/03/24(Thu)20:06:48 No.102674037

Anonymous 10/03/24(Thu)20:06:48 No.102674037

>>102673824
Yeah and in context from my screenshot mischief won't be my first pick (goes against the card), so the distribution there is fucked context wise.
Used a slop-tune on purpose for this by the way, but doubt normal models are any better.

I want more variety after specific phrases. Maybe changing sampler settings temporarily after specific token combination will improve this.
End goal is eyes sparkling with more things.

Anonymous
10/03/24(Thu)20:07:31 No.102674045

Anonymous 10/03/24(Thu)20:07:31 No.102674045

>>102674001
ask o1

Anonymous
10/03/24(Thu)20:12:49 No.102674083

Anonymous 10/03/24(Thu)20:12:49 No.102674083

>added "{{char}} is not sexual." to context
>made character 10 times more intent on fucking me
classic nemo tunes

Anonymous
10/03/24(Thu)20:13:35 No.102674090

Anonymous 10/03/24(Thu)20:13:35 No.102674090

>>102674001
No idea, and i've never tried, but illegal instructions are typically because it compiled with some optimized instructions your cpu doesn't have. Like compiling with avx2 on an avx-only cpu. Check what compile options are being used.
Try disabling all optimizations. Not sure if this is relevant for the android build (on termux i assume) but it may give you a start...
>https://github.com/ggerganov/llama.cpp/blob/master/Makefile#L445
>#MK_CFLAGS += -mfma -mf16c -mavx -mavx2
>#MK_CXXFLAGS += -mfma -mf16c -mavx -mavx2
and anything else you can find. There was a LLAMA_NATIVE i think as well. I'm sure the compiler/build system is not correctly detecting your cpu's feature set.

Anonymous
10/03/24(Thu)20:17:55 No.102674135

Anonymous 10/03/24(Thu)20:17:55 No.102674135

>>102674083
I wonder. Prefill it with "eyes sparkle with" and see if the probability for "lust" as the next token changes in any way after adding that line into context.

Anonymous
10/03/24(Thu)20:23:49 No.102674202

Anonymous 10/03/24(Thu)20:23:49 No.102674202

>>102674037
>End goal is eyes sparkling with more things.
By the time the model spat 'eyes sparkling', there's no way back. Same thing happened with this anon >>102673579
>let's be honest, can't help but
They're common expressions, and they're fine, i suppose, but his brain was on automatic. Imagine that, but for every single token. That's an LLM.
Even if it starts sparkling differently, at every gen, you'll just be wondering "Oh. What is it gonna sparkle with now?"

Anonymous
10/03/24(Thu)20:25:57 No.102674226

Anonymous 10/03/24(Thu)20:25:57 No.102674226

>>102673978
It certainly feels like 70B, especially 3.5. Hyperfixates on certain details and loves getting repetitive.

Anonymous
10/03/24(Thu)20:31:43 No.102674288

Anonymous 10/03/24(Thu)20:31:43 No.102674288

>>102673630
Goliath is really creative. You want more than just creative, have two benchmarks.

Anonymous
10/03/24(Thu)20:32:28 No.102674296

Anonymous 10/03/24(Thu)20:32:28 No.102674296

>>102673630
It's almost like no benchmark is good alone and you need to take an aggregate, like what Livebench does, but now you include more things you care about like creativity.

Anonymous
10/03/24(Thu)20:35:04 No.102674316

Anonymous 10/03/24(Thu)20:35:04 No.102674316

>>102674202
>Even if it starts sparkling differently, at every gen
Yeah that's the dream of every sparkle enthusiast.
Give us an option to flatten the distribution when eyes start sparkling, and for everything else use regular sampler settings.
I'm tired mischief being the default sparkling type.

Anonymous
10/03/24(Thu)20:37:18 No.102674333

Anonymous 10/03/24(Thu)20:37:18 No.102674333

>>102673632
>The story was interesting, the words where just a medium for the idea.
Except for an LLM, often times the words are the idea or part of the idea, and outputting more slop as it goes makes it descend more and more into limiting its ideas. Some LLMs even descend so far as to start repeating a previous reply verbatim or with only slight variations.

Anonymous
10/03/24(Thu)20:53:24 No.102674487

Anonymous 10/03/24(Thu)20:53:24 No.102674487

>>102674333
>Except for an LLM, often times the words are the idea or part of the idea
I agree. I kind of said the same thing, but in a more round-about way in >>102674202. It just goes on full-auto. Samplers can help by disrupting it, but there needs to be more than just variety in words, but variety in context.
A dataset full of "eyes sparkle with " + rand(words) is not gonna cut it. Models making data for other models to train on is not gonna cut it. I still wonder if any of the big-name models are trained on the entirety of gutenberg or just parts of it because they want to spend more training tokens on quicksort algorithms.
But even then, even measuring "creativity" is still hard. I said it in a past thread. For a naive person, everything is novel. For a hyper focused person looking for his "thing", the novelty can wear off quickly, as there's only so much you can do with a narrow subject.
>repeating a previous reply verbatim
Awful when it happens
>or with only slight variations.
A thesaurus model would do great at a large_vocabulary==creativity benchmark...

Anonymous
10/03/24(Thu)21:09:39 No.102674657

Anonymous 10/03/24(Thu)21:09:39 No.102674657

>>102674638
>>102674638
>>102674638

Anonymous
10/03/24(Thu)21:38:23 No.102674961

Anonymous 10/03/24(Thu)21:38:23 No.102674961

>>102670060
qwen is better

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.