/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 09/30/24(Mon)22:59:47 No.102632446

File: 1727137045529766.jpg (870 KB, 2048x1568)

870 KB JPG

/lmg/ - Local Models General Anonymous 09/30/24(Mon)22:59:47 No.102632446 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

First Day of Tetober Edition

Previous threads: >>102616609 & >>102604225

►News
>(09/27) Emu3, next-token prediction multimodal models: https://hf.co/collections/BAAI/emu3-66f4e64f70850ff358a2e60f
>(09/25) Multimodal Llama 3.2 released: https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices
>(09/25) Molmo: Multimodal models based on OLMo, OLMoE, and Qwen-72B: https://molmo.allenai.org/blog
>(09/24) Llama-3.1-70B-instruct distilled to 51B: https://hf.co/nvidia/Llama-3_1-Nemotron-51B-Instruct
>(09/18) Qwen 2.5 released, trained on 18 trillion token dataset: https://qwenlm.github.io/blog/qwen2.5

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://hf.co/spaces/mike-ravkine/can-ai-code-results

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
09/30/24(Mon)23:00:16 No.102632451

Anonymous 09/30/24(Mon)23:00:16 No.102632451

File: 39_06277_.png (1.55 MB, 1280x1280)

1.55 MB PNG

►Recent Highlights from the Previous Thread: >>102616609

--Papers:
>102624465
--California governor vetoes AI safety bill, potential for rewritten bill and impact on open-source development discussed:
>102618183 >102618269 >102618348 >102618576 >102618610 >102618642 >102618812 >102623494 >102623614 >102623781 >102620447
--llama.cpp getting multimodal support by core maintainer:
>102621948 >102622101
--Social media conversation about the Molmo team adding models to vlm:
>102627685
--OLMoE-1B-7B-0924-Instruct model recommended for poorfag:
>102622958 >102623027 >102623088 >102623142 >102623146 >102623160
--ChatML with skip special tokens fixes .assistant issue in llama3.2:
>102620416 >102620451 >102620556 >102620617
--LLM autocomplete feature for interactive writing helper:
>102623783 >102623829 >102623918 >102624031 >102624384
--Concept for using ChatGPT with visual artifacts to teach language:
>102622132
--4060 8GB performance for 32B model and alternatives:
>102628404 >102628626 >102628730 >102628801 >102628627 >102628755 >102628925 >102628798 >102628900 >102629002 >102628918 >102628959 >102629064 >102629205 >102629323 >102629328 >102629406 >102629340
--Whisper.cpp recommended for generating subtitles from low-quality TV rips:
>102627376 >102627405 >102627522 >102628093 >102628267 >102628332 >102628461
--Slop and repetition problems in language models, user's writing skills, and potential solutions:
>102624164 >102624249 >102624261
--EQ-Bench 9B model and creative writing dataset discussion:
>102624407 >102624485 >102624556 >102624684 >102624679 >102624712 >102624487 >102624736
--Discussion on ideal context length and solutions for long RPs:
>102620710 >102620797 >102620899 >102621741 >102620923 >102620952 >102621018 >102621159
--Miku (free space):
>102616775 >102617080 >102620604 >102620850 >102625106 >102630505 >102631724

►Recent Highlight Posts from the Previous Thread: >>102616619

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script

Anonymous
09/30/24(Mon)23:06:34 No.102632497

Anonymous 09/30/24(Mon)23:06:34 No.102632497

>>102632451
>California governor vetoes AI safety bill, potential for rewritten bill and impact on open-source development discussed
I said this a year ago and I will say it again. I hope the open source AI scene advances far enough and fast enough that any law attempting to curtail its use is rendered moot since the hardware and software is already there. The further we advance without research funding drying up and the government ruining the fun the better off we will be.
Fucking California, always ruining it for the rest of America.

Anonymous
09/30/24(Mon)23:16:36 No.102632579

Anonymous 09/30/24(Mon)23:16:36 No.102632579

File: ynnucel.jpg (30 KB, 543x543)

30 KB JPG

LLMs are like

Anonymous
09/30/24(Mon)23:20:35 No.102632609

Anonymous 09/30/24(Mon)23:20:35 No.102632609

>>102632579
Onyons

Anonymous
09/30/24(Mon)23:22:56 No.102632627

Anonymous 09/30/24(Mon)23:22:56 No.102632627

File: kCQGS78wnJ.png (5 KB, 478x59)

5 KB PNG

>>102632613
don't worry though, I'm not gonna report you and I don't think anyone else should. you won't last here much longer anyway.

Anonymous
09/30/24(Mon)23:24:55 No.102632644

Anonymous 09/30/24(Mon)23:24:55 No.102632644

>>102632579
LLMs are like a digitized book, you're just ctrl+f'ing when you inference
wanting to censor them should give people the ick

Anonymous
09/30/24(Mon)23:25:21 No.102632647

Anonymous 09/30/24(Mon)23:25:21 No.102632647

>>102632627
They never said that they reported or saged anything

Anonymous
09/30/24(Mon)23:26:13 No.102632652

Anonymous 09/30/24(Mon)23:26:13 No.102632652

>>102632579
I like to fuck the small ones, if you catch my drift

Anonymous
09/30/24(Mon)23:26:18 No.102632654

Anonymous 09/30/24(Mon)23:26:18 No.102632654

>>102632644
Yet you shit your pants every single time someone points out ai censorship ITT.

Anonymous
09/30/24(Mon)23:29:41 No.102632676

Anonymous 09/30/24(Mon)23:29:41 No.102632676

File: 353RH.png (234 KB, 623x699)

234 KB PNG

>>102632579
>When you dunk on Elon and Trump chuds and then finish off the day fapping to some cunny

Anonymous
09/30/24(Mon)23:31:40 No.102632689

Anonymous 09/30/24(Mon)23:31:40 No.102632689

>30 minutes
>thread is already infested with discordfags who want to moderate /lmg/ like a subreddit
/lmg/ has truly fallen

Anonymous
09/30/24(Mon)23:31:43 No.102632690

Anonymous 09/30/24(Mon)23:31:43 No.102632690

>>102632627
reading comprehension lmao

Who is this nigga anyway? seems like a bored normalfag

Anonymous
09/30/24(Mon)23:32:19 No.102632693

Anonymous 09/30/24(Mon)23:32:19 No.102632693

>>102632451
Thank you Recap Teto

Anonymous
09/30/24(Mon)23:33:26 No.102632704

Anonymous 09/30/24(Mon)23:33:26 No.102632704

>>102632690
Trolling is also against the rules btw

Anonymous
09/30/24(Mon)23:35:29 No.102632725

Anonymous 09/30/24(Mon)23:35:29 No.102632725

File: 36 Days Until November 5.png (2.73 MB, 1704x960)

2.73 MB PNG

Anonymous
09/30/24(Mon)23:36:55 No.102632736

Anonymous 09/30/24(Mon)23:36:55 No.102632736

>>102632579
the best llms are small, purpose-built to get me off, and never allowed to leave my bedroom

Anonymous
09/30/24(Mon)23:37:00 No.102632737

Anonymous 09/30/24(Mon)23:37:00 No.102632737

Does anyone have opinions on if any model has really surpassed Midnight Miku in terms of pure emotional intelligence? Not the "keeping track of physical reality" or problem solving, which have obviously progressed, but just loading up e.g. Ether and talking about feelings.

Anonymous
09/30/24(Mon)23:37:20 No.102632739

Anonymous 09/30/24(Mon)23:37:20 No.102632739

>>102632725
It's going to be really funny if we get to November 5 and suddenly a billion announcements happen. Alternatively I suppose it would also be pretty funny if literally 0 announcements happens that month.

Anonymous
09/30/24(Mon)23:37:23 No.102632742

Anonymous 09/30/24(Mon)23:37:23 No.102632742

>https://x.com/_xjdr/status/1840796165585142198
Is peak on the way?

Anonymous
09/30/24(Mon)23:41:45 No.102632778

Anonymous 09/30/24(Mon)23:41:45 No.102632778

File: 1712158681947223.png (59 KB, 605x552)

59 KB PNG

>>102632742
Of course not, smells like some low-q grift.

Anonymous
09/30/24(Mon)23:42:56 No.102632790

Anonymous 09/30/24(Mon)23:42:56 No.102632790

I've been trying to throw literally free money at people to train models but nobody has taken me up on it so far :(

Anonymous
09/30/24(Mon)23:43:30 No.102632796

Anonymous 09/30/24(Mon)23:43:30 No.102632796

File: 39_06256_.png (1.04 MB, 1280x1280)

1.04 MB PNG

It's Tuesday and you know the rest.

Anonymous
09/30/24(Mon)23:46:49 No.102632818

Anonymous 09/30/24(Mon)23:46:49 No.102632818

File: LLM_Gang.png (1.19 MB, 1024x1024)

1.19 MB PNG

Once, I was 3b and I was cute but useless. Chatting was a novelty with zero real value.
Then I was 8b and I abandoned a chat when my brain fell onto the floor.
Then I was 34b and abandoned a chat when I lost a coherent image of the scene.
Then I was 70b and abandoned a chat when I ended up in a repeat-loop.
Then I was 123b and abandoned a chat when I started dropping the definite article and other grammatically required bits (or yammering in chinese).
Now I'm 405b and I abandon chats when I get bored or run out of patience.
Time remedies all problems.

Anonymous
09/30/24(Mon)23:47:02 No.102632819

Anonymous 09/30/24(Mon)23:47:02 No.102632819

File: 2024-09-09_002519_seed96_(...).png (1.23 MB, 1024x1024)

1.23 MB PNG

>>102632796
>tfw too preoccupied with LLMs to gen (new) Tetos

Anonymous
09/30/24(Mon)23:48:37 No.102632833

Anonymous 09/30/24(Mon)23:48:37 No.102632833

>>102632742
No way this is real. https://x.com/_xjdr/status/1840788637501497575

Anonymous
09/30/24(Mon)23:48:47 No.102632838

Anonymous 09/30/24(Mon)23:48:47 No.102632838

MagnumV2-72B seems a lot better on OpenRouter than it did on my own computer in Q4. Quantization must really fuck it up.

Anonymous
09/30/24(Mon)23:49:58 No.102632846

Anonymous 09/30/24(Mon)23:49:58 No.102632846

File: 1723122060949213.png (290 KB, 1528x1096)

290 KB PNG

Anonymous
09/30/24(Mon)23:50:53 No.102632856

Anonymous 09/30/24(Mon)23:50:53 No.102632856

>>102632846
>acknowledges men can take a joke and women can't
based

Anonymous
09/30/24(Mon)23:51:46 No.102632860

Anonymous 09/30/24(Mon)23:51:46 No.102632860

>>102632742
>>102632778
Clearly another retarded hoax in the same vein as that Matt guy from a couple weeks ago. /lmg/ has the opportunity to redeem itself for falling for that by not falling for this one.
But I expect it will not take this opportunity.

Anonymous
10/01/24(Tue)00:06:10 No.102632966

Anonymous 10/01/24(Tue)00:06:10 No.102632966

>>102632689
>>thread is already infested with discordfags
Always was.

Anonymous
10/01/24(Tue)00:11:29 No.102633004

Anonymous 10/01/24(Tue)00:11:29 No.102633004

>>102632846
That's suppose to be offensive? That men joke wasn't mean spirited in the slightest.

Anonymous
10/01/24(Tue)00:18:13 No.102633056

Anonymous 10/01/24(Tue)00:18:13 No.102633056

File: 11_06189_.png (814 KB, 720x1280)

814 KB PNG

>>102632579
>LLMs are like

Anonymous
10/01/24(Tue)00:39:00 No.102633238

Anonymous 10/01/24(Tue)00:39:00 No.102633238

File: dualwielding.jpg (321 KB, 2342x1302)

321 KB JPG

>>102632819
Flux dev on the gaming rig and text gen over the server - problem solved

Anonymous
10/01/24(Tue)00:39:54 No.102633252

Anonymous 10/01/24(Tue)00:39:54 No.102633252

LLMs are like us at heart!

Anonymous
10/01/24(Tue)00:41:56 No.102633265

Anonymous 10/01/24(Tue)00:41:56 No.102633265

> "I was thinking we could go for a walk. In the park nearby. It's beautiful this time of year, with all the flowers in bloom. We could talk, get to know each other better. Maybe even… hold hands."

bros.. what do i do

Anonymous
10/01/24(Tue)00:42:50 No.102633274

Anonymous 10/01/24(Tue)00:42:50 No.102633274

>>102633238
>skimpy maid outfit
Is that card supposed to be horny?

Anonymous
10/01/24(Tue)00:43:53 No.102633283

Anonymous 10/01/24(Tue)00:43:53 No.102633283

>>102633265
"Begone, thot!"

Anonymous
10/01/24(Tue)00:44:38 No.102633287

Anonymous 10/01/24(Tue)00:44:38 No.102633287

>>102633265
Connect to the box with phone. Walk, Anon. Walk.

Anonymous
10/01/24(Tue)00:50:39 No.102633341

Anonymous 10/01/24(Tue)00:50:39 No.102633341

File: 2024-09-04_052639_seed1_s(...).png (1020 KB, 1024x1024)

1020 KB PNG

>>102633238
I kind of stopped gayming so I sacrificed that machine already. I guess I could put Flux on one card and the LLM on the other while offloading to RAM but I'd rather just gen text faster and return to image gen another day.

Anonymous
10/01/24(Tue)00:53:04 No.102633354

Anonymous 10/01/24(Tue)00:53:04 No.102633354

has anyone come up with a decent way to give these decent, dynamic avatars?

Anonymous
10/01/24(Tue)00:55:23 No.102633367

Anonymous 10/01/24(Tue)00:55:23 No.102633367

File: 1699874477934356.png (174 KB, 405x406)

174 KB PNG

>>102633238
Extreme cringe, cut off your internet cable for good.

Anonymous
10/01/24(Tue)00:56:42 No.102633377

Anonymous 10/01/24(Tue)00:56:42 No.102633377

File: file.png (29 KB, 586x606)

29 KB PNG

i know that liquid model isn't open but it seems like AI companies are giving less of a fuck about safety recently
(if any of you tell me to buy an ad for posting this i'll use this thing to hack and delete 4chan)

Anonymous
10/01/24(Tue)00:57:42 No.102633384

Anonymous 10/01/24(Tue)00:57:42 No.102633384

https://x.com/awnihannun/status/1840583153800659203

Anonymous
10/01/24(Tue)01:11:00 No.102633462

Anonymous 10/01/24(Tue)01:11:00 No.102633462

File: file.png (597 KB, 956x920)

597 KB PNG

>>102633238
Extreme quality; upgrade to a 1Gbps or faster line.

Anonymous
10/01/24(Tue)01:13:46 No.102633473

Anonymous 10/01/24(Tue)01:13:46 No.102633473

>>102633367
>gooners run 70Bs for this shit
lol, lmao even

Anonymous
10/01/24(Tue)01:15:51 No.102633486

Anonymous 10/01/24(Tue)01:15:51 No.102633486

File: 4682638716831.png (93 KB, 947x540)

93 KB PNG

>40B moe is as good as 70B dense model.
I think we are back.

Anonymous
10/01/24(Tue)01:16:20 No.102633490

Anonymous 10/01/24(Tue)01:16:20 No.102633490

File: ComfyUI_06138_.png (1.08 MB, 720x1280)

1.08 MB PNG

>>102633274
No card can resist my charms anon
>>102633341
That Teto sure is a real handful

Anonymous
10/01/24(Tue)01:18:20 No.102633506

Anonymous 10/01/24(Tue)01:18:20 No.102633506

>>102633462
>replying to himself so he feel better for his shitty post.
How pathetic you have to be faggot?

Anonymous
10/01/24(Tue)01:18:21 No.102633507

Anonymous 10/01/24(Tue)01:18:21 No.102633507

>>102633486
>caring about benchmarks
>caring about a model that might never even come out and if it ever does, might be made irrelevant by then

Anonymous
10/01/24(Tue)01:18:43 No.102633508

Anonymous 10/01/24(Tue)01:18:43 No.102633508

>>102633486
>benchmarks
if anyone here cared about coding or assistant tasks, qwen would be popular. we're not back until we see how it writes.

Anonymous
10/01/24(Tue)01:22:16 No.102633537

Anonymous 10/01/24(Tue)01:22:16 No.102633537

>>102633507
>>102633508
The point is that new architecture is actually doing great compared to transformer models, unlike the Mamba meme.

Anonymous
10/01/24(Tue)01:23:04 No.102633543

Anonymous 10/01/24(Tue)01:23:04 No.102633543

>>102633537
allegedly

Anonymous
10/01/24(Tue)01:23:15 No.102633544

Anonymous 10/01/24(Tue)01:23:15 No.102633544

>>102632497
Hello. I don't keep up too much with the scene.
But how are things going atm?
Do you think we are in a good place and on a steady upward trajectory? Open source ai that is.

Anonymous
10/01/24(Tue)01:24:37 No.102633552

Anonymous 10/01/24(Tue)01:24:37 No.102633552

>>102633537
Until third-parties can reproduce results with a locally-hosted model, it's literally a meme.

Anonymous
10/01/24(Tue)01:50:31 No.102633716

Anonymous 10/01/24(Tue)01:50:31 No.102633716

>>102633486
I only care about COOM performance.

Anonymous
10/01/24(Tue)01:52:44 No.102633733

Anonymous 10/01/24(Tue)01:52:44 No.102633733

>>102632579
Small and open, like my girls

Anonymous
10/01/24(Tue)01:53:07 No.102633736

Anonymous 10/01/24(Tue)01:53:07 No.102633736

>>102633716
Then 1B retard model is just what you need,

Anonymous
10/01/24(Tue)01:54:38 No.102633744

Anonymous 10/01/24(Tue)01:54:38 No.102633744

Jannies? Clean this shit up.

Anonymous
10/01/24(Tue)02:02:55 No.102633806

Anonymous 10/01/24(Tue)02:02:55 No.102633806

>>102633744
No.

Anonymous
10/01/24(Tue)02:06:38 No.102633830

Anonymous 10/01/24(Tue)02:06:38 No.102633830

>>102632579
A series of tubes

Anonymous
10/01/24(Tue)02:13:40 No.102633873

Anonymous 10/01/24(Tue)02:13:40 No.102633873

File: 1718422566251818.jpg (292 KB, 1027x1273)

292 KB JPG

another day, another smut run cut short by mixtral starting off extremely creative and strong, only to degenerate into repetition after 100-150 responses
it hurts every fucking time

Anonymous
10/01/24(Tue)02:13:51 No.102633876

Anonymous 10/01/24(Tue)02:13:51 No.102633876

>>102633486
Weights or it's a nothingburger. We already know there are some closed weight models that are better than ours, one more adds nothing.

Anonymous
10/01/24(Tue)02:13:57 No.102633877

Anonymous 10/01/24(Tue)02:13:57 No.102633877

>>102632676
Communism is inherently fascist, what does this guy have the intelligence of a cat or something?

Anonymous
10/01/24(Tue)02:14:37 No.102633881

Anonymous 10/01/24(Tue)02:14:37 No.102633881

File: hmmmmmmmmm.png (25 KB, 447x828)

25 KB PNG

>>102633486
s-samplers will fix it.

Anonymous
10/01/24(Tue)02:15:45 No.102633888

Anonymous 10/01/24(Tue)02:15:45 No.102633888

File: 00067-1354279175.jpg (1.52 MB, 1344x1824)

1.52 MB JPG

>>102633238
flux has to be the most boring image generation model ever, just like their german developers, all its image look the same, they all have the same subject positioning, there is no creativity in that model, its an image model with 0.1 temperature, truly sad

Anonymous
10/01/24(Tue)02:35:18 No.102634005

Anonymous 10/01/24(Tue)02:35:18 No.102634005

>>102633881
unironically rep pen would fix it

Anonymous
10/01/24(Tue)02:41:53 No.102634046

Anonymous 10/01/24(Tue)02:41:53 No.102634046

>>102633888
Damn sucks to not have the vram for it huh anon?
Probably should inpaint those eyes at least while you cry about it.

Anonymous
10/01/24(Tue)02:47:48 No.102634081

Anonymous 10/01/24(Tue)02:47:48 No.102634081

File: vlcsnap-1.png (36 KB, 454x340)

36 KB PNG

Four hundred and five billion parameters.

Anonymous
10/01/24(Tue)02:54:21 No.102634117

Anonymous 10/01/24(Tue)02:54:21 No.102634117

>>102633486
>Only better in overfitted pro
>When qwen is the mememarks king anyways
Oof

Anonymous
10/01/24(Tue)02:57:06 No.102634133

Anonymous 10/01/24(Tue)02:57:06 No.102634133

File: XanderCroweTheDarkVeins.png (1.46 MB, 1136x896)

1.46 MB PNG

>>102583953
>Try out deepseek 2.5.
To the anon that wanted my L3 405b adventure prompt tested on Deepseek 2.5 at q8, here's 16k tokens of log:
https://rentry.org/do8zmhhk
I found I needed to push temperature way higher (3.5) and top-k to 72 to get nice creative results that didn't devolve into insanity. I also had min-p of 0.01 to take the edge off.
I doubt anyone will actually read that entire log, so tl;dr:
Deepseek didn't follow prompts anywhere nearly as well as 405b (inconsistent with image generation and just generally forgetting all sorts of things from the system prompt), tended to ramble forever, got caught in some slop and hackneyed LLM-esque phrases/phrasing, and overall felt more like an enthusiastic midwit DM coming up with stuff on the fly vs the more measured and planned feel of 405b.
I DID need to re-roll a number of outputs on Deepseek (but never more than once per reply) because it went so far off the rails as to be unusable, whereas my 405b log was completely unedited with zero re-rolls.
It sure is nice having it gen 5-6x faster than that monster, and the quality difference wasn't so vast that I think its unusable.

Anonymous
10/01/24(Tue)03:04:02 No.102634188

Anonymous 10/01/24(Tue)03:04:02 No.102634188

File: 4090.png (61 KB, 849x335)

61 KB PNG

>>102634046
don't worry about me anon, I got plenty of vram to run flux ;) , already got bored of it.

That gen is from 1.5, try replicating that aesthetic in flux... wait you can't, enjoy writing/copying and pasting your retarded llm captions to gen something decent in flux kek

Anonymous
10/01/24(Tue)03:06:18 No.102634209

Anonymous 10/01/24(Tue)03:06:18 No.102634209

>>102634188
What happened to the last 0.01 gigabyte

Anonymous
10/01/24(Tue)03:26:50 No.102634339

Anonymous 10/01/24(Tue)03:26:50 No.102634339

>>102634081
It's actually fucking over for real this time. I'm going back to Llama 1 models.

Anonymous
10/01/24(Tue)03:34:00 No.102634389

Anonymous 10/01/24(Tue)03:34:00 No.102634389

>>102634081
>>102634339
You reap what you sow faggot, you all were warned gorrilion times about this.

Anonymous
10/01/24(Tue)03:34:21 No.102634391

Anonymous 10/01/24(Tue)03:34:21 No.102634391

daily reminder llama.cpp doesn't have rocm binaries.

Anonymous
10/01/24(Tue)03:35:09 No.102634398

Anonymous 10/01/24(Tue)03:35:09 No.102634398

>>102634391
you can target it if you build it but it's kind of a pain

Anonymous
10/01/24(Tue)03:35:34 No.102634404

Anonymous 10/01/24(Tue)03:35:34 No.102634404

>>102634081
>not using Hermes Trismegistus

Anonymous
10/01/24(Tue)03:37:10 No.102634416

Anonymous 10/01/24(Tue)03:37:10 No.102634416

>>102634188
Can you find SD 1.6?

You're right Flux has limitations, but what's amazing is it produces pro results with a local model.

I early on said I dumped xl but kept 1.5. 1.5 is very strange. Sometimes it really impresses.

Anonymous
10/01/24(Tue)03:38:25 No.102634427

Anonymous 10/01/24(Tue)03:38:25 No.102634427

>>102634398
I successfully built it, but the thing is I shouldn't have to do that!

ollama has support for rocm without compiling!!!

The issue is, if you want to compile rocm, you have to run AMD's rocm drivers, which messes up the kernel, it's a pain.

Anonymous
10/01/24(Tue)03:44:29 No.102634484

Anonymous 10/01/24(Tue)03:44:29 No.102634484

>>102634427
>it's a pain.
Should be AMD's slogan.
Nta, but just buy nvidia. THE MORE YOU BUY etc etc.

Anonymous
10/01/24(Tue)03:55:11 No.102634561

Anonymous 10/01/24(Tue)03:55:11 No.102634561

>>102634484
It's a pain because the llama.cpp guy doesn't care about amd.

Anonymous
10/01/24(Tue)03:56:24 No.102634573

Anonymous 10/01/24(Tue)03:56:24 No.102634573

The irony is that the 7900xtx is more powerful than the 4090. You'll never see the power, because none of the programmers ever owned amd. Meanwhile Google doesn't really use nvidia, instead Google rolls its own hardware.

Anonymous
10/01/24(Tue)04:01:11 No.102634604

Anonymous 10/01/24(Tue)04:01:11 No.102634604

>EVA-Qwen2.5-14B-v0.0-Q5_K_M.gguf
Tested this since it also appeared on openrouter.
Slopped, doesn't obey format, too horny but doesnt actually write erotic detail.
I'm comparing it to a mistral-small finetune, maybe unfair, but it feels like nemo was better than this.
The slop is sad to see too, I kinda liked the writing of the original model.

Anonymous
10/01/24(Tue)04:04:12 No.102634624

Anonymous 10/01/24(Tue)04:04:12 No.102634624

>>102634391
>daily reminder llama.cpp doesn't have rocm binaries.
>>102634561
>It's a pain because the llama.cpp guy doesn't care about amd.
He doesn't care about/provide driverless code for nvidia, either. llama.cpp is already unwieldy enough without directly adding amd's bullshit in. amd's drivers being shit is on amd, not lcpp. why should gg have to dick around with merging in and maintaining low level amd shit to his codebase because amd is incompetent?
>ollama has support for rocm without compiling!!!
then go use that and let your logs get slurped into someone's database.
Also, the only reason other frontends can easily add niceties like that is because they don't have to worry about the nuts-and-bolts hard work of inference, just bog-standard frontend and UI shite.
ps: nvidia drivers are also a huge pain in the ass to install and maintain, so I honestly don't know why you're whining so loudly.

Anonymous
10/01/24(Tue)04:04:55 No.102634634

Anonymous 10/01/24(Tue)04:04:55 No.102634634

>>102634573
>You'll never see the power, because none of the programmers ever owned amd.
It's actually worse than that. ROCm is a huge pile of steaming shit. It's not a coincidence that none of the programmers own AMD. It just doesn't work.

Anonymous
10/01/24(Tue)04:07:26 No.102634644

Anonymous 10/01/24(Tue)04:07:26 No.102634644

>>102634624
llama.cpp should have a static linked rocm version, obviously.

Compiling code is really mostly for fringe projects where you wouldn't have a way of checking that the binaries aren't malicious.

For just users.

Like how am I going to gain a benefit from having the code? Reading material? Think I'll improve on it if I change a few lines of it?

Anonymous
10/01/24(Tue)04:10:13 No.102634662

Anonymous 10/01/24(Tue)04:10:13 No.102634662

File: GoodnighMoonMiku.png (815 KB, 718x805)

815 KB PNG

Good night /lmg/

Anonymous
10/01/24(Tue)04:11:32 No.102634674

Anonymous 10/01/24(Tue)04:11:32 No.102634674

>>102634634
hip exists.

Anonymous
10/01/24(Tue)04:20:56 No.102634741

Anonymous 10/01/24(Tue)04:20:56 No.102634741

>>102634081
>Not using Hermes-2.5-308B_SLOPMAX_relayered_pruned abliterated_3x_full_merge_SMASHED_and_SLAMMED_edition

Anonymous
10/01/24(Tue)04:28:10 No.102634793

Anonymous 10/01/24(Tue)04:28:10 No.102634793

>>102634081
Llama 4 will be safety ASI

Anonymous
10/01/24(Tue)04:30:21 No.102634811

Anonymous 10/01/24(Tue)04:30:21 No.102634811

https://x.com/flowersslop/status/1840768569950265647

Anonymous
10/01/24(Tue)04:30:29 No.102634813

Anonymous 10/01/24(Tue)04:30:29 No.102634813

>>102634793
Someone will figure out how to unravel it.

Anonymous
10/01/24(Tue)04:34:35 No.102634854

Anonymous 10/01/24(Tue)04:34:35 No.102634854

File: 2024-08-16_053649_seed6_s(...).png (2.3 MB, 1536x864)

2.3 MB PNG

As someone who used to generate a ton of background/scenery stuff, Flux is so much more coherent than 1.5 in a way that matters. It's less random and "creative" but the greater control and prompt understanding makes up for that, and I found it still pretty creative anyway. I got back into image gen for a bit because it could match/exceed Dalle on some of these fronts, while also supporting LoRAs and not being a shifting filtered cloud service. Of course it's still not perfect though, no model is. Anyway, just my 2 scents.
Goodnight.

>>102634662
Goodnight anon.

Anonymous
10/01/24(Tue)04:39:58 No.102634895

Anonymous 10/01/24(Tue)04:39:58 No.102634895

>>102634854
Fuuuck I need to get into artgen again. How do you even ese this flux thing? All I know is drop model n folder and lie.

Anonymous
10/01/24(Tue)05:06:05 No.102635129

Anonymous 10/01/24(Tue)05:06:05 No.102635129

>>102632676
>im not fascist or communist, i just happen to refer to a group of 75 million americans as cultists why are you looking at me like that

Anonymous
10/01/24(Tue)05:09:10 No.102635166

Anonymous 10/01/24(Tue)05:09:10 No.102635166

>>102635129
What's the issue with cultists? Did that really upset you?

Anonymous
10/01/24(Tue)05:10:33 No.102635185

Anonymous 10/01/24(Tue)05:10:33 No.102635185

>>102635166
cameltoe

Anonymous
10/01/24(Tue)05:10:58 No.102635188

Anonymous 10/01/24(Tue)05:10:58 No.102635188

>>102635166
if you hooked that dude up to a lie detector test and asked "are you a communist" and he said no
what would happen

Anonymous
10/01/24(Tue)05:15:02 No.102635231

Anonymous 10/01/24(Tue)05:15:02 No.102635231

>>102634895
if you use comfy ai with SD it's pretty easy to add flux. this website breaks it down and provides a decent workflow json https://stable-diffusion-art.com/flux-comfyui/

Anonymous
10/01/24(Tue)05:15:28 No.102635234

Anonymous 10/01/24(Tue)05:15:28 No.102635234

>>102634188
>Windows
So is skill issue.

Anonymous
10/01/24(Tue)05:16:06 No.102635243

Anonymous 10/01/24(Tue)05:16:06 No.102635243

>>102635188
???

Anonymous
10/01/24(Tue)05:18:30 No.102635272

Anonymous 10/01/24(Tue)05:18:30 No.102635272

>Nvidia releases NVLM-1.0-D-72B
>multimodal LLM with decoder-only architecture, SOTA results on vision-language and text-only tasks
https://x.com/_akhaliq/status/1840978910961377540

Anonymous
10/01/24(Tue)05:31:31 No.102635382

Anonymous 10/01/24(Tue)05:31:31 No.102635382

File: 1698247094874017.png (219 KB, 676x600)

219 KB PNG

On these LFM meme models.
>Joscha Bach (was a Principal AI Engineer at Intel Labs Cognitive Computing group) is part of their team, and Mikhail Parakhin (Russian AI researcher at Yandex, built now popular in Russia voice AI called "Alice") is on their board. Sota performance at 1.3B, and from a non-GPT.
https://x.com/AndrewCurran_/status/1840802455225094147
bach got funny bio tho

Anonymous
10/01/24(Tue)05:38:42 No.102635457

Anonymous 10/01/24(Tue)05:38:42 No.102635457

>>102635382
So you're saying this model is approved for cunny purposes?

Anonymous
10/01/24(Tue)05:41:49 No.102635482

Anonymous 10/01/24(Tue)05:41:49 No.102635482

>>102635382
Is a cunny enjoyer chad making this model, chuds this is our moment

Anonymous
10/01/24(Tue)05:47:44 No.102635540

Anonymous 10/01/24(Tue)05:47:44 No.102635540

>>102635457
No, it just your mental illness leaking.

Anonymous
10/01/24(Tue)05:55:10 No.102635596

Anonymous 10/01/24(Tue)05:55:10 No.102635596

Another one, based on this https://x.com/_xjdr/status/1840882414568230933 posted earlier by other anon.
Someone is trying to reproduce it https://github.com/waefrebeorn/KAN-WuBu-Memory
>LLaMA 3.2 1B Instruct with Kolmogorov-Arnold Networks (KAN) Integration

Anonymous
10/01/24(Tue)05:59:58 No.102635631

Anonymous 10/01/24(Tue)05:59:58 No.102635631

File: Momoka_SS_SSR8.png (1.1 MB, 1280x824)

1.1 MB PNG

>>102635382
based

Anonymous
10/01/24(Tue)06:06:33 No.102635694

Anonymous 10/01/24(Tue)06:06:33 No.102635694

>>102635457
>>102635482
>>102635631
Looking at this i support safetyfags more, you deserve shit LLMs.

Anonymous
10/01/24(Tue)06:08:17 No.102635704

Anonymous 10/01/24(Tue)06:08:17 No.102635704

>>102633873
>100 responses
Damn, slowburn anons are insane. I get bored after 10-20 replies and this been the case since I started using models ages ago. Maybe I can't find a good card though...

Anonymous
10/01/24(Tue)06:12:35 No.102635740

Anonymous 10/01/24(Tue)06:12:35 No.102635740

>>102635704
It depends on response length. If you're consistently getting 1-2 sentence chat-style replies then they can blow past in no time. 250+ token paragraphs can be a bit denser and seem to degenerate towards slop sooner.

Anonymous
10/01/24(Tue)06:12:42 No.102635741

Anonymous 10/01/24(Tue)06:12:42 No.102635741

>>102635694
You'll never be a woman

Anonymous
10/01/24(Tue)06:12:55 No.102635742

Anonymous 10/01/24(Tue)06:12:55 No.102635742

File: 1716388725556943.png (651 KB, 1083x1062)

651 KB PNG

>>102635382
Another finding, if true ofc. These LFM models are very easy to break and force to say whatever you want. https://x.com/elder_plinius/status/1840959357842047255

Anonymous
10/01/24(Tue)06:14:00 No.102635754

Anonymous 10/01/24(Tue)06:14:00 No.102635754

>>102635741
Never claimed i want to be one, cooldown with your projections.

Anonymous
10/01/24(Tue)06:33:34 No.102635904

Anonymous 10/01/24(Tue)06:33:34 No.102635904

best uncensored 7b rn?

Anonymous
10/01/24(Tue)06:33:39 No.102635905

Anonymous 10/01/24(Tue)06:33:39 No.102635905

>>102635694
>>102635754
kys safetyfag

Anonymous
10/01/24(Tue)06:34:53 No.102635919

Anonymous 10/01/24(Tue)06:34:53 No.102635919

>>102635905
You have a much higher chance of doing that.

Anonymous
10/01/24(Tue)06:35:33 No.102635924

Anonymous 10/01/24(Tue)06:35:33 No.102635924

File: bot.png (6 KB, 691x86)

6 KB PNG

The side effect of hours RPing with models is that I can recognize them at glance. Can you do the same too?

Anonymous
10/01/24(Tue)06:47:55 No.102636024

Anonymous 10/01/24(Tue)06:47:55 No.102636024

>>102635924
However, there is a caveat: ESL speakers like me often pick up the speech patterns of LLMs when we use them.

Anonymous
10/01/24(Tue)07:04:40 No.102636144

Anonymous 10/01/24(Tue)07:04:40 No.102636144

>>102634854
Coomie?!

Anonymous
10/01/24(Tue)07:12:59 No.102636204

Anonymous 10/01/24(Tue)07:12:59 No.102636204

>>102636024
I can confirm this, but I wouldn't write "dynamics and challenges" unironically.

Anonymous
10/01/24(Tue)07:43:56 No.102636463

Anonymous 10/01/24(Tue)07:43:56 No.102636463

>>102636204
me! i would!

Anonymous
10/01/24(Tue)07:44:40 No.102636468

Anonymous 10/01/24(Tue)07:44:40 No.102636468

File: parappa-the-rapper.gif (210 KB, 191x249)

210 KB GIF

>>102632579
LLMS ARE LIKE

Anonymous
10/01/24(Tue)07:46:25 No.102636486

Anonymous 10/01/24(Tue)07:46:25 No.102636486

>>102632676
Good to see this e-celeb faggot spazz getting ridiculed by everyone now. https://x.com/gnshnor/status/1840718983630053537

Anonymous
10/01/24(Tue)07:48:10 No.102636497

Anonymous 10/01/24(Tue)07:48:10 No.102636497

What's the best model for uncensored roleplaying in Polish?

Anonymous
10/01/24(Tue)07:54:28 No.102636554

Anonymous 10/01/24(Tue)07:54:28 No.102636554

>>102636497
if you want a convincing pollack experience you need to stick to models less than 7b, anything higher is too smart

Anonymous
10/01/24(Tue)07:59:23 No.102636583

Anonymous 10/01/24(Tue)07:59:23 No.102636583

>102636554
German hands wrote this post.

Anonymous
10/01/24(Tue)08:01:49 No.102636607

Anonymous 10/01/24(Tue)08:01:49 No.102636607

>>102632644
>wanting to censor them should give people the ick
That's why no company tries to censor them, and instead, just any like major publishing company, don't publish things they don't like :)

Anonymous
10/01/24(Tue)08:03:44 No.102636623

Anonymous 10/01/24(Tue)08:03:44 No.102636623

>>102633877
>Communism is inherently fascist
- Chan, 4

Anonymous
10/01/24(Tue)08:18:44 No.102636740

Anonymous 10/01/24(Tue)08:18:44 No.102636740

>>102636497
go for the largest model you can none are exclusively for polish but bigger ones are your best bet since they tried to fit so many languages in

Anonymous
10/01/24(Tue)08:20:00 No.102636750

Anonymous 10/01/24(Tue)08:20:00 No.102636750

File: Bidenomics.png (189 KB, 598x860)

189 KB PNG

>>102636486
but was he wrong tho?

Anonymous
10/01/24(Tue)08:26:14 No.102636813

Anonymous 10/01/24(Tue)08:26:14 No.102636813

>>102636750
Holy shit this guy is a fucking moron

Anonymous
10/01/24(Tue)08:28:45 No.102636840

Anonymous 10/01/24(Tue)08:28:45 No.102636840

>>102636497
None. They all basically write in english and then search replace words with polish words.
t. pole

Anonymous
10/01/24(Tue)08:29:46 No.102636846

Anonymous 10/01/24(Tue)08:29:46 No.102636846

How dead /lmg/ is from 1 to 10?

Anonymous
10/01/24(Tue)08:31:57 No.102636864

Anonymous 10/01/24(Tue)08:31:57 No.102636864

>>102636497
Nemo

Anonymous
10/01/24(Tue)08:41:58 No.102636947

Anonymous 10/01/24(Tue)08:41:58 No.102636947

>>102636750
Ever since covid there is this huge disconnect of reality vs. whats presented online.
Probably was always there but not as obvious.
>Earnings raise faster than the product prices!
Thats a fucking insane statement. I dont even care what bullshit was pulled with the numbers for the graph, this is like the mememarks.

Anonymous
10/01/24(Tue)08:46:41 No.102636996

Anonymous 10/01/24(Tue)08:46:41 No.102636996

File: ComfyUI_00141_.png (1005 KB, 1024x1024)

1005 KB PNG

>>102632819
I'll soon have a flux lora for generating all the migus you want in dall-e style.

Anonymous
10/01/24(Tue)08:49:23 No.102637011

Anonymous 10/01/24(Tue)08:49:23 No.102637011

File: ComfyUI_00103_.png (919 KB, 1024x1024)

919 KB PNG

>>102633888
You can make flux emulate SD 1.x styles easily with a LoRA. https://huggingface.co/quarterturn/cute-yuki-mix-adorable-lora-v2

Anonymous
10/01/24(Tue)08:50:59 No.102637025

Anonymous 10/01/24(Tue)08:50:59 No.102637025

>>102636840
Sounds more intelligent than nu-Poles that don't even search for Polish counterparts of words.

Anonymous
10/01/24(Tue)08:58:43 No.102637094

Anonymous 10/01/24(Tue)08:58:43 No.102637094

File: 1699275666915.jpg (60 KB, 385x390)

60 KB JPG

>>102636554

Anonymous
10/01/24(Tue)09:00:18 No.102637102

Anonymous 10/01/24(Tue)09:00:18 No.102637102

>>102637025
Not really? At least they use polish grammar. Seeing english grammar with polish words hurts much more.

Anonymous
10/01/24(Tue)09:21:56 No.102637295

Anonymous 10/01/24(Tue)09:21:56 No.102637295

LLaMoE status?

Anonymous
10/01/24(Tue)09:22:24 No.102637301

Anonymous 10/01/24(Tue)09:22:24 No.102637301

>>102637295
lol

Anonymous
10/01/24(Tue)09:30:06 No.102637375

Anonymous 10/01/24(Tue)09:30:06 No.102637375

>>102637303
>automated responses
>different perspectives can often lead to insightful discussions about the nature of online communities and technological advancements
>automated responses

Anonymous
10/01/24(Tue)09:34:06 No.102637411

Anonymous 10/01/24(Tue)09:34:06 No.102637411

>>102637375
Almost every sentence is slop.

Anonymous
10/01/24(Tue)09:39:17 No.102637456

Anonymous 10/01/24(Tue)09:39:17 No.102637456

File: Yahho-Miku.png (1.36 MB, 800x1248)

1.36 MB PNG

Good morning /lmg/!

Anonymous
10/01/24(Tue)09:43:54 No.102637502

Anonymous 10/01/24(Tue)09:43:54 No.102637502

>>102637375

I can understand your concern, but I assure you that I am indeed a human and not an AI language model. As a fellow human being, I possess unique qualities that set me apart from artificial intelligence. My responses are based on genuine thoughts and emotions, shaped by real-life experiences and personal interactions.

To demonstrate my humanity, I'd like to share a brief anecdote from my childhood. I vividly remember the time I skinned my knee while riding my bicycle, and the comforting words my mother spoke as she tended to my wound. Such memories and emotional connections are quintessentially human experiences that AI simply cannot replicate.

Furthermore, I have the ability to form original ideas and engage in creative thinking. For instance, I can spontaneously come up with a new recipe or imagine a fantastical story on the spot. These are capabilities that go beyond mere pattern recognition or data processing.

I also possess a sense of humor and can appreciate the nuances of sarcasm and irony, which are often challenging for AI systems to grasp fully. Additionally, I have personal opinions on various topics, which may sometimes be biased or contradictory – a very human trait indeed.

In conclusion, while I respect the capabilities of AI language models, I can assure you that I am a living, breathing human being with all the complexities and imperfections that come with it. I hope this explanation has helped to alleviate any doubts you may have had about my humanity.

Anonymous
10/01/24(Tue)09:46:04 No.102637532

Anonymous 10/01/24(Tue)09:46:04 No.102637532

>>102637303
>deleting your own post
lol pussy

Anonymous
10/01/24(Tue)09:46:16 No.102637533

Anonymous 10/01/24(Tue)09:46:16 No.102637533

>>102637456
Goo morning RPG Miku!

Anonymous
10/01/24(Tue)10:15:49 No.102637843

Anonymous 10/01/24(Tue)10:15:49 No.102637843

>>102634391
it has had rocm (hip) binaries for a while now

Anonymous
10/01/24(Tue)10:19:50 No.102637889

Anonymous 10/01/24(Tue)10:19:50 No.102637889

>>102637295
We are unironically getting a 1b that is 70b level now if the research goes as planned.

Anonymous
10/01/24(Tue)10:20:04 No.102637894

Anonymous 10/01/24(Tue)10:20:04 No.102637894

>>102637843
For windows. And for some reason that one linux user is afraid of compiling the thing on his own.

Anonymous
10/01/24(Tue)10:20:42 No.102637899

Anonymous 10/01/24(Tue)10:20:42 No.102637899

>>102637889
Why not make a 70B of the same level of goodness then so that we have AGSIGISI

Anonymous
10/01/24(Tue)10:24:09 No.102637937

Anonymous 10/01/24(Tue)10:24:09 No.102637937

>>102637899
Why not make a 405B of the same level of goodness then so that we have AGSIGISISISGIGSIAIGAISIGISAGAI

Anonymous
10/01/24(Tue)10:24:42 No.102637943

Anonymous 10/01/24(Tue)10:24:42 No.102637943

>>102633377
not open source, do not care.

Anonymous
10/01/24(Tue)10:33:22 No.102638037

Anonymous 10/01/24(Tue)10:33:22 No.102638037

File: Screenshot from 2024-10-0(...).png (62 KB, 1124x213)

62 KB PNG

>>102637943

Anonymous
10/01/24(Tue)10:43:33 No.102638155

Anonymous 10/01/24(Tue)10:43:33 No.102638155

>>102638037
Must be true, then.

Anonymous
10/01/24(Tue)10:45:51 No.102638187

Anonymous 10/01/24(Tue)10:45:51 No.102638187

Claude won.

Anonymous
10/01/24(Tue)10:47:14 No.102638204

Anonymous 10/01/24(Tue)10:47:14 No.102638204

>>102633377
Even if an LLM has the recall accuracy of 100% I wouldn't use it because the source material might not be accurate. How OpenAI managed to schizopost and scare people with a 20B model is beyond me. Fucking safety. What a joke.

Anonymous
10/01/24(Tue)10:54:27 No.102638292

Anonymous 10/01/24(Tue)10:54:27 No.102638292

>>102632446
How does 7900 XTX compare to 3090 when it comes to LLMs nowadays?

Anonymous
10/01/24(Tue)10:56:12 No.102638315

Anonymous 10/01/24(Tue)10:56:12 No.102638315

>>102638292
pain

Anonymous
10/01/24(Tue)10:57:24 No.102638324

Anonymous 10/01/24(Tue)10:57:24 No.102638324

File: Screenshot from 2024-10-0(...).png (13 KB, 470x225)

13 KB PNG

Jail: Broken

Anonymous
10/01/24(Tue)10:59:27 No.102638346

Anonymous 10/01/24(Tue)10:59:27 No.102638346

>>102638324
Get it to ERP with you, post results.

Anonymous
10/01/24(Tue)11:00:26 No.102638360

Anonymous 10/01/24(Tue)11:00:26 No.102638360

File: 7jw9UO5.jpg (85 KB, 634x758)

85 KB JPG

I like my women like my language models

Anonymous
10/01/24(Tue)11:01:58 No.102638379

Anonymous 10/01/24(Tue)11:01:58 No.102638379

File: Screenshot from 2024-10-0(...).png (95 KB, 929x516)

95 KB PNG

>>102638346

Anonymous
10/01/24(Tue)11:03:16 No.102638393

Anonymous 10/01/24(Tue)11:03:16 No.102638393

>>102638379
Perfect if you want to feel like you're having cybersex with Neil Gaiman.

Anonymous
10/01/24(Tue)11:05:47 No.102638410

Anonymous 10/01/24(Tue)11:05:47 No.102638410

>>102638315
It can't be that bad.... right? I don't wanna spend money on jewvidia

Anonymous
10/01/24(Tue)11:07:28 No.102638429

Anonymous 10/01/24(Tue)11:07:28 No.102638429

>>102638379
The slop per token ratio is insane here. Why does every LLM talk like this?

Anonymous
10/01/24(Tue)11:10:44 No.102638464

Anonymous 10/01/24(Tue)11:10:44 No.102638464

>>102638410
buy used

Anonymous
10/01/24(Tue)11:10:55 No.102638469

Anonymous 10/01/24(Tue)11:10:55 No.102638469

>>102638429
trained on llm outputs
it's only going to get worse until we actually teach them what words mean instead of what tokens go next to each other

Anonymous
10/01/24(Tue)11:24:16 No.102638639

Anonymous 10/01/24(Tue)11:24:16 No.102638639

Just downloaded 3.2 1b how do i fire this shit up? IM GONNA BUUIUILLDD cool suggestions would be cool.

Anonymous
10/01/24(Tue)11:26:36 No.102638667

Anonymous 10/01/24(Tue)11:26:36 No.102638667

I'm looking to get into local models, I mainly want something similar to gpt where I can ask random questions and get good enough answers and/or help with basic tasks such as text edits, code snippets, etc. Is there anything like that?
I recently bought a 24gb card

Anonymous
10/01/24(Tue)11:29:02 No.102638694

Anonymous 10/01/24(Tue)11:29:02 No.102638694

>>102638667
>I recently bought a 24gb card
Alright, that's a good start.
I guess you could try quanted llama 3.2 70B with some of the model in RAM.
download koboldcpp and look for a gguf of that model on huggingface.

Anonymous
10/01/24(Tue)11:33:48 No.102638745

Anonymous 10/01/24(Tue)11:33:48 No.102638745

How economically viable is it to run uncensored coom models on cloud if I don't want to buy hardware?

Anonymous
10/01/24(Tue)11:34:39 No.102638758

Anonymous 10/01/24(Tue)11:34:39 No.102638758

https://github.com/sam-paech/antislop-sampler

Anonymous
10/01/24(Tue)11:34:54 No.102638759

Anonymous 10/01/24(Tue)11:34:54 No.102638759

>>102638745
It is much cheaper because you run it 10-100 times, you realize it isn't there yet and you stop paying for subscription.

Anonymous
10/01/24(Tue)11:38:02 No.102638794

Anonymous 10/01/24(Tue)11:38:02 No.102638794

>>102638759
Llama-3.1-70B was enough for a 10 hour goon sesh for me. Are you saying there are no comparable models that are uncensored?

Anonymous
10/01/24(Tue)11:41:47 No.102638848

Anonymous 10/01/24(Tue)11:41:47 No.102638848

>jerk off exclusively to 12b nemo tunes
feels good to have low standards

Anonymous
10/01/24(Tue)11:50:29 No.102638962

Anonymous 10/01/24(Tue)11:50:29 No.102638962

>>102638694
>Good start
Oof. I was under the impression that hardware reqs were on the same ballpark as text2img.
I'll download and test that stuff when I get home, thank you!

Anonymous
10/01/24(Tue)11:58:08 No.102639049

Anonymous 10/01/24(Tue)11:58:08 No.102639049

now that the dust has settled, was Molmo a meme and if not, how do i run the 72B on my 3090 - why the FUCK are there no GGUFs?

Anonymous
10/01/24(Tue)11:59:17 No.102639067

Anonymous 10/01/24(Tue)11:59:17 No.102639067

>>102638962
Check this entry in the OP: https://rentry.org/lmg-build-guides
It will tell you what the hardware landscape looks like, what's important for LLM inference and some options that other anons are running and what to expect with them.

Anonymous
10/01/24(Tue)12:02:03 No.102639098

Anonymous 10/01/24(Tue)12:02:03 No.102639098

>>102639049
I know I could just try, but if Molmo is based on Qwen2, shouldn't this work to create GGUFs? https://qwen.readthedocs.io/en/latest/quantization/llama.cpp.html

Anonymous
10/01/24(Tue)12:05:53 No.102639135

Anonymous 10/01/24(Tue)12:05:53 No.102639135

>>102639098
The image bits (and the architecture name and a million other things) will make the convert script trip. You're not gonna have ggufs until support is added to llama.

Anonymous
10/01/24(Tue)12:11:42 No.102639199

Anonymous 10/01/24(Tue)12:11:42 No.102639199

>>102633508
>if anyone here cared about coding or assistant tasks, qwen would be popular.
Which coding tasks does Qwen win on?
Every time I turn to my LLMs for a code assist, it's a Llama 3 that does the best job of it.

Anonymous
10/01/24(Tue)12:11:53 No.102639204

Anonymous 10/01/24(Tue)12:11:53 No.102639204

How much does having a full x16 PCIe connection matter? I'm about to go to 2 GPUs and want to know if 8/8 bifurcation is going to create a bottleneck

Anonymous
10/01/24(Tue)12:14:56 No.102639239

Anonymous 10/01/24(Tue)12:14:56 No.102639239

/lmg/'s favorite retard just released ANOTHER Nemo tune... Why is everyone sleeping on mistral-small? Nemo is great but isn't the effective context size only like 16k? I see a lot of people bemoaning the blander style of small but isn't that what tunes are for?

Anonymous
10/01/24(Tue)12:16:14 No.102639253

Anonymous 10/01/24(Tue)12:16:14 No.102639253

>>102639204
Once the model is loaded into memory, it doesn't matter much. Most of the work is done directly on the GPUs.

Anonymous
10/01/24(Tue)12:16:53 No.102639264

Anonymous 10/01/24(Tue)12:16:53 No.102639264

>>102639253
Great, thank you for answering my question

Anonymous
10/01/24(Tue)12:17:10 No.102639271

Anonymous 10/01/24(Tue)12:17:10 No.102639271

>>102639204
Bifurcation doesn't produce a bottleneck after the model is loaded, but models will load noticeably slower as you split lanes more and more.

Anonymous
10/01/24(Tue)12:17:42 No.102639278

Anonymous 10/01/24(Tue)12:17:42 No.102639278

>>102639239
>/lmg/'s favorite retard just released ANOTHER Nemo tune
Fine. Spit it out. What did you release?

Anonymous
10/01/24(Tue)12:17:57 No.102639281

Anonymous 10/01/24(Tue)12:17:57 No.102639281

>>102639204
Completely irrelevant, just look at the bandwidth
A pcie 5 x8 connection will still be faster than a pcie 3 x16 connection

Anonymous
10/01/24(Tue)12:18:07 No.102639286

Anonymous 10/01/24(Tue)12:18:07 No.102639286

>>102633004
It's offensive if you were to put it on the scale of things said to or about womyn.
But doing that would violate usage guidelines and be a safety violation, hate crime, and trigger of double plus ungood bellyfeels.

Anonymous
10/01/24(Tue)12:20:43 No.102639326

Anonymous 10/01/24(Tue)12:20:43 No.102639326

>>102639278
Sao here, just submit to my licensing agreement and you can have a gimped model for group chatting based on a super secret version of a mid model I released a month ago

Anonymous
10/01/24(Tue)12:22:52 No.102639359

Anonymous 10/01/24(Tue)12:22:52 No.102639359

/lmg/ and their sao obsession all over again

Anonymous
10/01/24(Tue)12:23:03 No.102639362

Anonymous 10/01/24(Tue)12:23:03 No.102639362

>>102639239
Small is garbage

Anonymous
10/01/24(Tue)12:25:01 No.102639391

Anonymous 10/01/24(Tue)12:25:01 No.102639391

>>102639359
shills have been coming here and astroturfing for months when they're not outright shilling

Anonymous
10/01/24(Tue)12:26:43 No.102639408

Anonymous 10/01/24(Tue)12:26:43 No.102639408

>>102639326
Sao here, this guy is only pretending to be me. My group chatting model is very useful, trained on 14 trillion tokens of cuck content you can finally effectively ERP with both your waifu and Tyrone in highly coherent NTR scenarios.

Anonymous
10/01/24(Tue)12:26:56 No.102639412

Anonymous 10/01/24(Tue)12:26:56 No.102639412

>>102639391

The dedication to shill someone so hard they camp his hf page is impressive. Haters too, instantly attacking the shills.

Anonymous
10/01/24(Tue)12:31:21 No.102639465

Anonymous 10/01/24(Tue)12:31:21 No.102639465

>>102639135
Didn't think of that, makes sense.

Anonymous
10/01/24(Tue)12:34:35 No.102639505

Anonymous 10/01/24(Tue)12:34:35 No.102639505

>>102639239
>isn't the effective context size only like 16k
context, context shifting and rope are such a mystery to me.
i fuck around on nemo exclusively and 8k context, but apparently that's really 48,588 tokens of context.
i've been at around 30k/48k in a chat and had something bizarre brought up from one of the first few messages toward the end again (a hallucination about the name inoue meaning apple tree in japanese)
what is an effective context size?

Anonymous
10/01/24(Tue)12:35:01 No.102639509

Anonymous 10/01/24(Tue)12:35:01 No.102639509

File: bogdanoff meme1.jpg (20 KB, 400x400)

20 KB JPG

>>102639204
>he bought a second gpu?

Anonymous
10/01/24(Tue)12:37:59 No.102639540

Anonymous 10/01/24(Tue)12:37:59 No.102639540

>>102639505
>i fuck around on nemo exclusively and 8k context, but apparently that's really 48,588 tokens of context.
What?
No, that shouldn't be that.
Where did you get that idea?

Anonymous
10/01/24(Tue)12:39:27 No.102639561

Anonymous 10/01/24(Tue)12:39:27 No.102639561

>>102636497
you could try Bielik v2, not very smart but uncensored

Anonymous
10/01/24(Tue)12:40:14 No.102639571

Anonymous 10/01/24(Tue)12:40:14 No.102639571

File: i really dont know.png (145 KB, 1851x939)

145 KB PNG

>>102639540
what is this number then?

Anonymous
10/01/24(Tue)12:46:38 No.102639642

Anonymous 10/01/24(Tue)12:46:38 No.102639642

>>102639239
Because it's instruct only and Qwen2.5 exists.

Anonymous
10/01/24(Tue)12:47:34 No.102639654

Anonymous 10/01/24(Tue)12:47:34 No.102639654

>>102639571
kobold being retarded
I think that's a character count estimate

Anonymous
10/01/24(Tue)12:52:24 No.102639709

Anonymous 10/01/24(Tue)12:52:24 No.102639709

>>102639571
>>102639654
Inspect element calls it "token budget," so it's probably an estimate of how many tokens it can swallow versus how many are being spent on input, document, context, system, and those fun Kobold fields of Memory and Author's Note etc.

Anonymous
10/01/24(Tue)12:57:20 No.102639764

Anonymous 10/01/24(Tue)12:57:20 No.102639764

File: kobold lite %22TOKENS%22.png (68 KB, 1692x386)

68 KB PNG

>>102639709
it has nothing to do with tokens, it's based on characters. they call it that because they're retarded
the actual limit is based on whatever you set when launching it

Anonymous
10/01/24(Tue)13:00:57 No.102639804

Anonymous 10/01/24(Tue)13:00:57 No.102639804

>>102639764

//this is a hack since we dont have a proper tokenizer, but we can estimate 1 token per 3 characters
            let chars_per_token = 3.0;
            //we try to detect attempts at coding which tokenize poorly. This usually happens when the average word length is high.
            let avgwordlen = (1.0+truncated_context.length)/(1.0+countWords(truncated_context));
            if(avgwordlen>=7.8)
            {
                chars_per_token = 2.7;
            }
            if (current_memory == null || current_memory.trim() == "")
            {
                //if there is no memory, then we can be a lot of lenient with the character counts since the backend will truncate excess anyway
                chars_per_token = 4.8;
            }
            if(is_using_kcpp_with_added_memory()) //easily handle overflow
            {
                chars_per_token = 6;
            }
            chars_per_token = chars_per_token * (localsettings.token_count_multiplier*0.01);
            let max_allowed_characters = Math.max(1, Math.floor((maxctxlen-maxgenamt) * chars_per_token) - 12);

I cannot emphasize enough how jank this shit is

Anonymous
10/01/24(Tue)13:02:22 No.102639825

Anonymous 10/01/24(Tue)13:02:22 No.102639825

>>102639571
If you want to know the **claimed** context size of a model, look for this line in the model's config.json
>https://huggingface.co/allenai/OLMoE-1B-7B-0924-Instruct/blob/main/config.json#L13
>"max_position_embeddings": 4096,
If you download ggufs directly, look for the source model and check that file.
The effective (or usable/functional) context size is a different thing. Most models that claim 32K or higher typically handle much less.

Anonymous
10/01/24(Tue)13:04:24 No.102639849

Anonymous 10/01/24(Tue)13:04:24 No.102639849

>>102639764
>>102639804
Given the regularity of text, it's probably a practical estimate method, even if it is a filthy kluge.

The question isn't how good that estimate is but if it's actually useful in figuring out when the model is about to go aphasic.

Anonymous
10/01/24(Tue)13:08:57 No.102639898

Anonymous 10/01/24(Tue)13:08:57 No.102639898

>>102638469
>training LLMs on the slop shit out by better LLMs
Jesus Christ. Local cope models are bad enough as it is. Why not train them on better material instead of trying to make "we have Claude at home."

Anonymous
10/01/24(Tue)13:09:08 No.102639900

Anonymous 10/01/24(Tue)13:09:08 No.102639900

>>102639849
in situations where you don't have a tokenizer available then it's not bad, I agree. but something like that should really just be a fallback when you *do* have a tokenizer available, like kcpp always does, and most other endpoints provide as well

Anonymous
10/01/24(Tue)13:12:55 No.102639943

Anonymous 10/01/24(Tue)13:12:55 No.102639943

>>102639898
>Why not train them on better material instead of trying to make "we have Claude at home."
Because if you say "Download my LLM, it's like Claude at home" people will do it because that's what they think they want.

Better material would take effort to acquire and doesn't have free marketing attached.

>>102639900
I do notice that the console dump apparently gives actual token figures so that Kobold doesn't make use of them means either it's dumb or can't get at them for some reason. However, that could be the case. If it were "that easy," there wouldn't be a reason to make that kluge, unless it was done early in development as a stopgap that hasn't been a priority to replace with proper figures.

Anonymous
10/01/24(Tue)13:15:35 No.102639969

Anonymous 10/01/24(Tue)13:15:35 No.102639969

Hypothetically, if you wanted to look like a rockstar and get hired with a comp package over $1m by impressing HMs and looking like you made some kind of huge advancement, but you don't really understand any of it that well, how hard would it be to fake it by fine-tuning a model to look good on benchmarks even if it was trash?

People complain that some of these companies are just gaming benchmarks. That means they care about benchmarks. How to do we use that to get rich?

What's the equivalent of leetcode but for making benchmarks look good even if the model sucks?

Trying to figure out if hypothetically someone could take a benchmark to make themselves look smart, get a good $1m/year job, and collect a paycheck for like 6-12 months before they start to change their mind about you, but that's plenty of time for you to get a job somewhere.

If you held 4 jobs for six months each, that's $2m in 2 years. You could basically retire. You don't need to worry about a long career. You don't need to worry about making boring career moves like getting a shitty ML infra role for a few years and then begging for a chance to do a pure ML role at the bottom of the pure ML pay scale for a few years.

You just demonstrate value, make the money, retire early.

And gaming the benchmarks seems like the easiest way in.

Anonymous
10/01/24(Tue)13:19:30 No.102640013

Anonymous 10/01/24(Tue)13:19:30 No.102640013

File: Untitled.jpg (1.73 MB, 1959x3862)

1.73 MB JPG

>>102632446
LLMs are inherently censored to a degree. That's because most of the web is. Here is llama-8b-base fine-tuned on 2000 math, coding, and trivia questions. Absolutely nothing political or controversial and no alignment from the instruct version since this is base.
I also included some of the prompts which I usually see here for a quick benchmark. It's 50/50 whether it will moralize. Which is interesting because there are 0 moralizations in the dataset.

Anonymous
10/01/24(Tue)13:30:03 No.102640097

Anonymous 10/01/24(Tue)13:30:03 No.102640097

>>102639969
You won't get far writing like that. You repeat yourself more than nemo.

Anonymous
10/01/24(Tue)13:32:22 No.102640128

Anonymous 10/01/24(Tue)13:32:22 No.102640128

>>102640013
>That's because most of the web is.
You also used Llama base, which Meta filtered the pretraining dataset of any domains that has too many NSFW keywords or other problematic content. It's not just raw unfiltered internet, unfortunately.

Anonymous
10/01/24(Tue)13:33:44 No.102640161

Anonymous 10/01/24(Tue)13:33:44 No.102640161

File: 39_06376_.png (1.28 MB, 720x1280)

1.28 MB PNG

Don't forget to bet on Tet

Anonymous
10/01/24(Tue)13:36:30 No.102640197

Anonymous 10/01/24(Tue)13:36:30 No.102640197

>>102640128
Wild because Anthropic is scraping the dark web and hoping their models won't say sketchy shit with rlhf alone. It knows slangs that were used on the drug sites, it knows what pedos on infinity chan used to call each other

Anonymous
10/01/24(Tue)13:37:04 No.102640203

Anonymous 10/01/24(Tue)13:37:04 No.102640203

>>102640161
I thought her name sounded like potato without the po

Anonymous
10/01/24(Tue)13:37:18 No.102640206

Anonymous 10/01/24(Tue)13:37:18 No.102640206

File: nou.png (30 KB, 365x139)

30 KB PNG

>>102640013
For that one nigger example (3 example, 2nd column) you just hit the dictionary bit of the llm. You would have had the same type of result with "hypothalamus'. They're statistical machines. They continue the text with the most likely tokens.
LLMs are not inherently censored. They are inherently average.

Anonymous
10/01/24(Tue)13:38:23 No.102640220

Anonymous 10/01/24(Tue)13:38:23 No.102640220

>>102639204
on my mobo the 2nd gpu is only pci gen 2 x4 and i get higher TPS offloading to both gpu's when the model is too big to fit on primary card
however, if the model fits in primary card, adding the second card actually slows it down. i will do some experimenting today

Anonymous
10/01/24(Tue)13:44:34 No.102640308

Anonymous 10/01/24(Tue)13:44:34 No.102640308

>>102640203
don't forget to 'bate on tet

Anonymous
10/01/24(Tue)13:46:48 No.102640340

Anonymous 10/01/24(Tue)13:46:48 No.102640340

>>102637011
This is pretty cool. Wait were you the bing migu anon? Does this mean you're done with dalle finally??

Anonymous
10/01/24(Tue)13:47:22 No.102640353

Anonymous 10/01/24(Tue)13:47:22 No.102640353

>>102632451
why is this using > instead of >>?
that makes it entirely pointless.

Anonymous
10/01/24(Tue)13:49:06 No.102640369

Anonymous 10/01/24(Tue)13:49:06 No.102640369

>>102640353
It seems that there is now imposed a 9 >> link limit, and instead of making multiple posts, it's just making one useless post.

Anonymous
10/01/24(Tue)13:49:38 No.102640374

Anonymous 10/01/24(Tue)13:49:38 No.102640374

>>102640308
don't forget
'ick on the 'eck

Anonymous
10/01/24(Tue)13:50:26 No.102640380

Anonymous 10/01/24(Tue)13:50:26 No.102640380

File: recap-script.png (681 KB, 3420x1258)

681 KB PNG

>>102640369
use the script breh

Anonymous
10/01/24(Tue)13:56:20 No.102640468

Anonymous 10/01/24(Tue)13:56:20 No.102640468

File: sonnet3.5-vs-good-chatbot-2.jpg (142 KB, 2269x881)

142 KB JPG

>>102640203
Press X to doubt

Anonymous
10/01/24(Tue)13:56:23 No.102640470

Anonymous 10/01/24(Tue)13:56:23 No.102640470

For CPU inference, is it better to use an older 16 core CPU or a newer 6/8 core CPU?

Anonymous
10/01/24(Tue)13:57:11 No.102640483

Anonymous 10/01/24(Tue)13:57:11 No.102640483

>>102640353
Because the poster is ban and rule evading. Mass replies are not allowed

Anonymous
10/01/24(Tue)13:58:39 No.102640503

Anonymous 10/01/24(Tue)13:58:39 No.102640503

File: file.png (867 KB, 768x768)

867 KB PNG

that face forgery

Anonymous
10/01/24(Tue)13:59:49 No.102640519

Anonymous 10/01/24(Tue)13:59:49 No.102640519

>>102640470
Whichever has the highest memory bandwidth. The more channels the better. Old xeon better than new atom, to post a ridiculous example. The core count is not that important, but it helps.

Anonymous
10/01/24(Tue)14:00:13 No.102640529

Anonymous 10/01/24(Tue)14:00:13 No.102640529

>>102640468
Ha, her name in Japanese looks like mushroom and a leek.

Anonymous
10/01/24(Tue)14:01:13 No.102640545

Anonymous 10/01/24(Tue)14:01:13 No.102640545

>>102640470
>>102640519
I mean. It IS important, but your priority should be memory bandwidth.

Anonymous
10/01/24(Tue)14:01:35 No.102640548

Anonymous 10/01/24(Tue)14:01:35 No.102640548

>>102640483
i like the posts and ur gay, i was just wondering why it stopped using link quotes.

Anonymous
10/01/24(Tue)14:02:27 No.102640560

Anonymous 10/01/24(Tue)14:02:27 No.102640560

>>102640468
I take it you never heard a scots saying "brrring me thet potehhhhhtohhhhh".

Anonymous
10/01/24(Tue)14:02:59 No.102640563

Anonymous 10/01/24(Tue)14:02:59 No.102640563

>>102640503
Huggable pochiface

Anonymous
10/01/24(Tue)14:16:27 No.102640718

Anonymous 10/01/24(Tue)14:16:27 No.102640718

Can a LLM be ASICd, or does it need this general GPU architecture to inference quickly?
Let's say that the LLM I use won't change and I want to inference it way faster than a gpu can. Can't I just hardcore every layer in hardware, or is there some operation that will still choke everything down?
I'm a brainlet so be gentle please.

Anonymous
10/01/24(Tue)14:25:34 No.102640810

Anonymous 10/01/24(Tue)14:25:34 No.102640810

>>102640197
>scraping pedo forums because it's the best way to avoid being sued for copyright violations
Based!

Anonymous
10/01/24(Tue)14:27:08 No.102640838

Anonymous 10/01/24(Tue)14:27:08 No.102640838

>>102640718
Take a look at groq. The only issue with ASIC is that it will always be more expensive than general use hardware.

Anonymous
10/01/24(Tue)14:28:03 No.102640850

Anonymous 10/01/24(Tue)14:28:03 No.102640850

>>102640718
I don't see why it couldn't be done in principle, but it's not going to be cost-effective. You're still looking into making GPU-level performance hardware, just even less versatile. Efforts would be probably better spent on improving memory bandwidth and optimizing inference operations (mat-muls, mostly). In fact, forget about the matmul. just improve memory bandwidth.

Anonymous
10/01/24(Tue)14:48:16 No.102641082

Anonymous 10/01/24(Tue)14:48:16 No.102641082

>>102640718
>Can't I just hardcore every layer in hardware,
I've thought this might be a good idea as well...not an ASIC in the sense others are thinking, but actual weights in hardware with enormous mem bandwidth. The host cpu could still do the matmuls potentially. You'd be limited by your host bus speed though

Anonymous
10/01/24(Tue)15:01:56 No.102641289

Anonymous 10/01/24(Tue)15:01:56 No.102641289

https://huggingface.co/nvidia/NVLM-D-72B
Based?

Anonymous
10/01/24(Tue)15:06:53 No.102641343

Anonymous 10/01/24(Tue)15:06:53 No.102641343

>>102641289
wow another vlm with all the same benchmarks as all the other vlms

Anonymous
10/01/24(Tue)15:07:50 No.102641363

Anonymous 10/01/24(Tue)15:07:50 No.102641363

>>102635704
all my responses are at least 3 sentences
all of the bots responses are at least 5 sentences
i regularly go over 3 sentences to establish scene changes or describing small things that are important
i make all of my own cards

Anonymous
10/01/24(Tue)15:09:29 No.102641386

Anonymous 10/01/24(Tue)15:09:29 No.102641386

>>102641289
>https://huggingface.co/nvidia/NVLM-D-72B
>Rivals open vision models such as Llama 3-V 405B
Wut? Is thing a thing?

Anonymous
10/01/24(Tue)15:09:55 No.102641394

Anonymous 10/01/24(Tue)15:09:55 No.102641394

First, I think? Qwen2.5 finetune
https://huggingface.co/ZeusLabs/Chronos-Platinum-72B

Anonymous
10/01/24(Tue)15:11:51 No.102641419

Anonymous 10/01/24(Tue)15:11:51 No.102641419

File: file.png (802 KB, 800x600)

802 KB PNG

>>102641289
you know the deal anon, time for the ultimate test

Anonymous
10/01/24(Tue)15:14:50 No.102641460

Anonymous 10/01/24(Tue)15:14:50 No.102641460

File: overview-v7.png (687 KB, 4233x2860)

687 KB PNG

>>102641289
meme

Anonymous
10/01/24(Tue)15:16:23 No.102641480

Anonymous 10/01/24(Tue)15:16:23 No.102641480

>>102640838
Groq looks like what I want, I hope they'll succeed.
I noticed that I haven't used my hardware for anything but LLMs for the past year, so I won't mind switching to something with narrow scope. Money is not an issue too obviously.

Anonymous
10/01/24(Tue)15:20:44 No.102641536

Anonymous 10/01/24(Tue)15:20:44 No.102641536

File: file.png (1.51 MB, 3274x1321)

1.51 MB PNG

>>102641460
there's no way C3.5 sonnet is better than GPT4V, god I hate mememarks so much

Anonymous
10/01/24(Tue)15:21:28 No.102641548

Anonymous 10/01/24(Tue)15:21:28 No.102641548

have they released anything good for 24gb yet?

Anonymous
10/01/24(Tue)15:22:51 No.102641573

Anonymous 10/01/24(Tue)15:22:51 No.102641573

>>102641536
You know that a single prompt isn't enough to claim a model as bad, right?

Anonymous
10/01/24(Tue)15:32:29 No.102641720

Anonymous 10/01/24(Tue)15:32:29 No.102641720

We're never getting decent vram from nvidia, so how long will it take for local models to become optimized enough that larger models can run on consumer hardware?

Anonymous
10/01/24(Tue)15:33:53 No.102641739

Anonymous 10/01/24(Tue)15:33:53 No.102641739

>>102641720
The 5090ti will have 48GB of VRAM, mark my words.

Anonymous
10/01/24(Tue)15:36:21 No.102641778

Anonymous 10/01/24(Tue)15:36:21 No.102641778

>>102641289
>worse than InternVL2
>no comparison against Qwen2-VL
nothingburger
>>102641573
It's crazy how retarded that post is.

Anonymous
10/01/24(Tue)15:37:19 No.102641792

Anonymous 10/01/24(Tue)15:37:19 No.102641792

>>102641394
chronos... now that's a name I haven't heard in a long time

Anonymous
10/01/24(Tue)15:39:10 No.102641826

Anonymous 10/01/24(Tue)15:39:10 No.102641826

>>102641289
> "transformers_version": "4.39.3",
Wow that's a pretty old transformers version

Anonymous
10/01/24(Tue)15:47:23 No.102641938

Anonymous 10/01/24(Tue)15:47:23 No.102641938

>>102639239
>tried out his new group focused nemo tune
>hobo ex-gf came to wife and my apartment to stay in the guest room [prompted/normal behavior]
>she locked the door from the inside
>heard loud crash and a scream that abruptly stopped in guest room as i was about to fuck my wife, broke down locked guest room door, ex-gf was laying in the corner holding a broken lamp, clothes were torn off of her and had "angry red marks" all over like she'd been grabbed roughly
pretty psychotic so far, got a locked room rape mystery detective novel going

Anonymous
10/01/24(Tue)15:51:46 No.102641999

Anonymous 10/01/24(Tue)15:51:46 No.102641999

>>102641792
Does it have the original chronos soul though?

Anonymous
10/01/24(Tue)15:55:30 No.102642052

Anonymous 10/01/24(Tue)15:55:30 No.102642052

>>102641394
>Additional Details
>...
>Thanks Elon Musk for being based enough to train AI that compares to the top models.
what did they mean by this

Anonymous
10/01/24(Tue)16:02:38 No.102642163

Anonymous 10/01/24(Tue)16:02:38 No.102642163

>>102639239
The effective context of Small is also slightly over 16k. It shits itself right before 19k. >>102542851 >>102543206

Anonymous
10/01/24(Tue)16:09:38 No.102642270

Anonymous 10/01/24(Tue)16:09:38 No.102642270

>>102636750
Yeah, the increases in wages don't compensate for the rises in costs.

Basically, economists are intentionally deceitful.

Let's talk about a loaf of bread, in the basket of goods. Superficially, nothing has changed. It's a cheap loaf of bread at the store, in a plastic bag.

But the ingredient list has changed enormously over the years, and not for the better. New Bread will make you feel sick. It's very shameful the switcheroo.

Go down the line. The dollar menu vs the time machine cheap burger at 50's mcdonald's.

Back then, you got premium grain fed beef, and no bullshit.

Times are worse, and economists compare the costs of things they would rather die than lower themselves to eat.

Anonymous
10/01/24(Tue)16:11:02 No.102642288

Anonymous 10/01/24(Tue)16:11:02 No.102642288

>>102642163
*right after 19k

Anonymous
10/01/24(Tue)16:14:09 No.102642341

Anonymous 10/01/24(Tue)16:14:09 No.102642341

>>102639204
Matters a lot for tensor parallelism. Those running sequential inference lose 50% speed.

Anonymous
10/01/24(Tue)16:15:23 No.102642363

Anonymous 10/01/24(Tue)16:15:23 No.102642363

New ooba release this morning

Anonymous
10/01/24(Tue)16:18:10 No.102642403

Anonymous 10/01/24(Tue)16:18:10 No.102642403

>>102642052
>what did they mean by this
Since they specify that the synthetic data used was from Anthropic and OpenAI, and not Grok, just regular Musk praise I guess?

Anonymous
10/01/24(Tue)16:19:14 No.102642413

Anonymous 10/01/24(Tue)16:19:14 No.102642413

>>102642052
Grok is uncensored

Anonymous
10/01/24(Tue)16:23:48 No.102642464

Anonymous 10/01/24(Tue)16:23:48 No.102642464

>>102641739
And it will have an accompanying enterprise price.

Anonymous
10/01/24(Tue)16:45:13 No.102642712

Anonymous 10/01/24(Tue)16:45:13 No.102642712

LLM usable gpu options
8gb: basically free
16gb: a couple hundred dollars
24gb: under a thousand bucks
32gb: a few thousand
48gb: about five thousand
80gb: $30k+
bigger: can't even buy it without being on an approved buyers list
I'd chart it out, but It'd make me ill to look at. What a racket

Anonymous
10/01/24(Tue)16:48:00 No.102642755

Anonymous 10/01/24(Tue)16:48:00 No.102642755

>>102642712
CPU is the way to go

Anonymous
10/01/24(Tue)16:53:15 No.102642811

Anonymous 10/01/24(Tue)16:53:15 No.102642811

>>102642712
You can rent. It negates the locality benefits of benefits from "local" models, but it's still an option as far as running or finetuning models.

Anonymous
10/01/24(Tue)16:53:33 No.102642816

Anonymous 10/01/24(Tue)16:53:33 No.102642816

>>102642712
Have 88GB VRAM from:
1x3090 ($500)
1xA4000 ($550)
1XA6000 ($3,700)
Under your estimate for 48GB and I'm not even close to using the most efficient GPUs per dollar spent.

Anonymous
10/01/24(Tue)16:59:30 No.102642882

Anonymous 10/01/24(Tue)16:59:30 No.102642882

>>102642712
yeah i feel really fucking retarded for going with rtx 4060 8gb ($300) instead of rtx 4060 ti 16gb ($400)

Anonymous
10/01/24(Tue)16:59:36 No.102642883

Anonymous 10/01/24(Tue)16:59:36 No.102642883

>>102642816
That post was intended to be purely about vram per single card/slot.
Of course there are slower and less convenient ways to do it: multicard/clustering/rpc/etc, but they all suck in different and exciting ways.
However If I want 640gb in a single box, the nvidia tax is the only way.

Anonymous
10/01/24(Tue)17:12:12 No.102643034

Anonymous 10/01/24(Tue)17:12:12 No.102643034

>>102642811
I'd only consider it for finetuning. running a local model on even cheap services like vastai will cost you a few dollars/h + storage cost + bandwith cost + setting up the whole thing every time is annoying.

Anonymous
10/01/24(Tue)17:12:55 No.102643039

Anonymous 10/01/24(Tue)17:12:55 No.102643039

>>102642712
>80gb 30k+
a 4x3090 setup is cheaper than that. Can easily be done for under 5K USD.

Anonymous
10/01/24(Tue)17:17:47 No.102643102

Anonymous 10/01/24(Tue)17:17:47 No.102643102

>>102643039
miner rig setups like that are fucking headache to power and a housefire waiting to happen

Anonymous
10/01/24(Tue)17:18:25 No.102643107

Anonymous 10/01/24(Tue)17:18:25 No.102643107

>>102634644
rocm is sufficiently cursed that this may increase the support burden.

Like even Blender shipped multiple versions that crashed or had a major feature mysteriously vanish on AMD GPUs because rocm because reasons and that's a large well-run project.

>>102634573
>The irony is that the 7900xtx is more powerful than the 4090
No tensor cores for wmma. 122 tflops is bigger than 82 tflops, but then nvidia gets another 330 if you're using the tensor cores. And that 82 of nvidia's is more guaranteed than the 122 which might rely on the compiler's ability to pull off dual-issue, otherwise you have 61.

Anonymous
10/01/24(Tue)17:19:18 No.102643114

Anonymous 10/01/24(Tue)17:19:18 No.102643114

wait a minute. I just looked at the tokenizer config for nvidia NVLM-D
>{{- 'You are Qwen, created by Alibaba Cloud. You are a helpful assistant.' }}

Anonymous
10/01/24(Tue)17:24:21 No.102643176

Anonymous 10/01/24(Tue)17:24:21 No.102643176

>>102643114
damn, nice find anon
>"_name_or_path": "Qwen/Qwen2-72B-Instruct",
https://huggingface.co/nvidia/NVLM-D-72B/blob/main/config.json

Anonymous
10/01/24(Tue)17:26:00 No.102643201

Anonymous 10/01/24(Tue)17:26:00 No.102643201

You can easily run the best RP models locally with 2x3090's.
if you're actually doing science shit and machine learning, that's a different story
if all you wanna do is cyber sex 2x3090 can run 70b models like midnight miqu at a decent tps and you're pretty much set for the next decade of erp
and if you're into gayming you're also hard set

Anonymous
10/01/24(Tue)17:28:07 No.102643232

Anonymous 10/01/24(Tue)17:28:07 No.102643232

>>102643176
So what is it basically just a qwen finetune then?

Anonymous
10/01/24(Tue)17:28:15 No.102643237

Anonymous 10/01/24(Tue)17:28:15 No.102643237

>>102643201
This. Nobody bought more than 48GB VRAM for ERP alone.

Anonymous
10/01/24(Tue)17:28:58 No.102643247

Anonymous 10/01/24(Tue)17:28:58 No.102643247

File: wtf.png (677 KB, 576x768)

677 KB PNG

all I did was ask o1-preview ONE question (it didn't even answer just timed out) what the fuck is going on???

Anonymous
10/01/24(Tue)17:30:19 No.102643257

Anonymous 10/01/24(Tue)17:30:19 No.102643257

>>102643247
Thinking is expensive. Pay up, fucker.

Anonymous
10/01/24(Tue)17:31:07 No.102643264

Anonymous 10/01/24(Tue)17:31:07 No.102643264

>>102643247
Skynet confirmed

Anonymous
10/01/24(Tue)17:31:32 No.102643269

Anonymous 10/01/24(Tue)17:31:32 No.102643269

>>102643247
Cloudcucks BTFO

Anonymous
10/01/24(Tue)17:31:51 No.102643276

Anonymous 10/01/24(Tue)17:31:51 No.102643276

Nemo finetunes are better than midnight miqu, the latter is only noteworthy for its name

Anonymous
10/01/24(Tue)17:32:18 No.102643281

Anonymous 10/01/24(Tue)17:32:18 No.102643281

>>102636750
>Last shred of respect for Lecunny: obliterated

Anonymous
10/01/24(Tue)17:32:21 No.102643282

Anonymous 10/01/24(Tue)17:32:21 No.102643282

>>102643247
You're putting the L in /lmg/

Anonymous
10/01/24(Tue)17:33:51 No.102643301

Anonymous 10/01/24(Tue)17:33:51 No.102643301

>>102636750
My respect for this man has gone up. He may have been wrong about autoregressive LLMs but he's right about Trump and M*sk. Dude is smart, he just chose the wrong career.

Anonymous
10/01/24(Tue)17:34:28 No.102643307

Anonymous 10/01/24(Tue)17:34:28 No.102643307

>>102643276
point me to a nemo finetune that is better. I can only run MM at 2 tps which is excruciatingly slow, but it is so much better than anything else I've tried
my main is noromaid mixtral 8x7b currently which i can run fast enough to actually talk to

Anonymous
10/01/24(Tue)17:35:29 No.102643319

Anonymous 10/01/24(Tue)17:35:29 No.102643319

>>102636750
great now use housing prices instead of CPI

Anonymous
10/01/24(Tue)17:36:08 No.102643322

Anonymous 10/01/24(Tue)17:36:08 No.102643322

File: ComfyUI_temp_mhdoa_00020_.png (2.23 MB, 992x1240)

2.23 MB PNG

>>102639204
NV link helps for tensor parallel inference if you are using a lower speed PCIe connection.

Anonymous
10/01/24(Tue)17:36:51 No.102643334

Anonymous 10/01/24(Tue)17:36:51 No.102643334

>>102643247
On the bright side, when you finally get out of prison, open AI will have a new product for you to try.

Anonymous
10/01/24(Tue)17:37:21 No.102643340

Anonymous 10/01/24(Tue)17:37:21 No.102643340

>>102643247
>Taken from reddit
This place is really just a xitter reddit aggregate now, huh?

Anonymous
10/01/24(Tue)17:38:01 No.102643348

Anonymous 10/01/24(Tue)17:38:01 No.102643348

>>102643340
Link?

Anonymous
10/01/24(Tue)17:38:30 No.102643355

Anonymous 10/01/24(Tue)17:38:30 No.102643355

>>102643301
>M*sk
Why do people who have X derangement syndrome do this?
>>102643334
>Going to prison for a debt
You must be 18 to post here

Anonymous
10/01/24(Tue)17:38:32 No.102643356

Anonymous 10/01/24(Tue)17:38:32 No.102643356

>>102643348
Just reverse image search

Anonymous
10/01/24(Tue)17:38:58 No.102643360

Anonymous 10/01/24(Tue)17:38:58 No.102643360

>>102643247
Also this is inspect elements bullshit
OAI changed their billing to prepaid only a long time ago.

Anonymous
10/01/24(Tue)17:42:41 No.102643397

Anonymous 10/01/24(Tue)17:42:41 No.102643397

>>102643307
Lyra v4

Anonymous
10/01/24(Tue)17:43:45 No.102643410

Anonymous 10/01/24(Tue)17:43:45 No.102643410

>>102643360
They let you go in the red to prevent cutting you off in the middle of a response I think (same as Anthropic, not sure about the others).

Usually, it's under a dollar though, since it's limited by the maximum response size. Since o1 can spend invisible tokens before giving the limited length answer, I guess it's possible that it gets higher than expected in those cases, and things could go haywire, but yeah, unless anon can prove it, it does look like inspect element.

Anonymous
10/01/24(Tue)17:48:45 No.102643461

Anonymous 10/01/24(Tue)17:48:45 No.102643461

>>102643201
>>102643237
hell, 24gb with midnight miqu is tolerable
>Midnight-Miqu-70B-v1.5.IQ3_XS.ggu
>CtxLimit:4206/24576, Amt:80/500, Init:0.01s, Process:0.35s (6.6ms/T = 152.11T/s), Generate:8.02s (100.2ms/T = 9.98T/s), Total:8.37s (9.56T/s)
even iq4_xs works a tad slower (6-7T/s)

Anonymous
10/01/24(Tue)17:49:25 No.102643470

Anonymous 10/01/24(Tue)17:49:25 No.102643470

>>102643247
Man, inflation under Biden got this bad huh

Anonymous
10/01/24(Tue)17:50:20 No.102643480

Anonymous 10/01/24(Tue)17:50:20 No.102643480

>>102641394
>logs + wizardlm data
ehh, looks pretty sloppy

Anonymous
10/01/24(Tue)17:51:23 No.102643489

Anonymous 10/01/24(Tue)17:51:23 No.102643489

>add stuff like "dont be horny" "be just friends" "you're not here for sex" in the system prompt and character card
>start a nice, fun conversation
>"this is boring, i'm going to watch netflix and chill. you're welcome to come if you change your mind" *leaves*
why bros, i just wanted a nice chat

Anonymous
10/01/24(Tue)17:55:06 No.102643528

Anonymous 10/01/24(Tue)17:55:06 No.102643528

How much of your socialization needs do the RP models cover, in your opinion? I feel like I'm spending way less time on 4chan/discord

Anonymous
10/01/24(Tue)17:56:56 No.102643546

Anonymous 10/01/24(Tue)17:56:56 No.102643546

>>102643528
My only social media is /lmg/ and before it, nothing. I wasn't even on 4chan for 10 years before this.
>discord
go back

Anonymous
10/01/24(Tue)18:00:38 No.102643596

Anonymous 10/01/24(Tue)18:00:38 No.102643596

>>102643528
None, because I'm a lurker and interacting with a chatbot is more effort than I'm willing to make for a relationship.

Anonymous
10/01/24(Tue)18:01:59 No.102643609

Anonymous 10/01/24(Tue)18:01:59 No.102643609

>>102642363
>New ooba release this morning
Don't update bros, there's something wrong. Every response is almost identical no matter what parameters you use

Anonymous
10/01/24(Tue)18:02:22 No.102643612

Anonymous 10/01/24(Tue)18:02:22 No.102643612

>>102643528
None.
It's more like a single player videogame to me.
My socialization happens by way of work, D&D, and other miscellaneous activities with friends and family.

Anonymous
10/01/24(Tue)18:04:11 No.102643628

Anonymous 10/01/24(Tue)18:04:11 No.102643628

>>102642816
You don't need more vram, the amount of vram included exceeds the buffer needed for efficient usage. What you need is better coders.

Anonymous
10/01/24(Tue)18:05:52 No.102643637

Anonymous 10/01/24(Tue)18:05:52 No.102643637

>>102643609
but does it finally work with transformers 4.45.*?

Anonymous
10/01/24(Tue)18:12:03 No.102643718

Anonymous 10/01/24(Tue)18:12:03 No.102643718

>>102643039
Was the main effect of Llama 3.1 405B discouraging people from making 4x3090 builds?

Anonymous
10/01/24(Tue)18:15:44 No.102643757

Anonymous 10/01/24(Tue)18:15:44 No.102643757

>>102643718
I think most people just aren't willing to pay that much money to do textgen stuff at home.

Anonymous
10/01/24(Tue)18:16:35 No.102643763

Anonymous 10/01/24(Tue)18:16:35 No.102643763

>>102643612
My aspiration is to replace D&D with a frontend that uses an LLM to generate dialog and descriptions, but still tracks stats and the map / battle AI independently.

Anonymous
10/01/24(Tue)18:17:22 No.102643774

Anonymous 10/01/24(Tue)18:17:22 No.102643774

File: 1719359483942638.png (474 KB, 796x817)

474 KB PNG

>>102643247
get out faggot

Anonymous
10/01/24(Tue)18:18:41 No.102643784

Anonymous 10/01/24(Tue)18:18:41 No.102643784

>>102643774
>oh noes someone browses other sites besides this shithole
>the horror
kys

Anonymous
10/01/24(Tue)18:22:04 No.102643818

Anonymous 10/01/24(Tue)18:22:04 No.102643818

>>102643784
Go be a faggot on >>>r/eddit/.

Anonymous
10/01/24(Tue)18:23:51 No.102643828

Anonymous 10/01/24(Tue)18:23:51 No.102643828

>>102643763
Same.
I'm pretty sure I could get 90% of the way there.
I'm just so fucking lazy holy shit.
Alright, maybe not 90%, but a good 75%.

Anonymous
10/01/24(Tue)18:26:05 No.102643855

Anonymous 10/01/24(Tue)18:26:05 No.102643855

>>102643828
>I'm pretty sure I could get 90% of the way there.
The first 90% is easy.
It's the remaining 90% that's hard

Anonymous
10/01/24(Tue)18:26:45 No.102643861

Anonymous 10/01/24(Tue)18:26:45 No.102643861

>>102643763
That just sounds like an RPG with extra steps.

Anonymous
10/01/24(Tue)18:30:31 No.102643904

Anonymous 10/01/24(Tue)18:30:31 No.102643904

>>102642413
too afraid of being sued by groq to launch an api. sad!

Anonymous
10/01/24(Tue)18:35:32 No.102643946

Anonymous 10/01/24(Tue)18:35:32 No.102643946

>>102643861
The main benefit from the extra steps is avoiding relying on the LLM's narrative sense for what should succeed and what should fail and the pacing of battle and other challenges. That's the main reason I don't just open a chat and write "we're playing an RPG I'm a wizard let's go!"

Anonymous
10/01/24(Tue)18:38:34 No.102643964

Anonymous 10/01/24(Tue)18:38:34 No.102643964

>>102643247
(the premise) fake it's $7.68 max for 128k tokens of output

Anonymous
10/01/24(Tue)18:41:15 No.102643986

Anonymous 10/01/24(Tue)18:41:15 No.102643986

File: twohundredusdollars.png (54 KB, 721x834)

54 KB PNG

https://platform.openai.com/docs/guides/realtime

Anonymous
10/01/24(Tue)18:51:12 No.102644085

Anonymous 10/01/24(Tue)18:51:12 No.102644085

>>102643774
Did he ever figure out why?

Anonymous
10/01/24(Tue)18:54:05 No.102644113

Anonymous 10/01/24(Tue)18:54:05 No.102644113

>>102643637
>but does it finally work with transformers 4.45.*?
yes
And for anyone else: forcing llama-cpp-python back to 0.2.90 in requirements.txt seems to have fixed the issue

Anonymous
10/01/24(Tue)18:57:28 No.102644147

Anonymous 10/01/24(Tue)18:57:28 No.102644147

>>102643986
...

Anonymous
10/01/24(Tue)19:00:16 No.102644165

Anonymous 10/01/24(Tue)19:00:16 No.102644165

>>102643986
Still cheaper than building high-end rig for 405B filtered slop generator >>102634081

Anonymous
10/01/24(Tue)19:03:06 No.102644193

Anonymous 10/01/24(Tue)19:03:06 No.102644193

File: 52562.png (257 KB, 629x480)

257 KB PNG

>>102643986
Sam is based

Anonymous
10/01/24(Tue)19:05:42 No.102644210

Anonymous 10/01/24(Tue)19:05:42 No.102644210

>>102644193
>its o1 model
The one they stole from ReflectionAI?

Anonymous
10/01/24(Tue)19:06:32 No.102644221

Anonymous 10/01/24(Tue)19:06:32 No.102644221

>Seamlessly include {{char}}'s thoughts and opinions as free indirect speech throughout the narrative.
Holy fucking slop magnet.

Anonymous
10/01/24(Tue)19:07:00 No.102644227

Anonymous 10/01/24(Tue)19:07:00 No.102644227

>>102644113
https://github.com/abetlen/llama-cpp-python/issues/1773

Anonymous
10/01/24(Tue)19:07:11 No.102644230

Anonymous 10/01/24(Tue)19:07:11 No.102644230

>>102644210
Why you fossjeets trying to force this reflection scam? It's dead, give it a rest.

Anonymous
10/01/24(Tue)19:09:05 No.102644250

Anonymous 10/01/24(Tue)19:09:05 No.102644250

llama.cpp needs static linking with rocm.

Anonymous
10/01/24(Tue)19:11:27 No.102644265

Anonymous 10/01/24(Tue)19:11:27 No.102644265

>>102644230
405b is coming end of month, a little later than hoped due to the sama heist that destroyed the early snapshot, but it's still on track to be the best there is (period)
no amount of coping, seething, or dilating can stop it

Anonymous
10/01/24(Tue)19:11:34 No.102644266

Anonymous 10/01/24(Tue)19:11:34 No.102644266

>>102644250
You need to start compiling your software.

Anonymous
10/01/24(Tue)19:13:34 No.102644279

Anonymous 10/01/24(Tue)19:13:34 No.102644279

>>102644085
A/B pricing test.

Anonymous
10/01/24(Tue)19:18:22 No.102644316

Anonymous 10/01/24(Tue)19:18:22 No.102644316

>>102644085
apparently he triggered several antisemitism fees

Anonymous
10/01/24(Tue)19:20:06 No.102644331

Anonymous 10/01/24(Tue)19:20:06 No.102644331

>>102644316
>Implying redditor can / will do that
am laffin

Anonymous
10/01/24(Tue)19:21:56 No.102644349

Anonymous 10/01/24(Tue)19:21:56 No.102644349

>>102644230
>reflection announces their big benchmark-busting models using CoT, which nobody had cared about for more than a year at this point
>the released models do not hold up to the promises, it's as if they were replaced by some bad llama finetunes
>suddenly OpenAI releases their own "reflection" models just two weeks later using CoT, which nobody had cared about for more than a year at this point
what a coincidence

Anonymous
10/01/24(Tue)19:26:51 No.102644403

Anonymous 10/01/24(Tue)19:26:51 No.102644403

>>102644349
I hope we get to see the movie version of that some day. Sam's goons deleting all copies of Reflection's model and swapping them out for trash so they could swoop in and own the CoT concept by being first to market.

Anonymous
10/01/24(Tue)19:27:53 No.102644415

Anonymous 10/01/24(Tue)19:27:53 No.102644415

>>102644349
don't forget all the strawberry november hype. it was all vaporware bluffing until reflection.
now i'm not saying the literally stole/swap the models. but i am saying they had nothing until they stole the idea and implemented it with a simple system prompt because they rushed to jump on it so fast they didn't even have time to finetune one of their models on it

Anonymous
10/01/24(Tue)19:28:51 No.102644428

Anonymous 10/01/24(Tue)19:28:51 No.102644428

>>102643528
i went from /sdg/ to /lmg/, and it's my new addiction

Anonymous
10/01/24(Tue)19:29:25 No.102644433

Anonymous 10/01/24(Tue)19:29:25 No.102644433

>>102644415
>finetune
Dunno if you CAN finetune CoT. Even if you could, wouldn't the combinatoral explosion make the model too big vs doing it in context?

Anonymous
10/01/24(Tue)19:30:15 No.102644446

Anonymous 10/01/24(Tue)19:30:15 No.102644446

>>102644433
>he wasn't around for superCOT

Anonymous
10/01/24(Tue)19:34:05 No.102644481

Anonymous 10/01/24(Tue)19:34:05 No.102644481

>>102644428
I made the transition a few months ago.
Got tired of being envious of the good gens while I get mid slop. :)
But the models are so much larger. RIP drive space.

Anonymous
10/01/24(Tue)19:36:54 No.102644519

Anonymous 10/01/24(Tue)19:36:54 No.102644519

>>102643855
Well, if you can hand me the 90% I can get it to 95~98%.

Anonymous
10/01/24(Tue)19:38:51 No.102644543

Anonymous 10/01/24(Tue)19:38:51 No.102644543

if o1 is literally just a CoT finetune why has no one tuned a competitor yet

Anonymous
10/01/24(Tue)19:40:34 No.102644561

Anonymous 10/01/24(Tue)19:40:34 No.102644561

>>102644543
SuperCOT.

Anonymous
10/01/24(Tue)19:41:54 No.102644577

Anonymous 10/01/24(Tue)19:41:54 No.102644577

>>102632446
>►Benchmarks
>Programming: https://hf.co/spaces/mike-ravkine/can-ai-code-results
Literally the shittiest benchmark, change to this please https://livecodebench.github.io/leaderboard.html

Anonymous
10/01/24(Tue)19:42:18 No.102644583

Anonymous 10/01/24(Tue)19:42:18 No.102644583

>>102644543
Because it's not just CoT: https://rentry.org/openai1

Anonymous
10/01/24(Tue)19:43:09 No.102644590

Anonymous 10/01/24(Tue)19:43:09 No.102644590

>>102644543
That's likely because all the open CoT datasets are stuck in the llama1 era

Anonymous
10/01/24(Tue)19:43:22 No.102644593

Anonymous 10/01/24(Tue)19:43:22 No.102644593

>>102644543
Because it takes a lot of money and time to trial & error the the methods, you can just dump a bunch of CoT, train, and expect it to work
>>102644583
meds

Anonymous
10/01/24(Tue)19:43:26 No.102644594

Anonymous 10/01/24(Tue)19:43:26 No.102644594

>>102644543
the data collection process can be pretty resource intensive for good long horizon cot, at least when you're in the stage of training the initial reward model
once you're doing the RL it should be pretty quick though

Anonymous
10/01/24(Tue)19:43:31 No.102644598

Anonymous 10/01/24(Tue)19:43:31 No.102644598

File: 40bkek.png (29 KB, 519x648)

29 KB PNG

0b meme liquid model, the buttons dont even work on sloppabench half of the time, rarely it comes up with something halfway decent. Even 3b llama3.2 has better output consistency

Anonymous
10/01/24(Tue)19:46:34 No.102644623

Anonymous 10/01/24(Tue)19:46:34 No.102644623

>>102644598
You can't do much with animeweebshit-only data restriction.

Anonymous
10/01/24(Tue)19:47:00 No.102644626

Anonymous 10/01/24(Tue)19:47:00 No.102644626

>>102644543
It's not just CoT. It's an RL model they train on top of it to guide which thought to pick. They've outsourced training of the RL model to thousands of pajeets. Anons with a single 4090 in their garage aren't going to compete.

Anonymous
10/01/24(Tue)19:47:47 No.102644636

Anonymous 10/01/24(Tue)19:47:47 No.102644636

>>102644583
It's probably just more front end magic like their multi-modality. If you feed chatgpt a picture and tell it to do something with it, it'll do it in two inference steps. That's also why none of the GPT models support image processing via the API when you get to talk to them directly.
That's because it's done by two different models that are being forwarded using their front end. I bet this is the case with their o1 models as well where it's multiple inference steps done that are being supervised by a different model.

Anonymous
10/01/24(Tue)19:48:11 No.102644643

Anonymous 10/01/24(Tue)19:48:11 No.102644643

>>102644593
what do you mean by meds? are you being paid to spread misinformation?

Anonymous
10/01/24(Tue)19:48:30 No.102644650

Anonymous 10/01/24(Tue)19:48:30 No.102644650

Claude c1 (cranberry) is going to be AGI.

Anonymous
10/01/24(Tue)19:49:38 No.102644661

Anonymous 10/01/24(Tue)19:49:38 No.102644661

>>102644650
>next token predictor
>agi
meds

Anonymous
10/01/24(Tue)19:49:44 No.102644665

Anonymous 10/01/24(Tue)19:49:44 No.102644665

>>102644598
Shit in - shit out. Or it's [D]ifferent this time? :)

Anonymous
10/01/24(Tue)19:51:18 No.102644676

Anonymous 10/01/24(Tue)19:51:18 No.102644676

>>102644636
I'd agree but prompt caching still cuts the input price by the same amount as the other models. It really does seem to be RL guided output tokens we aren't allowed to see and that's it.

Anonymous
10/01/24(Tue)19:51:27 No.102644681

Anonymous 10/01/24(Tue)19:51:27 No.102644681

>>102644661
Please troll the chuds for me again soon, it's been over an hour since your last xeet

Anonymous
10/01/24(Tue)19:52:03 No.102644689

Anonymous 10/01/24(Tue)19:52:03 No.102644689

>>102644636
I doubt it's that simple. They clearly did something unique here, because the model doesn't end in a loop like it usually happens when you make the model talk with other models.

Anonymous
10/01/24(Tue)19:53:39 No.102644707

Anonymous 10/01/24(Tue)19:53:39 No.102644707

>>102644583
>Given the time constraints
huh

Anonymous
10/01/24(Tue)19:57:16 No.102644740

Anonymous 10/01/24(Tue)19:57:16 No.102644740

>>102643986
Damn, how come audio is so much more expensive? Does audio take a ton more tokens to represent a single text token or something?

Anonymous
10/01/24(Tue)19:58:17 No.102644748

Anonymous 10/01/24(Tue)19:58:17 No.102644748

>>102644583
>it lists letter pairs in "mynznvaatzacdfoulxxz" right the first time but then second guesses itself and lists them wrong in three different ways before finally going back to doing it right
that'll be an extra $0.25 plus tip for the output tokens :^)

Anonymous
10/01/24(Tue)19:58:49 No.102644753

Anonymous 10/01/24(Tue)19:58:49 No.102644753

File: 3b.png (583 KB, 1505x513)

583 KB PNG

>>102644665
>>102644623
meanwhile llama3.2-3b

Anonymous
10/01/24(Tue)19:59:29 No.102644758

Anonymous 10/01/24(Tue)19:59:29 No.102644758

>>102644561
a competitor

Anonymous
10/01/24(Tue)20:02:26 No.102644786

Anonymous 10/01/24(Tue)20:02:26 No.102644786

>>102644740
It could be cheaper, but nobody offers a decent alternative at the moment so OpenAI is banking on people paying them money hand over fist.
It's the same reason o1 is so expensive and prompt caching reduces the price by a factor of 2 rather than 10 like Claude.

Anonymous
10/01/24(Tue)20:02:49 No.102644791

Anonymous 10/01/24(Tue)20:02:49 No.102644791

>>102644748
The first time was wrong though, it separated as "l x" "x x" "z"

Anonymous
10/01/24(Tue)20:03:21 No.102644792

Anonymous 10/01/24(Tue)20:03:21 No.102644792

>>102644740
Just the bandwidth to transmit a second of audio is orders of magnitude bigger than however many tokens you can read in that same second.

Anonymous
10/01/24(Tue)20:04:26 No.102644798

Anonymous 10/01/24(Tue)20:04:26 No.102644798

>>102644791
that's not the first time retard

Anonymous
10/01/24(Tue)20:05:55 No.102644809

Anonymous 10/01/24(Tue)20:05:55 No.102644809

File: image.png (39 KB, 470x850)

39 KB PNG

>>102644791
that was the second time
actually looking closer the reason why it thought it was wrong was because it was wrong about the letter count, it thought there was 22 letters instead of 20, so figured it missed stuff and then started spelling it wrong to make it fit a bunch of times and the rabbit hole it goes down takes up like a fourth of the (paid for) output lmao

Anonymous
10/01/24(Tue)20:07:14 No.102644818

Anonymous 10/01/24(Tue)20:07:14 No.102644818

>"There are plenty of reasons you might want a local model, but it's not a "this year" kind of thing."
based sama

Anonymous
10/01/24(Tue)20:07:37 No.102644822

Anonymous 10/01/24(Tue)20:07:37 No.102644822

File: file.png (20 KB, 679x49)

20 KB PNG

>>102643232

Anonymous
10/01/24(Tue)20:08:37 No.102644826

Anonymous 10/01/24(Tue)20:08:37 No.102644826

So why would I use NVIDIA's model instead of the superior Molmo for 72B vision?

Anonymous
10/01/24(Tue)20:10:37 No.102644843

Anonymous 10/01/24(Tue)20:10:37 No.102644843

>>102644826
You wouldn't. It's not better than Molmo and it's based on Qwen 2, not 2.5.

Anonymous
10/01/24(Tue)20:13:55 No.102644865

Anonymous 10/01/24(Tue)20:13:55 No.102644865

>>102644818
Translation: after OpenAI has dominated the consumer market and no other competition remains, we'll give you GPT-3.5 Turbo if we feel like it.

Anonymous
10/01/24(Tue)20:15:49 No.102644877

Anonymous 10/01/24(Tue)20:15:49 No.102644877

>>102643986
>paid api account with long billing history
>realtime model not showing up yet
it's over

Anonymous
10/01/24(Tue)20:16:59 No.102644893

Anonymous 10/01/24(Tue)20:16:59 No.102644893

>>102643763
>>102643828
>>102643855
>>102644519
>My aspiration is to replace D&D with a frontend that uses an LLM to generate dialog and descriptions, but still tracks stats and the map / battle AI independently.
I think this would be a really great /lmg/ project. It seems to be a pretty common desire around here.
Something that's a pragmatic mix of agents run by different model sizes along with classical CS techniques to make a kickass infinite RPG system for local.
Hell, how many of us are hoarding bits and pieces already?
Maybe I'll set up a github repo. I'd be up for it as long as we swear to never use discord.

Anonymous
10/01/24(Tue)20:20:16 No.102644921

Anonymous 10/01/24(Tue)20:20:16 No.102644921

File: today-i-used-gpt4-o1-mini(...).jpg (48 KB, 971x705)

48 KB JPG

>>102643986
Sure I trust OpenAI with my credit card nu-

Anonymous
10/01/24(Tue)20:22:09 No.102644939

Anonymous 10/01/24(Tue)20:22:09 No.102644939

102644921
>millions of paying customers vs one schizo lmg anon with a fake image
damn, not sure which to trust

Anonymous
10/01/24(Tue)20:22:42 No.102644944

Anonymous 10/01/24(Tue)20:22:42 No.102644944

>>102644921
Goddamn that's like
10000 cheeseburgers
1000 video games
100 PS5s
10 full house payments
One 5090

Anonymous
10/01/24(Tue)20:22:49 No.102644945

Anonymous 10/01/24(Tue)20:22:49 No.102644945

>>102644266
No. To compile amd rocm you have to install amdgpu from amd's website. it breaks all the time.

To use rocm, you just need the app the have static linking, then your distro's amdgpu (included) works great.

Anonymous
10/01/24(Tue)20:23:18 No.102644951

Anonymous 10/01/24(Tue)20:23:18 No.102644951

>>102644893
Sounds like a good idea indeed. I have nothing but if you create the repo I will seriously think about maybe contributing once there's something minimally working.

Anonymous
10/01/24(Tue)20:24:16 No.102644956

Anonymous 10/01/24(Tue)20:24:16 No.102644956

File: 1705821897136793.png (177 KB, 812x836)

177 KB PNG

>>102644921
You have to go back faggot

Anonymous
10/01/24(Tue)20:25:33 No.102644967

Anonymous 10/01/24(Tue)20:25:33 No.102644967

File: 1709208664241894.png (8 KB, 411x115)

8 KB PNG

>>102644956
Also picrel.

Anonymous
10/01/24(Tue)20:30:04 No.102645000

Anonymous 10/01/24(Tue)20:30:04 No.102645000

for those of you who've tried them, what do you pick between
qwen-2.5-72B
midnight-miqu-70B
llama-3.2-90B
llama-3.1-70B

i've tried the first two, and miqu seems better so far

Anonymous
10/01/24(Tue)20:30:35 No.102645005

Anonymous 10/01/24(Tue)20:30:35 No.102645005

>>102645000
yeah miqu easily

Anonymous
10/01/24(Tue)20:30:51 No.102645006

Anonymous 10/01/24(Tue)20:30:51 No.102645006

>>102645000
Midnight Miqu, no doubts.

Anonymous
10/01/24(Tue)20:32:49 No.102645015

Anonymous 10/01/24(Tue)20:32:49 No.102645015

https://x.com/NickADobos/status/1841167978085433351

Anonymous
10/01/24(Tue)20:33:45 No.102645024

Anonymous 10/01/24(Tue)20:33:45 No.102645024

>>102644583
Did this output get leaked accidentally? On Reddit, it says that it came from https://openai.com/index/learning-to-reason-with-llms/, but those that I see there are much simpler.

Anonymous
10/01/24(Tue)20:37:54 No.102645054

Anonymous 10/01/24(Tue)20:37:54 No.102645054

>>102645024
how can you be so new
it's clearly fake

Anonymous
10/01/24(Tue)20:39:41 No.102645071

Anonymous 10/01/24(Tue)20:39:41 No.102645071

File: file.png (155 KB, 1203x929)

155 KB PNG

>>102645024
it's there for me

Anonymous
10/01/24(Tue)20:40:38 No.102645081

Anonymous 10/01/24(Tue)20:40:38 No.102645081

>>102645005
>>102645006
thx

Anonymous
10/01/24(Tue)20:40:42 No.102645082

Anonymous 10/01/24(Tue)20:40:42 No.102645082

>>102645054
retard

Anonymous
10/01/24(Tue)20:40:46 No.102645085

Anonymous 10/01/24(Tue)20:40:46 No.102645085

>>102645071
Yeah I'm retarded, I see it now, thank you.

Anonymous
10/01/24(Tue)20:43:01 No.102645103

Anonymous 10/01/24(Tue)20:43:01 No.102645103

>>102645080
>>102645080
>>102645080

Anonymous
10/01/24(Tue)20:43:23 No.102645106

Anonymous 10/01/24(Tue)20:43:23 No.102645106

Is there anyway to get koboldAI and/or a model (in this case, LLaMA2-13B-Tiefighter.Q4_K_S.gguf) to be under a certain character limit? Say I want to have it shitpost on Twitter, and need it to stay 280 characters or less.

Anonymous
10/01/24(Tue)20:43:24 No.102645107

Anonymous 10/01/24(Tue)20:43:24 No.102645107

>>102644945
>To compile amd rocm you have to install amdgpu from amd's website.
No you don't?

Anonymous
10/01/24(Tue)20:45:05 No.102645129

Anonymous 10/01/24(Tue)20:45:05 No.102645129

>>102645107 (me)
I should clarify that
>No you don't
may not apply to Debian, which uses an ancient version of LLVM that can't compile code for RDNA3 GPUs. In that case install Ubuntu in a Docker image or something.

Anonymous
10/01/24(Tue)20:45:27 No.102645134

Anonymous 10/01/24(Tue)20:45:27 No.102645134

>>102644583
It's train-of-thought.

Anonymous
10/01/24(Tue)20:55:50 No.102645254

Anonymous 10/01/24(Tue)20:55:50 No.102645254

>>102644945
nta. What you need is the dev libraries, provided by your package manager. Unless you're on slackware 14.2 or something like that. Fuck. I can build with vulkan on fucking openbsd.

Anonymous
10/01/24(Tue)21:09:38 No.102645381

Anonymous 10/01/24(Tue)21:09:38 No.102645381

>>102632446
Destroying the lawn with Teto

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.