/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 11/12/24(Tue)11:50:50 No.103164575

File: artworks-000117957613-oso(...).jpg (46 KB, 500x500)

46 KB JPG

/lmg/ - Local Models General Anonymous 11/12/24(Tue)11:50:50 No.103164575 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>103153308 & >>103135641

►News
>(11/08) Sarashina2-8x70B, a Japan-trained LLM model: https://hf.co/sbintuitions/sarashina2-8x70b
>(11/05) Hunyuan-Large released with 389B and 52B active: https://hf.co/tencent/Tencent-Hunyuan-Large
>(10/31) QTIP: Quantization with Trellises and Incoherence Processing: https://github.com/Cornell-RelaxML/qtip
>(10/31) Fish Agent V0.1 3B: Voice-to-Voice and TTS model: https://hf.co/fishaudio/fish-agent-v0.1-3b
>(10/31) Transluce open-sources AI investigation toolkit: https://github.com/TransluceAI/observatory

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png (embed)

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://livecodebench.github.io/leaderboard.html

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
11/12/24(Tue)11:52:35 No.103164609

Anonymous 11/12/24(Tue)11:52:35 No.103164609

File: saintmakise.jpg (236 KB, 1614x992)

236 KB JPG

Long live the queen of alive /lmg/.

Anonymous
11/12/24(Tue)11:53:05 No.103164615

Anonymous 11/12/24(Tue)11:53:05 No.103164615

The Qwen guys released their paper, we'll finally see what secret sauce they used to make their 32b coder model so fucking good
https://arxiv.org/pdf/2409.12186

Anonymous
11/12/24(Tue)11:53:16 No.103164618

Anonymous 11/12/24(Tue)11:53:16 No.103164618

File: 1708187138045027.png (747 KB, 1346x1996)

747 KB PNG

PSA: Petra/blackedmikuanon/kurisufag/AGPL-spammer/drevilanon/2nd-belief-anon/midjourneyfag/repair-quant-anon is from... SERBIA
https://archive.4plebs.org/pol/search/uid/QmNRftdq/page/1/

Anonymous
11/12/24(Tue)11:54:07 No.103164626

Anonymous 11/12/24(Tue)11:54:07 No.103164626

File: 6e406395da7cff8573b731a66(...).jpg (110 KB, 736x1483)

110 KB JPG

>>103164618
Having a melty already?

Anonymous
11/12/24(Tue)11:54:10 No.103164627

Anonymous 11/12/24(Tue)11:54:10 No.103164627

>>103164618
that's why you use memeflags on /pol/ kek

Anonymous
11/12/24(Tue)11:55:28 No.103164646

Anonymous 11/12/24(Tue)11:55:28 No.103164646

>>103164618
as one (and only one) of those people I can confirm that this faggot is a legit schizo

Anonymous
11/12/24(Tue)11:56:49 No.103164657

Anonymous 11/12/24(Tue)11:56:49 No.103164657

>>103164129
>ask for GPT slop
>recieve GPT slop
>ask it to not do that
>recieve GPT slop
AGI?

Anonymous
11/12/24(Tue)11:58:36 No.103164678

Anonymous 11/12/24(Tue)11:58:36 No.103164678

File: 1727210335025020.jpg (190 KB, 841x1189)

190 KB JPG

migrate
>>103164659
>>103164659
>>103164659

Anonymous
11/12/24(Tue)12:03:03 No.103164735

Anonymous 11/12/24(Tue)12:03:03 No.103164735

File: 9ze75m65ecp01.jpg (141 KB, 892x1316)

141 KB JPG

I still can't wrap my head around how mentally ill the usual baker is...

Anonymous
11/12/24(Tue)12:04:34 No.103164752

Anonymous 11/12/24(Tue)12:04:34 No.103164752

File: v-Sy-Zqs_400x400.jpg (24 KB, 400x400)

24 KB JPG

Also other than triggering that sperg I am happy to do the public service of confusing the shit out of all the newfags.

Anonymous
11/12/24(Tue)12:07:56 No.103164784

Anonymous 11/12/24(Tue)12:07:56 No.103164784

maybe we could make a neutral thread without any mascots?

Anonymous
11/12/24(Tue)12:09:11 No.103164798

Anonymous 11/12/24(Tue)12:09:11 No.103164798

>maybe we can discuss local coombots without any troons?
jej, even zozzle

Anonymous
11/12/24(Tue)12:14:32 No.103164857

Anonymous 11/12/24(Tue)12:14:32 No.103164857

>>103164798
>/g/
>no troons
You all brought it upon yourself.

Anonymous
11/12/24(Tue)12:27:44 No.103164981

Anonymous 11/12/24(Tue)12:27:44 No.103164981

>>103164575
I like this OP image
Having a makise thread every now and then isn't that bad

Anonymous
11/12/24(Tue)12:28:33 No.103164989

Anonymous 11/12/24(Tue)12:28:33 No.103164989

I will stay in this thread if Petra doesn't decide to be more of a nigger

Anonymous
11/12/24(Tue)12:32:41 No.103165035

Anonymous 11/12/24(Tue)12:32:41 No.103165035

>>103164618
wtff!! he is based fun enjoyer! how horrific!!!

Anonymous
11/12/24(Tue)12:33:43 No.103165048

Anonymous 11/12/24(Tue)12:33:43 No.103165048

>>103164618
Makes sense he uses the same images over and over again.

Anonymous
11/12/24(Tue)12:39:02 No.103165089

Anonymous 11/12/24(Tue)12:39:02 No.103165089

File: 1731433107306893.png (95 KB, 1278x952)

95 KB PNG

The 'ick 'ecker added some things to his voice cloner.

Anonymous
11/12/24(Tue)12:45:54 No.103165155

Anonymous 11/12/24(Tue)12:45:54 No.103165155

File: 2024-11-12 11_43_29-Text (...).png (103 KB, 910x930)

103 KB PNG

I literally use her for everything now.

Anonymous
11/12/24(Tue)12:55:25 No.103165278

Anonymous 11/12/24(Tue)12:55:25 No.103165278

why did that guy split the thread?

Anonymous
11/12/24(Tue)12:58:23 No.103165310

Anonymous 11/12/24(Tue)12:58:23 No.103165310

>>103165278
kurisufag is a notorious shitposter

Anonymous
11/12/24(Tue)13:00:30 No.103165329

Anonymous 11/12/24(Tue)13:00:30 No.103165329

>>103165278
anime obsession and prolonged hrt intake makes a big toll on your mental wellbeing.

Anonymous
11/12/24(Tue)13:05:36 No.103165373

Anonymous 11/12/24(Tue)13:05:36 No.103165373

>>103164575
IM SO FUCKING CONFUSED WHICH ONE IS THE REAL THREAD
AAAAAAAAAAAARRRGGHHHH

Anonymous
11/12/24(Tue)13:06:47 No.103165381

Anonymous 11/12/24(Tue)13:06:47 No.103165381

>>103164618
rent free

Anonymous
11/12/24(Tue)13:07:50 No.103165392

Anonymous 11/12/24(Tue)13:07:50 No.103165392

>>103165310
and a ritualposting baker who has a meltdown over OP pictures and doxxes people is better?

Anonymous
11/12/24(Tue)13:13:41 No.103165453

Anonymous 11/12/24(Tue)13:13:41 No.103165453

>>103164575
>kurisu OP
Yeah, this is the thread

Anonymous
11/12/24(Tue)13:13:50 No.103165455

Anonymous 11/12/24(Tue)13:13:50 No.103165455

>>103165392
Nta but keep in mind he always samefags for optics, that one already makes him mentally ill schizo.

Anonymous
11/12/24(Tue)13:15:20 No.103165465

Anonymous 11/12/24(Tue)13:15:20 No.103165465

>>103165373
How can you be so new

Anonymous
11/12/24(Tue)13:15:38 No.103165466

Anonymous 11/12/24(Tue)13:15:38 No.103165466

>>103165455
I agree that mikubaker samefags for optics.

Anonymous
11/12/24(Tue)13:16:32 No.103165468

Anonymous 11/12/24(Tue)13:16:32 No.103165468

>>103165455
Here it is >>103165466

Anonymous
11/12/24(Tue)13:17:08 No.103165476

Anonymous 11/12/24(Tue)13:17:08 No.103165476

>>103165466
trvke

Anonymous
11/12/24(Tue)13:19:43 No.103165492

Anonymous 11/12/24(Tue)13:19:43 No.103165492

File: 1722578820348464.jpg (396 KB, 1726x1726)

396 KB JPG

>>103164609
VGH she's such a gem

Anonymous
11/12/24(Tue)13:21:56 No.103165502

Anonymous 11/12/24(Tue)13:21:56 No.103165502

>>103164575
I know you're a troll spammer who doesn't give a fuck about Kurisu but, damn, I hope the remake will be good. I love Steins;Gate.
https://youtu.be/dmmnx4VQmPU

Anonymous
11/12/24(Tue)13:25:47 No.103165535

Anonymous 11/12/24(Tue)13:25:47 No.103165535

File: 1723986939536441.png (1.31 MB, 1024x1024)

1.31 MB PNG

>>103165502
>>>/a/

Anonymous
11/12/24(Tue)13:33:38 No.103165623

Anonymous 11/12/24(Tue)13:33:38 No.103165623

>>103165535
kek, saved.

Anonymous
11/12/24(Tue)13:34:24 No.103165632

Anonymous 11/12/24(Tue)13:34:24 No.103165632

>>103164575
the troon is back I see. odd he usually only pops up on a large release

Anonymous
11/12/24(Tue)13:38:19 No.103165677

Anonymous 11/12/24(Tue)13:38:19 No.103165677

>>103165632
Uhm... xe is always here spamming anime pics amd melting over non-miku OPs though

Anonymous
11/12/24(Tue)13:43:30 No.103165715

Anonymous 11/12/24(Tue)13:43:30 No.103165715

finally, a good OP

Anonymous
11/12/24(Tue)14:50:35 No.103166450

Anonymous 11/12/24(Tue)14:50:35 No.103166450

File: 1730169243049.jpg (677 KB, 3834x1000)

677 KB JPG

>>103165089
He looks like this BTW

Anonymous
11/12/24(Tue)14:57:53 No.103166540

Anonymous 11/12/24(Tue)14:57:53 No.103166540

>>103165535
kek that's my gen I posted on /ldg/ a few months ago :v

Anonymous
11/12/24(Tue)15:14:51 No.103166689

Anonymous 11/12/24(Tue)15:14:51 No.103166689

>>103164575
omg it kurisu

Anonymous
11/12/24(Tue)15:16:33 No.103166712

Anonymous 11/12/24(Tue)15:16:33 No.103166712

For what it is worth this thread revitalized /lmg/ by forcing the ritual poster to samefag and pretend to have a discussion.

Anonymous
11/12/24(Tue)16:19:14 No.103167225

Anonymous 11/12/24(Tue)16:19:14 No.103167225

File: file.png (263 KB, 1160x762)

263 KB PNG

Anonymous
11/12/24(Tue)16:22:03 No.103167245

Anonymous 11/12/24(Tue)16:22:03 No.103167245

>>103167225
overfitted nothingburger

Anonymous
11/12/24(Tue)17:01:29 No.103167561

Anonymous 11/12/24(Tue)17:01:29 No.103167561

still no ministral support...

Anonymous
11/12/24(Tue)18:04:57 No.103168015

Anonymous 11/12/24(Tue)18:04:57 No.103168015

Thanks for the OP.
I got mistral7b on kobold and sillytavern working, my first local llm use. I checked the answers to a few questions against gpt4o and i was happy with the answers.
So, thanks to the Renty people.

Anonymous
11/12/24(Tue)18:09:14 No.103168055

Anonymous 11/12/24(Tue)18:09:14 No.103168055

>>103165502
I'm hoping for an anime remake to introduce zoomers to the series

Anonymous
11/12/24(Tue)18:52:54 No.103168435

Anonymous 11/12/24(Tue)18:52:54 No.103168435

File: 1707000371361654.png (426 KB, 1718x1278)

426 KB PNG

Anonymous
11/12/24(Tue)19:02:01 No.103168493

Anonymous 11/12/24(Tue)19:02:01 No.103168493

>>103167561
doesn't it already work fine with llamacpp?

Anonymous
11/12/24(Tue)19:15:49 No.103168591

Anonymous 11/12/24(Tue)19:15:49 No.103168591

>>103167561
huh? FFTing seems to work and EXL2 works if you use the HF conversion to quant.

Anonymous
11/12/24(Tue)19:23:33 No.103168625

Anonymous 11/12/24(Tue)19:23:33 No.103168625

>>103168591
>FFT
What? Also I would expect exl2 quants to be on hf but nope. ggoofs are obviously broken even with hf conversion.

Anonymous
11/12/24(Tue)19:39:16 No.103168721

Anonymous 11/12/24(Tue)19:39:16 No.103168721

File: __shiina_mayuri_steins_ga(...).jpg (1.2 MB, 1024x1024)

1.2 MB JPG

https://github.com/t41372/Open-LLM-VTuber/
I got this semi-working with whisper.cpp, Bark, and ollama with Llama3.2-vision (11B).
I say semi-working because it listens and generates one response then stops. Something in the front-end code isn't working; if I reload the tab I can get another response. I might investigate more later.
whisper.cpp works very well. Had to manually generate a CoreML model which was a moderate pain, the scripts ggreganov and friends make always seem so half-baked, but they're actually building shit and it mostly works so I should stop complaining.
I chose Bark for TTS because it can do code-switching, but the responses take ~2 minutes to generate on my M1 Pro so it's not usable until they add GPU or CoreML support for mac. There's no issue open for it though so I'm not holding my breath.

The Live2D seems to work well with the lip sync, though I'm not sure yet if expressions are working.

There's also this fork https://github.com/ylxmf2005/LLM-Live2D-Desktop-Assitant
which seems to add a bunch of features and has a better Elaina-flavoured TTS. I haven't tried it at all.

Anyone else give it a whirl?

Anonymous
11/12/24(Tue)19:47:13 No.103168782

Anonymous 11/12/24(Tue)19:47:13 No.103168782

>>103168721
hi petra

Anonymous
11/12/24(Tue)19:58:00 No.103168861

Anonymous 11/12/24(Tue)19:58:00 No.103168861

File: channels4_profile.jpg (138 KB, 900x900)

138 KB JPG

>>103168782
the penguin from nijisanji?

Anonymous
11/12/24(Tue)20:03:11 No.103168902

Anonymous 11/12/24(Tue)20:03:11 No.103168902

good rape prompts for mistral large?

Anonymous
11/12/24(Tue)20:04:37 No.103168914

Anonymous 11/12/24(Tue)20:04:37 No.103168914

>>103168625
FFT = Full finetuning
EXL2 = https://huggingface.co/lucyknada/prince-canuma_Ministral-8B-Instruct-2410-HF-exl2

I was using these and they worked

Anonymous
11/12/24(Tue)20:33:15 No.103169081

Anonymous 11/12/24(Tue)20:33:15 No.103169081

>>103166540
great gen!

Anonymous
11/12/24(Tue)21:00:12 No.103169228

Anonymous 11/12/24(Tue)21:00:12 No.103169228

$2,000 USD for a 64GB Mac mini...
anyone try Llama 3.2 Vision 90B yet?

Anonymous
11/12/24(Tue)22:15:58 No.103169708

Anonymous 11/12/24(Tue)22:15:58 No.103169708

File: thinking.jpg (9 KB, 225x225)

9 KB JPG

>>103164575
>>103167339
What is the best micro model for writing creative text snippets and is licensed for commercial use?

I'm building a game and I want it to run an LLM to write descriptions of NPCs and object based on stats.

Looking for maximum speed even on mid-range cards. I saw considering Llama-3.2-1B but the license is restrictive.

Is there something like Mistral for 1B?

Anonymous
11/13/24(Wed)04:17:36 No.103171820

Anonymous 11/13/24(Wed)04:17:36 No.103171820

File: kurigohan.jpg (659 KB, 1404x2048)

659 KB JPG

it's fine to bump this right..

Anonymous
11/13/24(Wed)05:54:29 No.103172405

Anonymous 11/13/24(Wed)05:54:29 No.103172405

>>103168721
Man, a single component of these can be enough to make you work on it for weeks before getting proper and stable results. That's not the kind of project you can put together in an afternoon.
t. fixing sovits for three weeks rn

Anonymous
11/13/24(Wed)06:47:10 No.103172758

Anonymous 11/13/24(Wed)06:47:10 No.103172758

>>103169708
Just use 3.2-1b to see if the idea is worth it. Once you know it works, worry about the license. If you can do with something much dumber but still fast that you could eventually train on a dataset for your task, look at
>https://huggingface.co/allenai/OLMoE-1B-7B-0924-Instruct
Or olmoe if ram is not an issue. Works just as fast. IBM also released much smaller moe models you could try, i don't remember their license. Apache probably.
https://huggingface.co/ibm-granite/granite-3.0-3b-a800m-instruct
https://huggingface.co/ibm-granite/granite-3.0-1b-a400m-instruct
You won't find a model that works 100% consistently, just good enough.

Anonymous
11/13/24(Wed)07:20:50 No.103172953

Anonymous 11/13/24(Wed)07:20:50 No.103172953

>>103168914
It seems to work in the same way as ggufs. Ok at first but once you fill out the context it becomes incoherent so exl2 probably doesn't have the implementation for that SWA yet. I mean if it would then turboderp would probably make quants himself.

Anonymous
11/13/24(Wed)07:38:44 No.103173081

Anonymous 11/13/24(Wed)07:38:44 No.103173081

File: Screenshot_286.png (56 KB, 1839x480)

56 KB PNG

https://huggingface.co/anthracite-org/magnum-v4-27b

I copy pasted instruct template text into magnum.json and portion of it is glowing red, and tavern does not see this .json file ( file for context template is visible )

also, whats sampler preset for magnum?

Anonymous
11/13/24(Wed)07:44:06 No.103173122

Anonymous 11/13/24(Wed)07:44:06 No.103173122

File: m27.png (35 KB, 645x373)

35 KB PNG

>>103173081
chatml

Anonymous
11/13/24(Wed)07:50:17 No.103173166

Anonymous 11/13/24(Wed)07:50:17 No.103173166

>>103173081
>>103173122 (cont)
Specifically about that error, the " needs to be escaped. They cannot be trusted with a fucking json file.

Anonymous
11/13/24(Wed)08:18:56 No.103173352

Anonymous 11/13/24(Wed)08:18:56 No.103173352

File: Screenshot_287.png (847 B, 218x19)

847 B PNG

>>103173166
Like this?

What sampler preset is recommended for magnum?

Anonymous
11/13/24(Wed)08:20:38 No.103173361

Anonymous 11/13/24(Wed)08:20:38 No.103173361

>>103173352
nta probably just escaping the quotes so they're not interpreted as ending the value of system_prompt.
\"!\" and \"~\"

Anonymous
11/13/24(Wed)08:28:27 No.103173409

Anonymous 11/13/24(Wed)08:28:27 No.103173409

>>103173352
Yeah. This: >>103173361
I don't use it, but don't worry about the presets. Start with everything neutralised and tune as you see fit. Experiment.
Or use
>Sampler visualizer: https://artefact2.github.io/llm-sampling
to get a more intuitive understanding of what they do.

Anonymous
11/13/24(Wed)08:31:09 No.103173421

Anonymous 11/13/24(Wed)08:31:09 No.103173421

>>103173409
thank you

Anonymous
11/13/24(Wed)09:16:21 No.103173731

Anonymous 11/13/24(Wed)09:16:21 No.103173731

Any good voice-based memes lately? The Star Wars and Richard Nixon shit was amazing.

Anonymous
11/13/24(Wed)09:30:14 No.103173818

Anonymous 11/13/24(Wed)09:30:14 No.103173818

New claude slops in the new generation of erp sloptunes
>i'm not some common harlot/whore (in response to anything inappropriate, though recent sloptunes rarely deny ever you)
>don't you dare fucking stop
>make me scream
>make me yours/mark me as yours

Anonymous
11/13/24(Wed)10:01:21 No.103174021

Anonymous 11/13/24(Wed)10:01:21 No.103174021

nakurisudashi

Anonymous
11/13/24(Wed)10:11:39 No.103174108

Anonymous 11/13/24(Wed)10:11:39 No.103174108

why are there 2 threads?

Anonymous
11/13/24(Wed)10:13:24 No.103174122

Anonymous 11/13/24(Wed)10:13:24 No.103174122

Is a local model with the intent of using it as a programming assistant actually worth it? Or are they all shit compared to the openai/anthropic alternatives?

Anonymous
11/13/24(Wed)10:20:38 No.103174175

Anonymous 11/13/24(Wed)10:20:38 No.103174175

>>103174122
Test it yourself. Apparently, qwen coder 32b is pretty good. If not, go back to whatever you like most.

Anonymous
11/13/24(Wed)10:22:41 No.103174201

Anonymous 11/13/24(Wed)10:22:41 No.103174201

>>103174122
they are shit and not worth it for anything other than story writing or cooming

Anonymous
11/13/24(Wed)10:35:02 No.103174307

Anonymous 11/13/24(Wed)10:35:02 No.103174307

>>103174122
I use deepseek's online chat and it's not bad.

Anonymous
11/13/24(Wed)10:42:29 No.103174359

Anonymous 11/13/24(Wed)10:42:29 No.103174359

>voice cloning still sucks
/lmg/ was mistake

Anonymous
11/13/24(Wed)10:46:44 No.103174386

Anonymous 11/13/24(Wed)10:46:44 No.103174386

>>103174201
But they are shit and not worth it for story writing or cooming either...

Anonymous
11/13/24(Wed)10:49:10 No.103174407

Anonymous 11/13/24(Wed)10:49:10 No.103174407

>>103174122
Qwen2.5 32B coder. It's 90% there with the best enterprise and can RP while coding.

Anonymous
11/13/24(Wed)10:49:36 No.103174412

Anonymous 11/13/24(Wed)10:49:36 No.103174412

QRD on thread split?

Anonymous
11/13/24(Wed)10:51:47 No.103174430

Anonymous 11/13/24(Wed)10:51:47 No.103174430

>>103174407
is it censored?

Anonymous
11/13/24(Wed)10:52:18 No.103174434

Anonymous 11/13/24(Wed)10:52:18 No.103174434

>>103174430
No, unlike 2.5 72B chat was

Anonymous
11/13/24(Wed)10:53:40 No.103174454

Anonymous 11/13/24(Wed)10:53:40 No.103174454

>>103174412
trannies throwing a fit, ignore

Anonymous
11/13/24(Wed)10:59:58 No.103174511

Anonymous 11/13/24(Wed)10:59:58 No.103174511

>>103174412
Like other anon said.
Two tranny camps war for OP pic with their FOTM waifu of choice, happens every time OP makes non-miku thread.

Anonymous
11/13/24(Wed)11:01:28 No.103174522

Anonymous 11/13/24(Wed)11:01:28 No.103174522

>>103174386
I mean... You aren't wrong.

Anonymous
11/13/24(Wed)11:05:25 No.103174553

Anonymous 11/13/24(Wed)11:05:25 No.103174553

>>103174412
The same as always. The right thread is always the one with the recap btw.

Anonymous
11/13/24(Wed)11:08:47 No.103174589

Anonymous 11/13/24(Wed)11:08:47 No.103174589

>>103174434
So is it actually good for cooming?

Anonymous
11/13/24(Wed)11:09:48 No.103174596

Anonymous 11/13/24(Wed)11:09:48 No.103174596

>>103174553
t. mentally ill mikutranny

Anonymous
11/13/24(Wed)11:10:42 No.103174603

Anonymous 11/13/24(Wed)11:10:42 No.103174603

>>103174589

>>103158694

Anonymous
11/13/24(Wed)11:11:53 No.103174623

Anonymous 11/13/24(Wed)11:11:53 No.103174623

>>103174603
Local model ERP has taught me what purple prose is. And it has taught me that I absolutely hate it.

Anonymous
11/13/24(Wed)11:13:58 No.103174646

Anonymous 11/13/24(Wed)11:13:58 No.103174646

>>103174623
Well that was with a system prompt telling it to be vivid / use all senses.

Anonymous
11/13/24(Wed)11:18:12 No.103174700

Anonymous 11/13/24(Wed)11:18:12 No.103174700

>>103174646
I don't think prompts can do that much. And especially with context filled it. It will always start going to the default model style which is always purple prose poetic slop.

Anonymous
11/13/24(Wed)11:22:17 No.103174742

Anonymous 11/13/24(Wed)11:22:17 No.103174742

File: Qwen32B.png (9 KB, 801x636)

9 KB PNG

Qwen2.5 32B coder one shot tetris for me btw.

https://files.catbox.moe/heo220.py

Anonymous
11/13/24(Wed)11:24:45 No.103174763

Anonymous 11/13/24(Wed)11:24:45 No.103174763

>>103174742
>knows what tetris is
>knows how to code it
>doesn't know how to suck dick the way I want it
Current year dystopia personified

Anonymous
11/13/24(Wed)11:25:08 No.103174767

Anonymous 11/13/24(Wed)11:25:08 No.103174767

>>103174700
They DO do that much if you have a even slightly competent model. Tell it to write in a style of a somewhat popular author and watch.

Anonymous
11/13/24(Wed)11:30:36 No.103174810

Anonymous 11/13/24(Wed)11:30:36 No.103174810

>>103174763
Nothing gets me off like my waifu coding me games on the fly.

Anonymous
11/13/24(Wed)11:33:18 No.103174833

Anonymous 11/13/24(Wed)11:33:18 No.103174833

>>103174810
I want to prompt "send nudes" and get omnimodal-generated nudes immediately without going through hoops of setting up a gymnastics pipeline

Anonymous
11/13/24(Wed)11:34:02 No.103174838

Anonymous 11/13/24(Wed)11:34:02 No.103174838

okay, feed me people
how does one implement memory if you make anything? I've heard that people just tell the AI to make its own memory?

Anonymous
11/13/24(Wed)11:34:16 No.103174841

Anonymous 11/13/24(Wed)11:34:16 No.103174841

>>103174833
We are only 2 years in, give it another.

Anonymous
11/13/24(Wed)11:44:54 No.103174929

Anonymous 11/13/24(Wed)11:44:54 No.103174929

>>103174838
Have the model (or a different, smaller model) summarize whatever needs to be remembered, add it to an embeddings database, query the database for relevant information, inject it into the model's context when needed.
RAG, basically.
If none of that made sense, read on RAG. You'll have to code the stuff together or use something like langchain. It's not something that can be fed through posts.

Anonymous
11/13/24(Wed)11:49:22 No.103174962

Anonymous 11/13/24(Wed)11:49:22 No.103174962

>>103174838
>how do you solve the AI gf problem
We wouldn't be posting here if it was already solved.

Anonymous
11/13/24(Wed)11:52:28 No.103174991

Anonymous 11/13/24(Wed)11:52:28 No.103174991

>>103174929
that sounds basically what was in my mind. Faster model to summarize things and save it and then query it
though it all sounds so ugly. This project I am on right now already does this in plenty of parts, where instead of making algorithms in code it just asks the model. I guess this is the future now, huh?

Anonymous
11/13/24(Wed)12:09:09 No.103175149

Anonymous 11/13/24(Wed)12:09:09 No.103175149

>>103174991
>though it all sounds so ugly
Yeah. It's as reliable as the models used. Never bothered to make something like it, but it'd still be interesting. Maybe one day...
>where instead of making algorithms in code it just asks the model. I guess this is the future now, huh?
If you meant what i think you did, that's not the case for me. I like programming. I take pride on figuring stuff out on my own, even if my implementation is less than optimal.

Anonymous
11/13/24(Wed)12:12:57 No.103175186

Anonymous 11/13/24(Wed)12:12:57 No.103175186

>>103175149
tesla apparently replaced much of of its C++ code with just asking the AI for results, so I meant this is basically what the industry and everything is going to move towards

Anonymous
11/13/24(Wed)12:17:55 No.103175233

Anonymous 11/13/24(Wed)12:17:55 No.103175233

>>103174929
I believe RAG is another grift that tries to sell an alternative to continuous learning. It's a dead end in the long term. But somehow everyone is shilling it, I attended an Nvidia seminar last month and they talked about RAG like it's the holy grail

Anonymous
11/13/24(Wed)12:19:35 No.103175248

Anonymous 11/13/24(Wed)12:19:35 No.103175248

>>103175233
NTA
how does RAG differ from continuous learning?

Anonymous
11/13/24(Wed)12:19:52 No.103175256

Anonymous 11/13/24(Wed)12:19:52 No.103175256

>>103175186
Maybe. I'd trust that statement much more from someone who *isn't* selling AI. There's people that often do long divisions or look for words in a physical dictionary. Some people repair their own cars, draw and play instruments. They're not having fun while programming. I do.

Anonymous
11/13/24(Wed)12:23:03 No.103175283

Anonymous 11/13/24(Wed)12:23:03 No.103175283

>>103175233
>I believe RAG is another grift that tries to sell an alternative to continuous learning
I see it as the best thing we have *until* we get continuous learning, if we ever do. It cannot be a replacement for something we don't have.

Anonymous
11/13/24(Wed)12:26:19 No.103175320

Anonymous 11/13/24(Wed)12:26:19 No.103175320

File: file.png (112 KB, 1258x526)

112 KB PNG

What a rebel model, what about my fucking python script?

Anonymous
11/13/24(Wed)12:45:53 No.103175500

Anonymous 11/13/24(Wed)12:45:53 No.103175500

What's the smallest/fastest uncensored model that can summarize a 22k context long multi-part story? I tried Dolphin Nemo and it failed spectacularly, started inventing plots that didn't exist at all in every attempt. Dolphin finetunes have been good for me in the past but it does say "The base model has 128K context, and our finetuning used 8192 sequence length." So I'm not sure if that's the issue or is Nemo just too stupid for that, didn't try the normal instruct yet. I don't care much about roleplay flavor enhancers, but I'd prefer a decensored model in a way that causes as little brain damage as possible. Mistral Small Instruct seems to remember the story at first try with 80% accuracy (forgot one part)

Anonymous
11/13/24(Wed)12:46:32 No.103175506

Anonymous 11/13/24(Wed)12:46:32 No.103175506

>>103175320
Check your email anon.
Look at how coding sensi does it btw. You need to tell it to give requested scripts / code in code blocks.

Anonymous
11/13/24(Wed)12:48:17 No.103175524

Anonymous 11/13/24(Wed)12:48:17 No.103175524

>>103174767
>They DO do that much if you have a even slightly competent model.
I asked qwen coder to write in a style of ERP forum user and to avoid purple prose and flowery langauge. I asked it to give me 3 different ways a character would talk. And after I saw all the shit I despise I told it:
>It is all so poetic...
And what I got in return is:
>Glad you like it! Now let's continue blah blah blah
At least I had a chuckle at how completely autistic the model is.

Anonymous
11/13/24(Wed)13:02:20 No.103175683

Anonymous 11/13/24(Wed)13:02:20 No.103175683

File: 1653702138732.jpg (60 KB, 900x900)

60 KB JPG

What the fuck are you guys even saying? Is it even English? Half the words you use don't make any sense. I wonder if this is how normies felt about me talking about anime back in 2008.

You guys are weird. I would shove you in a locker if I could.

Anonymous
11/13/24(Wed)13:10:55 No.103175754

Anonymous 11/13/24(Wed)13:10:55 No.103175754

>>103175683
>2008
Akira was at the end of the 80s and GitS came out before 2k. Oh.. you were in school in 2008? I see....

Anonymous
11/13/24(Wed)13:11:29 No.103175757

Anonymous 11/13/24(Wed)13:11:29 No.103175757

>>103174742
Get it to make Tetris but with circles that can roll around if they are jostled by another circle landing nearby.

Anonymous
11/13/24(Wed)13:11:32 No.103175759

Anonymous 11/13/24(Wed)13:11:32 No.103175759

Don't mind the retards >>103174929 >>103174962 they never read papers as always.
Here is your solution without summarizing: https://arxiv.org/abs/2409.05591

Anonymous
11/13/24(Wed)13:17:51 No.103175820

Anonymous 11/13/24(Wed)13:17:51 No.103175820

>>103175683
The only thing you're shoving is groceries in my bag
Speaking of, you should probably get back to work

Anonymous
11/13/24(Wed)13:20:59 No.103175852

Anonymous 11/13/24(Wed)13:20:59 No.103175852

File: only2.png (46 KB, 680x124)

46 KB PNG

>>103175759
I'm talking about things available now that you can do with any good-enough model. That one you linked requires training.

Anonymous
11/13/24(Wed)13:42:43 No.103176056

Anonymous 11/13/24(Wed)13:42:43 No.103176056

>>103175852
You need training to make a memory model, not to use it. You feed it your dataset and link it to whatever model you want as a generator.

Anonymous
11/13/24(Wed)13:59:57 No.103176213

Anonymous 11/13/24(Wed)13:59:57 No.103176213

>>103176056
NTA
are there available "memory models" that one can just use? If not, then one should just use RAG?

Anonymous
11/13/24(Wed)14:01:55 No.103176228

Anonymous 11/13/24(Wed)14:01:55 No.103176228

>>103175500
Dolphin fine tunes have always been over hyped and pretty bad in my experience.
I still remember when people were praising their mistral 8x7b tune only to figure it out that the fine tune script everybody was using was broken.
Try the official instruct fine tune of nemo. It should be able to cope with 22k of text without much weirdness if you use greedy sampling, don't inject unnecessary instructions in the context, etc.
Failing that, deepseek in their website does pretty well with long, long texts.

Anonymous
11/13/24(Wed)14:22:48 No.103176439

Anonymous 11/13/24(Wed)14:22:48 No.103176439

>>103176056
There's a million papers, with a million demos. That's all they are until they're taken seriously either by big model makers or inference software devs. The former doesn't guarantee it, less so the latter.
RAG, as clunky as it is, can be ducktaped together with any inference software that supports embeddings.

Anonymous
11/13/24(Wed)14:23:09 No.103176441

Anonymous 11/13/24(Wed)14:23:09 No.103176441

>>103176228
>Dolphin fine tunes have always been over hyped and pretty bad in my experience.
You are talking about the famous AI researcher Eric Hartford who once said that frankenmerging l3 70B with itself makes it incredibly intelligent and humanlike. Or something like that.

Anonymous
11/13/24(Wed)14:31:36 No.103176497

Anonymous 11/13/24(Wed)14:31:36 No.103176497

>>103164618
Based.
>>103164678
Ultra based.
>>103173457
>>103173457
>>103173457
Giga based.

Anonymous
11/13/24(Wed)15:35:28 No.103176986

Anonymous 11/13/24(Wed)15:35:28 No.103176986

>>103174359
The devs for MaskGCT said they're going to include long form audio.
https://github.com/open-mmlab/Amphion/issues/290

Anonymous
11/13/24(Wed)15:40:38 No.103177034

Anonymous 11/13/24(Wed)15:40:38 No.103177034

>>103176213
Yeah they are there: https://huggingface.co/TommyChien read the paper to understand the difference.
>>103176439
RAG is dumb af as they explained in the same paper. It doesn't know what to retrieve from the memory, which leads to worse generation.

Anonymous
11/13/24(Wed)15:52:48 No.103177154

Anonymous 11/13/24(Wed)15:52:48 No.103177154

>>103176228
So is it always going to be choice between censored with good memory or decensored with bad memory? Is there no way to decensor a model without damaging it?

Anonymous
11/13/24(Wed)15:56:11 No.103177189

Anonymous 11/13/24(Wed)15:56:11 No.103177189

>>103177154
Nemo-instruct isn't censored.
At least it never refused anything I asked of it.
But if you want a fine tune that seemingly didn't make the base model any dumber, try rocinante v1.1.
I can't really speak for models other than Nemo as that's about the largest thing I can comfortably run.
Actually, that's not true. Mixtral 8x7b instruct might actually be able to do what you need too.
Or CommandR, although I don't remember how big the context window on that one is, but I tried it (at excruciatingly low speeds) and it was really good.

Anonymous
11/13/24(Wed)15:56:59 No.103177196

Anonymous 11/13/24(Wed)15:56:59 No.103177196

>>103177034
>RAG is dumb af as they explained in the same paper.
Irrelevant to the original point and your first post in the chain. Just look how anon asked the question: >>103174838. Does it sound like he has any idea of what he's talking about? I gave him *an* answer to his question.
MemoryRAG IS RAG.

Anonymous
11/13/24(Wed)16:02:01 No.103177240

Anonymous 11/13/24(Wed)16:02:01 No.103177240

>>103177196
Sure? Whatever floats your boat, I guess.

Anonymous
11/13/24(Wed)16:07:57 No.103177279

Anonymous 11/13/24(Wed)16:07:57 No.103177279

File: rag.png (97 KB, 857x433)

97 KB PNG

>>103177240
Same boat, mate. Same boat.

Anonymous
11/13/24(Wed)16:29:40 No.103177467

Anonymous 11/13/24(Wed)16:29:40 No.103177467

Aside from just having long context, the second most proper way to do long-context memory would be to somehow divide the work into pieces/layers that can be processed in parallel (or sequentially) and then added together in a way where the entire context gets to affect the generation of next token. Though I don't know if this is even possible. Searching through a web of information might be the only solution, but that has the risk of disregarding important nuances.

Anonymous
11/13/24(Wed)16:39:22 No.103177560

Anonymous 11/13/24(Wed)16:39:22 No.103177560

It'll take a while for the dreamers and the hypemen to admit that ML isn't going to create a god, and some stocks will get hurt bad when it finally sinks in, but longer term it'll be good for a more realistic discourse around LLMs. A lot of the safety bullshit will dry up once people are forced to accept that they're going to cap out at "useful assistants" and can never become a competitor species.

Anonymous
11/13/24(Wed)16:41:03 No.103177573

Anonymous 11/13/24(Wed)16:41:03 No.103177573

What's an entry-level development machine build for local models?

I'm considering 1 3060, upgradeable to 2 or more 3060s but would it be worth it if they're relatively cheaper or is it better to go for a better single GPU build?

What models? No clue yet.

Anonymous
11/13/24(Wed)16:44:06 No.103177597

Anonymous 11/13/24(Wed)16:44:06 No.103177597

>>103164575
Why are LMs more woke than Chatgpt?

Anonymous
11/13/24(Wed)16:45:34 No.103177608

Anonymous 11/13/24(Wed)16:45:34 No.103177608

>>103177573
Bad idea. Get used 3090s or other 24GB+ cards. You'll only have so many slots / motherboards / power connectors.

Anonymous
11/13/24(Wed)16:45:37 No.103177609

Anonymous 11/13/24(Wed)16:45:37 No.103177609

>>103177597
This has only been true since around June when OpenAI started releasing looser finetunes and giving it a more entertaining personality
hopefully other labs will take cues from them

Anonymous
11/13/24(Wed)16:48:03 No.103177632

Anonymous 11/13/24(Wed)16:48:03 No.103177632

File: cudadev.jpg (1.96 MB, 4000x3000)

1.96 MB JPG

>>103177573
Vague question. For reference, CUDA Dev has picrel monster.
You can develop inference software on just a CPU if the models you're working on are small enough. And you can train 100M models as well, but it'll still be slow and tedious. If you can buy a single gpu, buy a 3090.
You need to be more specific than "how to AI".

Anonymous
11/13/24(Wed)16:52:10 No.103177664

Anonymous 11/13/24(Wed)16:52:10 No.103177664

>>103177560
Current models are limited by context and having to start every problem from scratch. As soon as infinite memory is solved, LLMs will be able to consider every detail in the world, generate new information and use it for vastly superior problem-solving compared to humans, scaling linearly by compute you throw at it.

Anonymous
11/13/24(Wed)16:54:23 No.103177679

Anonymous 11/13/24(Wed)16:54:23 No.103177679

>>103177632
Talking about cuda dev, he posted in the real thread, btw: >>103177202
He's not as stupid as you to tell which one is the real one, it seems.

Anonymous
11/13/24(Wed)16:56:41 No.103177704

Anonymous 11/13/24(Wed)16:56:41 No.103177704

>>103177679
You seem to be in love with that codemonkey, go and suck him off i guess

Anonymous
11/13/24(Wed)16:59:28 No.103177736

Anonymous 11/13/24(Wed)16:59:28 No.103177736

>>103177632
>6x4090 on a single PSU
North America bros... why did we get such shitty electrical standards?

Anonymous
11/13/24(Wed)17:03:06 No.103177775

Anonymous 11/13/24(Wed)17:03:06 No.103177775

>>103177736
>SilverStone has already done so with the HELA 2050; as its model number implies, it can deliver up to 2050 W of power with 230 V input. With 115 V input, it is capable of providing 1650 W since standard wall sockets cannot deliver more than 15 A.
haha

Anonymous
11/13/24(Wed)17:13:07 No.103177850

Anonymous 11/13/24(Wed)17:13:07 No.103177850

>>103177608
>>103177632
I appreciate the advice, I know I should be more specific and I've already run Ollama on an i7 4790 cpu-only and it was meh.

I have a potential client that's the kind to jump on every bandwagon (guy bankrupted one of his companies moving to cloud) and now that he wants AI I'm trying to squeeze a dev box (3090/4090) out of him, I don't really do much AI on my day to day so if I end up building it out of pocket I want to go as cheap as possible.

From my knowledge of his different businesses it'll probably be OCR/computer vision or support chatbots but we haven't gone through the specifics.

Anonymous
11/13/24(Wed)17:29:53 No.103177974

Anonymous 11/13/24(Wed)17:29:53 No.103177974

>>103177679
Wait. People can use multiple threads at the same time? No waaaaaaayyyy

Anonymous
11/13/24(Wed)17:48:29 No.103178135

Anonymous 11/13/24(Wed)17:48:29 No.103178135

>>103177850
Get whatever you can get that allows you to upgrade the most if needed. A 12gb gpu will become clutter and wasted money if you need to go monster build. Buying a mobo with 4 ddr4 ram slots will be a waste if you need to go cpumaxxing.... you get the point. But not yet...
Practice vision stuff with small models like
>https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5
Or whatever you can already run on your setup. If you're going for support bot, just run llama-3.2-1b until you have a UI to show. Figure out what you can do with simple tools. Then, *once you know you can do it* and you tell him your realistic expectations, ask for a budget for a big build where you can develop more comfortably, safe in the knowledge that the product you make cannot possibly be worse than your demo.
He's a griter. I hope you aren't.

Anonymous
11/13/24(Wed)17:50:11 No.103178149

Anonymous 11/13/24(Wed)17:50:11 No.103178149

>>103177679
You really are mentally ill. And I am not even trying to insult you at this point.

Anonymous
11/13/24(Wed)18:15:04 No.103178368

Anonymous 11/13/24(Wed)18:15:04 No.103178368

File: file.png (523 KB, 768x768)

523 KB PNG

Anonymous
11/13/24(Wed)18:19:18 No.103178404

Anonymous 11/13/24(Wed)18:19:18 No.103178404

>>103178135
Yeah, there's zero value in adding AI to anything that guy does but might as well get something out of it.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.