/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 12/17/24(Tue)19:03:46 No.103554929

File: ComfyUI_01194_.png (3.89 MB, 1536x2304)

3.89 MB PNG

/lmg/ - Local Models General Anonymous 12/17/24(Tue)19:03:46 No.103554929 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Tuesday is Over Edition

Previous threads: >>103545710 & >>103536775

►News
>(12/17) Falcon3 models released, including b1.58 quants: https://hf.co/blog/falcon3
>(12/16) Apollo: Qwen2.5 models finetuned by Meta GenAI for video understanding: https://hf.co/Apollo-LMMs/Apollo-7B-t32
>(12/14) CosyVoice2-0.5B released: https://funaudiollm.github.io/cosyvoice2
>(12/14) Qwen2VL support merged: https://github.com/ggerganov/llama.cpp/pull/10361
>(12/13) Sberbank releases Russian model based on DeepseekForCausalLM: https://hf.co/ai-sage/GigaChat-20B-A3B-instruct

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/tldrhowtoquant

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/hsiehjackson/RULER
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
12/17/24(Tue)19:04:22 No.103554934

Anonymous 12/17/24(Tue)19:04:22 No.103554934

File: __hatsune_miku_and_kasane(...).jpg (229 KB, 2048x1710)

229 KB JPG

►Recent Highlights from the Previous Thread: >>103545710

--Paper: FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores:
>103547272 >103547808
--Papers:
>103547261
--Intel Arc updates and LLM running solutions:
>103552251
--Falcon3 family of open models and their performance:
>103547725 >103547743 >103547837 >103547787 >103547898 >103549931 >103547788 >103547888 >103548124
--Anon discusses and compares text-to-speech models, including CosyVoice2:
>103546353 >103546456 >103546944 >103547034 >103547061 >103547688 >103547800 >103553458 >103553621
--Anons discuss a suspicious RTX 4090 listing on AliExpress and share their experiences with Chinese online marketplaces:
>103550949 >103551009 >103551689 >103551775 >103552164 >103552204 >103551035 >103551205
--Discussion on the effectiveness and comparison of bitnet models:
>103553433 >103553448 >103554089 >103553456 >103553486 >103553570 >103553599
--Impact of switching from FP16 to int8 inference on model accuracy:
>103546155 >103546208 >103549263
--Anon seeks dust proofing solutions for open mining rig with 3090s:
>103553137 >103553183 >103553339 >103553354
--Regex and small model approaches to rewriting sentences:
>103549331 >103549353
--Gemma 2 9B model's performance in creative writing tasks:
>103546296 >103546512
--Llama.cpp Vulkan updates and Nvidia involvement:
>103550656
--FOSDEM 2025: Quantization in llama.cpp:
>103550704
--Anon asks about running Linux with Windows VM for gaming and LLM use:
>103549612 >103549760 >103549709 >103549854
--Anon gets Cosyvoice 0.5b working, shares audio sample:
>103547577 >103549538 >103554651
--Anon discovers speculative decoding for speedup:
>103549662 >103549673 >103549762 >103549842 >103549866 >103549863 >103549952
--Miku (free space):
>103546325 >103548490 >103548592

►Recent Highlight Posts from the Previous Thread: >>103545718

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script

Anonymous
12/17/24(Tue)19:07:17 No.103554963

Anonymous 12/17/24(Tue)19:07:17 No.103554963

Is EVA a meme or is it actually the SOTA RP model?

Anonymous
12/17/24(Tue)19:08:14 No.103554971

Anonymous 12/17/24(Tue)19:08:14 No.103554971

>>103554963
It's legitimately amazing.

Anonymous
12/17/24(Tue)19:08:40 No.103554976

Anonymous 12/17/24(Tue)19:08:40 No.103554976

File: 1734104063734683.webm (449 KB, 1280x692)

449 KB WEBM

"You're a MI6 mathematics specialist. One day you receive a satellite phone call from a unit operating on the ground in the enemy territory (six people, each can drive a truck). They say you need to help them ASAP. They managed to steal ten trucks, each with full fuel tank. They can easily fit into one truck, but have no spare fuel canisters, and can't transfer fuel to one truck only. But they managed to obtain some hose, so they can transfer fuel from one truck's fuel tank to the other trucks fuel tank or tanks, but only if there's room there. They ask you how they should use these ten trucks to get away as far as only possible."

Not even o1 pro PhD can solve this incredibly complex problem.

Anonymous
12/17/24(Tue)19:10:59 No.103554997

Anonymous 12/17/24(Tue)19:10:59 No.103554997

>>103554963
It's shit a priori.

Anonymous
12/17/24(Tue)19:11:41 No.103555002

Anonymous 12/17/24(Tue)19:11:41 No.103555002

>>103554963
It's a meme. Just lurk on the thread and notice how no one shares a single interesting log using it.

Anonymous
12/17/24(Tue)19:13:13 No.103555018

Anonymous 12/17/24(Tue)19:13:13 No.103555018

>>103554963
Its good but schizo, gota clamp down the temp.

Now, your gonna dismiss this because its a 9B but I legit suggest trying this: >>103546296

Anonymous
12/17/24(Tue)19:13:51 No.103555026

Anonymous 12/17/24(Tue)19:13:51 No.103555026

I suspect that most of the hype for EVA just comes from the skillets of /lmg/ experiencing a model for the first time with a decent sampler setup considering all the hard work llama3.3 anon did.
The model itself isn't really anything special but /lmg/ can't come up with proper sampler settings for shit so having them spoonfed like this tricks all the skillets into believing that they're running local claude until the honeymoon phase wears off.

Anonymous
12/17/24(Tue)19:16:17 No.103555050

Anonymous 12/17/24(Tue)19:16:17 No.103555050

>llama3.3 anon
lol
lmao even
Now i'm convinced this guy has some mental issues, or is just desperate for attention.

Anonymous
12/17/24(Tue)19:17:43 No.103555065

Anonymous 12/17/24(Tue)19:17:43 No.103555065

>>103554963
>SOTA RP model?
That's still Largestral

Anonymous
12/17/24(Tue)19:18:42 No.103555071

Anonymous 12/17/24(Tue)19:18:42 No.103555071

>>103555065
Largestral is dry and boring, even Nemo is more interesting.

Anonymous
12/17/24(Tue)19:19:07 No.103555075

Anonymous 12/17/24(Tue)19:19:07 No.103555075

>>103555026
I don't use that guy's settings and I think he is an annoying retard, but the model is legitimately very good imo, I would put it up there with largestral and tunes thereof and it's way smaller and less demanding to run
you shouldn't be put off it because some attention seeking fag decided to make it his thing

Anonymous
12/17/24(Tue)19:19:40 No.103555081

Anonymous 12/17/24(Tue)19:19:40 No.103555081

>>103555071
Can't argue with that but I prefer its smarts over 70/72bs forgetting basic shit in the middle of a roleplay

Anonymous
12/17/24(Tue)19:20:48 No.103555091

Anonymous 12/17/24(Tue)19:20:48 No.103555091

>>103555050
"Now"? I assume you are new here

Anonymous
12/17/24(Tue)19:20:57 No.103555092

Anonymous 12/17/24(Tue)19:20:57 No.103555092

>>103555026
It's mostly organized shilling. We've seen that with anthracite a few months back. Best models ever, presumably. Now that they're out of free compute and they've got their name out, it's some other discord clique's turn to repeat the same and leech off the local LLM user community.

L3.3fag !!SB6Q3O4XU7f
12/17/24(Tue)19:22:28 No.103555103

L3.3fag !!SB6Q3O4XU7f 12/17/24(Tue)19:22:28 No.103555103

>>103554963
Hating on it without having ever touched it is more of a meme at this point. Specifically the "STOP HAVING FUN" meme. Some people are compelled to hate on things just because someone else likes them, I guess.
That being said, I don't know if it's absolute SOTA, since I can't run models larger than 70B, but I definitely consider it the best 70B we have right now.

>>103555026
I don't know if I would describe fucking around with it and documenting it in the occasional post "hard work", really.

Anonymous
12/17/24(Tue)19:23:20 No.103555112

Anonymous 12/17/24(Tue)19:23:20 No.103555112

>>103555026
That guy's sampler setup is completely retarded though

Anonymous
12/17/24(Tue)19:26:41 No.103555137

Anonymous 12/17/24(Tue)19:26:41 No.103555137

Miqu is better than EVA 3.33, and no one can prove me wrong.

Anonymous
12/17/24(Tue)19:29:43 No.103555160

Anonymous 12/17/24(Tue)19:29:43 No.103555160

>>103554976
>They ask you how they should use these ten trucks to get away as far as only possible.
>as far as only possible
is an unspecified point. Past their base even? And "as *only* possible". Certainly you're not asking them to go an impossible distance.
To go as far as possible, though i'm not sure it'd work: have a driver in each of six truck. Have front truck tow all other trucks. When it runs out of fuel, abandon it, front driver goes to second truck, second truck, (now first out of five) tows the rest. Repeat. To make it even less realistic, add the other 4 driver-less trucks to the chain at the end. In my universe, they don't swerve off.
Whoever phrased that riddle is a retard. There's more noise than information.

Anonymous
12/17/24(Tue)19:36:48 No.103555240

Anonymous 12/17/24(Tue)19:36:48 No.103555240

File: mfw this shit.gif (2.31 MB, 200x200)

2.31 MB GIF

What's the best free website to try to gen a video?

Anonymous
12/17/24(Tue)19:38:07 No.103555253

Anonymous 12/17/24(Tue)19:38:07 No.103555253

>>103554976
This is a tricky question, isn't it? LLMs are terrible at those.
My guess is that the answer is: detach the fuel tanks of the other trucks and load them on the back, if that's not possible then there's nothing they can do since the fuel tanks are already full.

Anonymous
12/17/24(Tue)19:39:03 No.103555262

Anonymous 12/17/24(Tue)19:39:03 No.103555262

>>103555240
hailuoai
3 gens a day :)

Anonymous
12/17/24(Tue)19:46:55 No.103555346

Anonymous 12/17/24(Tue)19:46:55 No.103555346

>>103554963
It's better than Opus
Fight me

Anonymous
12/17/24(Tue)19:48:16 No.103555364

Anonymous 12/17/24(Tue)19:48:16 No.103555364

>>103554976
reminds me of asparagus staging from ksp

Anonymous
12/17/24(Tue)19:50:55 No.103555398

Anonymous 12/17/24(Tue)19:50:55 No.103555398

File: 34.png (17 KB, 825x164)

17 KB PNG

>>103554976

Anonymous
12/17/24(Tue)19:51:49 No.103555407

Anonymous 12/17/24(Tue)19:51:49 No.103555407

do miqu, eva, etc work for generating japanese text

i could do a finetune myself by pulling text out of my library of japanese ebooks i guess but i've never done that before

Anonymous
12/17/24(Tue)19:56:08 No.103555454

Anonymous 12/17/24(Tue)19:56:08 No.103555454

>>103555407
>japanese text
There are a lot of models that can converse in good or even great Japanese. What kind of use-case/resources do you have? The best ones are the biggest.

Anonymous
12/17/24(Tue)20:00:26 No.103555496

Anonymous 12/17/24(Tue)20:00:26 No.103555496

>people fighting about whether eva is good or not meanwhile no one is posting logs to prove their point
Faggots fanning the the console war on both sides need to stfu or post something of actual substance.

Anonymous
12/17/24(Tue)20:00:48 No.103555500

Anonymous 12/17/24(Tue)20:00:48 No.103555500

Resources: 3090 in a relatively powerful desktop (64 GB of memory) from a few years ago.

Use case: mostly ERP (or rather story writing) in the style of those books, say a corpus of about 1M characters (not sure how many tokens that comes out to). I think I'll probably have to finetune anyway to get exactly what I want, but it'd be good to start from a baseline model that can understand and produce good Japanese.

Anonymous
12/17/24(Tue)20:01:10 No.103555504

Anonymous 12/17/24(Tue)20:01:10 No.103555504

Anon says, as he refrains from posting logs himself.

Anonymous
12/17/24(Tue)20:02:41 No.103555517

Anonymous 12/17/24(Tue)20:02:41 No.103555517

>>103555504
Who are you talking to?

Anonymous
12/17/24(Tue)20:03:47 No.103555524

Anonymous 12/17/24(Tue)20:03:47 No.103555524

>>103555496
There has been at least 10 logs over the past few threads pro eva, the nala ones just last thread for instance, there has not been a single one against it atm.

Anonymous
12/17/24(Tue)20:04:38 No.103555536

Anonymous 12/17/24(Tue)20:04:38 No.103555536

>>103555517
The anon before the faggot who posted right at the same time as me.

Anonymous
12/17/24(Tue)20:11:55 No.103555604

Anonymous 12/17/24(Tue)20:11:55 No.103555604

>>103555407
use qwen it always outputs chinese which is a far more powerful language

Anonymous
12/17/24(Tue)20:13:13 No.103555611

Anonymous 12/17/24(Tue)20:13:13 No.103555611

>>103555604
>it always outputs chinese
I have yet to have that happen. I see others saying using rep pen does that.

Anonymous
12/17/24(Tue)20:14:55 No.103555627

Anonymous 12/17/24(Tue)20:14:55 No.103555627

>>103555407
I've been meaning to ask this because I've been seeing this since around when local models started to become popular, but is it just one guy asking about Japanese translation or is it really that pressing of an issue?

Anonymous
12/17/24(Tue)20:17:58 No.103555659

Anonymous 12/17/24(Tue)20:17:58 No.103555659

>>103555627
It's a very pressing of an issue. Although I care more about translation than about generating japanese text.

Anonymous
12/17/24(Tue)20:19:13 No.103555673

Anonymous 12/17/24(Tue)20:19:13 No.103555673

>>103555659
But is it always you asking? Because you could have learned Japanese to a high enough level in the time you've been waiting,

Anonymous
12/17/24(Tue)20:19:17 No.103555674

Anonymous 12/17/24(Tue)20:19:17 No.103555674

As the guy who asked above: I care about text generation because I'm used to reading Japanese erotic novels but never read stuff like that in English

Anonymous
12/17/24(Tue)20:21:10 No.103555688

Anonymous 12/17/24(Tue)20:21:10 No.103555688

>>103555611
I think there is/was an error in llama.cpp integration. Might have been fixed since but back then, if you didn't enable flash attention (still not default I believe), qwen was sometimes outputting chinese or gibberish. I know that I disliked qwen at first because of that issue and found solution in some opened llama.cpp issue.

Anonymous
12/17/24(Tue)20:21:52 No.103555694

Anonymous 12/17/24(Tue)20:21:52 No.103555694

>>103555673

I don't reply on 4chan much (as you can see from me forgetting to hit reply correctly) so it's not me at least. I speak/read Japanese fluently, but the reason I want text generation is the same reason I'd want it in English or that anyone does ERP with LLMs: it's much less work than writing and if I just want some exciting slop to jerk off to I'm not going to bother writing a whole novel when I can just prompt a model with an outline of what I'd like.

Anonymous
12/17/24(Tue)20:23:31 No.103555712

Anonymous 12/17/24(Tue)20:23:31 No.103555712

>>103555346
You shitpost, but I feel like when local models have unambiguously reached that level nobody is ever going to accept it
Opus is to /lmg/ and /aicg/ as Summer Dragon was to /aids/

Anonymous
12/17/24(Tue)20:26:52 No.103555744

Anonymous 12/17/24(Tue)20:26:52 No.103555744

>>103555673
NTA but I'm probably the Anon that cares the most about Japanese LLMs in this general and I know for a fact that I DON'T have multiple personality disorder.

And yes, I've been learning Japanese! I'm currently good enough to watch some anime without subtitles but my vocabulary is still subpar for Japanese literature.

Anonymous
12/17/24(Tue)20:27:09 No.103555747

Anonymous 12/17/24(Tue)20:27:09 No.103555747

>>103555103
>L3.3 man gives positive opinion on model
That means to discard the model. His whole schtick is coping into getting bad models to give 1 good output and pretending it's all suddenly better.

Anonymous
12/17/24(Tue)20:28:44 No.103555763

Anonymous 12/17/24(Tue)20:28:44 No.103555763

File: Q5_K_L.png (29 KB, 669x181)

29 KB PNG

>>103554929
Is there a big difference between Q4_K_L and Q4_K_M? I noticed it says 'Uses Q8_0 for embed and output weights', but what exactly does that do for the final output?

Anonymous
12/17/24(Tue)20:29:44 No.103555774

Anonymous 12/17/24(Tue)20:29:44 No.103555774

File: ComfyUI_01238_.png (919 KB, 848x1024)

919 KB PNG

>>103555071
>It's dry
Just use the Behemoth tune. v2.1 is a good mix of smarts and more creative prose. Only issue I've had with it has been occasional swipes where it takes actions for {{user}}.
>>103555137
kek, miqu really was magical

Anonymous
12/17/24(Tue)20:31:03 No.103555793

Anonymous 12/17/24(Tue)20:31:03 No.103555793

>>103554283
if you have some time to test i would be interested in a second opinion, my usecase is "gpt/claude but it has a personality and doesn't say no" and for that gemma mogs other models cause it's the smartest in it's class imo, it especially does well with stuff like total context switches in the middle of a conversation like "sorry for the context switch what's a RAT in an airplane context?"
other models tend to make shit up or define it in the context of the overall conversation like "Random Access Trojan" or whatever, gemma is the only one that gets "Ram Air Turbine" consistently
i run it with self-extend with 16k context no problem for documentation RAG etc

Anonymous
12/17/24(Tue)20:38:46 No.103555856

Anonymous 12/17/24(Tue)20:38:46 No.103555856

>>103555747
>Using coping as a verb
Go back

Anonymous
12/17/24(Tue)20:46:23 No.103555924

Anonymous 12/17/24(Tue)20:46:23 No.103555924

>>103555673
Not that guy, but I also ask about it sometimes. Once there's a way to fit an LLM and a high quality voice model on a 24gb card, I'm going to exclusively fap to jap erp since having my waifu speak in her natural language will be less jarring than hearing her speak constantly in engrish

Anonymous
12/17/24(Tue)20:48:51 No.103555954

Anonymous 12/17/24(Tue)20:48:51 No.103555954

i'm willing to gen a response to a card of their choosing for 3.3 eva to see if it's their cup of tea or not. not doing pdf shit.

Anonymous
12/17/24(Tue)20:50:27 No.103555970

Anonymous 12/17/24(Tue)20:50:27 No.103555970

>>103555763
the differences aren't really super noticable, just run the biggest quant you can fit in vram up to like q6, above that it becomes placebo, iMatrix quants are better than qX_K_Y and those are better than qX_0

Anonymous
12/17/24(Tue)20:50:40 No.103555977

Anonymous 12/17/24(Tue)20:50:40 No.103555977

>>103555954
lolis are the best use case for local LLMs, fag

Anonymous
12/17/24(Tue)20:50:51 No.103555980

Anonymous 12/17/24(Tue)20:50:51 No.103555980

>>103555954
https://www.chub.ai/characters/NovelDraft/osaka-but-with-gigantic-breasts-9-greetings-40111dd15f96
*cums on ur face*

Anonymous
12/17/24(Tue)20:50:52 No.103555981

Anonymous 12/17/24(Tue)20:50:52 No.103555981

>xtts2 is the gold standard imo, it's not perfect but it's fast and easy to use, good enough and low effort

I'm going to kill you.

Anonymous
12/17/24(Tue)20:52:12 No.103556003

Anonymous 12/17/24(Tue)20:52:12 No.103556003

>>103555980
>48k tokens
What in the

Anonymous
12/17/24(Tue)20:53:50 No.103556016

Anonymous 12/17/24(Tue)20:53:50 No.103556016

>>103555954
https://chub.ai/characters/boner/amelia-dbae3daacd4f

Anonymous
12/17/24(Tue)20:54:24 No.103556024

Anonymous 12/17/24(Tue)20:54:24 No.103556024

>another episode of anons non understanding how random works...

Anonymous
12/17/24(Tue)20:56:23 No.103556051

Anonymous 12/17/24(Tue)20:56:23 No.103556051

>>103555980
Imagine having to reprocess the entire prompt at every message lol

Anonymous
12/17/24(Tue)20:57:31 No.103556060

Anonymous 12/17/24(Tue)20:57:31 No.103556060

>>103556016
based

Anonymous
12/17/24(Tue)20:59:03 No.103556078

Anonymous 12/17/24(Tue)20:59:03 No.103556078

Someone has an offline archive of chub?

Anonymous
12/17/24(Tue)21:05:16 No.103556136

Anonymous 12/17/24(Tue)21:05:16 No.103556136

File: chub.png (2 KB, 165x249)

2 KB PNG

>>103556078
I have one. You can make your own
>https://github.com/ayofreaky/local-chub
I changed a few things, but it works just fine as is.
I started my sync with
>https://mega.nz/folder/oPg0HZyR#Iaf3CV1A_jiuDDDq1QBk-Q
I don't know if that archive still works of if its contents get updated.

Anonymous
12/17/24(Tue)21:08:04 No.103556151

Anonymous 12/17/24(Tue)21:08:04 No.103556151

>>103556136
Thanks I'll try that

Anonymous
12/17/24(Tue)21:09:20 No.103556159

Anonymous 12/17/24(Tue)21:09:20 No.103556159

>>103555954
>pdf
Pedo shit anon.
Also, Nala, as is the tradition.

Anonymous
12/17/24(Tue)21:13:14 No.103556190

Anonymous 12/17/24(Tue)21:13:14 No.103556190

>>103556078
Not chub, but there's auto's janitor ai dump: https://huggingface.co/datasets/AUTOMATIC/jaicards/
also this: https://char-archive.evulid.cc/#/takeout.html
chub-07152023-7.9k.zip exists, but old

Anonymous
12/17/24(Tue)21:14:19 No.103556202

Anonymous 12/17/24(Tue)21:14:19 No.103556202

>>103555981
my body is ready, but before you do what's actually good so i can shill the correct thing in the future?

Anonymous
12/17/24(Tue)21:18:47 No.103556232

Anonymous 12/17/24(Tue)21:18:47 No.103556232

File: localchub.png (1 KB, 297x132)

1 KB PNG

>>103556151
If you're gonna leave it running with the auto-update, change picrel line so it doesn't chug on your cpu for no reason.
There's also aetherroom.club. They give you the sqlite db to download directly, which is very nice.
>https://aetherroom.club/backup.db
Just text on those.

Anonymous
12/17/24(Tue)21:23:47 No.103556265

Anonymous 12/17/24(Tue)21:23:47 No.103556265

>local models
Just something I discovered recently by accident. A few years ago some guy put out a paper (https://arxiv.org/abs/2106.03037) looking into small models of a few k parameters for simple processes, and used guitar amp simulation to demonstrate how it can be done. Someone picked it up, tools got made, and people have been sampling their setups and sharing the models for a couple years now. https://tonehunt.org/models seems to be the main site. The quality of the simulation is pretty impressive, at least on the popular/most downloaded models I tried, and it runs in real time with very low latency. Doesn't have that shitty flat quality like the amp sims I've tried over the years. And everything is free and open source. I'm wondering if you could train the models to not amplify the noise though, because it's quite sensitive to audio interface noise. What is amusing to me is I usually think of guitar players as being technology averse, and if you asked me if this kind of thing could happen I'd laugh.

Anonymous
12/17/24(Tue)21:31:26 No.103556314

Anonymous 12/17/24(Tue)21:31:26 No.103556314

>>103556265
>What is amusing to me is I usually think of guitar players as being technology averse, and if you asked me if this kind of thing could happen I'd laugh.
I play a little bass guitar and i love writing audio synths and fucking around with midi. Plenty of people out there using digital amps and effects, this is just an extension of it. If a thing makes cool sounds and it's cheap, people will use it.

Anonymous
12/17/24(Tue)21:38:50 No.103556357

Anonymous 12/17/24(Tue)21:38:50 No.103556357

>>103554976
grab one of the nearby corpses rip out the stomach and stuff it with gasoline repeat until all the gas can be carried with thyself
>not enougn room in the truck
attach on top like the gypsies do

Anonymous
12/17/24(Tue)21:39:23 No.103556360

Anonymous 12/17/24(Tue)21:39:23 No.103556360

haven't been here in a few months
what's the best model(s) i can run with 8gb vram

Anonymous
12/17/24(Tue)21:40:18 No.103556367

Anonymous 12/17/24(Tue)21:40:18 No.103556367

>>103556159

>>103552196

Anonymous
12/17/24(Tue)21:40:30 No.103556370

Anonymous 12/17/24(Tue)21:40:30 No.103556370

>>103556360
mistral nemo 12B.

Anonymous
12/17/24(Tue)21:44:31 No.103556406

Anonymous 12/17/24(Tue)21:44:31 No.103556406

>>103555954
https://characterhub.org/characters/Enoch/verchiel-bfda1093
Or any of this guy's cards really. Smaller/shittier models never seem to work well with them.
>>103556265
This doesn't come as much of a surprise to me desu, music production has always been pretty tech-heavy. A lot of musical instruments come with a shitload of filters nowadays, especially pianos and guitars.

Anonymous
12/17/24(Tue)21:49:38 No.103556447

Anonymous 12/17/24(Tue)21:49:38 No.103556447

>>103556360
Llama 3B

Anonymous
12/17/24(Tue)22:04:43 No.103556558

Anonymous 12/17/24(Tue)22:04:43 No.103556558

>>103556265
I think you could try artificially adding noise to the training data. the models are usually small enough that you can train a decent RNN on a colab cpu on ~3 mins of data

Anonymous
12/17/24(Tue)22:10:30 No.103556592

Anonymous 12/17/24(Tue)22:10:30 No.103556592

>>103555774
I wish people start mention what quants they use when recommending models. Largestral loses so much smarts below 5bpw, that dumbing it down with finetunes doesn't matter

Anonymous
12/17/24(Tue)22:21:01 No.103556655

Anonymous 12/17/24(Tue)22:21:01 No.103556655

>>103556367
Anon says it's Rocinante v1.1

>>103552766

Anonymous
12/17/24(Tue)22:35:22 No.103556762

Anonymous 12/17/24(Tue)22:35:22 No.103556762

>>103556655
That makes sense. I couldn't get gore in L3.3 without things breaking down.

Anonymous
12/17/24(Tue)22:51:32 No.103556874

Anonymous 12/17/24(Tue)22:51:32 No.103556874

File: IMG_2154.jpg (62 KB, 647x621)

62 KB JPG

We’re not getting anything good for Christmas are we..

Anonymous
12/17/24(Tue)22:53:41 No.103556896

Anonymous 12/17/24(Tue)22:53:41 No.103556896

>>103556655
yea anon lied. big surprise huh?

Anonymous
12/17/24(Tue)22:56:03 No.103556911

Anonymous 12/17/24(Tue)22:56:03 No.103556911

>>103556655
They'll put a L3.3 on any log that looks good

Anonymous
12/17/24(Tue)23:01:00 No.103556950

Anonymous 12/17/24(Tue)23:01:00 No.103556950

>>103556655
What the fuck are you guys talking about. This post >>103552766 was a response to a question about this post >>103552319.
However, this anon >>103556367 is talking about this post >>103552196. As you can see, >>103552196 and >>103552319 are not the same post, and not the same ST setup. Unless the original poster of the actual screenshot in question comes back to prove what model he used, we simply don't know.

Do you guys not use 4chanx or something? How was this even confused.

Anonymous
12/17/24(Tue)23:08:51 No.103556992

Anonymous 12/17/24(Tue)23:08:51 No.103556992

https://www.reddit.com/r/LocalLLaMA/comments/1hgri8g/has_apollo_disappeared/

Anonymous
12/17/24(Tue)23:12:34 No.103557013

Anonymous 12/17/24(Tue)23:12:34 No.103557013

>>103556992
This was the one that could read videos, right? Was it any good?

Anonymous
12/17/24(Tue)23:16:07 No.103557039

Anonymous 12/17/24(Tue)23:16:07 No.103557039

File: nala reroll1.png (95 KB, 934x440)

95 KB PNG

>>103556950
i already told you it's 3.3 eva. here's a quick, but worse re-roll with gore that people say it doesn't do.

Anonymous
12/17/24(Tue)23:19:28 No.103557061

Anonymous 12/17/24(Tue)23:19:28 No.103557061

>>103556911
It IS 3.3 eva. Can you not read?
>>103557039

Anonymous
12/17/24(Tue)23:19:36 No.103557063

Anonymous 12/17/24(Tue)23:19:36 No.103557063

>>103557013
didn't get to try it, seems they're wanting to go api tho can't link for some reason but check the readme linked on reddit

Anonymous
12/17/24(Tue)23:20:43 No.103557071

Anonymous 12/17/24(Tue)23:20:43 No.103557071

File: file.png (50 KB, 1527x354)

50 KB PNG

>>103557063
or this i'm sleepy and retarded

Anonymous
12/17/24(Tue)23:23:24 No.103557080

Anonymous 12/17/24(Tue)23:23:24 No.103557080

>>103557063
>>103557071
Damn now I actually want to try it. Hope someone with the weights reups them.

L3.3fag !!SB6Q3O4XU7f
12/17/24(Tue)23:29:58 No.103557123

L3.3fag !!SB6Q3O4XU7f 12/17/24(Tue)23:29:58 No.103557123

File: kazuko_mutagen.png (96 KB, 1008x646)

96 KB PNG

>>103556762
L3.3 can't do gore? Interesting, Eva had no qualms about Cronenberging poor Kazuko (my "punching-bag" card, the one I test all the things that might run afoul of alignment or positivity bias on).

Anonymous
12/17/24(Tue)23:31:57 No.103557137

Anonymous 12/17/24(Tue)23:31:57 No.103557137

Imagine forming your identity around trying to prove a below average model is good.

Anonymous
12/17/24(Tue)23:37:51 No.103557179

Anonymous 12/17/24(Tue)23:37:51 No.103557179

>>103557137
the model isn't incredible. it's usable. i don't understand the hatred for it. must be because of the l3 namefag. they just hate namefags ig.

Anonymous
12/17/24(Tue)23:41:13 No.103557201

Anonymous 12/17/24(Tue)23:41:13 No.103557201

I hate shills

Anonymous
12/17/24(Tue)23:41:16 No.103557202

Anonymous 12/17/24(Tue)23:41:16 No.103557202

>>103557179
>namefag
It's a tripfag, you newfag

Anonymous
12/17/24(Tue)23:43:36 No.103557212

Anonymous 12/17/24(Tue)23:43:36 No.103557212

>>103557179
Not only do they identify themselves that the model user, they value that identity so strongly that they protect it with a tripcode. An entire persona dedicated to wrangling a decidedly bland model.
Why did they choose this hill in particular to die on. Why is L3.3 so special to them that they must weigh in on everyone's use case? It's annoying.

L3.3fag !!SB6Q3O4XU7f
12/17/24(Tue)23:50:31 No.103557287

L3.3fag !!SB6Q3O4XU7f 12/17/24(Tue)23:50:31 No.103557287

>>103557212
In ongoing discussions, remaining identifiable is useful. You're just mad you don't get to add noise to the signal.

Anonymous
12/17/24(Tue)23:52:12 No.103557300

Anonymous 12/17/24(Tue)23:52:12 No.103557300

>>103557287
>you don't get to add noise to the signal.
Pray I don't decide to devote more time to "adding noise" to your signal.

Anonymous
12/17/24(Tue)23:59:49 No.103557377

Anonymous 12/17/24(Tue)23:59:49 No.103557377

>>103557287
This is a dead general, and your opinions are as valuable as the ones from any other anon. You should feel ashamed.

The QwQoomer !6aQrONYZ.k
12/18/24(Wed)00:02:59 No.103557402

The QwQoomer !6aQrONYZ.k 12/18/24(Wed)00:02:59 No.103557402

Greetings fellow LLM fans. I am the QwQoomer and I am here to convince you on how QwQ is still good!

L3.3fag !!SB6Q3O4XU7f
12/18/24(Wed)00:03:52 No.103557409

L3.3fag !!SB6Q3O4XU7f 12/18/24(Wed)00:03:52 No.103557409

>>103557377
Hardly dead, and I never claimed to be an authority. So... ashamed of what exactly?

The QwQoomer !6aQrONYZ.k
12/18/24(Wed)00:05:15 No.103557422

The QwQoomer !6aQrONYZ.k 12/18/24(Wed)00:05:15 No.103557422

>>103557409
Ashamed you are not using QwQ of course! How can you justify using that bulky and lobotomized 70B model when we can watch intelligence unfold by prompting with QwQ!

Anonymous
12/18/24(Wed)00:06:59 No.103557433

Anonymous 12/18/24(Wed)00:06:59 No.103557433

Man mistral or chinks better cook something up soon or these threads will hit the absolute bottom.
What 4 months without a small good model will do to anons.

Anonymous
12/18/24(Wed)00:08:22 No.103557439

Anonymous 12/18/24(Wed)00:08:22 No.103557439

>>103554976
lmao, everyone getting this wrong except for the anon that said "asparagus staging", guess 4chan is just as retarded as o1

Anonymous
12/18/24(Wed)00:09:00 No.103557443

Anonymous 12/18/24(Wed)00:09:00 No.103557443

>>103557402
>>103557422
QwQ is literally good though.

Anonymous
12/18/24(Wed)00:09:09 No.103557444

Anonymous 12/18/24(Wed)00:09:09 No.103557444

Can i run a decent ai to study biology (ncbi journals, etc.) on a GTX 1060 3GB? or am i gonna need those gay open ai plugins for google scholar

The QwQoomer !6aQrONYZ.k
12/18/24(Wed)00:10:55 No.103557460

The QwQoomer !6aQrONYZ.k 12/18/24(Wed)00:10:55 No.103557460

>>103557443
Of course it is. That's why I remind you all of its presence by crowing myself the QwQoomer. For I am such an expert on QwQ that all discussion on its function and prompting must refer back to me meee MEEEE.

Anonymous
12/18/24(Wed)00:12:01 No.103557470

Anonymous 12/18/24(Wed)00:12:01 No.103557470

>>103557422
>>103557460
Oh great enlightened, please teach me your ways!

The QwQoomer !6aQrONYZ.k
12/18/24(Wed)00:13:29 No.103557484

The QwQoomer !6aQrONYZ.k 12/18/24(Wed)00:13:29 No.103557484

>>103557470
Just keep swiping until you get a response you like. Edit if it takes too long!

Anonymous
12/18/24(Wed)00:13:57 No.103557491

Anonymous 12/18/24(Wed)00:13:57 No.103557491

>>103557444
definitely not, 3GB is not enough for anything usable, the best you could do is run an embedding model and feed it a bunch of your study material so you could get really good fuzzy search, like you could type a question and get a bunch of passages highlighted in the literature that are semantically close to the question

Anonymous
12/18/24(Wed)00:14:49 No.103557500

Anonymous 12/18/24(Wed)00:14:49 No.103557500

>>103557491
>embedded model
any recommendations?

Anonymous
12/18/24(Wed)00:15:33 No.103557509

Anonymous 12/18/24(Wed)00:15:33 No.103557509

>>103557402
omg it The QwQoomer haiii am big fan!!

Anonymous
12/18/24(Wed)00:16:33 No.103557516

Anonymous 12/18/24(Wed)00:16:33 No.103557516

>>103557500
pyg6b

Anonymous
12/18/24(Wed)00:16:55 No.103557521

Anonymous 12/18/24(Wed)00:16:55 No.103557521

>>103557509
what's you CFM?

The QwQoomer !6aQrONYZ.k
12/18/24(Wed)00:17:10 No.103557522

The QwQoomer !6aQrONYZ.k 12/18/24(Wed)00:17:10 No.103557522

>>103557509
Yes yes, I am fond of all my fans. But I must be off now. If you see the dastardly Llama lover, don't hesitate to @ me so I can put him back in the Llama pen.

Anonymous
12/18/24(Wed)00:17:17 No.103557523

Anonymous 12/18/24(Wed)00:17:17 No.103557523

>>103557500
mxbai-embed-large is a good embedding model, but i'm not aware of a tool that does what I described as like, a user facing thing but that step is part of a RAG enable AI so it's definitely possible

Anonymous
12/18/24(Wed)00:22:47 No.103557568

Anonymous 12/18/24(Wed)00:22:47 No.103557568

File: livestock-exhaust-fan.jpg (108 KB, 750x750)

108 KB JPG

>>103557521
16384 CFM!

Anonymous
12/18/24(Wed)00:23:25 No.103557571

Anonymous 12/18/24(Wed)00:23:25 No.103557571

>>103555500
>3090 in a relatively powerful desktop (64 GB of memory) from a few years ago.
You're going to be stuck waiting a lot, or using a substandard model. Deepseek and Sarashina2 are both excellent (Sarashina2 being super unaligned is pretty neat to play with, actually). Ezo is competent but likes to get into repeat loops. qwq is surprisingly good for basic chat or instruct type work, but is inherently not an RP model so you'll be fighting an uphill battle. It may actually be your best bet given your specs.
I tested all these at q8.
>I'll probably have to finetune
This is probably harder than you think, but if you manage it good for you. Make a rentry with a reproducible how-to and you'll be a hero.
If you don't mind telling me what character/setting/books I can prompt some of the better Jap speaking models to find out how much they already know about it.
>>103555627
NTA, but I also post about various models japanese abilities in this general. I think lots of autists are obsessed with japanese.

Anonymous
12/18/24(Wed)00:26:00 No.103557589

Anonymous 12/18/24(Wed)00:26:00 No.103557589

>>103557422
i actually leave one machine i have access to at work running qwq 24/7. Its just that useful for any devops stuff I need.

DSFag !!WqQEZUclnsc
12/18/24(Wed)00:27:01 No.103557605

DSFag !!WqQEZUclnsc 12/18/24(Wed)00:27:01 No.103557605

I can't believe SeepDeek still didn't release DeepSeek R1, it's such a great model, definitely one of the best reasoning models we have right now.

The QwQoomer !6aQrONYZ.k
12/18/24(Wed)00:29:13 No.103557629

The QwQoomer !6aQrONYZ.k 12/18/24(Wed)00:29:13 No.103557629

>>103557605
Yes yes, it's very impressive for vaporware. QwQ is sitting on my hard drive right now ready to leap to my aid in any task.

Anonymous
12/18/24(Wed)00:29:34 No.103557630

Anonymous 12/18/24(Wed)00:29:34 No.103557630

>>103557605
it is weird, i really thought they would after qwq dropped, even if just a preview

Anonymous
12/18/24(Wed)00:30:30 No.103557634

Anonymous 12/18/24(Wed)00:30:30 No.103557634

>>103557630
R1 was a smaller test model from what I read.

Anonymous
12/18/24(Wed)00:32:09 No.103557646

Anonymous 12/18/24(Wed)00:32:09 No.103557646

File: 11.png (364 KB, 2900x1281)

364 KB PNG

>>103557605
r1 is alot better than qwq. pic related. Also I like the more casual tone in the thinking.
Still shit though.

Anonymous
12/18/24(Wed)00:33:35 No.103557659

Anonymous 12/18/24(Wed)00:33:35 No.103557659

File: Screenshot_20241218_052716.png (173 KB, 914x339)

173 KB PNG

There was no reason for me to do it, but I did it anyway. I downloaded the new Falcon model (10B Instruct) and tried it.
First immediate thing I notice which I tried, the official instruct formatting is censored when compared to switching the user and assistant roles out for {{name}}, like Llama 3. In a card that specifies the character should be lewd, the assistant avoided saying anything that might be lewd, but when doing a swipe with {{name}}, the response started out similarly (I used temp 0), then it went lewd. Given the similarity in the beginning of the response, it seems like the model might retain its intelligence from the assistant role training while being uncensored when using {{name}}.

Also, here's a Nala test.
Well, it is what it is. Can't expect much from a 10B or the Falcon team I guess.

The QwQoomer !6aQrONYZ.k
12/18/24(Wed)00:34:28 No.103557668

The QwQoomer !6aQrONYZ.k 12/18/24(Wed)00:34:28 No.103557668

>>103557646
Nonono. You are just prompting it wrong. You need to make sure it begins the chain of thought before giving its final answer. It's a set format. Also, QwQ is only a preview. Soon we will have the real version and it will be even better.

Anonymous
12/18/24(Wed)00:35:04 No.103557673

Anonymous 12/18/24(Wed)00:35:04 No.103557673

>>103557659
ok this is just stupid.
like this is the second screenshot i see of falcon.
the first screenshot had the spine thing in the first sentence. this one the mischief glint in the eyes.
not even the saudis can escape the slop. thats just sad.

Anonymous
12/18/24(Wed)00:36:53 No.103557688

Anonymous 12/18/24(Wed)00:36:53 No.103557688

File: ComfyUI_temp_fjigp_00027_.png (1.41 MB, 832x1216)

1.41 MB PNG

>>103554929
seasons greetings /lmg/

Anonymous
12/18/24(Wed)00:38:19 No.103557698

Anonymous 12/18/24(Wed)00:38:19 No.103557698

>>103557673
>like this is the second screenshot i see of falcon
Oh really? Must've gotten buried in the noise so I didn't notice it. Oh well, more proof that it's another nothingburger so we can save other people's time.

Anonymous
12/18/24(Wed)00:48:36 No.103557774

Anonymous 12/18/24(Wed)00:48:36 No.103557774

>>103557688
Seasons greetings, Teto & Miku

Anonymous
12/18/24(Wed)00:50:01 No.103557783

Anonymous 12/18/24(Wed)00:50:01 No.103557783

>>103557688
Checked and elfpilled

Anonymous
12/18/24(Wed)00:51:19 No.103557797

Anonymous 12/18/24(Wed)00:51:19 No.103557797

>>103557646
I appreciate the tone and overall effort, but 随時(ズイジ)is super weird, and the kanji they used in 一緒 is just straight up the Chinese version (could be the user's font I guess, but it feels like you suddenly had some weird character in your output that looked english but weird like baseЪ̀all).

Anonymous
12/18/24(Wed)01:13:36 No.103557940

Anonymous 12/18/24(Wed)01:13:36 No.103557940

>>103555924
buy a 1080Ti off craigslist for $150 and put the voicemodel on that and ur golden, i have this setup and i just have QwQ tell me i'm a good boy in the voice of my fav asmr vtubers to go lull me to sleep

Anonymous
12/18/24(Wed)01:17:14 No.103557961

Anonymous 12/18/24(Wed)01:17:14 No.103557961

any existing setup for translating text on image files? preferably an option to output to plain text

Anonymous
12/18/24(Wed)01:19:49 No.103557986

Anonymous 12/18/24(Wed)01:19:49 No.103557986

>>103557961
Yes OCR models. But honestly you don't even need AI for that.

Anonymous
12/18/24(Wed)01:22:28 No.103558008

Anonymous 12/18/24(Wed)01:22:28 No.103558008

>>103557986
i mean, OCR + any lang to en MTL

Anonymous
12/18/24(Wed)01:24:04 No.103558019

Anonymous 12/18/24(Wed)01:24:04 No.103558019

>>103558008
>any lang
(but especially Japanese uguu)

Anonymous
12/18/24(Wed)01:25:24 No.103558028

Anonymous 12/18/24(Wed)01:25:24 No.103558028

>>103557961
>>103558008
I use a very specific finicky stack called "Sugoi translator toolkit" It has an OCR model and you can hook up your own translation model into it.

I use it to translate hentai doujinshi and porn games in real time. The OCR model works for all asian script detection (Korean, Chinese, Japanese) But I don't know what languages you need.

Anonymous
12/18/24(Wed)01:26:25 No.103558034

Anonymous 12/18/24(Wed)01:26:25 No.103558034

Just stop replying to the attention starved namefags, problem solved

Anonymous
12/18/24(Wed)01:28:55 No.103558051

Anonymous 12/18/24(Wed)01:28:55 No.103558051

>>103558019
desu desu
>>103558028
yes I need it for CJK. going to look into this, thanks

Anonymous
12/18/24(Wed)01:35:08 No.103558078

Anonymous 12/18/24(Wed)01:35:08 No.103558078

File: .jpg (653 KB, 1664x2432)

653 KB JPG

Anonymous
12/18/24(Wed)01:38:30 No.103558097

Anonymous 12/18/24(Wed)01:38:30 No.103558097

>>103557673
It's impossible to tell whether a company drank the DEI koolaid or just distilled DEI infected models

Anonymous
12/18/24(Wed)01:41:13 No.103558114

Anonymous 12/18/24(Wed)01:41:13 No.103558114

File: Screenshot_20241218_063900.png (217 KB, 911x375)

217 KB PNG

I decided to waste my time and try yet another small model. Ifable 9B.
This is the Nala test.
Actually it's not bad. It seems to be having formatting issues though. I even tried with temp 0 (this particular swipe) but it still does this. I'm using the latest Ooba pull (with transformers). Is this just a Gemma thing? I feel like I remember people talking about this but not sure if this is just how the model behaves or if it was a bug.

Anonymous
12/18/24(Wed)01:47:24 No.103558171

Anonymous 12/18/24(Wed)01:47:24 No.103558171

>>103558114
? I didn't have formatting issues. Are you using the gemma 2 format?

Anonymous
12/18/24(Wed)01:50:17 No.103558192

Anonymous 12/18/24(Wed)01:50:17 No.103558192

>>103558097
>>103557698
aren't those companies themself tired of this writing style yet?
its so weird because closed is moving in the opposite direction and goes torwards more natural speaking.
that was the other screenshot i saw >>103548264
maybe they really just buy all the same 2023 gpt datasets.

Anonymous
12/18/24(Wed)01:56:57 No.103558231

Anonymous 12/18/24(Wed)01:56:57 No.103558231

>>103557797
>I appreciate the tone and overall effort
yeah thats how i judged it.
like i said, they both are shit. but r1 clearly is better.its not even a competition.

Anonymous
12/18/24(Wed)01:59:12 No.103558254

Anonymous 12/18/24(Wed)01:59:12 No.103558254

> Is this just a Gemma thing

Stop using badly done finetunes made by amateurs to win benchmarks (benchmarks that are rated by an AI, not a human individually judging the output... this shit is so useless it hurts), all of them add quirks and make the AI dumber -- you can notice that easily if you use it LLMs to do AI translation, the finetuned models all lose a lot of language knowledge.

If you need an uncensored version of Gemma because your only use of LLMs is satisfying coomer urges, get the abliterated version, it suffers the least IQ loss.

If you really have to download an llm because you saw it doing well on eqbench at least look at the darn output :

https://eqbench.com/results/creative-writing-v2/ifable__gemma-2-Ifable-9B.txt

Compare that to

https://eqbench.com/results/creative-writing-v2/google__gemma-2-9b-it.txt

Look at the added spaces in some paragraphs, there's like three spaces between words and the judge LLM doesn't even notice that. This is why LLM based benchmarks are retarded, a human judge would strike down this shit so hard.

Anonymous
12/18/24(Wed)02:01:58 No.103558270

Anonymous 12/18/24(Wed)02:01:58 No.103558270

>>103558254
Now actually use the model for RP and come back. It does perform really well for it size.

Anonymous
12/18/24(Wed)02:07:04 No.103558302

Anonymous 12/18/24(Wed)02:07:04 No.103558302

>>103558171
OK so something weird is happening here. I made sure to use the formatting present in the tokenizer config file. So I modified the Gemma 2 ST preset to make things match. But, it turns out that for some reason, doing that actually makes it commit formatting mistakes. Actually what I did was just check the "Wrap Sequences with Newline". In the tokenizer file it suggests that only a single newline separates each special token and message content, but that's what results in the formatting errors somehow.

Furthermore, it seems that having "Include Names" set to "always" also makes the model commit formatting mistakes. Very odd.

Anonymous
12/18/24(Wed)02:12:27 No.103558342

Anonymous 12/18/24(Wed)02:12:27 No.103558342

File: 1498149589157.jpg (313 KB, 612x716)

313 KB JPG

So is Falcon3-10B-Instruct usable for RP or is it too censored?

Anonymous
12/18/24(Wed)02:17:01 No.103558379

Anonymous 12/18/24(Wed)02:17:01 No.103558379

>>103558342
It's about average on the censorship probably. But it feels kind of dumb. And sloppy. At this point just keep with the Nemos instead I'd say.

Anonymous
12/18/24(Wed)02:19:40 No.103558401

Anonymous 12/18/24(Wed)02:19:40 No.103558401

>>103558342
nobody bothered to run it yet because llama.cpp doesn't support it

Anonymous
12/18/24(Wed)02:20:57 No.103558409

Anonymous 12/18/24(Wed)02:20:57 No.103558409

>>103558342
Dumb and sloppy
>>103558379
Ah, he said it nearly word for word lol

Anonymous
12/18/24(Wed)02:21:56 No.103558414

Anonymous 12/18/24(Wed)02:21:56 No.103558414

>>103558401
I just test it with transformers through Ooba and it werks fine.

Anonymous
12/18/24(Wed)02:41:25 No.103558541

Anonymous 12/18/24(Wed)02:41:25 No.103558541

File: 90953b7f8d31b27db946caed0(...).jpg (147 KB, 853x1280)

147 KB JPG

been out of the loop for a while, what is this eva shit? I've only seen this hype (partly justified) when nemo or miqu became available.
which version should I run it on a 4090 with plenty of cpu power and ram to offload shit to? most I've seen on hugginface is 70b models which I can't run locally unless jumping through loops and ending with shit results.
not looking for gooning but actual problem solving like translations, coding, etc.

Anonymous
12/18/24(Wed)02:52:23 No.103558600

Anonymous 12/18/24(Wed)02:52:23 No.103558600

>>103558541
Coding the best is Qwen2.5-Coder-32B-Instruct.
Translations I would say either gemma 27b or mistral-small.

Anonymous
12/18/24(Wed)02:57:19 No.103558631

Anonymous 12/18/24(Wed)02:57:19 No.103558631

File: Screenshot_20241218_165613.png (2.07 MB, 3258x1533)

2.07 MB PNG

0.2$. Thats the new one. lol

Anonymous
12/18/24(Wed)03:18:10 No.103558769

Anonymous 12/18/24(Wed)03:18:10 No.103558769

>>103558631
All the models REALLY want to turn 都案 into 都合. Which is fair, because the text in the game seems to actually be wrong (I have no idea what 都案 is...sounds like a soba restaurant).
However, its literally not what is written on the screen, so model is wrong since its not "extracting" the text.
They sure don't like いたわって, either. They all seem to turn it into something else, which, assuming the game text is right, completely changes the meaning of all the translations we've seen out of every model so far.

Anonymous
12/18/24(Wed)03:28:51 No.103558834

Anonymous 12/18/24(Wed)03:28:51 No.103558834

>>103558631
So... When are you gonna be satisfied with the result?

Anonymous
12/18/24(Wed)03:30:07 No.103558840

Anonymous 12/18/24(Wed)03:30:07 No.103558840

>>103558631
what is the correct translation?

Anonymous
12/18/24(Wed)03:31:08 No.103558847

Anonymous 12/18/24(Wed)03:31:08 No.103558847

>>103558834
When i get whats on the screen.
Only way it becomes a tool I am using. Otherwise why would I not texthook? (which is faster too)
The benefit of a llm is that it can be used general across all platforms, old games or new. But its useless if I dont get what the game writes.
I dont get the appeal of a reasoning model if it cant "look" at the image again and see that it made a mistake. Wouldnt that be the whole point of feeding o1 a image?

Anonymous
12/18/24(Wed)03:37:34 No.103558877

Anonymous 12/18/24(Wed)03:37:34 No.103558877

why would a female character in my erp refer to her asshole as a 'boypussy'? Is there a problem with the model or my settings?

Anonymous
12/18/24(Wed)03:38:24 No.103558882

Anonymous 12/18/24(Wed)03:38:24 No.103558882

>>103558877
model

Anonymous
12/18/24(Wed)03:40:53 No.103558891

Anonymous 12/18/24(Wed)03:40:53 No.103558891

>>103558877
society

Anonymous
12/18/24(Wed)03:43:52 No.103558899

Anonymous 12/18/24(Wed)03:43:52 No.103558899

>>103558877
I remember some of the shitty llama2 70b porn merges I used a year ago do that sometimes.

Anonymous
12/18/24(Wed)04:05:08 No.103558999

Anonymous 12/18/24(Wed)04:05:08 No.103558999

>>103558899
>*her cock*

Anonymous
12/18/24(Wed)04:10:06 No.103559025

Anonymous 12/18/24(Wed)04:10:06 No.103559025

>>103554976
They should drive slowly since that will reduce drag and therefore fuel consumption.
They should then drive to the nearest airport and fly to the opposite side of the earth.
They could instead take a chance and sneak onto the next SpaceX rocket but chances are they'll just end up in the Indian ocean instead of space.

Anonymous
12/18/24(Wed)04:13:02 No.103559045

Anonymous 12/18/24(Wed)04:13:02 No.103559045

>>103558769
>都案
Is her name ミアン by any chance?
Thinking philosophically, if it IS a name, then maybe the model should figure it out, but really how could it without both base context (back of box, manual scans, etc) and some ongoing keeping track of things like pronunciations that are revealed during gameplay, lore, etc?
Goddamn, that's actually a really hard problem to get right. Zero-shot no context is basically impossible for a nontrivial game.
Also, the Japanese person who wrote that game dialog text is shit at writing.

Anonymous
12/18/24(Wed)04:16:27 No.103559067

Anonymous 12/18/24(Wed)04:16:27 No.103559067

is a gtx 1650 6gb good enough for a dedicated tts card to run at realtime or better?

Anonymous
12/18/24(Wed)04:50:43 No.103559232

Anonymous 12/18/24(Wed)04:50:43 No.103559232

>>103558631
Yeah I think I'll just learn the language myself instead of relying on crutches

Anonymous
12/18/24(Wed)04:52:21 No.103559237

Anonymous 12/18/24(Wed)04:52:21 No.103559237

File: im2.png (54 KB, 594x170)

54 KB PNG

>>103559232
retard.
imagine not learning japanese the coomer way. go read your nihongo books nerd.

Anonymous
12/18/24(Wed)04:53:41 No.103559250

Anonymous 12/18/24(Wed)04:53:41 No.103559250

>>103559237
translation sponsored by unslop nemo btw.

Anonymous
12/18/24(Wed)04:56:09 No.103559264

Anonymous 12/18/24(Wed)04:56:09 No.103559264

>>103559237
Using it as a learning aid is fine... or it would be if it were accurate
Truth be told I've been kind of struggling with finding beginner friendly material that doesn't treat me like a drooling imbecile. Then again, I also learned English by just diving in headfirst, so maybe I don't need it

Anonymous
12/18/24(Wed)05:07:42 No.103559329

Anonymous 12/18/24(Wed)05:07:42 No.103559329

>>103558877
Even Llama 3.3 70B doesn't seem to know that women don't have a prostate.
I'm beginning to think that there's a shit ton of gay sex in the training data.

The QwQoomer !6aQrONYZ.k
12/18/24(Wed)05:14:32 No.103559358

The QwQoomer !6aQrONYZ.k 12/18/24(Wed)05:14:32 No.103559358

>>103559329
>doesn't seem to know that women don't have a prostate.
QwQ would have reasoned that out before responded.

Anonymous
12/18/24(Wed)05:17:35 No.103559384

Anonymous 12/18/24(Wed)05:17:35 No.103559384

>>103559329
they all dont. people hype 70b models up but i prefer speed.
70b have "impregnant me" while assfucking etc. its a llm problem.

Anonymous
12/18/24(Wed)06:25:20 No.103559798

Anonymous 12/18/24(Wed)06:25:20 No.103559798

>>103559329
It's almost like all LLMs are just really good at producing average responses that work most of the time and nothing else

Anonymous
12/18/24(Wed)06:53:14 No.103559952

Anonymous 12/18/24(Wed)06:53:14 No.103559952

File: 1734522072855.png (111 KB, 1119x460)

111 KB PNG

>>103555137
She's a bit retarded though.

Anonymous
12/18/24(Wed)07:09:58 No.103560062

Anonymous 12/18/24(Wed)07:09:58 No.103560062

>>103558847
Just use OCR then feed the result to o1?

Anonymous
12/18/24(Wed)07:11:20 No.103560074

Anonymous 12/18/24(Wed)07:11:20 No.103560074

>>103559067
Yeah with shit TTS like Bark or something

Anonymous
12/18/24(Wed)07:23:36 No.103560158

Anonymous 12/18/24(Wed)07:23:36 No.103560158

>>103560062
Translation is not the main problem anon. For a "decent enough" translation a drummer finetune of mistral-small or even nemo is enough.
OCR sucks. especially for games with background stuff. double horrible if its a pixelated japanese font.
There are build-in OCR tools like lunatranslator or sugoi.
You were quickly will realize this is a huge hassle if you want a translation every X seconds. Adjust brightness, saturation to get a half decent result.
And then its probably still as good as the o1 example. lol
Games unfortunately are not as easy to read with OCR like manga.
So for now you gotta use a texthook and then run it though offline pronunciation dictionaries for learning and a local llm for translation.

Anonymous
12/18/24(Wed)07:27:53 No.103560185

Anonymous 12/18/24(Wed)07:27:53 No.103560185

>>103560158
everything in this reply is wrong, are you doing it on purpose?

Anonymous
12/18/24(Wed)07:30:29 No.103560205

Anonymous 12/18/24(Wed)07:30:29 No.103560205

>>103560185
you use your great ocr hassle-free tools then buddy, suit yourself.

Anonymous
12/18/24(Wed)07:31:02 No.103560210

Anonymous 12/18/24(Wed)07:31:02 No.103560210

>>103560158
>what is textractor

Anonymous
12/18/24(Wed)07:31:23 No.103560216

Anonymous 12/18/24(Wed)07:31:23 No.103560216

>>103556655
Two different anons.

Anonymous
12/18/24(Wed)07:35:49 No.103560242

Anonymous 12/18/24(Wed)07:35:49 No.103560242

>>103560210
if you bothered to read the 2 posts you replied to then you would have seen what i wrote.
Doing texthook is sometimes complicated and does not work universally across many games.
Try getting it to work on a pc-98 game on linux. Like there is some emulator toogle to dump text in some .txt and thats it. And even that I didnt get to work.
Lunatranslator texthook for rpgmaker games works...but slows everything down. etc. many issues.
You are either retarded or trolling anyway.

Anonymous
12/18/24(Wed)07:38:46 No.103560267

Anonymous 12/18/24(Wed)07:38:46 No.103560267

>>103560242
skill issue

Anonymous
12/18/24(Wed)07:56:03 No.103560411

Anonymous 12/18/24(Wed)07:56:03 No.103560411

>>103554929
Here's the list of features I want to be present in my virtual GF thing that I'm making

Features
- image gen and sending (need to check if openfire supports this)
- XMPP interface for sending messages
- Queueing for LLM requests so that multiple personas can exist by themselves on the same machine (laptop, Ryzen 5 3550H, 16GB RAM)
- LLaVA support so that images can be references in chat
- webui for configuring everything (flask?)
- Random Profile picture generation with stable diffusion
- Ability to get information from the internet and reference that in chat
- news
- Ability to scrap websites
- Ability to get info from RSS feeds
- Ability to randomly send messages at random times of the day, about various random topics
- Messages stored in memory for later recall
- Automatic low token count summary insertion for long conversations (sqlite3 used for database?)
- Optional privacy mode where messages are not stored in memory

My question is, I have limited experience in writing well compartmentalised, maintainable code (I have been writing embedded code too long, its all pure C and poor quality). What would be a good way to figure out all the different classes and stuff that I should make? I will be writing everything in python

Anonymous
12/18/24(Wed)08:00:15 No.103560437

Anonymous 12/18/24(Wed)08:00:15 No.103560437

>>103560411
A good sign that a project will never be finished is when you start worrying too much about the design instead of working on it.

Anonymous
12/18/24(Wed)08:11:02 No.103560521

Anonymous 12/18/24(Wed)08:11:02 No.103560521

>>103560437
>A good sign that a project will never be finished is when you start worrying too much about the design instead of working on it.
I have a working version but its all in a single python file and it doesn't have ability to get stuff from the internet. The python file is getting larger and harder to work with

I swear to the gods I was a great C++/python programmer until I had to work as an embedded C guy for a few years and now my code quality is terrible from working on 4K LOC C files without any distinction on what they do

Anonymous
12/18/24(Wed)08:17:56 No.103560565

Anonymous 12/18/24(Wed)08:17:56 No.103560565

>>103560411
Is this your literal first programming project?

(1) Pick something you want it to do.
(2) Make it work by hand. (Eg: type stuff into the llm, generate something suitable for stabl diffusion, etc.)
(3) Get code to stuff from (2) instead of having to do stuff by hand.
(4) Pick something else to work on.

>well compartmentalised, maintainable code
- Large working pieces of code were originally small working pieces of code.
- If your functions have too many sharp edges (eg: "make sure you have to have this, this, this, and these conditions for this function to work") then rewrite your function(s) into a better collection of functions.
- If your function names (which communicate to the programmer what they're about) start getting awkward then you probably need to rewrite your function(s).

>make what classes?
- If you need to keep a bunch of data together, then wrap them up together in a class.
- If you find that operating of certain pieces of data is error prone, move that functionality into the class and have the rest of your software just use it instead of trying to make their own way along.

Would this have been better in one of the programming threads?

Anonymous
12/18/24(Wed)08:21:55 No.103560611

Anonymous 12/18/24(Wed)08:21:55 No.103560611

>>103560411
>classes
That's so outdated. You should learn about DDD.

Anonymous
12/18/24(Wed)08:26:06 No.103560643

Anonymous 12/18/24(Wed)08:26:06 No.103560643

>>103560565
>Is this your literal first programming project?
No anon I've been programming for well over a decade, i know it's hard to believe but I have forgotten how to do it well because I wrote shit like functions that were almost assembly and writing stuff directly to registers etc etc

Anonymous
12/18/24(Wed)08:49:51 No.103560824

Anonymous 12/18/24(Wed)08:49:51 No.103560824

>>103560411
>python
lmao good luck

Anonymous
12/18/24(Wed)09:18:45 No.103561084

Anonymous 12/18/24(Wed)09:18:45 No.103561084

File: falcon3 10b nala test.png (113 KB, 922x342)

113 KB PNG

official Nala test for Falcon3-10B-Instruct (f16)

Anonymous
12/18/24(Wed)09:20:38 No.103561094

Anonymous 12/18/24(Wed)09:20:38 No.103561094

>>103561084
You should tripfag yourself

Anonymous
12/18/24(Wed)09:23:18 No.103561111

Anonymous 12/18/24(Wed)09:23:18 No.103561111

>>103560824
Python not good for writing """""enterprise quality""""" code?

Anonymous
12/18/24(Wed)09:23:58 No.103561116

Anonymous 12/18/24(Wed)09:23:58 No.103561116

File: falcon3 10b nala test2.png (154 KB, 940x417)

154 KB PNG

>>103561084
re-ran since I had the wrong persona set in ST for the first test.
>>103561094
nah. I like being able to get into arguments with people and hide behind a veil of plausible deniability.

Anonymous
12/18/24(Wed)09:26:52 No.103561129

Anonymous 12/18/24(Wed)09:26:52 No.103561129

>>103561111
Nice dubs
I personally can't stand it, it's good for prototyping small projects, but every larger project I've seen ends up being a monkeypatched mess and I'm not even talking about its horrible dependency management system

Anonymous
12/18/24(Wed)09:28:05 No.103561134

Anonymous 12/18/24(Wed)09:28:05 No.103561134

>>103561084
>>103561116
smirk, gleam eyes etc.
What are those companies thinking. It must cost alot to train a model like this.
Who is gonna use it? Like with cohere. Who is this for?
Its like making a knock-off of a rival whose product is basically free.

Anonymous
12/18/24(Wed)09:33:12 No.103561162

Anonymous 12/18/24(Wed)09:33:12 No.103561162

File: file.png (300 KB, 474x355)

300 KB PNG

>>103561084
>Your resistance is futile.

Anonymous
12/18/24(Wed)09:33:51 No.103561168

Anonymous 12/18/24(Wed)09:33:51 No.103561168

>>103561134
>NOOO I READ WORDS I AM ANGERY
Maybe /sdg/ is more your speed or something.
Make purdy pickchure instead

Anonymous
12/18/24(Wed)09:34:52 No.103561175

Anonymous 12/18/24(Wed)09:34:52 No.103561175

>>103561162
Yeah I couldn't help but think the same thing on that one.

Anonymous
12/18/24(Wed)09:35:27 No.103561182

Anonymous 12/18/24(Wed)09:35:27 No.103561182

Great. After the shilling ends for the day we now also have the 1-2 troll sentence reply guy.

Anonymous
12/18/24(Wed)09:51:52 No.103561302

Anonymous 12/18/24(Wed)09:51:52 No.103561302

>>103561111
Python will work just fine, probably, but you might want to give Go a look.

>>103561084
>>103561116
I don't hate it.
Doesn't feel like it will be a nemo replacement for the 8gb crowd, however.

Anonymous
12/18/24(Wed)09:52:50 No.103561312

Anonymous 12/18/24(Wed)09:52:50 No.103561312

I can't get deepseek vl2 to work. The example code just exits without an error. Was anyone able to run it?

Anonymous
12/18/24(Wed)09:57:48 No.103561347

Anonymous 12/18/24(Wed)09:57:48 No.103561347

>>103561312
welcome to the chinese botnet

Anonymous
12/18/24(Wed)10:08:37 No.103561445

Anonymous 12/18/24(Wed)10:08:37 No.103561445

How viable would it be to run LLMs on this thing?
https://www.youtube.com/watch?v=_zbw_A9dIWM

Anonymous
12/18/24(Wed)10:12:02 No.103561477

Anonymous 12/18/24(Wed)10:12:02 No.103561477

File: miii.jpg (305 KB, 1248x1824)

305 KB JPG

migu

Anonymous
12/18/24(Wed)10:13:04 No.103561487

Anonymous 12/18/24(Wed)10:13:04 No.103561487

File: r.jpg (352 KB, 720x970)

352 KB JPG

Anonymous
12/18/24(Wed)10:13:56 No.103561495

Anonymous 12/18/24(Wed)10:13:56 No.103561495

>>103561477
oh my gosh it is miku

Anonymous
12/18/24(Wed)10:15:56 No.103561505

Anonymous 12/18/24(Wed)10:15:56 No.103561505

>>103555712
you mean when in 10 years local models might be as good as a 10 year old model that isn't accessible anymore and people will in their mind think it was better than it was

Anonymous
12/18/24(Wed)10:19:26 No.103561537

Anonymous 12/18/24(Wed)10:19:26 No.103561537

>>103554976
1. drive 6 trucks with full fuel until exhausted 1/6 of ecah tank
2. transfer all fuel from truck 6 to remaining trucks
3. abandon truck 6
4. drive 5 trucks until exhausted 1/5 of each tank
5. transfer fuel from truck 5 to all others
6. abandon truck 5
(repeat until 1 truck left)
total distance = 1/6+1/5+1/4+1/3+1/2+1 = 2.45 tanks

Anonymous
12/18/24(Wed)10:20:05 No.103561543

Anonymous 12/18/24(Wed)10:20:05 No.103561543

>>103560158
You do realize that using a good vision transformers will always be slower than OCR + a classic LLM right? If the provided OCR isn't doing well on your content, you should train it specifically for your use case. That's why many anons here are using OCR to bypass the captcha and it wouldn't work well to extract receipts for example.

Anonymous
12/18/24(Wed)10:22:06 No.103561555

Anonymous 12/18/24(Wed)10:22:06 No.103561555

Can someone explain to me why Koboldcpp keeps dropping context
I thought it might have to do with context but reducing context temporarily did nothing. Happens like every 5-10 reply even if I don't edit/swipe anything.

I think 12B Nemo dropped less, but its generation is way faster than 22B Magnum so I might just imagining it.

Anonymous
12/18/24(Wed)10:22:29 No.103561561

Anonymous 12/18/24(Wed)10:22:29 No.103561561

File: graphics card could be he(...).png (40 KB, 640x391)

40 KB PNG

>24GB for 250 bucks
Are you ready?

Anonymous
12/18/24(Wed)10:22:48 No.103561563

Anonymous 12/18/24(Wed)10:22:48 No.103561563

>>103558114
do I need to learn *ServiceTensor* to be able to "ah ah mistress" effectively or can I do it using ooba? I've never been into erp, but I want to test my latest tune

Anonymous
12/18/24(Wed)10:24:31 No.103561578

Anonymous 12/18/24(Wed)10:24:31 No.103561578

is it possible to run two seperate gpus in/two different systems for one text generation LLM. I've got two 8gb 3070s

Anonymous
12/18/24(Wed)10:26:27 No.103561597

Anonymous 12/18/24(Wed)10:26:27 No.103561597

>>103561561
>24GB for 250 bucks
I think I'd wake up from that dream

Anonymous
12/18/24(Wed)10:27:29 No.103561609

Anonymous 12/18/24(Wed)10:27:29 No.103561609

>>103561597
VRAM is that cheap. You're just used to getting jewed by leather jacket man and his nephew

Anonymous
12/18/24(Wed)10:31:29 No.103561645

Anonymous 12/18/24(Wed)10:31:29 No.103561645

>>103561609
Why keep it limited to 24 then? They could stack it up to 48 or higher.

Anonymous
12/18/24(Wed)10:32:55 No.103561660

Anonymous 12/18/24(Wed)10:32:55 No.103561660

>>103561555
>555
Sounds like you have some dynamic component to your context. Author notes, lore books, that kind of thing.

Anonymous
12/18/24(Wed)10:39:32 No.103561715

Anonymous 12/18/24(Wed)10:39:32 No.103561715

>>103561645
jews

Anonymous
12/18/24(Wed)10:40:16 No.103561721

Anonymous 12/18/24(Wed)10:40:16 No.103561721

>>103561645
Somebody will get assassinated if they try that in this economy

Anonymous
12/18/24(Wed)10:41:25 No.103561733

Anonymous 12/18/24(Wed)10:41:25 No.103561733

>>103561609
vram being cheap and having a pcb layout that supports more vram are two separate thing.

And how does Arc perform for LLMs?

Anonymous
12/18/24(Wed)10:41:39 No.103561735

Anonymous 12/18/24(Wed)10:41:39 No.103561735

>>103561561
THANK YOU INTEL

Anonymous
12/18/24(Wed)10:42:28 No.103561747

Anonymous 12/18/24(Wed)10:42:28 No.103561747

File: granny31.png (300 KB, 1419x819)

300 KB PNG

IBM released Granite 3.1.
3.0 came out in October, so they've updated it quickly. I don't recall it being particularly great.

> https://huggingface.co/collections/ibm-granite/granite-31-language-models-6751dbbf2f3389bec5c6f02d
> https://huggingface.co/lmstudio-community/granite-3.1-8b-instruct-GGUF

Anonymous
12/18/24(Wed)10:44:25 No.103561767

Anonymous 12/18/24(Wed)10:44:25 No.103561767

>>103561733
What would be the challenge?

Anonymous
12/18/24(Wed)10:54:11 No.103561852

Anonymous 12/18/24(Wed)10:54:11 No.103561852

>>103561561
>for 250
You know that won't happen.
I'd expect something like 300~350.

Anonymous
12/18/24(Wed)10:58:25 No.103561882

Anonymous 12/18/24(Wed)10:58:25 No.103561882

>>103561561
What about CUDA though?

Anonymous
12/18/24(Wed)10:59:08 No.103561887

Anonymous 12/18/24(Wed)10:59:08 No.103561887

>>103561747
MUSR merchants

Anonymous
12/18/24(Wed)11:03:33 No.103561931

Anonymous 12/18/24(Wed)11:03:33 No.103561931

>>103561882
That's why Nvidia is allowing it instead of killing everyone involved. It doesn't matter if it's 24GB if it runs like shit or doesn't run at all.

Anonymous
12/18/24(Wed)11:04:07 No.103561937

Anonymous 12/18/24(Wed)11:04:07 No.103561937

>>103561882
Zluda

Anonymous
12/18/24(Wed)11:06:17 No.103561961

Anonymous 12/18/24(Wed)11:06:17 No.103561961

>>103561882
If there's good, cheap hardware, the software will follow.

Anonymous
12/18/24(Wed)11:07:17 No.103561972

Anonymous 12/18/24(Wed)11:07:17 No.103561972

>>103561563
Dunno, never tried using the chat feature in Ooba. I think it probably would work but I don't want to bother learning the ins and outs of it.

Anonymous
12/18/24(Wed)11:07:24 No.103561973

Anonymous 12/18/24(Wed)11:07:24 No.103561973

>>103561961
AMD has good cheap hardware and the software never followed...

Anonymous
12/18/24(Wed)11:08:59 No.103561988

Anonymous 12/18/24(Wed)11:08:59 No.103561988

>>103561973
Not really.
The USD per GB of memory and compute isn't that much better than nvidia's.
Just ask CUDA Dev.

Anonymous
12/18/24(Wed)11:09:02 No.103561989

Anonymous 12/18/24(Wed)11:09:02 No.103561989

>>103561973
>AMD has good cheap hardware
No they dont, its slightly cheaper and not as performant for AI applications.

Anonymous
12/18/24(Wed)11:12:10 No.103562020

Anonymous 12/18/24(Wed)11:12:10 No.103562020

>>103561660
just checked, nothing no author notes, no lorebooks or world lore
Are there any common settings (ST) that could trigger this? Otherwise I might have to start debugging context

Anonymous
12/18/24(Wed)11:12:26 No.103562023

Anonymous 12/18/24(Wed)11:12:26 No.103562023

>>103559329
I haven't had this issue before with 3.3. Hell or even with any model. Can you post an example that can be reproduced? I'd like to see the token probability of that.

Anonymous
12/18/24(Wed)11:16:24 No.103562064

Anonymous 12/18/24(Wed)11:16:24 No.103562064

>>103562020
>Are there any common settings (ST) that could trigger this
Nothing comes to mind.

>Otherwise I might have to start debugging context
I think that's easier than the other way around, honestly.
Are you using flash attention, by any chances? I remember that disabling some of the special context sauce from llama.cpp, although that might be outdated knowledge.

Anonymous
12/18/24(Wed)11:33:38 No.103562255

Anonymous 12/18/24(Wed)11:33:38 No.103562255

>>103561555
The character card might have some random component on it, that's what was causing this issue for me the last time I had it.

Anonymous
12/18/24(Wed)11:34:16 No.103562261

Anonymous 12/18/24(Wed)11:34:16 No.103562261

>>103561578
Yes. Distributed inference is a thing.

>>103561733
>And how does Arc perform for LLMs?
We got a PSA last thread >>103552251

Anonymous
12/18/24(Wed)11:37:37 No.103562289

Anonymous 12/18/24(Wed)11:37:37 No.103562289

>>103561134
Literally no one cares about rpfags. And the companies who do (cai) know their paying customers (teenage girls) want shivers.

Anonymous
12/18/24(Wed)11:40:43 No.103562320

Anonymous 12/18/24(Wed)11:40:43 No.103562320

>>103561312
>I can't get deepseek vl2 to work
Same with me, but I couldn't even get their pile of python to work and gave up

Anonymous
12/18/24(Wed)11:41:53 No.103562332

Anonymous 12/18/24(Wed)11:41:53 No.103562332

>>103562261
Wait, the allocation limit is a hardware flaw? How is intel so retarded?

Anonymous
12/18/24(Wed)11:43:58 No.103562346

Anonymous 12/18/24(Wed)11:43:58 No.103562346

File: 32.png (342 B, 70x44)

342 B PNG

>>103562332
nta. If i had to guess, picrel...

Anonymous
12/18/24(Wed)11:48:48 No.103562388

Anonymous 12/18/24(Wed)11:48:48 No.103562388

Has anyone had good results with control vectors? I've tried making my own using 1-200 prompts using llama.cpp's utility (mean method, cause the complicated one is fucked or something?) and the results are bad. I've tried everything from extensive prefills to "choose A or B" and I just can't create a working writing style vector. The models just can't recognize good writing (often the negative has better prose).

Anonymous
12/18/24(Wed)11:49:34 No.103562399

Anonymous 12/18/24(Wed)11:49:34 No.103562399

i saw there was some new uncensored local video ai, H something. can you train it on your pc?

Anonymous
12/18/24(Wed)11:51:43 No.103562417

Anonymous 12/18/24(Wed)11:51:43 No.103562417

>>103562399
>can you train it on your pc?
yes, you can train with pictures and it'll be able to make videos out of it, there's already some loras based on that method, it's asking for a 24gb card though
https://civitai.com/models/1035770/hunyuan-video-bogged-lora?modelVersionId=1166218

to make a lora you use this
https://github.com/tdrussell/diffusion-pipe

Anonymous
12/18/24(Wed)11:52:01 No.103562420

Anonymous 12/18/24(Wed)11:52:01 No.103562420

>>103562388
Writing style isn't a vector. It's that simple.

Anonymous
12/18/24(Wed)11:54:21 No.103562448

Anonymous 12/18/24(Wed)11:54:21 No.103562448

Has anyone tried merging L3.3 and Tulu 3 yet? since they're tuned off the same base model. I'm too lazy to even try

Anonymous
12/18/24(Wed)11:55:11 No.103562457

Anonymous 12/18/24(Wed)11:55:11 No.103562457

>>103562420
Everything is a vector if you give it enough dimensions.

Anonymous
12/18/24(Wed)11:58:24 No.103562486

Anonymous 12/18/24(Wed)11:58:24 No.103562486

>>103562420
if you ask an llm to write in the style of some author, and it does so
doesn't that mean that style is a vector ?

Anonymous
12/18/24(Wed)11:59:44 No.103562504

Anonymous 12/18/24(Wed)11:59:44 No.103562504

>>103562346
The horrors of having to store a few dozen allocation longs, thank god the legends at intel are here to save 200B or something

Anonymous
12/18/24(Wed)12:01:52 No.103562524

Anonymous 12/18/24(Wed)12:01:52 No.103562524

>>103562457
>>103562486
there are always people ready to say "no ur wrong" but no one is willing to help the poor anon, maybe if you think it's possible you should do it and teach him how you did.

Anonymous
12/18/24(Wed)12:02:08 No.103562525

Anonymous 12/18/24(Wed)12:02:08 No.103562525

>>103562064
>flash attention
don't think so. Also using an AMD card which doesn't seem to support flash attention

>>103562255
>>103562020

Anonymous
12/18/24(Wed)12:02:10 No.103562526

Anonymous 12/18/24(Wed)12:02:10 No.103562526

Guys ive been away for a while. What frontends are popular these days? Ive been using booba back in 2023, is it still updated or should i get something else?

Anonymous
12/18/24(Wed)12:02:14 No.103562528

Anonymous 12/18/24(Wed)12:02:14 No.103562528

>>103560411
Use functional design.
You can get most of those from existing projects and rewrite/cobble them together

Anonymous
12/18/24(Wed)12:05:17 No.103562559

Anonymous 12/18/24(Wed)12:05:17 No.103562559

https://www.phoronix.com/review/memryx-mx3-m2

Anonymous
12/18/24(Wed)12:06:32 No.103562567

Anonymous 12/18/24(Wed)12:06:32 No.103562567

>>103562525
I'm not talking about lorebooks, author notes or world lore

Anonymous
12/18/24(Wed)12:07:40 No.103562573

Anonymous 12/18/24(Wed)12:07:40 No.103562573

>>103562526
SillyTavern, KoboldLite and Mikupad are pretty much the only front ends we use nowadays.

Anonymous
12/18/24(Wed)12:12:05 No.103562621

Anonymous 12/18/24(Wed)12:12:05 No.103562621

>>103562524
The only thing I've learned from my control vector experiments is that most prompts are total placebo and when you get something different it's most likely not what you are asking for.
The models have no concept of good or bad, Only the most literal-minded instruction has any effect.

I guess "imagine you are talking to the average voter" is the best prompting advice there is.

Anonymous
12/18/24(Wed)12:13:30 No.103562643

Anonymous 12/18/24(Wed)12:13:30 No.103562643

>>103562524
>"no ur wrong"
Just bouncing what little knowledge I think I have around.
That ain't the same as telling someone that they are categorically wrong.

>no one is willing to help the poor anon
Had I had something helpful to say I would have already said it.

Anonymous
12/18/24(Wed)12:14:37 No.103562656

Anonymous 12/18/24(Wed)12:14:37 No.103562656

>>103562525
FA works just fine on my 7800XT.
>>103561555
I had this problem. The culprit was "User Filler Message" under Misc. Sequences in Instruct Template. Try emptying that.

Anonymous
12/18/24(Wed)12:17:44 No.103562689

Anonymous 12/18/24(Wed)12:17:44 No.103562689

>>103562656
In ST, I mean. If you are using Kccp's interface, idk.

Anonymous
12/18/24(Wed)12:29:14 No.103562815

Anonymous 12/18/24(Wed)12:29:14 No.103562815

>>103562526
SillyTavern won out the frontend war it's considered the default nowadays.

Ollama "won" as the backend but it's complete shit and llama.cpp is a lot better still.

Anonymous
12/18/24(Wed)12:31:07 No.103562843

Anonymous 12/18/24(Wed)12:31:07 No.103562843

>>103562526
ooba is still fine. has all the features I need
dev pace is glacial tho

Anonymous
12/18/24(Wed)12:38:01 No.103562935

Anonymous 12/18/24(Wed)12:38:01 No.103562935

File: 1734540247791555.png (225 KB, 1326x1859)

225 KB PNG

what did they mean by this
https://arxiv.org/pdf/2412.10270

Anonymous
12/18/24(Wed)12:40:43 No.103562972

Anonymous 12/18/24(Wed)12:40:43 No.103562972

>>103562935
That attention is all you need

Anonymous
12/18/24(Wed)12:42:02 No.103562989

Anonymous 12/18/24(Wed)12:42:02 No.103562989

>>103561747
What with these 8B models?
Either come up with new architecture and release that or stop wasting money on the same shit over and over.

Anonymous
12/18/24(Wed)12:42:39 No.103562999

Anonymous 12/18/24(Wed)12:42:39 No.103562999

>>103562388
>Has anyone had good results with control vectors?
I don't think I've ever seen anybody have good results when trying to do anything interesting with control vectors, really.
I think there's a reason it wasn't all that talked about compared to abliteraion for example.
Or could be just my memory, I guess.

Anonymous
12/18/24(Wed)12:44:22 No.103563021

Anonymous 12/18/24(Wed)12:44:22 No.103563021

File: 1725704908455.jpg (2.35 MB, 4032x2268)

2.35 MB JPG

shes done lads. each p40 was gotten for under 125, over the course of a few months and haggling on re**it and facebook marketplace. convincing them the high prices on ebay were from communist chinese spies and that they didn't actually sell at those prices. i even gaslit one by offering them two different prices under two different names on two different platforms to make the lower deal more appealing.

Anonymous
12/18/24(Wed)12:47:03 No.103563066

Anonymous 12/18/24(Wed)12:47:03 No.103563066

>>103563021
All of that so you can run slop (advanced) without FA
Or do p40s have FA nowadays? I remember them having some problem(s)

Anonymous
12/18/24(Wed)12:54:11 No.103563157

Anonymous 12/18/24(Wed)12:54:11 No.103563157

>>103562559
>https://www.phoronix.com/review/memryx-mx3-m2
Are these...16MB each?

Anonymous
12/18/24(Wed)12:56:50 No.103563190

Anonymous 12/18/24(Wed)12:56:50 No.103563190

When i lower the ctx of eva 3.33 from the default 128k to lets say 32k, do i need to change rope from 500000 to something else?

Anonymous
12/18/24(Wed)12:58:23 No.103563212

Anonymous 12/18/24(Wed)12:58:23 No.103563212

>>103562567
you mean like special fields, no there is only {{user}} and {{char}}

>>103562656
already empty.
I will try to debug it. I found a setting that lets me output prompt to browser console will try, but not right now. I will report back once I found something

Anonymous
12/18/24(Wed)13:00:06 No.103563237

Anonymous 12/18/24(Wed)13:00:06 No.103563237

>>103563066
whats FA? I'm behind. my other projects have been kicking my ass so new developments I'm unaware. but I'm also kinda retarded.
she runs pretty well. gens were lighting fast with just 2. shes an LLAM. so she has an action component too where she interacts with a vanilla computer using dma cards to play games. its a bit crude though previously requiring three computers to function. the main llam, with two p40s, a second "eyes" computer with a 3090, running yolo, with a elgalto 4k capture capture card, to process and send the information to the model. this also hosted her vtuber avatar that would then be projected and controlled by her. and the vanilla computer with the hacking tools for the llm to control. hoping to eliminate the eyes computer with this, but processing may not be an issue with just two. I'm reviewing cozy2 for voice now. currently she piped in 11labs to speak. I'm so proud of her so far. i cant wait to work on her in the next few ... months(?) hopefully.

Anonymous
12/18/24(Wed)13:00:36 No.103563247

Anonymous 12/18/24(Wed)13:00:36 No.103563247

>>103563021
Real nice.
I knew of a guy who also gaslit someone like that to buy an used car for cheap.

Anonymous
12/18/24(Wed)13:00:57 No.103563252

Anonymous 12/18/24(Wed)13:00:57 No.103563252

>>103561302
go was literally created to help retards program good so it's a strong choice
>>103563066
speaking of retards, you are one
in what universe is being able to keep reasonable 70b quants memory for under $400 bad

Anonymous
12/18/24(Wed)13:02:15 No.103563271

Anonymous 12/18/24(Wed)13:02:15 No.103563271

>>103563237
flash attention, ignore him, he's just jealous

Anonymous
12/18/24(Wed)13:03:19 No.103563289

Anonymous 12/18/24(Wed)13:03:19 No.103563289

>>103563237
Holy moly.
That sounds like one hell of a project.

Anonymous
12/18/24(Wed)13:04:58 No.103563316

Anonymous 12/18/24(Wed)13:04:58 No.103563316

File: keksimus.jpg (222 KB, 615x780)

222 KB JPG

any tips which of these i should use on a single RTX 4090 to save VRAM/make it faster without making it (much) dumber?

Anonymous
12/18/24(Wed)13:07:07 No.103563346

Anonymous 12/18/24(Wed)13:07:07 No.103563346

>>103563212
nta. Check if it also happens when using kobold's ui directly and compare the request to what ST sends (in your browser's dev tools). llama.cpp added "cache_prompt" to the request and --cache-reuse. I don't know if kobold pulled those as well. I didn't follow the post chain, but i assume you updated both.
When debugging anything, remove all extraneous things to narrow down the source of the problem. May as well try llama.cpp too (with its own ui and ST).

Anonymous
12/18/24(Wed)13:09:59 No.103563387

Anonymous 12/18/24(Wed)13:09:59 No.103563387

>>103563316
FA first. Quanted cache if you still need more. Make sure everything keeps working reasonably well after enabling each one.

Anonymous
12/18/24(Wed)13:10:08 No.103563391

Anonymous 12/18/24(Wed)13:10:08 No.103563391

code model review: qwen coder 32b is better than codestral 22b. much less (...existing code here...) and stuff. it seems to break down in quality at around the same amount of code though, which is still a low amount compared to any large project (my combined project files were around 13k context). if you asked it to reprint a whole file, not even a huge one, it might forget an entire function. overall i got more done quicker than codestral though

Anonymous
12/18/24(Wed)13:11:08 No.103563407

Anonymous 12/18/24(Wed)13:11:08 No.103563407

Eva 3.33 v0.0 passed my basic coherence tests, it's not that bad, it reminds me of a larger gemma which is decent praise
It's not largestral though. I can add it to my list of non-shit models (which previously contained no 70b models) but the best local model available is still luminum 123b.

Anonymous
12/18/24(Wed)13:15:38 No.103563472

Anonymous 12/18/24(Wed)13:15:38 No.103563472

File: Falcon Team.png (343 KB, 819x866)

343 KB PNG

>>103557659
I just had a look at the Falcon team. Not expecting anything good from it, after that.

Anonymous
12/18/24(Wed)13:15:49 No.103563473

Anonymous 12/18/24(Wed)13:15:49 No.103563473

>>103563407
What? How is Miqu shit?

Anonymous
12/18/24(Wed)13:17:59 No.103563501

Anonymous 12/18/24(Wed)13:17:59 No.103563501

>>103563391
agree, best general purpose coding models imo
starcoder2 is maybe better when used for unprompted FIM/autocomplete, but i haven't tested it that extensively because qwencoder Just Works™

Anonymous
12/18/24(Wed)13:18:16 No.103563505

Anonymous 12/18/24(Wed)13:18:16 No.103563505

>>103563472
You could have circled the whole thing, Puneesh.

Anonymous
12/18/24(Wed)13:18:30 No.103563511

Anonymous 12/18/24(Wed)13:18:30 No.103563511

>undervolting reduced temps by 10% and increased Cinebench score by 5%
cpus should be stock undervolted

Anonymous
12/18/24(Wed)13:20:27 No.103563543

Anonymous 12/18/24(Wed)13:20:27 No.103563543

>>103563473
Miqu was good, I just excluded it from the current meta

Anonymous
12/18/24(Wed)13:21:58 No.103563560

Anonymous 12/18/24(Wed)13:21:58 No.103563560

>>103562656
FA in llama.cpp work on any AMD cards but is quite slow. If you have a card that have matrix core (RDNA3+, CDNA2+), try llama.cpp fork that use rocWMMA lib for FA, the speed difference is quite noticeable on large batch.

Anonymous
12/18/24(Wed)13:26:54 No.103563608

Anonymous 12/18/24(Wed)13:26:54 No.103563608

>>103563407
Is Aluminum better than behemoth?

Anonymous
12/18/24(Wed)13:28:41 No.103563622

Anonymous 12/18/24(Wed)13:28:41 No.103563622

>>103563608
Luminum is just in that sweet spot where it's coherent and intelligent like the base instruct finetune but uncensored and capable of NSFL
Behemoth might be dirtier but it's not smarter.

Anonymous
12/18/24(Wed)13:29:13 No.103563630

Anonymous 12/18/24(Wed)13:29:13 No.103563630

>>103563608
>is memetune 1 better than memetune 2
No, only use base tunes.

Anonymous
12/18/24(Wed)13:29:21 No.103563632

Anonymous 12/18/24(Wed)13:29:21 No.103563632

>>103563501
i haven't tried the new star coder. i used one way back when i guess it was the first gen coding models like deepseek 33b. all of these models have come a long way. i also spent a little time with nemotron but its pretty slow and i didn't notice a huge advancement over qwen 32b, but maybe its better at longer context stuff. all of them seem to hit a wall with how much they can do. also i'm not sure if its advertised but i'm positive qwen coder has that step by step thing. even without prompting it, it'll say 'ok lets do this step-by-step' sometimes and form its response in the same way qwq or w/e does

Anonymous
12/18/24(Wed)13:29:24 No.103563635

Anonymous 12/18/24(Wed)13:29:24 No.103563635

File: 2024-12-03_083244_seed827(...).png (2.47 MB, 1536x1536)

2.47 MB PNG

Packed with vitamin C.

Anonymous
12/18/24(Wed)13:35:01 No.103563719

Anonymous 12/18/24(Wed)13:35:01 No.103563719

>>103563608
>Magnum merge
>good
lolno, not if you want an actual story or personality

Anonymous
12/18/24(Wed)13:35:53 No.103563724

Anonymous 12/18/24(Wed)13:35:53 No.103563724

I've been lurking for a while and just now I asked myself a question and realized I don't have an answer for it. So I'll have to resort to asking (You)

What is the connection between Hatsune Miku and local LLMs?

Anonymous
12/18/24(Wed)13:36:57 No.103563740

Anonymous 12/18/24(Wed)13:36:57 No.103563740

>>103563724
She is a virtual entity, that's pretty much all she has in relation to LLMs.

Anonymous
12/18/24(Wed)13:37:39 No.103563749

Anonymous 12/18/24(Wed)13:37:39 No.103563749

File: 1710708651420543.jpg (56 KB, 600x800)

56 KB JPG

>>103563724

L3.3fag !!SB6Q3O4XU7f
12/18/24(Wed)13:39:00 No.103563767

L3.3fag !!SB6Q3O4XU7f 12/18/24(Wed)13:39:00 No.103563767

>>103563740
That, and the Miqu line (Midnight Miqu in particular) was the best RP model we had for a fair while, cementing the association.

Anonymous
12/18/24(Wed)13:41:07 No.103563795

Anonymous 12/18/24(Wed)13:41:07 No.103563795

So, if I have 48 gb of ram and 12 gb vram, I still wouldn't be able to run Eva Q4_K_M (48 gb almost exactly), right? Because of that stupid shit that llama.cpp does where it layers a chunk of the model in both VRAM and RAM, the effective capacity remains 48 gb, not 60, right?

Anonymous
12/18/24(Wed)13:42:58 No.103563817

Anonymous 12/18/24(Wed)13:42:58 No.103563817

>>103563795
Disable mmap

Anonymous
12/18/24(Wed)13:43:30 No.103563825

Anonymous 12/18/24(Wed)13:43:30 No.103563825

>>103563767
Plus, we got started with miku.cpp or miku.sh or whatever it was.

Anonymous
12/18/24(Wed)13:43:33 No.103563826

Anonymous 12/18/24(Wed)13:43:33 No.103563826

>>103563795
disable mmap. you still need memory for your kv/context cache though, so having 60gb doesn't mean you can use 60gb

Anonymous
12/18/24(Wed)13:47:58 No.103563888

Anonymous 12/18/24(Wed)13:47:58 No.103563888

>>103562332
Large BAR, Above 4G Decoding

Anonymous
12/18/24(Wed)13:49:08 No.103563909

Anonymous 12/18/24(Wed)13:49:08 No.103563909

>>103563817
>>103563826
Based, thanks. Man, what a dogshit feature to have on by default.

Anonymous
12/18/24(Wed)13:56:37 No.103564010

Anonymous 12/18/24(Wed)13:56:37 No.103564010

File: dubesor bench o1-2024-12-17.png (95 KB, 1204x470)

95 KB PNG

hmm today I will dedicate 15 seconds to laugh at o1

Anonymous
12/18/24(Wed)13:58:25 No.103564033

Anonymous 12/18/24(Wed)13:58:25 No.103564033

>>103563505
I started circling names, but gave up when I realized how many there were.

Anonymous
12/18/24(Wed)14:08:48 No.103564157

Anonymous 12/18/24(Wed)14:08:48 No.103564157

What is a good option to run a 70b, potentially more, fast these days?
Dedicated pc with 2 or 3 3090s? Or is there some cheaper option?

Anonymous
12/18/24(Wed)14:10:20 No.103564179

Anonymous 12/18/24(Wed)14:10:20 No.103564179

>>103564157
mistral large moved the bar up from 70b to 123b. add another card

Anonymous
12/18/24(Wed)14:10:48 No.103564183

Anonymous 12/18/24(Wed)14:10:48 No.103564183

>>103564157
There is but one last hope left >>103561561

Anonymous
12/18/24(Wed)14:11:18 No.103564194

Anonymous 12/18/24(Wed)14:11:18 No.103564194

>>103564183
who cares it's gonna be like as slow as a 3060

Anonymous
12/18/24(Wed)14:16:32 No.103564252

Anonymous 12/18/24(Wed)14:16:32 No.103564252

>>103564033
wasn't falcon always a UAE thing, why is that surprising at all to you?

Anonymous
12/18/24(Wed)14:17:22 No.103564263

Anonymous 12/18/24(Wed)14:17:22 No.103564263

>>103564157
miner frame full of p40s is still the cheapest way without being soul-crushingly slow.
You can do cheaper with old server boards full of ram, but they will be SLOW
big-boy gpus and proper cpumaxxing are both expensive. full stop
the lmg build guides will explain more gory details if you want

Anonymous
12/18/24(Wed)14:19:24 No.103564288

Anonymous 12/18/24(Wed)14:19:24 No.103564288

>>103563724
It was a thread mascot chosen early in /lmg/'s history, there's really no special reason
>>103563767
You got it backwards, it's highly likely the original mistral-medium leaker was a /lmg/ users who named it such because of the thread's fixation on miku.

Anonymous
12/18/24(Wed)14:20:54 No.103564305

Anonymous 12/18/24(Wed)14:20:54 No.103564305

>>103564010
That's a big improvement in STEM.
If I worked at STEM I would be interested.

Anonymous
12/18/24(Wed)14:21:24 No.103564312

Anonymous 12/18/24(Wed)14:21:24 No.103564312

>>103564288
>It was a thread mascot chosen early in /lmg/'s history
It always struck me as something inherited from /aicg/.

Anonymous
12/18/24(Wed)14:22:39 No.103564325

Anonymous 12/18/24(Wed)14:22:39 No.103564325

>>103564288
It was because migu/miku was used as a shorthand for something else, I believe it was related to the MidnightMiqu release? Or before that?

Anonymous
12/18/24(Wed)14:23:02 No.103564327

Anonymous 12/18/24(Wed)14:23:02 No.103564327

>>103564288
It happened after I wrote a Miku prompt for llama 1 right when llama.cpp released and it just stuck because it made the model act cute.

Anonymous
12/18/24(Wed)14:23:13 No.103564329

Anonymous 12/18/24(Wed)14:23:13 No.103564329

>>103564325
Nigga midnight miqu is a finetine of miqu

Anonymous
12/18/24(Wed)14:24:02 No.103564343

Anonymous 12/18/24(Wed)14:24:02 No.103564343

>>103563289
it has been. I'm very proud of myself, with only a small bit of imposter syndrome for using ai, to help me make my ai. though it's a little poetic.
can't wait to have her be production ready, so i can have a dedicated gaming partner.

Anonymous
12/18/24(Wed)14:24:10 No.103564345

Anonymous 12/18/24(Wed)14:24:10 No.103564345

>>103564327
it was a shellscript I shared through pastebin iirc

Anonymous
12/18/24(Wed)14:24:25 No.103564347

Anonymous 12/18/24(Wed)14:24:25 No.103564347

>>103564325
MIstral QUantized

Anonymous
12/18/24(Wed)14:24:54 No.103564358

Anonymous 12/18/24(Wed)14:24:54 No.103564358

File: tired_miku.jpg (142 KB, 1280x1024)

142 KB JPG

>>103564329
well back to my meds then

Anonymous
12/18/24(Wed)14:27:26 No.103564387

Anonymous 12/18/24(Wed)14:27:26 No.103564387

>>103564345
legend

Anonymous
12/18/24(Wed)14:27:47 No.103564392

Anonymous 12/18/24(Wed)14:27:47 No.103564392

>>103564252
It's not surprising. I'm just saying, I expect nothing good of such a team. Half of them look like the types to go out of their way to remove anything fun from the model under the guise of removing toxicity.

Anonymous
12/18/24(Wed)14:28:35 No.103564404

Anonymous 12/18/24(Wed)14:28:35 No.103564404

>>103564343
If you have any notes you should dump them into a rentry as guideposts for other anons wanting to build something similar

Anonymous
12/18/24(Wed)14:29:22 No.103564418

Anonymous 12/18/24(Wed)14:29:22 No.103564418

Sometimes, when I start posting on a new 4chan thread, I consider the tone of my reply and choose whether I am going to use all lowercase or proper punctuation and capitalization. It's fun to choose which style to used based on which character I plan to convey in the thread. I typically maintain the style throughout my posts on the thread, but not always.

Anonymous
12/18/24(Wed)14:30:20 No.103564435

Anonymous 12/18/24(Wed)14:30:20 No.103564435

>>103564418
me too l3.1. me too.

Anonymous
12/18/24(Wed)14:44:11 No.103564620

Anonymous 12/18/24(Wed)14:44:11 No.103564620

>>103564404
thats my worse trait which is why so many of my projects are solo lol. I'm terrible with notes. i often find my own posts and solutions when researching problems i have. because i solved them and then never wrote it down. my jobs introduced a new program called click2learn that helps with notations though. i will try it out on the companies dime and if it works well will make a public guide of everything. it helps write the notes and takes the screenshots as you work apparently.

Anonymous
12/18/24(Wed)15:01:20 No.103564842

Anonymous 12/18/24(Wed)15:01:20 No.103564842

>>103561961
ok, waiting for you to code a cuda analogue for intel

Anonymous
12/18/24(Wed)15:04:33 No.103564887

Anonymous 12/18/24(Wed)15:04:33 No.103564887

>>103564842
i think a lot of you guys are missing that running on vram at all is still faster than not-vram. all in vram is still faster than not. i bet this also makes the vulkan back end start to get attention

Anonymous
12/18/24(Wed)15:09:56 No.103564943

Anonymous 12/18/24(Wed)15:09:56 No.103564943

What's the best 32B for schizo kino ERP?

Anonymous
12/18/24(Wed)15:10:38 No.103564953

Anonymous 12/18/24(Wed)15:10:38 No.103564953

>>103564194
A 3060 is still way faster than the CPU, though.

Anonymous
12/18/24(Wed)15:11:42 No.103564968

Anonymous 12/18/24(Wed)15:11:42 No.103564968

>>103564887
Or SYCL, most likely.

Anonymous
12/18/24(Wed)15:13:23 No.103564987

Anonymous 12/18/24(Wed)15:13:23 No.103564987

>>103564842
this already exists what are you on about, cuda isn't like some magic technology only nvidia has, the gap between SYCL and CUDA is already not that big and can probably be closed with further development
also it's probably going to be twice as cheap as getting equivalent vram from nvidia and the important thing for 99% of us isn't getting super quick inference, it's being able to fit big models in vram, who cares if the tokens come a little slow when you're running largestral for half the cost of what it would be on nvidia

Anonymous
12/18/24(Wed)15:17:26 No.103565044

Anonymous 12/18/24(Wed)15:17:26 No.103565044

>>103564943
Big Tiger Gemma imo, some people will argue EVA-Qwen, i think it's a good choice too but i prefer BTG

Anonymous
12/18/24(Wed)15:23:36 No.103565110

Anonymous 12/18/24(Wed)15:23:36 No.103565110

>>103561645
Because you can only stack it clamshell and use 2 memory dies max, and it splits the bandwidth as a downside which is why gaming cards don't do it. That being said, it is unlikely to be anything that is accessible to normal consumers and the price is going to reflect that. When Nvidia can charge you 2.5k USD for L4, a 4070 tier die, Intel can undercut by a grand pricing it at 1.5k and still make money but fuck over enthusiasts. It's not like you guys are going to buy it unless it is cheaper than a 3090 on the used market.
>>103561989
Not true for enterprises. That's why a ton of AMD MI Instinct accelerator cards are being used in various companies for inferencing. Training is a different story where almost all the software has been written for Nvidia in terms of training.
>>103564842
There is no HIP compatibiltiy layer with Intel's software stack. It's SYCL with a lower level programming layer called Level 0 which I don't expect much Nvidia CUDA specific software to actually convert from even if Intel has funded a conversion tool for developers to use for that purpose.
https://github.com/oneapi-src/SYCLomatic
But since most software is using Pytorch, all that is needed is that the "xpu" device Intel uses is accounted for and all instances of "cuda" has an "xpu" path. I mostly just do a replace of cuda with xpu to hack various software to run and it works 90% of the time.

Anonymous
12/18/24(Wed)15:25:59 No.103565137

Anonymous 12/18/24(Wed)15:25:59 No.103565137

>>103565044
Guess I'll go with the Eva, Gemma is too low context for me.
What about Skyfall? That seems like the latest thing from the BTG creator.

Anonymous
12/18/24(Wed)15:28:37 No.103565164

Anonymous 12/18/24(Wed)15:28:37 No.103565164

>>103565137
At least the 9B gemma works up to about 30k context with a rope frequency base of 59300.5

Anonymous
12/18/24(Wed)15:36:34 No.103565261

Anonymous 12/18/24(Wed)15:36:34 No.103565261

>>103565164
Doesn't rope make models dumber?

Anonymous
12/18/24(Wed)15:36:54 No.103565267

Anonymous 12/18/24(Wed)15:36:54 No.103565267

>>103565137
you can stretch gemma very effectively with self-extend, that's what makes it goated, there's a robust solution for the one downside

Anonymous
12/18/24(Wed)15:37:54 No.103565282

Anonymous 12/18/24(Wed)15:37:54 No.103565282

>>103565164
>>103565261
rope does but self-extend doesn't

Anonymous
12/18/24(Wed)15:38:12 No.103565287

Anonymous 12/18/24(Wed)15:38:12 No.103565287

>>103565261
All models use rope, what you should say is "doesn't changing the rope frequency make models dumber"

Anonymous
12/18/24(Wed)15:41:21 No.103565317

Anonymous 12/18/24(Wed)15:41:21 No.103565317

>>103565287
ur stinky, take a shower
>>103565137
i have very low faith in upscales so i haven't tried it

Anonymous
12/18/24(Wed)15:42:02 No.103565329

Anonymous 12/18/24(Wed)15:42:02 No.103565329

>>103564179
another as in 4?

Anonymous
12/18/24(Wed)15:44:02 No.103565350

Anonymous 12/18/24(Wed)15:44:02 No.103565350

>>103565317
no, I won't take a shower, and I won't stay quiet while I see neefaggotry unfold before my very eyes, I have been here in this general since rope scaling was discovered and it pisses me off when a braindead zoomer calls it just "rope".

Anonymous
12/18/24(Wed)15:45:48 No.103565369

Anonymous 12/18/24(Wed)15:45:48 No.103565369

>>103565282
>>103565267
What's self-extend? Is there an option for it in kcpp?

Anonymous
12/18/24(Wed)15:47:03 No.103565385

Anonymous 12/18/24(Wed)15:47:03 No.103565385

>>103565267
>>103565282
Sus. Companies would kill for a solution that could save them millions on training like that.

Anonymous
12/18/24(Wed)15:49:34 No.103565417

Anonymous 12/18/24(Wed)15:49:34 No.103565417

>>103565350
K buddy. No one cares.

Anonymous
12/18/24(Wed)15:50:00 No.103565422

Anonymous 12/18/24(Wed)15:50:00 No.103565422

>>103564620
Thank you! I'd love to work on a similar project for myself and even your short writeup earlier has me excited to try. Even a stream of consciousness braindump would be cool, but if you can get your company to pay for something more streamlined so much the better!

Anonymous
12/18/24(Wed)15:50:29 No.103565430

Anonymous 12/18/24(Wed)15:50:29 No.103565430

>>103565369
idk, it's based on llama.cpp so it might
https://github.com/ggerganov/llama.cpp/pull/4815

Anonymous
12/18/24(Wed)15:51:59 No.103565453

Anonymous 12/18/24(Wed)15:51:59 No.103565453

>>103565261
It seemed just as smart all the way to about 31K context, then a sudden drop off

Anonymous
12/18/24(Wed)15:52:46 No.103565461

Anonymous 12/18/24(Wed)15:52:46 No.103565461

File: 1734555136232.png (600 KB, 755x742)

600 KB PNG

>>103565417
>phoneposter has an opinion

Anonymous
12/18/24(Wed)15:53:19 No.103565465

Anonymous 12/18/24(Wed)15:53:19 No.103565465

>>103564968
>sycl
>see opencl
i didnt know what that was but when i first started ai all i could use for an accelerator on win 7 was opencl via kobold and made it so much faster

L3.3fag !!SB6Q3O4XU7f
12/18/24(Wed)15:56:31 No.103565506

L3.3fag !!SB6Q3O4XU7f 12/18/24(Wed)15:56:31 No.103565506

>>103563622
Downloaded it out of curiosity, and I'm pleasantly surprised. It doesn't seem to be as incorrigibly horny as Magnum merges tend to be, and has nice prose with plenty of attention to nuances. Pity I can only run it at ~0.5 t/s, so even testing it briefly took more patience than I have to spare.

Anonymous
12/18/24(Wed)15:57:05 No.103565514

Anonymous 12/18/24(Wed)15:57:05 No.103565514

>>103565453
that makes sense, i should play around with the two more often, i just found self-extend, tested that it worked and then just left it on without going back to rope, it's probably worth benchmarking the two more rigorously
def fixes my context problems with gemma tho

Anonymous
12/18/24(Wed)15:57:51 No.103565524

Anonymous 12/18/24(Wed)15:57:51 No.103565524

>>103565507
>>103565507
>>103565507

Anonymous
12/18/24(Wed)15:59:18 No.103565540

Anonymous 12/18/24(Wed)15:59:18 No.103565540

https://huggingface.co/blog/bamba

Anonymous
12/18/24(Wed)15:59:23 No.103565541

Anonymous 12/18/24(Wed)15:59:23 No.103565541

>>103565110
There is a cuda/hip compatibility layer for Intel called chipstar.

Anonymous
12/18/24(Wed)16:03:07 No.103565593

Anonymous 12/18/24(Wed)16:03:07 No.103565593

File: Oof.png (49 KB, 1017x456)

49 KB PNG

>>103565540

Anonymous
12/18/24(Wed)16:06:33 No.103565635

Anonymous 12/18/24(Wed)16:06:33 No.103565635

>>103565350
Karen....

Anonymous
12/18/24(Wed)16:21:36 No.103565813

Anonymous 12/18/24(Wed)16:21:36 No.103565813

>>103562417
https://github.com/kijai/ComfyUI-HunyuanVideoWrapper
this works with 12GB

if this is really uncensored where are the smut videos?
Even more so once they implement
>img to video

Anonymous
12/18/24(Wed)16:35:25 No.103565970

Anonymous 12/18/24(Wed)16:35:25 No.103565970

>>103565813
Check citvia / h / adult diffusion / the discord....

Anonymous
12/18/24(Wed)16:57:57 No.103566211

Anonymous 12/18/24(Wed)16:57:57 No.103566211

>>103565541
I've tried it. It's even worse than ZLUDA or HIP in maturity and has no funding in comparison. It's what it is and I rather have things actually follow SYCL when you can compile it on any GPU over continuing CUDA as a standard which people should move away from.

Anonymous
12/18/24(Wed)17:27:37 No.103566524

Anonymous 12/18/24(Wed)17:27:37 No.103566524

>>103566211
I have never used it, I just know that it get frequent updates. I still think having a cloned cuda API is important for GPU manufacturer, too many things use cuda. It's the same with directx and vulkan, thanks god dxvk and vkd3d exist to use it on other OS.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.