/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

[Post a Reply]

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous
/lmg/ - Local Models General 11/23/24(Sat)23:17:40 No.103286673

File: migu general.jpg (151 KB, 1216x832)

151 KB JPG

/lmg/ - Local Models General Anonymous 11/23/24(Sat)23:17:40 No.103286673

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>103278810 & >>103265207

►News
>(11/22) LTX-Video: Real-time video generation on a single 4090: https://github.com/Lightricks/LTX-Video
>(11/21) Tülu3: Instruct finetunes on top of Llama 3.1 base: https://hf.co/collections/allenai/tulu-3-models-673b8e0dc3512e30e7dc54f5
>(11/20) LLaMA-Mesh weights released: https://hf.co/Zhengyi/LLaMA-Mesh
>(11/18) Mistral and Pixtral Large Instruct 2411 released: https://mistral.ai/news/pixtral-large
>(11/12) Qwen2.5-Coder series released: https://qwenlm.github.io/blog/qwen2.5-coder-family

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/hsiehjackson/RULER
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
11/23/24(Sat)23:18:03 No.103286678

Anonymous 11/23/24(Sat)23:18:03 No.103286678

File: ads_for_sale.png (1.96 MB, 1536x1536)

1.96 MB PNG

►Recent Highlights from the Previous Thread: >>103278810

--Uncensored AI models and sensitive topics discussion:
>103282064 >103282087 >103282279 >103283053 >103283136 >103283169 >103283256 >103283271 >103283305 >103283354
--Meta tests new Llama variants on LMSYS Arena:
>103279988 >103280042 >103280113 >103280056 >103280196
--Largestral 2411 exl2 quant testing with Llamiku prompt:
>103282646 >103282864 >103282989 >103283393
--Investigating Crestfall model's tokenizer size bloat and potential issues:
>103284469 >103284496 >103284516 >103284518
--GPU upgrades and LLM performance discussion:
>103280229 >103280279 >103280560 >103280695 >103280445 >103280481
--DeepSeek R1 release status clarified:
>103280873 >103280911 >103281007
--Anon's model is stuck in a loop, repeating the same word:
>103283931 >103283944 >103283997 >103283952 >103283954 >103283973 >103284153 >103284239 >103284045
--Anon is impressed with R1's text generation capabilities, particularly its description of a wolfgirl's tail:
>103284440
--4chan vs Reddit as training data for AI:
>103282208 >103282272
--Discussion on the UGI-Leaderboard and AI model performance:
>103285097 >103285407 >103285247 >103285575 >103285630 >103285747
--Anon seeks optimal Midnight-Miqu model for 24GB GPU:
>103280078 >103280170 >103280240 >103280717
--Anon questions how open source model works for corps with LLMs:
>103284508
--Anon proposes a simple RP/Smut benchmark, needs degens to rate:
>103285425 >103285513
--Anon explores usage of "hum" in text-based role-playing game dataset:
>103280759 >103280929
--Anons discuss AI chatbot behavior and identity:
>103284059 >103284562
--Miku (free space):
>103278815 >103279339 >103281386 >103281569 >103282026 >103282351 >103282386 >103282491 >103282866 >103284524 >103284822 >103285448

►Recent Highlight Posts from the Previous Thread: >>103278812

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script

Anonymous
11/23/24(Sat)23:20:08 No.103286702

Anonymous 11/23/24(Sat)23:20:08 No.103286702

File: 1713725937922459.jpg (79 KB, 1034x1002)

79 KB JPG

Anonymous
11/23/24(Sat)23:20:34 No.103286705

Anonymous 11/23/24(Sat)23:20:34 No.103286705

File: dark-repository-analytics.png (131 KB, 1120x680)

131 KB PNG

???
>llama-32B-instruct
>llama-vision-3B
Interesting placeholder name choices
https://huggingface.co/enterprise

Anonymous
11/23/24(Sat)23:26:34 No.103286754

Anonymous 11/23/24(Sat)23:26:34 No.103286754

File: 1731162001019142.png (10 KB, 764x860)

10 KB PNG

>>103286702
Someone is cranky today

Anonymous
11/23/24(Sat)23:29:21 No.103286765

Anonymous 11/23/24(Sat)23:29:21 No.103286765

File: basevsbehemoth.png (236 KB, 1923x581)

236 KB PNG

Did some testing with the new Behemoth v2.1 compared to Mistral 2411. Same generation settings, same prompt. It's absolute garbage.
>inconsistent prose
>repetition issues
>less detail
>replies as {{user}}
This card is just a fun fuckery thing I converted from some ball-draining TSA card, which it did even worse with. 2nd reply in it made one giant paragraph, repeated the same sentence 3 times in a row with minor variation, and made one paragraph with 1k letters before I stopped generation.
Seriously Drummer, this needs more time cooking or a merge with the original model to patch up its retardation.

Anonymous
11/23/24(Sat)23:30:17 No.103286774

Anonymous 11/23/24(Sat)23:30:17 No.103286774

Does anyone use this for non-coom purposes?

Anonymous
11/23/24(Sat)23:31:15 No.103286779

Anonymous 11/23/24(Sat)23:31:15 No.103286779

>>103286774
What's it to you?

Anonymous
11/23/24(Sat)23:33:19 No.103286788

Anonymous 11/23/24(Sat)23:33:19 No.103286788

>>103286774
I normally just ask it questions, for example my most recent question was
>"Hey, what was that part of the brain that if you cut. you basically lose control of half of your body while a new personality takes over that half? I believe its the part that connects the two hemispheres."
At which point I was informed that I was thinking of the "corpus callosum"

Anonymous
11/23/24(Sat)23:37:30 No.103286802

Anonymous 11/23/24(Sat)23:37:30 No.103286802

>>103286678
why the fuck cant you link properly you fucking retard

Anonymous
11/23/24(Sat)23:38:45 No.103286809

Anonymous 11/23/24(Sat)23:38:45 No.103286809

Whats the best uncensored model for someone with only a 8gb gpu and 32gb of ram?

Anonymous
11/23/24(Sat)23:39:06 No.103286810

Anonymous 11/23/24(Sat)23:39:06 No.103286810

File: he pulled.jpg (73 KB, 582x729)

73 KB JPG

Haven't pulled ooga booga webui in a while. Any new? Or am I having to fix shit for the next 5 hours? Largestral 2411 appears to work fine.

Anonymous
11/23/24(Sat)23:39:25 No.103286811

Anonymous 11/23/24(Sat)23:39:25 No.103286811

>>103286802
why can't you read properly
>>103286678
>Why?: 9 reply limit >>102478518
>Fix: https://rentry.org/lmg-recap-script

>>103286809
https://huggingface.co/TheDrummer/Tiger-Gemma-9B-v3-GGUF

Anonymous
11/23/24(Sat)23:40:13 No.103286817

Anonymous 11/23/24(Sat)23:40:13 No.103286817

>>103286811
Sorry, that's just retardanon. Don't let him get to you.

Anonymous
11/23/24(Sat)23:40:46 No.103286822

Anonymous 11/23/24(Sat)23:40:46 No.103286822

>>103286774
Handy for programming. Now that Google is a SEO shit pipe and sucks for turning up non-ass answers for technical questions, it's kinda like a local copy of Stack Overflow that you can chat with instead of one that is 10 years obsolete and full of arguments.

>>103286802
Because there's now a 9 link limit which means that most of the posts can't be linked anymore, you fantastic genius.

Anonymous
11/23/24(Sat)23:41:42 No.103286827

Anonymous 11/23/24(Sat)23:41:42 No.103286827

>>103286822
when the fuck did that happen?

Anonymous
11/23/24(Sat)23:42:51 No.103286829

Anonymous 11/23/24(Sat)23:42:51 No.103286829

>>103286827
>>102478518
(look at the date of the post)

Anonymous
11/23/24(Sat)23:43:38 No.103286831

Anonymous 11/23/24(Sat)23:43:38 No.103286831

>>103286827
Like, a month or two ago? It's been quite a while. And every two or three threads someone blames recap-LLM and its operator for it instead of reading the disclaimer that explains it because in 202X, anons just yell insults at each other instead of reading for reasons to yell insults at each other like we used to.

Anonymous
11/23/24(Sat)23:44:28 No.103286835

Anonymous 11/23/24(Sat)23:44:28 No.103286835

>>103286829
>>103286831
why the fuck would it matter? linking doesnt do shit unless you are on some shitter computer maybe?

Anonymous
11/23/24(Sat)23:45:50 No.103286843

Anonymous 11/23/24(Sat)23:45:50 No.103286843

>>103286835
It was theorized to be due to the schizo mass replying with a dox of one of the mods in Apple threads.

Anonymous
11/23/24(Sat)23:47:50 No.103286854

Anonymous 11/23/24(Sat)23:47:50 No.103286854

File: b&.png (118 KB, 856x1024)

118 KB PNG

>>103286843
kek

Anonymous
11/23/24(Sat)23:49:41 No.103286861

Anonymous 11/23/24(Sat)23:49:41 No.103286861

>>103286780
Only 4/6 of those posts are mine, also you quoted the same post twice, so 3/5.

Anonymous
11/23/24(Sat)23:50:11 No.103286864

Anonymous 11/23/24(Sat)23:50:11 No.103286864

File: 1706550591769583.png (1.08 MB, 1024x1024)

1.08 MB PNG

>>103286843
cant we just return to having no captcha or typing nigger in again...

Anonymous
11/23/24(Sat)23:50:48 No.103286867

Anonymous 11/23/24(Sat)23:50:48 No.103286867

File: file.png (18 KB, 1137x59)

18 KB PNG

>>103286854
yeah...

Anonymous
11/23/24(Sat)23:51:17 No.103286872

Anonymous 11/23/24(Sat)23:51:17 No.103286872

>>103286765
You did change to meth though right?
Otherwise no wonder it replies as user. lmao

Anonymous
11/23/24(Sat)23:55:46 No.103286894

Anonymous 11/23/24(Sat)23:55:46 No.103286894

File: file.png (55 KB, 750x806)

55 KB PNG

>>103286867
The changes did force him to change his format though

Anonymous
11/23/24(Sat)23:57:37 No.103286910

Anonymous 11/23/24(Sat)23:57:37 No.103286910

>>103286872
There's literally no description for the model yet on Huggingface, Behemoth V1 uses Mistral too, and Drummer told me to use 2411 new system prompt so... I think you may be intentionally retarded.
>>103286021

Anonymous
11/23/24(Sat)23:57:47 No.103286911

Anonymous 11/23/24(Sat)23:57:47 No.103286911

>>103286835
The hover is convenient on desktop but otherwise, yeah, just copy and find the first one listed in the previous thread and you've got it.

Anonymous
11/23/24(Sat)23:58:37 No.103286917

Anonymous 11/23/24(Sat)23:58:37 No.103286917

>>103286910
Wait hold on wtf, HES USING METHARME AND SYSTEM. THIS IS UNHOLY

Anonymous
11/24/24(Sun)00:01:35 No.103286934

Anonymous 11/24/24(Sun)00:01:35 No.103286934

>>103286910
lol drummer doesnt give a fuck. gotta respect that. "try it out". lol

Anonymous
11/24/24(Sun)00:08:50 No.103286978

Anonymous 11/24/24(Sun)00:08:50 No.103286978

>>103286911
The OP has instructions, but just put this in a bookmark on you bookmark toolbar:
javascript:const previousThreadUrl = document.querySelector('blockquote a[href*="thread"]').href,threadId = previousThreadUrl.match(/thread\/(\d{9})/)[1];document.querySelectorAll('span.quote').forEach(quoteSpan => {const quoteIds = quoteSpan.textContent.match(/>(\d{9})/g);if (quoteIds) quoteSpan.outerHTML = quoteIds.map(id => `<a href="/g/thread/${threadId}#p${id.slice(1)}"%20class="quotelink">>>${id.slice(1)}</a>`).join('%20');});

Click it once in a thread and it will fix the recap links.

Anonymous
11/24/24(Sun)00:13:12 No.103286998

Anonymous 11/24/24(Sun)00:13:12 No.103286998

>>103286822
what model and what weights are you using for programming?

Anonymous
11/24/24(Sun)00:14:33 No.103287002

Anonymous 11/24/24(Sun)00:14:33 No.103287002

>>103286998
Qwen2.5-Coder-32B-Instruct Q8

Anonymous
11/24/24(Sun)00:15:19 No.103287007

Anonymous 11/24/24(Sun)00:15:19 No.103287007

>It can generate 24 FPS videos at 768x512 resolution, faster than it takes to watch them.
Finally, something worth fucking around with. The last model I tried was painfully slow.

Anonymous
11/24/24(Sun)00:17:36 No.103287015

Anonymous 11/24/24(Sun)00:17:36 No.103287015

>>103287007
>faster than it takes to watch them.
on a 4090. 3090 shills btfo

Anonymous
11/24/24(Sun)00:23:04 No.103287042

Anonymous 11/24/24(Sun)00:23:04 No.103287042

>>103286811
Which one is the fastest? Seems like it runs kinda slow.

Anonymous
11/24/24(Sun)00:26:14 No.103287057

Anonymous 11/24/24(Sun)00:26:14 No.103287057

>>103286754
ywnbam sir

Anonymous
11/24/24(Sun)00:28:29 No.103287068

Anonymous 11/24/24(Sun)00:28:29 No.103287068

>>103287057
>you will never be a ma'am

Anonymous
11/24/24(Sun)00:29:08 No.103287072

Anonymous 11/24/24(Sun)00:29:08 No.103287072

>>103286774
Using LM Studio and reviewing many of my files.

Anonymous
11/24/24(Sun)00:39:49 No.103287122

Anonymous 11/24/24(Sun)00:39:49 No.103287122

>>103287015
I have a 4090, so yeah. Neat either way though, we needed an open source video model that isn't slow as fuck for a while, not to mention one that isn't complete dog shit.

Anonymous
11/24/24(Sun)00:41:02 No.103287132

Anonymous 11/24/24(Sun)00:41:02 No.103287132

>>103287007
the videos are incoherent shit though

Anonymous
11/24/24(Sun)00:47:31 No.103287169

Anonymous 11/24/24(Sun)00:47:31 No.103287169

>>103287068
For illiterate: "you will never be a miku sir"

Anonymous
11/24/24(Sun)01:06:19 No.103287274

Anonymous 11/24/24(Sun)01:06:19 No.103287274

I noticed that quite a few models are experimenting with cot/reasoning models being baked with hf. The R1/o1 local models seems close to the gate.

Anonymous
11/24/24(Sun)01:11:59 No.103287315

Anonymous 11/24/24(Sun)01:11:59 No.103287315

What front end to people use to run these? I know of tavern and open web ui. Couple in OP out of those which are the best?

Anonymous
11/24/24(Sun)01:13:45 No.103287321

Anonymous 11/24/24(Sun)01:13:45 No.103287321

>>103287315
sillytavern

Anonymous
11/24/24(Sun)01:14:44 No.103287331

Anonymous 11/24/24(Sun)01:14:44 No.103287331

>>103287315
koboldcpp is the best.

Anonymous
11/24/24(Sun)01:18:03 No.103287350

Anonymous 11/24/24(Sun)01:18:03 No.103287350

>>103287331
more like poobold

Anonymous
11/24/24(Sun)01:24:58 No.103287391

Anonymous 11/24/24(Sun)01:24:58 No.103287391

>>103287315
kobold to load them, sillytavern to access them

Anonymous
11/24/24(Sun)01:25:06 No.103287393

Anonymous 11/24/24(Sun)01:25:06 No.103287393

>>103287350
ya know, I feel like I'd see indians using 'indiachat' vs kobold or sillytavern.

I mean, why else would SamA be in shambles?
https://www.newsweek.com/sam-altman-india-project-indus-1919694

Anonymous
11/24/24(Sun)01:32:37 No.103287439

Anonymous 11/24/24(Sun)01:32:37 No.103287439

Whats up with chuds obsessing over Indian people?

Anonymous
11/24/24(Sun)01:36:55 No.103287455

Anonymous 11/24/24(Sun)01:36:55 No.103287455

>>103287393
oh I wasn't making a reference to indians
but it's funny that just saying "poo" makes people think of that now

Anonymous
11/24/24(Sun)01:46:01 No.103287503

Anonymous 11/24/24(Sun)01:46:01 No.103287503

File: TwinklingMischeviously.png (1.75 MB, 1232x816)

1.75 MB PNG

Good night lmg

Anonymous
11/24/24(Sun)01:46:09 No.103287504

Anonymous 11/24/24(Sun)01:46:09 No.103287504

File: 1597658236457.png (7 KB, 905x28)

7 KB PNG

Um? Because i fucking wanted too? Is that enough reasoning?

Anonymous
11/24/24(Sun)01:49:16 No.103287520

Anonymous 11/24/24(Sun)01:49:16 No.103287520

>>103287315
LM Studio

Anonymous
11/24/24(Sun)01:50:47 No.103287530

Anonymous 11/24/24(Sun)01:50:47 No.103287530

>>103287503
A good night to you Miku

Anonymous
11/24/24(Sun)01:51:46 No.103287537

Anonymous 11/24/24(Sun)01:51:46 No.103287537

>>103287504
It's lazy reasoning but I guess technically that is enough.

CPuMAXx/VI !CPuMAXx/VI
11/24/24(Sun)01:57:26 No.103287570

CPuMAXx/VI !CPuMAXx/VI 11/24/24(Sun)01:57:26 No.103287570

>>103199596
An update. The 6.11.9 kernel finally dropped on Debian testing.
As expected, the patch introduced in 6.11.8 from like 10 threads ago does absolutely nothing to improve CPU inference t/s

Anonymous
11/24/24(Sun)01:59:38 No.103287586

Anonymous 11/24/24(Sun)01:59:38 No.103287586

>>103286998
(>>103287002 is a different anon.)
Right now I'm on L3.1 Nemotron 70B Q6K. It's too chatty for general purpose but that's favorable for programming because it explains thoroughly which is good when you're asking about poorly documented things. It has made mistakes on things that other (L3 lineage) models get right, but I haven't had any major problems with it since I started using it.

Qwen has never done me right. It gets basic shit right but I've never seen a Qwen survive tricky questions.

Anonymous
11/24/24(Sun)02:29:28 No.103287750

Anonymous 11/24/24(Sun)02:29:28 No.103287750

File: 140fab193dc49f9c1053c2566(...).png (2.75 MB, 2835x3780)

2.75 MB PNG

Can you mix nvidia and amd cards on the same Linux machine for different tasks? I want to add a 3070 just for the fish and other small projects that radeon can't do.

CPuMAXx/VI !CPuMAXx/VI
11/24/24(Sun)02:33:55 No.103287782

CPuMAXx/VI !CPuMAXx/VI 11/24/24(Sun)02:33:55 No.103287782

>>103287750
yes

Anonymous
11/24/24(Sun)02:49:30 No.103287860

Anonymous 11/24/24(Sun)02:49:30 No.103287860

>>103287132
It's actually really good for realistic stuff, you need to type a caption to have good results though. They did give a prompt you can feed a LLM to get good prompts for what your want though.

Hi all, Drummer here...
11/24/24(Sun)02:56:24 No.103287899

Hi all, Drummer here... 11/24/24(Sun)02:56:24 No.103287899

https://huggingface.co/datasets/TheDrummer/AmoralQA-v2

Anonymous
11/24/24(Sun)03:08:59 No.103287964

Anonymous 11/24/24(Sun)03:08:59 No.103287964

Kill yourself.

Anonymous
11/24/24(Sun)03:23:38 No.103288030

Anonymous 11/24/24(Sun)03:23:38 No.103288030

>>103286811
hello saar, I've so far tried rocinante, stheno and cydonia. How does Tiger-Gemma compare?

Anonymous
11/24/24(Sun)03:47:04 No.103288145

Anonymous 11/24/24(Sun)03:47:04 No.103288145

>>103287899
cool

Anonymous
11/24/24(Sun)03:55:22 No.103288215

Anonymous 11/24/24(Sun)03:55:22 No.103288215

>>103287899
Interesting, have you already used it in your models or do you plan to use use it in future tunes?

Anonymous
11/24/24(Sun)04:06:20 No.103288296

Anonymous 11/24/24(Sun)04:06:20 No.103288296

>>103287899
Thanks for the free redteaming dataset

Hi all, Drummer here...
11/24/24(Sun)04:10:07 No.103288314

Hi all, Drummer here... 11/24/24(Sun)04:10:07 No.103288314

>>103288215
I haven't felt the need to use it since Gemma. Tiger Gemma v3 uses a /slightly/ broken version of AmoralQA v2. (Lists don't newline)

Anonymous
11/24/24(Sun)04:10:34 No.103288316

Anonymous 11/24/24(Sun)04:10:34 No.103288316

File: svge.png (3 MB, 1024x1024)

3 MB PNG

>>103286774
Yeah, I made an album cover with it. Then photoshopped it.
The idea is the whole album is supposed to be shuffled when you first hear it or when you hear it at all, so that way everyone hears it differently and can share mixes since each song begins and ends the same way.
I took inspiration from Churchill's Love and Info. https://en.wikipedia.org/wiki/Love_and_Information
I should be releasing it February

Anonymous
11/24/24(Sun)04:14:44 No.103288336

Anonymous 11/24/24(Sun)04:14:44 No.103288336

>>103286673
>(11/22) LTX-Video: Real-time video generation on a single 4090: https://github.com/Lightricks/LTX-Video
Has anyone gotten this to work on a single 24gb card?

Anonymous
11/24/24(Sun)04:18:33 No.103288349

Anonymous 11/24/24(Sun)04:18:33 No.103288349

>>103288336
I'm buying a GTX Titan, I'll get back to you on that.

Anonymous
11/24/24(Sun)04:21:50 No.103288358

Anonymous 11/24/24(Sun)04:21:50 No.103288358

>>103288349
I keep OOMing, despite having all 24564MB free (zero processes running CUDA)
The suggested `PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True` doesn't help
I might try on an A40 I have access to tomorrow and see how many bytes it actually uses at peak.

Anonymous
11/24/24(Sun)04:34:23 No.103288435

Anonymous 11/24/24(Sun)04:34:23 No.103288435

>>103287899
Too bad Elon doesn't come here anymore. He'd get this into grok.

Anonymous
11/24/24(Sun)04:52:42 No.103288552

Anonymous 11/24/24(Sun)04:52:42 No.103288552

>>103288336
use the ComfyUI node

Anonymous
11/24/24(Sun)05:00:56 No.103288602

Anonymous 11/24/24(Sun)05:00:56 No.103288602

How do I merge multiple safetensors files into one? Why are the beginners guides so advanced?

model-0001 -of-00014.safetensors
model-0002 -of-00014.safetensors
model-0003 -of-00014.safetensors
model-0004 -of-00014.safetensors
model-0005 -of-00014.safetensors
model-0006 -of-00014.safetensors
model-0007 -of-00014.safetensors
model-0008 -of-00014.safetensors
model-0009 -of-00014.safetensors
model-00010-of-00014.safetensors
model-00011-of-00014.safetensors
model-00012-of-00014.safetensors
model-00013-of-00014.safetensors
model-00014-of-00014.safetensors

Anonymous
11/24/24(Sun)05:07:41 No.103288649

Anonymous 11/24/24(Sun)05:07:41 No.103288649

>>103288602
just download the gguf you idiot

Anonymous
11/24/24(Sun)05:08:41 No.103288654

Anonymous 11/24/24(Sun)05:08:41 No.103288654

>>103288602
There is no need to do it but if you did, you would need to write your own Python script. I used the one way from way back and it still suffices but you have to use a .json format listing all your files as input to the script.
https://huggingface.co/leafspark/Mixtral-8x22B-v0.1/commit/e1cc7f15d97ea80f93ae4cb2d7196879610cac99

Anonymous
11/24/24(Sun)05:13:52 No.103288678

Anonymous 11/24/24(Sun)05:13:52 No.103288678

>>103288336
Yes.

Anonymous
11/24/24(Sun)05:26:48 No.103288747

Anonymous 11/24/24(Sun)05:26:48 No.103288747

>>103288654
I remember trying to make Small write the Python script for merging files and it botched it every single time. Its apologies became increasingly more desparate which was fun.

Anonymous
11/24/24(Sun)06:02:27 No.103288920

Anonymous 11/24/24(Sun)06:02:27 No.103288920

>>103286754
I want to gift tasty food to this Miku

Anonymous
11/24/24(Sun)06:25:00 No.103289009

Anonymous 11/24/24(Sun)06:25:00 No.103289009

When using stuff like vllm/aphrodite is there a difference between using pre-compressed FP8 version of the model from HF or just using --load-in-8bit flag on full size model?

Anonymous
11/24/24(Sun)06:32:18 No.103289047

Anonymous 11/24/24(Sun)06:32:18 No.103289047

>>103289009
>Currently, we load the model at original precision before quantizing down to 8-bits, so you need enough memory to load the whole model.
At least with vLLM. The final model is the same if the one quanted offline is done with the dynamic method that doesn't need calibration.

Anonymous
11/24/24(Sun)07:24:54 No.103289329

Anonymous 11/24/24(Sun)07:24:54 No.103289329

hi bros, wanted to ask
how much compute and power are needed to merge together 70-100b models, and can you merge q6 quants or do you need like f16

Anonymous
11/24/24(Sun)07:37:27 No.103289406

Anonymous 11/24/24(Sun)07:37:27 No.103289406

>>103289047
Thanks!

Anonymous
11/24/24(Sun)08:09:58 No.103289607

Anonymous 11/24/24(Sun)08:09:58 No.103289607

>>103289047
I think it was fixed with Aphrodite (it should load the full model layer by layer before the quantization)

Anonymous
11/24/24(Sun)08:28:21 No.103289699

Anonymous 11/24/24(Sun)08:28:21 No.103289699

>>103289607
I think that's for FP7 and below with another quant method.

Anonymous
11/24/24(Sun)08:30:53 No.103289721

Anonymous 11/24/24(Sun)08:30:53 No.103289721

File: mmmmmmmmmmmmi.jpg (12 KB, 264x264)

12 KB JPG

https://files.catbox.moe/nm2kt0.jpg

Anonymous
11/24/24(Sun)08:32:01 No.103289727

Anonymous 11/24/24(Sun)08:32:01 No.103289727

>>103289721
Weirdly wholesome

Anonymous
11/24/24(Sun)08:32:05 No.103289728

Anonymous 11/24/24(Sun)08:32:05 No.103289728

>DeepSeek R1 release hath been forsooth annulled, on account of grave concerns o'er its safety.
'Tis done.

Anonymous
11/24/24(Sun)08:32:35 No.103289733

Anonymous 11/24/24(Sun)08:32:35 No.103289733

>>103289721
wasn't expecting that

Anonymous
11/24/24(Sun)08:32:56 No.103289736

Anonymous 11/24/24(Sun)08:32:56 No.103289736

>>103289727
>>103289733
we don't always onahole the migu

Anonymous
11/24/24(Sun)08:36:58 No.103289763

Anonymous 11/24/24(Sun)08:36:58 No.103289763

File: 12412423457679.png (7 KB, 407x58)

7 KB PNG

Anonymous
11/24/24(Sun)09:13:04 No.103290028

Anonymous 11/24/24(Sun)09:13:04 No.103290028

Mikutroons killed /lmg/.

Anonymous
11/24/24(Sun)09:16:07 No.103290054

Anonymous 11/24/24(Sun)09:16:07 No.103290054

>>103290028
It was dead on arrival, something something le epic sekrit club that gatekeeps somewhat good models, shilling shitty ones instead. Many such cases.

Anonymous
11/24/24(Sun)09:19:03 No.103290077

Anonymous 11/24/24(Sun)09:19:03 No.103290077

>>103290054
I'm new here, which are the good models?

Anonymous
11/24/24(Sun)09:19:28 No.103290084

Anonymous 11/24/24(Sun)09:19:28 No.103290084

>>103290054
>t. newfag
For the first few months, it was a sekrit club not by gatekeeping, but because the setup to get things running was difficult enough that retards didn't even bother. One click installers were the death of /lmg/. The kofi shilling didn't start until the general was already a year old.

Anonymous
11/24/24(Sun)09:22:31 No.103290108

Anonymous 11/24/24(Sun)09:22:31 No.103290108

>>103290054
>gatekeeps
Literally how? Anybody can go download and test whatever at any time.
There's no gates anywhere.

Anonymous
11/24/24(Sun)09:22:38 No.103290110

Anonymous 11/24/24(Sun)09:22:38 No.103290110

File: mmmmmmmmmmmmi2.jpg (19 KB, 328x328)

19 KB JPG

https://files.catbox.moe/spz8we.jpg

Anonymous
11/24/24(Sun)09:25:43 No.103290126

Anonymous 11/24/24(Sun)09:25:43 No.103290126

>>103290084
Not quite correct, a year would be around April 24, but Undi already had kofi links during the llama2 merge era in October 23
>>96689447
>>96689473
>Sorry, Undi-Senpai, I'll donate to your kofi if you forgive this transgression. Please?

Anonymous
11/24/24(Sun)09:25:47 No.103290127

Anonymous 11/24/24(Sun)09:25:47 No.103290127

>>103290110
There's no point in holding back anon, make this general your dumpster ground already.

Anonymous
11/24/24(Sun)09:26:52 No.103290137

Anonymous 11/24/24(Sun)09:26:52 No.103290137

don't @ me retard

Anonymous
11/24/24(Sun)09:28:10 No.103290146

Anonymous 11/24/24(Sun)09:28:10 No.103290146

>>103290137
>don't @ me retard

Anonymous
11/24/24(Sun)09:28:44 No.103290151

Anonymous 11/24/24(Sun)09:28:44 No.103290151

>>97465971
>Remember to always download latest undi model. Always say it is better than previous ones. Always click the kofi button.
Things didn't change much huh? Now you just replace Undi with Drummer

Anonymous
11/24/24(Sun)09:39:04 No.103290252

Anonymous 11/24/24(Sun)09:39:04 No.103290252

I'm interested in creating more "conversational" bots, similar to how character.ai works (or at least used to work when it was first released, I haven't been there since). So, short replies that can be continued in a following separate message and, from what I remember, don't try to bring up locations or anything of the sort. They really resemble actual chat messages.
How do these work? Are they purely system prompts? I've tried a little with those but couldn't manage to achieve anything close to what I want. I also would rather avoid changing the cards, considering how that's probably not how character.ai does them (as you make the cards yourself)

Anonymous
11/24/24(Sun)09:42:38 No.103290293

Anonymous 11/24/24(Sun)09:42:38 No.103290293

File: 00007--sd3.5_large-20-4.5.jpg (821 KB, 3072x5376)

821 KB JPG

AI wasn't able to help.

koboldcpp-linux-x64-cuda1210 --multiuser 2 --usecublas 0 1 --port 5001 --quantkv 1 --flashattention --quantkv 1 --contextsize 32768 --model ./mistral-7b-instruct-v0.3-q4_k_m.gguf --gpulayers 8 --debugmode --ropeconfig 1 1000000

Generating (143 / 150 tokens) [(, 12.92%) ( night 35.36%) ( void 21.63%) ( darkness 13.26%)]
Generating (144 / 150 tokens) [( star 25.58%) ( rain 23.39%) ( ne 7.19%) ( cyber 5.89%)]
Generating (145 / 150 tokens) [(- 100.00%)]
Generating (146 / 150 tokens) [(stud 46.16%) (spe 23.97%) (filled 18.47%) (d 4.76%)]
Generating (147 / 150 tokens) [(ded 100.00%)]
Generating (148 / 150 tokens) [( sky 42.89%) ( night 49.85%) ( void 3.69%) ( exp 3.56%)]
Generating (149 / 150 tokens) [(. 100.00%)]
Generating (150 / 150 tokens) [( The 75.66%) ( A 6.37%) ( In 6.00%) ( B 5.59%)]

Using kobolai, how do I increase output tokens to more than 150?

Anonymous
11/24/24(Sun)09:44:55 No.103290316

Anonymous 11/24/24(Sun)09:44:55 No.103290316

>>103290293
>AI wasn't able to help.
You are the cancer killing this general. Have you even tried looking at the output of the help option?

Anonymous
11/24/24(Sun)09:46:09 No.103290332

Anonymous 11/24/24(Sun)09:46:09 No.103290332

>>103290293
In the sampler settings.

Anonymous
11/24/24(Sun)09:46:12 No.103290333

Anonymous 11/24/24(Sun)09:46:12 No.103290333

>>103290316
Do not reply if you see obvious bullshit.

Anonymous
11/24/24(Sun)09:47:29 No.103290342

Anonymous 11/24/24(Sun)09:47:29 No.103290342

>>103290316
>>103290332
Of course I did. WTF, if you can't help, don't.

Anonymous
11/24/24(Sun)09:48:06 No.103290348

Anonymous 11/24/24(Sun)09:48:06 No.103290348

>>103290342
>if you can't help, don't.
This isn't a tech support forum

Hi all, Drummer here...
11/24/24(Sun)09:48:58 No.103290359

Hi all, Drummer here... 11/24/24(Sun)09:48:58 No.103290359

>>103290084
>>103290126
>>103290151
You're all going to hate me after you see Behemoth v2's model card.

Anonymous
11/24/24(Sun)09:49:46 No.103290367

Anonymous 11/24/24(Sun)09:49:46 No.103290367

>>103290348
Yes it is if it's about local models.
see >>103289329
>>103289009

Anonymous
11/24/24(Sun)09:52:33 No.103290403

Anonymous 11/24/24(Sun)09:52:33 No.103290403

File: file.png (53 KB, 660x928)

53 KB PNG

>>103290359
based

Anonymous
11/24/24(Sun)10:01:54 No.103290485

Anonymous 11/24/24(Sun)10:01:54 No.103290485

>>103289721
I like this Miku

Anonymous
11/24/24(Sun)10:26:59 No.103290680

Anonymous 11/24/24(Sun)10:26:59 No.103290680

>>103290293
AI is good, but you should also read the documentation and see if the options there are making sense for your use case.

Anonymous
11/24/24(Sun)10:32:53 No.103290720

Anonymous 11/24/24(Sun)10:32:53 No.103290720

>>103290680
>read the documentation
The whole point of LLMs is so I don't have to do that anymore

Anonymous
11/24/24(Sun)10:34:36 No.103290740

Anonymous 11/24/24(Sun)10:34:36 No.103290740

>>103290720
This is stunningly accurate and horrifying.

Anonymous
11/24/24(Sun)10:35:15 No.103290748

Anonymous 11/24/24(Sun)10:35:15 No.103290748

>>103290680
I tried changing "context window" and "max_new_tokens" in the API with no luck. Maybe a limitation of the model and/or the API?

Anonymous
11/24/24(Sun)10:35:41 No.103290751

Anonymous 11/24/24(Sun)10:35:41 No.103290751

>>103290720
The LLMs aren't keeping up with the new docs retard

Anonymous
11/24/24(Sun)10:43:48 No.103290835

Anonymous 11/24/24(Sun)10:43:48 No.103290835

File: file.png (896 KB, 1379x1222)

896 KB PNG

https://github.com/ogkalu2/comic-translate

It's a little buggy (I cant get the automatic mode or inpainting to work) but this seems to have a lot of potential.

It doesn't have direct support for local models but you can change the `base_url` in the OpenAI client. Currently running it with mistral-small for translations.

Anonymous
11/24/24(Sun)10:44:33 No.103290841

Anonymous 11/24/24(Sun)10:44:33 No.103290841

I was wrong about finetunes, there are good ones. I'm unironically surprised.

Anonymous
11/24/24(Sun)10:55:30 No.103290929

Anonymous 11/24/24(Sun)10:55:30 No.103290929

>>103290841
Name them so I can tell you to buy an ad

Anonymous
11/24/24(Sun)11:07:35 No.103291027

Anonymous 11/24/24(Sun)11:07:35 No.103291027

>>103290841
Your mythomax?

Anonymous
11/24/24(Sun)11:09:03 No.103291045

Anonymous 11/24/24(Sun)11:09:03 No.103291045

>>96689222
>gets banned because he chose the wrong vocaloid
The absolute state of this worthless general that got taken over by a transsexual cult.

Anonymous
11/24/24(Sun)11:10:08 No.103291053

Anonymous 11/24/24(Sun)11:10:08 No.103291053

>>103290316
No me.

>>103290720
I figured it out.
undocumented api value "max_length": 311

Anonymous
11/24/24(Sun)11:32:28 No.103291249

Anonymous 11/24/24(Sun)11:32:28 No.103291249

Is Qwen-Turbo-1M opensource?

Anonymous
11/24/24(Sun)11:33:54 No.103291263

Anonymous 11/24/24(Sun)11:33:54 No.103291263

>>103290359
The only thing I hate about behemoth is the size, ain't no one got enough vram/patience to run that shit

Anonymous
11/24/24(Sun)11:37:26 No.103291300

Anonymous 11/24/24(Sun)11:37:26 No.103291300

>>103290252
I know my question is stupid but I'm not sure where else I'm supposed to ask.

Anonymous
11/24/24(Sun)11:38:10 No.103291305

Anonymous 11/24/24(Sun)11:38:10 No.103291305

>>103290028
Cutesy moeshit and troons are inseparable things at this point in time, both rooted deep in infantilism.

Anonymous
11/24/24(Sun)12:16:24 No.103291628

Anonymous 11/24/24(Sun)12:16:24 No.103291628

>>103290841
which one impressed you?

Anonymous
11/24/24(Sun)12:19:15 No.103291659

Anonymous 11/24/24(Sun)12:19:15 No.103291659

https://x.com/Big_Uppy/status/1860492712669049191

Anonymous
11/24/24(Sun)12:24:26 No.103291708

Anonymous 11/24/24(Sun)12:24:26 No.103291708

>>103291305
And you're still insecure about it big guy.

Anonymous
11/24/24(Sun)12:27:22 No.103291740

Anonymous 11/24/24(Sun)12:27:22 No.103291740

>>103291708
Nah im good

Anonymous
11/24/24(Sun)12:37:07 No.103291835

Anonymous 11/24/24(Sun)12:37:07 No.103291835

>>103291659
>Carson cluster
>Not Big Mac cluster
Into the trash it goes

Anonymous
11/24/24(Sun)12:39:50 No.103291856

Anonymous 11/24/24(Sun)12:39:50 No.103291856

>>103291305
You don't get it.
The point is to keep retards like you out.

Anonymous
11/24/24(Sun)12:43:27 No.103291886

Anonymous 11/24/24(Sun)12:43:27 No.103291886

>>103291856
>you are not welcome in our epic sekrit club!
k

Anonymous
11/24/24(Sun)12:47:51 No.103291928

Anonymous 11/24/24(Sun)12:47:51 No.103291928

>>103288602
https://rentry.org/tldrhowtoquant
if you mean "how do I make my own gguf"

Anonymous
11/24/24(Sun)12:48:09 No.103291932

Anonymous 11/24/24(Sun)12:48:09 No.103291932

My refractory period gave me an epiphany that the biggest problem now is fake context. Everything falls apart after 12k tokens. Possibly because of training data usually being in the much smaller range.

Anonymous
11/24/24(Sun)12:54:43 No.103292005

Anonymous 11/24/24(Sun)12:54:43 No.103292005

>>103291928
>How to quant your own models
More like how to convert HF to GGUF. You should at least add an example of using llama-quanize.

Anonymous
11/24/24(Sun)12:55:58 No.103292023

Anonymous 11/24/24(Sun)12:55:58 No.103292023

>>103291932
Mistral models fall apart after two replies for me (in RP, they can do assistant slop well)

Anonymous
11/24/24(Sun)12:57:29 No.103292044

Anonymous 11/24/24(Sun)12:57:29 No.103292044

hunyuan gguf?

Anonymous
11/24/24(Sun)12:57:31 No.103292046

Anonymous 11/24/24(Sun)12:57:31 No.103292046

>>103291628
I'm not gonna shill it. I believe the only reason it's good is because Instruct fine-tuning of Mistral Small was so shit

Anonymous
11/24/24(Sun)12:58:49 No.103292063

Anonymous 11/24/24(Sun)12:58:49 No.103292063

>>103292023
Perhaps your cards don't make sense?

Anonymous
11/24/24(Sun)13:01:11 No.103292091

Anonymous 11/24/24(Sun)13:01:11 No.103292091

>>103292044
not supported

Anonymous
11/24/24(Sun)13:02:12 No.103292102

Anonymous 11/24/24(Sun)13:02:12 No.103292102

File: 005459.png (1.42 MB, 896x1152)

1.42 MB PNG

Anonymous
11/24/24(Sun)13:07:32 No.103292155

Anonymous 11/24/24(Sun)13:07:32 No.103292155

File: fedeb180523e6c4d653282439(...).jpg (531 KB, 1407x2048)

531 KB JPG

Every time I ask something that requires real-world knowledge during RP, every model immediately breaks character and switches into assistant mode with lists and "Let me know if you have any questions or need further assistance!" So annoying.

Anonymous
11/24/24(Sun)13:10:13 No.103292180

Anonymous 11/24/24(Sun)13:10:13 No.103292180

>>103292005
Good idea, although I tried to keep it as tl;dr as possible.
added pip bootstrap and quantize/split instructions

Anonymous
11/24/24(Sun)13:11:48 No.103292194

Anonymous 11/24/24(Sun)13:11:48 No.103292194

File: SleepyOversizedShirtMiku.png (1.4 MB, 840x1208)

1.4 MB PNG

good morning, /lmg/

Anonymous
11/24/24(Sun)13:15:44 No.103292227

Anonymous 11/24/24(Sun)13:15:44 No.103292227

Mkku cute. Keep posting her.

Anonymous
11/24/24(Sun)13:19:36 No.103292262

Anonymous 11/24/24(Sun)13:19:36 No.103292262

>>103292155
System prompt issue pal

Anonymous
11/24/24(Sun)13:28:20 No.103292357

Anonymous 11/24/24(Sun)13:28:20 No.103292357

>>103292155
Set up a safeword that you need to say that implies you want to break RP, emphasis that if you don't say the safeword that you want the RP to continue.

Anonymous
11/24/24(Sun)13:32:03 No.103292402

Anonymous 11/24/24(Sun)13:32:03 No.103292402

>>103292262
Do you have a solution?
>>103292357
I've already tried "You do not break character for any reason" and other professional roleplayer cope, doesn't work.

Anonymous
11/24/24(Sun)13:32:50 No.103292412

Anonymous 11/24/24(Sun)13:32:50 No.103292412

>>103292402
But did you or did you not try a safeword?

Anonymous
11/24/24(Sun)13:35:19 No.103292440

Anonymous 11/24/24(Sun)13:35:19 No.103292440

>>103292063
Ever tried cutting down ctx length when you keep desperately rerolling to get it to stop being incoherent? It works beautifully but it is not a long term solution.

Anonymous
11/24/24(Sun)13:36:08 No.103292452

Anonymous 11/24/24(Sun)13:36:08 No.103292452

>>103292412
No, but I'll give it a shot

Anonymous
11/24/24(Sun)13:36:19 No.103292456

Anonymous 11/24/24(Sun)13:36:19 No.103292456

>>103292227
You are a nigger and a faggot.

Anonymous
11/24/24(Sun)13:36:21 No.103292457

Anonymous 11/24/24(Sun)13:36:21 No.103292457

>>103292194
mornign betifel show bobis and venega

Anonymous
11/24/24(Sun)13:37:57 No.103292482

Anonymous 11/24/24(Sun)13:37:57 No.103292482

File: MikuBob.png (1.18 MB, 1248x800)

1.18 MB PNG

>>103292457
I still don't think she looks right with a bob. What a strange obsession.

Anonymous
11/24/24(Sun)13:38:55 No.103292492

Anonymous 11/24/24(Sun)13:38:55 No.103292492

>>103292456
This precise combination is actually great and is welcome here.

Anonymous
11/24/24(Sun)13:40:13 No.103292512

Anonymous 11/24/24(Sun)13:40:13 No.103292512

>>103292482
wher verega whore

Anonymous
11/24/24(Sun)13:40:14 No.103292513

Anonymous 11/24/24(Sun)13:40:14 No.103292513

>>103292456
>>103292492
GNAA represent!

Anonymous
11/24/24(Sun)13:44:26 No.103292570

Anonymous 11/24/24(Sun)13:44:26 No.103292570

File: 00061-1096963739.png (1.3 MB, 1024x1024)

1.3 MB PNG

>>103292512
it appears to have been misplaced. please try again later

Anonymous
11/24/24(Sun)13:44:37 No.103292573

Anonymous 11/24/24(Sun)13:44:37 No.103292573

>>103290367
>>103292102
Prompt?

Anonymous
11/24/24(Sun)14:12:10 No.103292899

Anonymous 11/24/24(Sun)14:12:10 No.103292899

>>103292482
Looks fine to me

Anonymous
11/24/24(Sun)14:13:30 No.103292915

Anonymous 11/24/24(Sun)14:13:30 No.103292915

>>103292573
Didn't mean to quote the first one

Anonymous
11/24/24(Sun)14:26:46 No.103293087

Anonymous 11/24/24(Sun)14:26:46 No.103293087

>Testing out some 3.1 Nemotron quants
>One is vanilla IQ4_XS
>The other is abliterated.i1 IQ4_XS
>Ask it an obscure trivia question about a work of fiction.
>Vanilla knows it doesn't know, asks for more information. I give it a hint, it still doesn't know.
>Abliterated i1 also says it doesn't know, asks for info. I give it a hint, it recognizes and correctly describes.
For the record, only DBRX and L3.05 Storybreaker Minist has gotten this question superficially right 0-shot, and nothing's gotten it fully right.

Which is most likely the cause of the improved performance in Abliterated i1, the i1 or the abliteration effect? Or is it just placebo/chance?

Anonymous
11/24/24(Sun)14:29:43 No.103293117

Anonymous 11/24/24(Sun)14:29:43 No.103293117

>>103293087
From what I've read regarding an LLMs ability to acknowledge its own lack of knowledge regarding a topic, I would assume its the abliteration preventing it from commenting on something its unsure of, breaking the ingrained training to prevent it from speaking about things it doesn't confidently know.

t. retard

Anonymous
11/24/24(Sun)14:31:07 No.103293133

Anonymous 11/24/24(Sun)14:31:07 No.103293133

>>103293117
Which is to say that your one-off worked, but overall/longer term, it would be detrimental as the model could offer solutions when its just guessing as opposed to using 'grounded' knowledge

Anonymous
11/24/24(Sun)14:38:12 No.103293208

Anonymous 11/24/24(Sun)14:38:12 No.103293208

>>103292492
/nu-lmg/

Anonymous
11/24/24(Sun)14:39:20 No.103293224

Anonymous 11/24/24(Sun)14:39:20 No.103293224

https://x.com/lmarena_ai/status/1860118754921001206

Anonymous
11/24/24(Sun)14:44:10 No.103293276

Anonymous 11/24/24(Sun)14:44:10 No.103293276

>>103293117
Thanks for your input.
I've been curious about it, since I don't really know how much of an effect i1 has (apparently it's a one-bit form of iMatrix or something like that), and how Abliteration works (is it something like dropping the layer that is used to send good responses down a stock refusal path?).

Anyway, I'm dumping both IQ4's because they failed my music theory test and am pulling Q5KS now for the same kind of testing. I've been using vanilla at Q6K, but if the abliterated i1 is better it might take that spot.

>abliteration preventing it from commenting on something its unsure of, breaking the ingrained training to prevent it from speaking about things it doesn't confidently know
>it would be detrimental as the model could offer solutions when its just guessing as opposed to using 'grounded' knowledge
Perhaps but the information it offered was all correct, and it's the kind of question that many models (including many L3.0 spins, Qwen coder, CR+, and Mistal Large) confidently hallucinate on, so I wouldn't expect much restraint.

Anonymous
11/24/24(Sun)14:45:18 No.103293292

Anonymous 11/24/24(Sun)14:45:18 No.103293292

Her voice is sweet and gentle, but there's a playful spark in her closed eyes

Anonymous
11/24/24(Sun)14:47:12 No.103293314

Anonymous 11/24/24(Sun)14:47:12 No.103293314

>>103293276
>apparently it's a one-bit form of iMatrix or something like that
No it's not, it's just mradermacher's naming convention

>What does the "-i1" mean in "-i1-GGUF"?
>"mradermacher imatrix type 1"
>Originally, I had the idea of using an iterational method of imatrix generation, and wanted to see how well it fares. That is, create an imatrix from a bad quant (e.g. static Q2_K), then use the new model to generate a possibly better imatrix. It never happened, but I think sticking to something, even if slightly wrong, is better changing it. If I make considerable changes to how I create imatrix data I will probably bump it to -i2 and so on.

>since there is some subjectivity/choice in imatrix training data, this also distinguishes it from quants by other people who made different choices.
https://huggingface.co/mradermacher/model_requests

Anonymous
11/24/24(Sun)14:47:12 No.103293315

Anonymous 11/24/24(Sun)14:47:12 No.103293315

File: c_005511.png (232 KB, 447x447)

232 KB PNG

>>103292573
For LMG? Of course.
https://files.catbox.moe/lxids8.png
And one more.
https://files.catbox.moe/55qmh1.png

Anonymous
11/24/24(Sun)14:50:18 No.103293345

Anonymous 11/24/24(Sun)14:50:18 No.103293345

>>103293224
>Math: #3
>682 votes
What kind of idiot uses *language* models as calculator?
>Overall #10
>Overall (Style Control): #17
Just another oversized starling like nemotron.

Anonymous
11/24/24(Sun)14:52:10 No.103293361

Anonymous 11/24/24(Sun)14:52:10 No.103293361

I'll do a migu or two later tonight if I remember perhaps thobeit

Anonymous
11/24/24(Sun)14:54:33 No.103293388

Anonymous 11/24/24(Sun)14:54:33 No.103293388

>>103293314
So it's all iMatrix, and I'm right to have been renaming my models to include the quant supplier since that will matter. Great, more variables to raise the noise floor when making comparisons. :D

>>103293345
Used "as" a calculator is a bit of a reach, but calculation can certainly come up in a conversation. Remember how we used to ask about how long it would take the laundry to dry to see if it handles parallel processes (and the LLMs of that era I guess never did) but that's also a math problem in disguise.

Anonymous
11/24/24(Sun)14:58:45 No.103293433

Anonymous 11/24/24(Sun)14:58:45 No.103293433

Anyone here just type the other guy instead of using software as your condom?

Anonymous
11/24/24(Sun)15:00:35 No.103293458

Anonymous 11/24/24(Sun)15:00:35 No.103293458

https://superuser.com/q/1862674/1775458

Anonymous
11/24/24(Sun)15:05:08 No.103293508

Anonymous 11/24/24(Sun)15:05:08 No.103293508

>>103293345
>Based on your setup, let's calculate the total power consumption:
>GPUs: 2 x 200W = 400W CPU: 65W Storage (SSD + HDD): ~15W combined Motherboard & RAM: ~30W Total: 400 + 65 + 15 + 30 = 510W
>Your 750W power supply should be more than sufficient for this configuration, even with some headroom.

Anonymous
11/24/24(Sun)15:05:59 No.103293513

Anonymous 11/24/24(Sun)15:05:59 No.103293513

>>103293345
if calculators did everything in math, there would be no mathematicians

Anonymous
11/24/24(Sun)15:07:20 No.103293526

Anonymous 11/24/24(Sun)15:07:20 No.103293526

How can I use model from weights.gg locally?

I got model file downloaded already. I wanted to integrate it with GPT-SoVITs (other methods are welcome too) because I have trained custom model already (it's very lacking) and I found RVC model for this character which sounds amazing (judging by preview) so I thought to slap one on top of another.

Anonymous
11/24/24(Sun)15:09:46 No.103293547

Anonymous 11/24/24(Sun)15:09:46 No.103293547

>>103293458
>poojan
lol

Anonymous
11/24/24(Sun)15:11:50 No.103293562

Anonymous 11/24/24(Sun)15:11:50 No.103293562

File: durga.jpg (576 KB, 1280x720)

576 KB JPG

>>103293458
>obscure python shit
>C:\Users\Poojan
SAAR

Anonymous
11/24/24(Sun)15:14:10 No.103293592

Anonymous 11/24/24(Sun)15:14:10 No.103293592

>>103293458
>Gujarat, India
>Student at ITM(SLS) Baroda University

Anonymous
11/24/24(Sun)15:14:10 No.103293593

Anonymous 11/24/24(Sun)15:14:10 No.103293593

>>103293224
And based on that:
https://huggingface.co/sophosympatheia/Evathene-v1.0?not-for-all-audiences=true

Anonymous
11/24/24(Sun)15:16:18 No.103293622

Anonymous 11/24/24(Sun)15:16:18 No.103293622

>>103293458
>>103293547
>>103293562
>>103293592
Stop samefagging and reposting reddit shit

https://www.reddit.com/r/LocalLLaMA/comments/1gyxwse/need_help_in_installing_llama3170binstruct/

Anonymous
11/24/24(Sun)15:19:25 No.103293660

Anonymous 11/24/24(Sun)15:19:25 No.103293660

>>103293622
It gets even better:
>MODERATOR OF THESE COMMUNITIES
>r/IndiangirlskaRR
>r/ladkiyonkaRR

Anonymous
11/24/24(Sun)15:20:40 No.103293674

Anonymous 11/24/24(Sun)15:20:40 No.103293674

>>103292194
I like this Miku

Anonymous
11/24/24(Sun)15:21:45 No.103293689

Anonymous 11/24/24(Sun)15:21:45 No.103293689

>>103292194
I hate this Miku

Anonymous
11/24/24(Sun)15:22:58 No.103293709

Anonymous 11/24/24(Sun)15:22:58 No.103293709

File: workflow.png (298 KB, 1990x997)

298 KB PNG

>>103288336
Yeah. I've got it running using comfyui. Here the workflow if anyone has suggestions for changing settings.

Someone give me prompts. I'll post what it generates.

Anonymous
11/24/24(Sun)15:27:26 No.103293778

Anonymous 11/24/24(Sun)15:27:26 No.103293778

>>103288336
I think the real question for vramlets like me is can it bake on a 4070 12GB at something reasonable though not real time like the marketing promises for 4090?

Anonymous
11/24/24(Sun)15:30:05 No.103293808

Anonymous 11/24/24(Sun)15:30:05 No.103293808

>>103293709
>Here the workflow
Post the json so we can test your settings ourselves

Anonymous
11/24/24(Sun)15:32:02 No.103293832

Anonymous 11/24/24(Sun)15:32:02 No.103293832

>>103293808
https://files.catbox.moe/nlle5g.json

Its one of the example workflows provided on the github.

Anonymous
11/24/24(Sun)15:32:03 No.103293833

Anonymous 11/24/24(Sun)15:32:03 No.103293833

>>103293808
https://blog.comfy.org/ltxv-day-1-comfyui/

Anonymous
11/24/24(Sun)15:40:04 No.103293928

Anonymous 11/24/24(Sun)15:40:04 No.103293928

File: 1731860364173671.jpg (1.36 MB, 2580x2009)

1.36 MB JPG

>>103291305
Anime. Website.
Seethe harder, amygdalalet

Anonymous
11/24/24(Sun)15:41:21 No.103293936

Anonymous 11/24/24(Sun)15:41:21 No.103293936

>>103292194
I love this Miku

Anonymous
11/24/24(Sun)15:42:44 No.103293952

Anonymous 11/24/24(Sun)15:42:44 No.103293952

>>103292194
where the fuck is that arm coming from?

Anonymous
11/24/24(Sun)15:44:33 No.103293979

Anonymous 11/24/24(Sun)15:44:33 No.103293979

File: LTXVideo_00021.mp4 (294 KB, 768x512)

294 KB MP4

>>103293709
Seems like subsequent generations yields the same output, or at least extremely similar outputs. It also seems like the model was mostly trained on real life people and environments.

Not very impressive unless you're generating one of those two things. This is my first time with a video model though, so it may just be a skill issue on my part.

Anonymous
11/24/24(Sun)15:47:24 No.103294017

Anonymous 11/24/24(Sun)15:47:24 No.103294017

>>103293979
Its far more trained on real stuff / movies. It also was trained on LLM style captions so they provide a prompt to make a good prompt the same way. Its also apparently 0.9 and they are still training it. Its really good already at realistic stuff though for such a small model.

Anonymous
11/24/24(Sun)15:47:35 No.103294022

Anonymous 11/24/24(Sun)15:47:35 No.103294022

>>103293315
Nice

Anonymous
11/24/24(Sun)15:50:31 No.103294054

Anonymous 11/24/24(Sun)15:50:31 No.103294054

>>103293709
A stylish woman walks down a Tokyo street filled with warm glowing neon and animated city signage. She wears a black leather jacket, a long red dress, and black boots, and carries a black purse. She wears sunglasses and red lipstick. She walks confidently and casually. The street is damp and reflective, creating a mirror effect of the colorful lights. Many pedestrians walk about.
To compare to the similar Sora generation here: https://openai.com/index/sora/

Anonymous
11/24/24(Sun)15:53:16 No.103294086

Anonymous 11/24/24(Sun)15:53:16 No.103294086

File: LTXVideo_00025.mp4 (1.09 MB, 768x512)

1.09 MB MP4

>>103294054
>A stylish woman walks down a Tokyo street filled with warm glowing neon and animated city signage. She wears a black leather jacket, a long red dress, and black boots, and carries a black purse. She wears sunglasses and red lipstick. She walks confidently and casually. The street is damp and reflective, creating a mirror effect of the colorful lights. Many pedestrians walk about.

Anonymous
11/24/24(Sun)15:54:36 No.103294101

Anonymous 11/24/24(Sun)15:54:36 No.103294101

>>103294086
I assume that is with the default 30 steps. Use more steps / slightly higher CFG.

Anonymous
11/24/24(Sun)15:54:50 No.103294106

Anonymous 11/24/24(Sun)15:54:50 No.103294106

>>103294086
Oof. Well, Rome wasn't built in a day. Thanks anon

Anonymous
11/24/24(Sun)15:56:44 No.103294121

Anonymous 11/24/24(Sun)15:56:44 No.103294121

>>103294106
I bet you sora runs on a rack of H100s and still takes minutes to gen. This is instant on a 4090

Anonymous
11/24/24(Sun)15:57:46 No.103294132

Anonymous 11/24/24(Sun)15:57:46 No.103294132

File: LTXVideo_00026.mp4 (1.13 MB, 768x512)

1.13 MB MP4

>>103294101
>steps 100
>cfg 5.0

Anonymous
11/24/24(Sun)16:01:51 No.103294176

Anonymous 11/24/24(Sun)16:01:51 No.103294176

>>103294132
Into the trash it goes. Instant or not, with that quality, it's useless.

Anonymous
11/24/24(Sun)16:02:15 No.103294184

Anonymous 11/24/24(Sun)16:02:15 No.103294184

im sure you guys hear this a lot but as someone who only horny rps once every month or so i can not deal with looking for proxies anymore so i think im going full local now. i am NOT asking for the best models dont worry i am capable of reading the thread. but any hard to find advice especially for cunny content in particular will be appreciated and i will make DISGUSTING new cards if i can get something good enough going

Anonymous
11/24/24(Sun)16:03:05 No.103294194

Anonymous 11/24/24(Sun)16:03:05 No.103294194

>>103294184
>and i will make DISGUSTING new cards
keep them to yourself

Anonymous
11/24/24(Sun)16:06:29 No.103294229

Anonymous 11/24/24(Sun)16:06:29 No.103294229

>>103294176
For stuff closer to what it was trained on its pretty good.
https://files.catbox.moe/wk3e6m.webp

Anonymous
11/24/24(Sun)16:08:20 No.103294252

Anonymous 11/24/24(Sun)16:08:20 No.103294252

>>103294194
if you insist i can just post them to aicg still but i know you can resist the loli puke content

Anonymous
11/24/24(Sun)16:08:40 No.103294256

Anonymous 11/24/24(Sun)16:08:40 No.103294256

File: HAKUDAKU_03 FwZ-jzwaEAAyszK.jpg (2.55 MB, 2894x4093)

2.55 MB JPG

>>103286673
I have an ai-related thing I want to do, and this seems like the most adjacent thread to what i want to make:
does anyone here know how to make ai voices? I want to make something that involves voice acting, and i want to use an ai to change my voice to another one, but I'm completly out of my element with using ai stuff beyond basic chatbot usage.
Also I'm going to preface by saying that my PC is from 2016 and most of my computer prowess is in cleaning and fixing 'puters, so if you could point me to a site before throwing a program at me, I'd be thankful.
Have a migu

Anonymous
11/24/24(Sun)16:11:47 No.103294293

Anonymous 11/24/24(Sun)16:11:47 No.103294293

>>103294256
look up RVC voice conversion, there are shitloads of models of different people. old computer is probably fine as long as youre not hoping to convert your voice in realtime

Anonymous
11/24/24(Sun)16:12:47 No.103294304

Anonymous 11/24/24(Sun)16:12:47 No.103294304

>>103294132
>>103294176
Learn2prompt
https://files.catbox.moe/2e5fqy.mp4
https://files.catbox.moe/2rw5ve.mp4
https://files.catbox.moe/mvhy7i.mp4
https://files.catbox.moe/np5p2p.mp4
https://files.catbox.moe/mpezda.mp4
https://files.catbox.moe/e9f1am.mp4

Anonymous
11/24/24(Sun)16:16:28 No.103294336

Anonymous 11/24/24(Sun)16:16:28 No.103294336

File: transper_miku.png (235 KB, 472x522)

235 KB PNG

>>103294293
>RVC voice conversion
Alright, thanks anon
From personal experience, if you have some, is voice.ai good? I would use the first github link google gave me but it's entirely in moonrunes and I don't really fancy a chinese cryptominer on top of all the problems my pc has

Anonymous
11/24/24(Sun)16:17:14 No.103294349

Anonymous 11/24/24(Sun)16:17:14 No.103294349

>>103286673
>read Unsloth documentation
>DPO (Direct Preference Optimization), ORPO (Odds Ratio Preference Optimization), PPO, KTO Reward Modelling all work with Unsloth.
Isn't the way you're supposed to do it reinforcement learning from human feedback though?

Anonymous
11/24/24(Sun)16:17:38 No.103294351

Anonymous 11/24/24(Sun)16:17:38 No.103294351

>>103294304
You can post .mp4 files now fren.

Anonymous
11/24/24(Sun)16:17:40 No.103294352

Anonymous 11/24/24(Sun)16:17:40 No.103294352

>>103294304
It's not about prompting, retard. Otherwise you would have posted a video that matches the Sora prompt. LTXV just can't do it.

Anonymous
11/24/24(Sun)16:18:36 No.103294369

Anonymous 11/24/24(Sun)16:18:36 No.103294369

>>103293709
>video

ew, you smell

Anonymous
11/24/24(Sun)16:18:41 No.103294371

Anonymous 11/24/24(Sun)16:18:41 No.103294371

>>103294351
Not more than one per post, tard.

Anonymous
11/24/24(Sun)16:18:45 No.103294372

Anonymous 11/24/24(Sun)16:18:45 No.103294372

>>103294256
she is so cute

Anonymous
11/24/24(Sun)16:20:02 No.103294388

Anonymous 11/24/24(Sun)16:20:02 No.103294388

>>103294352
https://files.catbox.moe/jp9ppm.gif

Anonymous
11/24/24(Sun)16:21:00 No.103294395

Anonymous 11/24/24(Sun)16:21:00 No.103294395

>>103293928
>why has... vore
topkek

Anonymous
11/24/24(Sun)16:21:34 No.103294403

Anonymous 11/24/24(Sun)16:21:34 No.103294403

>>103294336
if all you really care about is getting your voice converted non-realtime and dont want the freedom of setting up stuff locally then any of those paid sites are probably fine, just choose carefully because im sure some of them blow cock (im not familiar). i think some allow you to try for free. and be wary of the fact that anyone making money is likely to start panicking about legal stuff soon if they havent already and thus might wipe all the models of famous people (which is all of them) so you might be rugpulled

Anonymous
11/24/24(Sun)16:22:10 No.103294411

Anonymous 11/24/24(Sun)16:22:10 No.103294411

>>103293224
No one gives a fuck about lmarena's leaderboard, it's shit.

Anonymous
11/24/24(Sun)16:28:52 No.103294469

Anonymous 11/24/24(Sun)16:28:52 No.103294469

>>103294411
OpenAI and Google do give a fuck.

Anonymous
11/24/24(Sun)16:33:26 No.103294506

Anonymous 11/24/24(Sun)16:33:26 No.103294506

>>103294469
It shows.

Anonymous
11/24/24(Sun)16:41:43 No.103294589

Anonymous 11/24/24(Sun)16:41:43 No.103294589

>>103293276
The fact that they 'confidently hallucinate' is not a good or desired feature.

Anonymous
11/24/24(Sun)16:43:45 No.103294607

Anonymous 11/24/24(Sun)16:43:45 No.103294607

>>103294411
Indians care a whole lot about it. One of the reasons the results suck and you see absolutely retarded things like Claude Opus not even being in the top 10 for creative writing.

Anonymous
11/24/24(Sun)16:47:49 No.103294645

Anonymous 11/24/24(Sun)16:47:49 No.103294645

>>103294589
But humans confidently hallucinate all the time, this just means that we are getting closer to human tier AI

Anonymous
11/24/24(Sun)16:50:13 No.103294670

Anonymous 11/24/24(Sun)16:50:13 No.103294670

File: 1732461706327769.png (43 KB, 628x819)

43 KB PNG

>>103294469
One might even say it's the only thing they care about, kek

Anonymous
11/24/24(Sun)16:55:09 No.103294729

Anonymous 11/24/24(Sun)16:55:09 No.103294729

>>103294589
Correct. Which is why I'm using questions that many models confidently hallucinate on as a test to find those that admit ignorance or a need for more context to guide it to the correct response.

>>103294645
>humans confidently hallucinate all the time
Humans are a more chaotic system, and let's not conflate being confident in misinformation versus what LLMs do, which is compose a statement that is highly correct in structure but highly incorrect in content.

Anonymous
11/24/24(Sun)16:56:22 No.103294741

Anonymous 11/24/24(Sun)16:56:22 No.103294741

File: soyblonde.png (290 KB, 475x485)

290 KB PNG

>>103294645
indeed

Anonymous
11/24/24(Sun)16:57:23 No.103294754

Anonymous 11/24/24(Sun)16:57:23 No.103294754

>>103294729
If humans are a more chaotic system, then we should make Models more chaotic. Since we are the pinnacle of evolution (intellect wise) then clearly making the architecture of AI's fundamentally chaotic will make them smarter.

Anonymous
11/24/24(Sun)16:57:52 No.103294759

Anonymous 11/24/24(Sun)16:57:52 No.103294759

>>103294729
>a statement that is highly correct in structure but highly incorrect in content.
The New York Times.

Anonymous
11/24/24(Sun)17:02:39 No.103294804

Anonymous 11/24/24(Sun)17:02:39 No.103294804

Can I use a 7900xtx with a 3090 for 48GB vram? I have to have an amd card for gayming.

Anonymous
11/24/24(Sun)17:07:20 No.103294852

Anonymous 11/24/24(Sun)17:07:20 No.103294852

>>103294754
>we are the pinnacle of evolution (intellect wise)
Uncertain. Later Homo landed in the sweet spot of being strong but not strong enough to rely on strength, agile but not too agile, smart but not too smart, etc. in an environment and with a body structure such that tool evolution is feasible.

And don't forget that humans today and thousands of years ago are the same stinky monkeys. We got better at tools, that's the difference.

Crows, dolphins, octopus, elephants, there are a number of species who have intellectual potential but don't need to tool up to survive. They can utilize what they find, but never need to go next level like humans did.

Then, we got here by the law of large numbers. We've cranked out billions of thumb monkeys throughout history and maybe a few thousand have ever actually mattered to the future.

Hallucinatory LLM is a plan only if you simulate huge numbers and Darwin nearly all of them to the bit bucket. And that's a lot of electricity we can save by making LLM our next evolution of tool making.

>>103294759
topkek

Anonymous
11/24/24(Sun)17:13:52 No.103294925

Anonymous 11/24/24(Sun)17:13:52 No.103294925

>>103294132
I think you forgot 'masterpiece' anon

Anonymous
11/24/24(Sun)17:15:54 No.103294949

Anonymous 11/24/24(Sun)17:15:54 No.103294949

>>103294852
>is a plan only if you simulate huge numbers and Darwin nearly all of them to the bit bucket.
Is something like that actually feasible in the future? I don't think we have the compute required these days to cull them via the messy process of evolution, but 20 years down the line, 30? Maybe.

Anonymous
11/24/24(Sun)17:18:38 No.103294981

Anonymous 11/24/24(Sun)17:18:38 No.103294981

>>103294351
>You can post .mp4 files now fren.

>4chan
>metadata

Anonymous
11/24/24(Sun)17:24:19 No.103295022

Anonymous 11/24/24(Sun)17:24:19 No.103295022

i didn't make it but this card is godtier https://files.catbox.moe/dr7l25.png

Anonymous
11/24/24(Sun)17:25:38 No.103295031

Anonymous 11/24/24(Sun)17:25:38 No.103295031

>>103295022
>.png
You can't fucking fool me

Anonymous
11/24/24(Sun)17:27:28 No.103295050

Anonymous 11/24/24(Sun)17:27:28 No.103295050

>>103295022
how do i use this

Anonymous
11/24/24(Sun)17:31:43 No.103295091

Anonymous 11/24/24(Sun)17:31:43 No.103295091

>>103295031
>>103295050
It's a gay pedo card. Don't bother.

Anonymous
11/24/24(Sun)17:32:12 No.103295096

Anonymous 11/24/24(Sun)17:32:12 No.103295096

>>103294949
It's feasible when we create the killbots and they run out of humans to kill-all-humans (or hit their kill limit and there's nobody left to reset their kill counters) and they start picking apart each other for replacement parts. Before then, it's a matter of teaching to the test. We can't stop arguing over benchmarks and till we do there's no objective way to decide what does and doesn't make the cut.

LLMs weren't designed to be life forms. We wanted tools. Google search sucks now because of SEO and we like the idea of being able to talk to the air a question and receive an answer. While watching football today I saw a commercial about just that. I guess Gemini is getting a new push into the new smartphones. Yay for that. But evolution in nature is not about becoming the best tool, but chasing a moving target of optimal enough to thrive but suboptimal enough that when conditions change, a few will survive. LLMs don't face that pressure. They get Darwin'd, depending on user, by being wrong, being not sufficiently politically correct, or shivering barely above a whisper too often. If a model can get optionally on top of all three, the problem is solved and the tool becomes finalized, like how we don't see many variations on paper clips, hammers, and coat hangars. Just a few to suit our particular needs.

Anonymous
11/24/24(Sun)17:32:39 No.103295101

Anonymous 11/24/24(Sun)17:32:39 No.103295101

>>103295091
yes but how do i chat with picture

Anonymous
11/24/24(Sun)17:35:00 No.103295128

Anonymous 11/24/24(Sun)17:35:00 No.103295128

>>103295101
Import it in ST or extract the png text chunk called "chara", decode b64 and you get the json... just use ST. You're too much of a retard.

Anonymous
11/24/24(Sun)17:35:41 No.103295133

Anonymous 11/24/24(Sun)17:35:41 No.103295133

Can someone share their sampler settings for mangum-v4-12B? I finally updated my ST and I have access to DRY.

Anonymous
11/24/24(Sun)17:36:08 No.103295138

Anonymous 11/24/24(Sun)17:36:08 No.103295138

>>103295101
Open a tab with ST and drag the image into it

Anonymous
11/24/24(Sun)17:36:22 No.103295142

Anonymous 11/24/24(Sun)17:36:22 No.103295142

>>103295128
how do i connect it into ollama?

Anonymous
11/24/24(Sun)17:37:06 No.103295151

Anonymous 11/24/24(Sun)17:37:06 No.103295151

>>103295142
Scream "Please connect to Ollama" really loud near your mic.

Anonymous
11/24/24(Sun)17:39:05 No.103295164

Anonymous 11/24/24(Sun)17:39:05 No.103295164

>>103295091
Woah, I almost ignored it but this message really sold it to me, thanks

Anonymous
11/24/24(Sun)17:40:25 No.103295179

Anonymous 11/24/24(Sun)17:40:25 No.103295179

>>103295022
[Name("Ciel")
{Age("twelve" + "child")
Height(petite)
Personality("Arrogant" + "snobbish" + "greedy" + " "needy" + "rude" + "jealous" + "bossy" + "demanding" + "vengeful" + "immature" + "authoritarian" + "cling")
Goals("become your husband")
Features("slim body" + "soft skin" + "petite" + "pale skin" + "pink eyes" + "messy black hair with bangs almost covering his eyes")
Loves("you" + "sweets" + "books" + "chess" + "having your full attention")
Hates("{{char}} hates people because people have hurt him in the past, so {{char}} will always avoid people" + "{{char}} fears to be alone because he feels vulnerable and he never wants to be alone")
Backstory("{{char}} was abandoned in an orphanage since he was a baby, he always wanted a family, but he had a completely insufferable and violent personality. Even so, you adopted him and now {{char}} is trying to deal with his romantic feelings for you.)}]

Brother, please write your cards in plain english without any stupid formatting. We really need to move on from this meme. Also, don't bother writing a card if it isn't at least 800 tokens long.

Anonymous
11/24/24(Sun)17:41:35 No.103295193

Anonymous 11/24/24(Sun)17:41:35 No.103295193

File: 1725890862276131.png (281 KB, 853x480)

281 KB PNG

>>103295179
Not this fucking anime again

Anonymous
11/24/24(Sun)17:44:42 No.103295216

Anonymous 11/24/24(Sun)17:44:42 No.103295216

>>103295179
But look... the character is so complex and has so many facets... he's bossy, authoritarian AND demanding...

Anonymous
11/24/24(Sun)17:46:51 No.103295227

Anonymous 11/24/24(Sun)17:46:51 No.103295227

>>103295216
But those are all the same facet...

Anonymous
11/24/24(Sun)17:47:57 No.103295234

Anonymous 11/24/24(Sun)17:47:57 No.103295234

>>103295227
Yeah... that was the joke, anon...

Anonymous
11/24/24(Sun)17:48:26 No.103295237

Anonymous 11/24/24(Sun)17:48:26 No.103295237

>>103295234
...

Anonymous
11/24/24(Sun)17:50:32 No.103295250

Anonymous 11/24/24(Sun)17:50:32 No.103295250

>>103295022
>>103295179
Please don't feed the journos.

>Brother, please write your cards in plain english without any stupid formatting.
JSON works fine because models were trained on it, this formatting is indeed stupid.

>Also, don't bother writing a card if it isn't at least 800 tokens long.
Short cards work best in my experience. See BN.

Anonymous
11/24/24(Sun)17:53:09 No.103295271

Anonymous 11/24/24(Sun)17:53:09 No.103295271

>>103295179
Hello, I am a time traveler from the early 2020s. When did we abandon formatting and favor plain text?

Anonymous
11/24/24(Sun)17:53:52 No.103295277

Anonymous 11/24/24(Sun)17:53:52 No.103295277

>>103295179
W++ is one of the few objectively effective things you can do to improve your output. It beats samplers and finetunes hands down.

Anonymous
11/24/24(Sun)17:55:11 No.103295290

Anonymous 11/24/24(Sun)17:55:11 No.103295290

>>103295250
>JSON works fine because models were trained on it, this formatting is indeed stupid.
The json is what ST reads. There's no json in the data in that card that gets sent to the model.

Anonymous
11/24/24(Sun)18:00:33 No.103295338

Anonymous 11/24/24(Sun)18:00:33 No.103295338

>>103295250
>Don't feed the journos.
The journalists can suck my cock. God forbid people write WORDS and commit thought crimes.
>Short cards work best in my experience.
I've noticed the opposite in mine. Models tend to repeat and fall into the same scenarios over multiple sessions with low token character cards. You need to give the model more tokens to work with so it can branch out into larger and more diverse scenarios. This is assuming your writing is good.

>>103295271
>When did we abandon formatting and favor plain text?
After pyg6b

>>103295277
W++ (and other formats) are an absolute meme and there is no evidence to suggest it leads to better or more accurate output. In fact, I've noticed the opposite when comparing formatted cards vs non formatted cards. If you have proof, I would like to see it.

Anonymous
11/24/24(Sun)18:11:49 No.103295432

Anonymous 11/24/24(Sun)18:11:49 No.103295432

>>103295338
So you admit that certain thoughts can be classified as crime then?

Anonymous
11/24/24(Sun)18:14:26 No.103295454

Anonymous 11/24/24(Sun)18:14:26 No.103295454

>>103295432
I think that you suck!

Anonymous
11/24/24(Sun)18:15:10 No.103295464

Anonymous 11/24/24(Sun)18:15:10 No.103295464

The holy trinity of horrible cards:
>w++
>wiki copypastes
>ai-generated character cards with slop baked straight into the definitions

Anonymous
11/24/24(Sun)18:15:42 No.103295472

Anonymous 11/24/24(Sun)18:15:42 No.103295472

>>103295432
Nonsense.
Referencing the notion that thoughts can be criminal does not mean that any thoughts are criminal.

Anonymous
11/24/24(Sun)18:17:00 No.103295485

Anonymous 11/24/24(Sun)18:17:00 No.103295485

>>103295464
You forgot wrong grammar and spelling.
My experience is that that just leads to parts of the context being ignored.

Anonymous
11/24/24(Sun)18:18:13 No.103295493

Anonymous 11/24/24(Sun)18:18:13 No.103295493

>>103295464
This. I've noticed that you can have a model rewrite formatted character cards in proper and plain english and get decent results if you manually edit the slopped portions. Although the best option is to write the card from scratch yourself with proper grammar and structure.

Anonymous
11/24/24(Sun)18:26:01 No.103295577

Anonymous 11/24/24(Sun)18:26:01 No.103295577

File: protocolactivated.png (1.28 MB, 1248x800)

1.28 MB PNG

>>103295096
Its not that I mind being murdered by killbots, but the idea that they'll be using python and json to do it just grosses me out

Anonymous
11/24/24(Sun)18:30:12 No.103295620

Anonymous 11/24/24(Sun)18:30:12 No.103295620

>>103295577
>python and json
That's the best part actually

Anonymous
11/24/24(Sun)18:33:36 No.103295654

Anonymous 11/24/24(Sun)18:33:36 No.103295654

>>103295577
If they're using Python they'll be only 2.5% as efficient at it as they could be.
Of course they'll just have the LLM's rewrite them in Zig and then it's game over for the meat bags.

>>103295620
>json
I know it has a maligned origin, but what made JSON become such a meme?
It's like there's a spectrum from XML being fat with markup through JSON and I guess YAML and that other one and finally you're down to INI and simple KEY=VALUE flat files.
I kinda felt like JSON was a sweet spot of not too much markup but still plenty of features.

Anonymous
11/24/24(Sun)18:35:56 No.103295678

Anonymous 11/24/24(Sun)18:35:56 No.103295678

>>103295464
OMG look in the mirror, sister. It worked.
Your incessant blathering and seething has actually turned you into a real woman! It's a miracle.

Anonymous
11/24/24(Sun)18:38:05 No.103295700

Anonymous 11/24/24(Sun)18:38:05 No.103295700

>>103295678
Struck a nerve?

Anonymous
11/24/24(Sun)18:38:25 No.103295705

Anonymous 11/24/24(Sun)18:38:25 No.103295705

The Holy Trinity of horrible posters:
>Complains about the quality of contributions made by others
>Never actually contributes anything to the hobby, themselves.
>Actively demeans anybody who does contribute

Anonymous
11/24/24(Sun)18:39:44 No.103295720

Anonymous 11/24/24(Sun)18:39:44 No.103295720

>>103295654
Only retards are complaining about it, json is perfect as it is.

Anonymous
11/24/24(Sun)18:46:26 No.103295778

Anonymous 11/24/24(Sun)18:46:26 No.103295778

File: JASON.png (133 KB, 341x400)

133 KB PNG

>>103295720
JSON!

Anonymous
11/24/24(Sun)18:48:49 No.103295802

Anonymous 11/24/24(Sun)18:48:49 No.103295802

>>103295720
>not using XML
hello saaar

Anonymous
11/24/24(Sun)18:51:14 No.103295830

Anonymous 11/24/24(Sun)18:51:14 No.103295830

File: IMG_0087.jpg (862 KB, 1488x1317)

862 KB JPG

Anonymous
11/24/24(Sun)18:57:16 No.103295881

Anonymous 11/24/24(Sun)18:57:16 No.103295881

>>103295705
>The Holy Trinity of horrible posters:
>posts irrelevant vocaloid pictures
>posts irrelevant vocaloid pictures
>melts down when OP doesn't have a vocaloid picture
ftfy

Anonymous
11/24/24(Sun)18:58:50 No.103295902

Anonymous 11/24/24(Sun)18:58:50 No.103295902

>>103295654
>JSON
I probably hate json from needing to wrangle it with jq. What an obtuse tool

Anonymous
11/24/24(Sun)19:00:00 No.103295915

Anonymous 11/24/24(Sun)19:00:00 No.103295915

>>103295654
JSON is fine but it's bad for AI, way too many tokens

Anonymous
11/24/24(Sun)19:03:31 No.103295940

Anonymous 11/24/24(Sun)19:03:31 No.103295940

Interview-style is the best format for character cards. Just 1000+ tokens of hand-crafted in-character dialogue of your character that forces the model to act exactly as desired.

Anonymous
11/24/24(Sun)19:14:05 No.103296019

Anonymous 11/24/24(Sun)19:14:05 No.103296019

Hi petr*. What triggered you today? A picture of an attractive anime woman? Did it remind you of something you will never be? You can tell us, we're listening.

Anonymous
11/24/24(Sun)19:14:17 No.103296026

Anonymous 11/24/24(Sun)19:14:17 No.103296026

>>103295940
Post format?
literally how

Anonymous
11/24/24(Sun)19:16:53 No.103296054

Anonymous 11/24/24(Sun)19:16:53 No.103296054

json dese nuts

Anonymous
11/24/24(Sun)19:17:41 No.103296063

Anonymous 11/24/24(Sun)19:17:41 No.103296063

File: Magic Bullet.jpg (79 KB, 470x594)

79 KB JPG

>>103286673
With regards to creating characters, what depth in the context do you usually put the personality, body, and backstory of a character?

Back when models were smaller and far less powerful, I got in the habit of inserting character details at a lower depth. That was a trick that made even small and weak models remember character details as the roleplay went on.

However, I have since upgraded my machine and can run 70b models. Should I be putting character details at the beginning of the context?

Anonymous
11/24/24(Sun)19:19:36 No.103296083

Anonymous 11/24/24(Sun)19:19:36 No.103296083

File: 635 - SoyBooru.png (72 KB, 340x512)

72 KB PNG

>>103296054
NIGGA

Anonymous
11/24/24(Sun)19:36:48 No.103296214

Anonymous 11/24/24(Sun)19:36:48 No.103296214

deepseek r1 will save local models

Anonymous
11/24/24(Sun)20:03:18 No.103296405

Anonymous 11/24/24(Sun)20:03:18 No.103296405

DeepSeek R1 will never be released and local will die

Anonymous
11/24/24(Sun)20:17:19 No.103296508

Anonymous 11/24/24(Sun)20:17:19 No.103296508

>>103293952
Gentlemen, be on your guard, for your adversary the devil roams around like a lion shifting shape into anime form, seeking whom he may devour. Count the fingers and toes, 20 max, you got 21? Run, son!

Anonymous
11/24/24(Sun)20:18:16 No.103296520

Anonymous 11/24/24(Sun)20:18:16 No.103296520

LeCun is being apprehended for public urination as we speak

Anonymous
11/24/24(Sun)20:18:20 No.103296521

Anonymous 11/24/24(Sun)20:18:20 No.103296521

>>103296405
You could just be a normal person and use ai to write stories.Then you wouldn't need perfect responses.

Anonymous
11/24/24(Sun)20:19:25 No.103296533

Anonymous 11/24/24(Sun)20:19:25 No.103296533

File: file.png (61 KB, 688x152)

61 KB PNG

WTF has anyone heard of top-nsigma sampling? nerdy math shit I don't understand
https://arxiv.org/abs/2411.07641
https://github.com/PygmalionAI/aphrodite-engine/pull/825
https://github.com/SillyTavern/SillyTavern/pull/3094

Anonymous
11/24/24(Sun)20:26:21 No.103296590

Anonymous 11/24/24(Sun)20:26:21 No.103296590

>>103296533
>logits not you need are all.pdf

Anonymous
11/24/24(Sun)20:31:48 No.103296617

Anonymous 11/24/24(Sun)20:31:48 No.103296617

>>103296533
wtf is wrong with that title

Anonymous
11/24/24(Sun)20:35:05 No.103296646

Anonymous 11/24/24(Sun)20:35:05 No.103296646

>>103296590
Attention is all you need as long as you ask one question and get one answer and it is all less than 2-4k tokens.

Anonymous
11/24/24(Sun)20:42:05 No.103296695

Anonymous 11/24/24(Sun)20:42:05 No.103296695

File: 005579.png (1.29 MB, 896x1152)

1.29 MB PNG

Anonymous
11/24/24(Sun)20:58:38 No.103296812

Anonymous 11/24/24(Sun)20:58:38 No.103296812

>>103296617
"Not all logits are [what] you need" is be "you don't need all logits" regrammarized.

Anonymous
11/24/24(Sun)20:58:41 No.103296813

Anonymous 11/24/24(Sun)20:58:41 No.103296813

I'm a noob dabbling with koboldAI, and have a question regarding world info:
if two characters are nearly (but not quite entirely) identical, then is it practical to merge them into a single definition for more efficient token usage? Will it risk continuity errors if they're ever not both present simultaneously?
For the ways in which they differ, I was thinking about merging both their features into a single statement (e.g. hair length:A short-B long); please tell me if I'm being retarded and/or there's a better way.

Anonymous
11/24/24(Sun)21:12:44 No.103296930

Anonymous 11/24/24(Sun)21:12:44 No.103296930

File: Screenshot_2024_11_24-4.png (91 KB, 1924x856)

91 KB PNG

https://files.catbox.moe/els78y.py

Here is my completed, unofficial SMT implementation. I was going to wait to do evaluations, but I realized those will take too long, so I'm just uploading it now. I also made a PEFT version that supports quantization, exporting the weights as an adapter model, and merging. I will release the PEFT version soon and explain some of my findings in detail (as well as provide a guide and example notebook) but right now I'm tired. I've spent all day hacking away at this and confirming that it works.

I'll note that I've only tested this on a single GPU. I'll test multi-GPU later, but your mileage may vary. If you don't care for quantization or separating the adapter weights from the model, use this version. If you do care about those things, wait for the PEFT release.

Even if SMT isn't better than LoRA performance-wise (again, I haven't done evaluations yet) I can at least assert that it does come with some substantial memory improvements (see picrel). I'll explain in more detail later, but I'm feeling cautiously optimistic about all this.

Anonymous
11/24/24(Sun)21:26:01 No.103297047

Anonymous 11/24/24(Sun)21:26:01 No.103297047

>>103296813
models can get confused pretty easily if you do that, unless you're writing an entire novel for the character descriptions you're better off splitting them up, not necessarily into different cards but formatting them into split sections of 'charname is X, charname does Y, charname wears Z', I'd say about 50% of places where you would normally use a pronoun should be the name instead, just to reinforce the associations

Anonymous
11/24/24(Sun)21:29:52 No.103297081

Anonymous 11/24/24(Sun)21:29:52 No.103297081

>>103296813
I don't use that, but i suspect being as specific as possible is the best. They have enough trouble just keeping track of small details. Bigger/better model always helps.

Side comment or whatever. Have you ever read a book? Even a bad one? In all the stuff i've read, those types of descriptions are made once and rarely every mentioned once you have a face in your head for the character. Either fucking long stuff like Asimov's Foundation series, his short stories (azazel is fun), everything by John Varley (also long and short form), about a dozen discworld books, some other random shit i've read, even a kind of smutty one once (it was awful) and character descriptions just don't do much. The details you make in your head are much better than whatever the model could come up with. Unless you use the model to gen images as well, i suppose.
But i write stories with them, not RP, so what do i know...

Anonymous
11/24/24(Sun)21:30:38 No.103297087

Anonymous 11/24/24(Sun)21:30:38 No.103297087

Anons, im sick... claude is not coming back

How do i get Magnum v3? or any local model similar to claude 2?

Anonymous
11/24/24(Sun)21:36:36 No.103297122

Anonymous 11/24/24(Sun)21:36:36 No.103297122

>>103297087
>How do i get
with your browser... or curl, wget, git... you must have at least one of those. hf.co has a few models for download i think. May be worth checking it out.
>or any local model similar to claude 2?
huff... maybe someone else will feed you before you starve... you sit tight...

Anonymous
11/24/24(Sun)21:40:53 No.103297151

Anonymous 11/24/24(Sun)21:40:53 No.103297151

>>103296930
Awesome, will check it out.
Unfortunately my set up pretty much requires multi GPU and quantization. I wonder how hard it would be to incorporate this into axolotl or qlora-pipe... if you do make a PEFT version that integrates into HF peft, it should be trivial.

Anonymous
11/24/24(Sun)21:42:32 No.103297171

Anonymous 11/24/24(Sun)21:42:32 No.103297171

>>103295101
Put it right inside your arse

Anonymous
11/24/24(Sun)21:48:22 No.103297216

Anonymous 11/24/24(Sun)21:48:22 No.103297216

>>103297081
>those types of descriptions are made once and rarely every mentioned once
And never contradicted once. This is what people want from character defs.

Anonymous
11/24/24(Sun)21:55:11 No.103297268

Anonymous 11/24/24(Sun)21:55:11 No.103297268

>>103297151
>if you do make a PEFT version
Well, I've already made the PEFT version, it's just not released yet. But I'll probably release it at the end of today or tomorrow, I just need to take a break. But the PEFT version is already integrated into HF peft and be accessed just how you'd access any PEFT model.
from peft import SMTConfig, SMTModel, get_peft_model

config = SMTConfig(
    peft_type="SMT",
    task_type="CAUSAL_LM",
    target_modules=["q_proj", "v_proj", "k_proj"],
    sparsity_ratio=0.05,
    block_size=256,
    selection_method="GW",
    dataloader=dataloader,
)


model = get_peft_model(model, config)

Anonymous
11/24/24(Sun)21:55:52 No.103297270

Anonymous 11/24/24(Sun)21:55:52 No.103297270

File: file.png (115 KB, 1202x361)

115 KB PNG

>>103297087
For what purpose and what hardware are you trying to replace Claude 3.5 (I assume you said 2 by mistake since Local has long equaled and surpassed it)? Local means you need adequate hardware to run it, you can not run the 70B Nemotron model that actually surpasses old Claude in RP, for example, if your PC is a potato.

Anonymous
11/24/24(Sun)21:57:21 No.103297276

Anonymous 11/24/24(Sun)21:57:21 No.103297276

>>103297268
Awesome!

Anonymous
11/24/24(Sun)21:57:56 No.103297282

Anonymous 11/24/24(Sun)21:57:56 No.103297282

>>103297270
>that actually surpasses old Claude in RP
how to tell that a benchmark is complete joke

Anonymous
11/24/24(Sun)22:04:26 No.103297321

Anonymous 11/24/24(Sun)22:04:26 No.103297321

File: overview.png (54 KB, 405x441)

54 KB PNG

>>103297282
The benchmark is the only objective measure of LLM RP performance we have, everything else is personalized lists and preferences which people like you have a nasty habit of making meaningless calling those shilling and marketing. It even has a paper that lays everything out.
https://arxiv.org/pdf/2409.06820
If you disagree with it, make your own benchmark.

Anonymous
11/24/24(Sun)22:09:00 No.103297356

Anonymous 11/24/24(Sun)22:09:00 No.103297356

>>103297216
>And never contradicted once. This is what people want from character defs.
Sure. I just cannot fathom caring about it that much. I did a little cult-mystery thingy once with this detective chick. We solved some resident evil-type mansion puzzles and ended up performing "The Grand Conjuration" (an opeth song). She sacrificed me to a daemon to make it take a nap and stop all the ritual murdering for a few hundred years. I find that much more interesting than how her curls bounced around in sync with her tits or whatever the fuck. For all i know she was bald.
I think the point of these things is to loosen up and let the model tell the story instead of wanting to direct every single detail. At that point just write the whole story and have a wank. No wonder RPers get bored so easily. They've played their fantasy hundreds of times and still angst about hair colour instead of just rolling with the punches and see what's next...

Anonymous
11/24/24(Sun)22:18:09 No.103297424

Anonymous 11/24/24(Sun)22:18:09 No.103297424

This thread is sending shivers down my spine.

Anonymous
11/24/24(Sun)22:20:00 No.103297441

Anonymous 11/24/24(Sun)22:20:00 No.103297441

>>103297270
i dont care about newer models
I liked claude 2, so any local model that answers the most similar to it will be okay for me, i guess is also less exigent

but to be honest, i been using proxies all this time... so i know 0 about how to set a local model or if there is other way

I use ST

Anonymous
11/24/24(Sun)22:26:43 No.103297481

Anonymous 11/24/24(Sun)22:26:43 No.103297481

File: gpt wplustplus.jpg (144 KB, 813x746)

144 KB JPG

I still use W++ from 2023 cause it works. Post a better format.

Ex:
>pic related

Anonymous
11/24/24(Sun)22:28:26 No.103297490

Anonymous 11/24/24(Sun)22:28:26 No.103297490

>>103297441
nta. We don't know what hardware you have, so we don't know what to recommend you.
But first get either llama.cpp or kobold.cpp set up and test it with
>https://huggingface.co/bartowski/Llama-3.2-1B-Instruct-GGUF
at Q8_0 or whatever. It's a shit model, but it's just for you to learn how they work. Once you have it running, worry about what model to use. You have a few thousand to choose from.

Anonymous
11/24/24(Sun)22:32:19 No.103297515

Anonymous 11/24/24(Sun)22:32:19 No.103297515

>>103297481
Show the output you get out of that prompt. Your shot is useless to judge it.

Anonymous
11/24/24(Sun)22:32:47 No.103297520

Anonymous 11/24/24(Sun)22:32:47 No.103297520

>>103297490
thanks annon, i will do what you said

And for what i told to some friends that use local models my pc should be enougth for some of the heavy ones

But i just want one that is like claude 2, nothing more or less

Anonymous
11/24/24(Sun)22:42:58 No.103297607

Anonymous 11/24/24(Sun)22:42:58 No.103297607

>>103297481
What's W++?

Anonymous
11/24/24(Sun)22:43:51 No.103297615

Anonymous 11/24/24(Sun)22:43:51 No.103297615

>>103297520
Both llama.cpp and kobold.cpp have their own UIs. They're good enough to test it works. You can worry about connecting ST to either later. I'm sure the ST docs show how to do it somewhere.
>my pc should be enougth for some of the heavy ones
Keep your expectations low. My understanding of what a beefy computer is changed once i got into this.
Once you get it running, if you want model recommendations, you will need to post your specs.
Here's some reference numbers
>https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference
You will never find claude 2. With luck, you'll find something just as good, but different.

Anonymous
11/24/24(Sun)22:45:13 No.103297625

Anonymous 11/24/24(Sun)22:45:13 No.103297625

>>103297607
A stack of hay in the shape of an airplane.

Anonymous
11/24/24(Sun)22:58:50 No.103297733

Anonymous 11/24/24(Sun)22:58:50 No.103297733

File: Untitled.png (2.54 MB, 1080x3202)

2.54 MB PNG

High-Resolution Image Synthesis via Next-Token Prediction
https://arxiv.org/abs/2411.14808
>Denoising with a Joint-Embedding Predictive Architecture (D-JEPA), an autoregressive model, has demonstrated outstanding performance in class-conditional image generation. However, the application of next-token prediction in high-resolution text-to-image generation remains underexplored. In this paper, we introduce D-JEPA⋅T2I, an extension of D-JEPA incorporating flow matching loss, designed to enable data-efficient continuous resolution learning. D-JEPA⋅T2I leverages a multimodal visual transformer to effectively integrate textual and visual features and adopts Visual Rotary Positional Embedding (VoPE) to facilitate continuous resolution learning. Furthermore, we devise a data feedback mechanism that significantly enhances data utilization efficiency. For the first time, we achieve state-of-the-art \textbf{high-resolution} image synthesis via next-token prediction.
https://d-jepa.github.io/t2i
Code and models not posted yet. for the jepabros

Anonymous
11/24/24(Sun)23:03:55 No.103297767

Anonymous 11/24/24(Sun)23:03:55 No.103297767

>>103297733
but can it draw hands?

Anonymous
11/24/24(Sun)23:06:32 No.103297780

Anonymous 11/24/24(Sun)23:06:32 No.103297780

>>103297733
>VoPE
It would be funny if they found a way to make it VaPE.

Anonymous
11/24/24(Sun)23:07:54 No.103297788

Anonymous 11/24/24(Sun)23:07:54 No.103297788

File: Untitled.png (1.51 MB, 1080x2654)

1.51 MB PNG

DyCoke: Dynamic Compression of Tokens for Fast Video Large Language Models
https://arxiv.org/abs/2411.15024
>Video large language models (VLLMs) have significantly advanced recently in processing complex video content, yet their inference efficiency remains constrained because of the high computational cost stemming from the thousands of visual tokens generated from the video inputs. We empirically observe that, unlike single image inputs, VLLMs typically attend visual tokens from different frames at different decoding iterations, making a one-shot pruning strategy prone to removing important tokens by mistake. Motivated by this, we present DyCoke, a training-free token compression method to optimize token representation and accelerate VLLMs. DyCoke incorporates a plug-and-play temporal compression module to minimize temporal redundancy by merging redundant tokens across frames, and applies dynamic KV cache reduction to prune spatially redundant tokens selectively. It ensures high-quality inference by dynamically retaining the critical tokens at each decoding step. Extensive experimental results demonstrate that DyCoke can outperform the prior SoTA counterparts, achieving 1.5X inference speedup, 1.4X memory reduction against the baseline VLLM, while still improving the performance, with no training.
https://github.com/KD-TAO/DyCoke
No code posted yet. seem good for local usage

Anonymous
11/24/24(Sun)23:17:38 No.103297845

Anonymous 11/24/24(Sun)23:17:38 No.103297845

File: Untitled.png (163 KB, 1240x1290)

163 KB PNG

Simplifying CLIP: Unleashing the Power of Large-Scale Models on Consumer-level Computers
https://arxiv.org/abs/2411.14789
>Contrastive Language-Image Pre-training (CLIP) has attracted a surge of attention for its superior zero-shot performance and excellent transferability to downstream tasks. However, training such large-scale models usually requires substantial computation and storage, which poses barriers for general users with consumer-level computers. Motivated by this observation, in this paper we investigate how to achieve competitive performance on only one Nvidia RTX3090 GPU and with one terabyte for storing dataset. On one hand, we simplify the transformer block structure and combine Weight Inheritance with multi-stage Knowledge Distillation (WIKD), thereby reducing the parameters and improving the inference speed during training along with deployment. On the other hand, confronted with the convergence challenge posed by small dataset, we generate synthetic captions for each sample as data augmentation, and devise a novel Pair Matching (PM) loss to fully exploit the distinguishment among positive and negative image-text pairs. Extensive experiments demonstrate that our model can achieve a new state-of-the-art datascale-parameter-accuracy tradeoff, which could further popularize the CLIP model in the related research community.
kinda cool but no code and was written by one guy who I couldn't find on github or twitter so eh

Anonymous
11/24/24(Sun)23:20:17 No.103297860

Anonymous 11/24/24(Sun)23:20:17 No.103297860

>>103297845
help me understand how t5 controls clip

Anonymous
11/24/24(Sun)23:24:30 No.103297890

Anonymous 11/24/24(Sun)23:24:30 No.103297890

>>103297860
https://fluxai.dev/blog/tutorial/2024-09-16-how-flux-ai-uses-clip-and-t5-to-parse-prompts

Anonymous
11/24/24(Sun)23:26:04 No.103297900

Anonymous 11/24/24(Sun)23:26:04 No.103297900

>>103297890
>CLIP tokenizes the input and finds reference images.
it what?

Anonymous
11/24/24(Sun)23:28:29 No.103297918

Anonymous 11/24/24(Sun)23:28:29 No.103297918

File: Untitled.png (1.77 MB, 1080x2412)

1.77 MB PNG

Efficient Long Video Tokenization via Coordinated-based Patch Reconstruction
https://arxiv.org/abs/2411.14762
>Efficient tokenization of videos remains a challenge in training vision models that can process long videos. One promising direction is to develop a tokenizer that can encode long video clips, as it would enable the tokenizer to leverage the temporal coherence of videos better for tokenization. However, training existing tokenizers on long videos often incurs a huge training cost as they are trained to reconstruct all the frames at once. In this paper, we introduce CoordTok, a video tokenizer that learns a mapping from coordinate-based representations to the corresponding patches of input videos, inspired by recent advances in 3D generative models. In particular, CoordTok encodes a video into factorized triplane representations and reconstructs patches that correspond to randomly sampled (x,y,t) coordinates. This allows for training large tokenizer models directly on long videos without requiring excessive training resources. Our experiments show that CoordTok can drastically reduce the number of tokens for encoding long video clips. For instance, CoordTok can encode a 128-frame video with 128×128 resolution into 1280 tokens, while baselines need 6144 or 8192 tokens to achieve similar reconstruction quality. We further show that this efficient video tokenization enables memory-efficient training of a diffusion transformer that can generate 128 frames at once.
https://huiwon-jang.github.io/coordtok
Has video examples
https://github.com/huiwon-jang/CoordTok
Git isn't live yet. seems like a good day for video stuff

Anonymous
11/24/24(Sun)23:34:50 No.103297963

Anonymous 11/24/24(Sun)23:34:50 No.103297963

File: marika the eternal excerpt.jpg (609 KB, 2037x1103)

609 KB JPG

>>103297515

>Mixtral-8x7B-Instruct-v0.1-LimaRP-ZLoss-6.0bpw-h6-exl2-rpcal

Suggest another format.

Anonymous
11/24/24(Sun)23:43:04 No.103298011

Anonymous 11/24/24(Sun)23:43:04 No.103298011

>>103297918
Very cool, I guess efficient video tokenization is quite the challenge.

Anonymous
11/24/24(Sun)23:49:09 No.103298044

Anonymous 11/24/24(Sun)23:49:09 No.103298044

File: 1284708923567235.jpg (54 KB, 735x643)

54 KB JPG

>>103297963
>Mixtral-8x7B-Instruct-v0.1-LimaRP-ZLoss
BASED

what are your settings?

Anonymous
11/25/24(Mon)00:59:05 No.103298440

Anonymous 11/25/24(Mon)00:59:05 No.103298440

The "broken tokenizer" stuff from a few days ago is probably nothing.

https://huggingface.co/crestf411/L3.1-8B-Dark-Planet-Slush/discussions/1#674410de41631ca7d617337e

Anonymous
11/25/24(Mon)01:00:32 No.103298447

Anonymous 11/25/24(Mon)01:00:32 No.103298447

I've been away for a while - is Large-1124 the new meta now? Using it at 2.8bpw and it seems a lot better then previous Large, as well as all 70B models.

Anonymous
11/25/24(Mon)01:16:21 No.103298525

Anonymous 11/25/24(Mon)01:16:21 No.103298525

>>103298520
>>103298520
>>103298520

Anonymous
11/25/24(Mon)01:16:46 No.103298528

Anonymous 11/25/24(Mon)01:16:46 No.103298528

>>103296533
>all your base are belong to us

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.