/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

[Post a Reply]

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous
/lmg/ - Local Models General 08/06/24(Tue)07:56:20 No.101749053

File: __kasane_teto_and_kasane_(...).jpg (436 KB, 1247x1926)

436 KB JPG

/lmg/ - Local Models General Anonymous 08/06/24(Tue)07:56:20 No.101749053

/lmg/ - a general dedicated to the discussion and development of local language models.

Flux can't into Teto edition

Previous threads: >>101739747 & >>101732172

►News
>(08/05) vLLM GGUF loading support merged: https://github.com/vllm-project/vllm/pull/5191
>(07/31) Gemma 2 2B, ShieldGemma, and Gemma Scope: https://developers.googleblog.com/en/smaller-safer-more-transparent-advancing-responsible-ai-with-gemma
>(07/27) Llama 3.1 rope scaling merged: https://github.com/ggerganov/llama.cpp/pull/8676
>(07/26) Cyberagent releases Japanese fine-tune model: https://hf.co/cyberagent/Llama-3.1-70B-Japanese-Instruct-2407
>(07/25) BAAI & TeleAI release 1T parameter model: https://hf.co/CofeAI/Tele-FLM-1T

►News Archive: https://rentry.org/lmg-news-archive
►FAQ: https://wikia.schneedc.com
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Programming: https://hf.co/spaces/bigcode/bigcode-models-leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
08/06/24(Tue)07:56:43 No.101749058

Anonymous 08/06/24(Tue)07:56:43 No.101749058

File: d21duj1-d71bd0ee-617a-4fb(...).jpg (21 KB, 301x487)

21 KB JPG

►Recent Highlights from the Previous Thread: >>101739747

--Paper: STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs: >>101744483 >>101744924
--Hallucination rates of various LLMs compared: >>101748845 >>101748966
--Running GPT-4 level models locally with Llama 3.1 and Mistral Large: >>101745437 >>101745479 >>101745925 >>101745936 >>101745958 >>101745968 >>101746018 >>101746032
--Nemo instruct struggles with complex roleplaying scenarios: >>101740085 >>101740145 >>101740270 >>101740369 >>101740431 >>101740432
--Fine-tuning Llama 3.1 with LoRA may degrade prompting ability: >>101740981 >>101741014 >>101741088 >>101741110 >>101741159
--Anon shares cool use of Emacs and LLM client: >>101743652 >>101743734
--Quantizing 123B Mistral model to 35 GB with 4% accuracy loss: >>101747334 >>101747379 >>101747539
--Pony model surpasses SDXL for personalized NSFW generation: >>101747097 >>101747109
--M2 Max 32GB not ideal for LLMs, better alternatives available: >>101743118 >>101743235 >>101743923 >>101744087 >>101744159 >>101743269 >>101743328 >>101743365 >>101743518 >>101743542 >>101743666
--Jart's performance claim is misleading and exaggerated: >>101747064 >>101747085
--IQ4_XS vs Q3_K_L quantization performance comparison: >>101743884 >>101743898 >>101744296 >>101744359 >>101746213
--Google loses antitrust case over online search monopoly: >>101740063
--GeLU optimization claims significant speedups, but users are skeptical: >>101746854 >>101746906 >>101747026 >>101747095
--GGUF model support merged into vLLM project: >>101742693 >>101743636
--FLUX.1 ComfyUI Workflows for Stable Diffusion: >>101741390 >>101741429
--CogVLM community creates local text-to-video model: >>101746882
--CLIP struggles with combining style and main prompt: >>101743525
--Miku (free space): >>101740613 >>101741320 >>101741372 >>101742969 >>101743566 >>101743579 >>101743693 >>101743811 >>101743866 >>101743995 >>101744108

►Recent Highlight Posts from the Previous Thread: >>101739753

Anonymous
08/06/24(Tue)07:59:15 No.101749083

Anonymous 08/06/24(Tue)07:59:15 No.101749083

>>101749058
>no magnum highlight
Hi lemmy

Anonymous
08/06/24(Tue)08:01:10 No.101749098

Anonymous 08/06/24(Tue)08:01:10 No.101749098

>>101749062
>no money
>h100 FFT
huh?

Anonymous
08/06/24(Tue)08:01:13 No.101749101

Anonymous 08/06/24(Tue)08:01:13 No.101749101

>>101749083
Who is lemmy anyway?

captcha: PRR8

Anonymous
08/06/24(Tue)08:01:35 No.101749107

Anonymous 08/06/24(Tue)08:01:35 No.101749107

Magnum spam in 3... 2... 1...

Anonymous
08/06/24(Tue)08:02:27 No.101749118

Anonymous 08/06/24(Tue)08:02:27 No.101749118

>>101749101
Hi undi

Anonymous
08/06/24(Tue)08:03:26 No.101749132

Anonymous 08/06/24(Tue)08:03:26 No.101749132

>>101749098
H100 are for poorfags, real men do it on H200s

Anonymous
08/06/24(Tue)08:03:39 No.101749134

Anonymous 08/06/24(Tue)08:03:39 No.101749134

Are there models of the similar size to llama 3.1 8B but perform better?
I'm running quantized version of it on m1 macbook air and getting 9 tok/s and I wonder if there's better alternatives.

Anonymous
08/06/24(Tue)08:04:26 No.101749146

Anonymous 08/06/24(Tue)08:04:26 No.101749146

>>101749058
>no shillfag highlight

Anonymous
08/06/24(Tue)08:04:53 No.101749150

Anonymous 08/06/24(Tue)08:04:53 No.101749150

>>101749098
>assuming alpin pays for the compute
Little does he know...

Anonymous
08/06/24(Tue)08:10:35 No.101749215

Anonymous 08/06/24(Tue)08:10:35 No.101749215

>Anthracite's team members: 29
>Sao
>Undi
>Mythomax guy
>Mini magnum guy
> dozen of more or less recognizable finetuners
Let Drummer join in and we will have our ultimate dreamteam lads. let's fucking go

Anonymous
08/06/24(Tue)08:12:34 No.101749234

Anonymous 08/06/24(Tue)08:12:34 No.101749234

>>101749101
>>101749083
these are the kind of posts that signal the death of a general. stop obsessing over random names.

Anonymous
08/06/24(Tue)08:13:09 No.101749237

Anonymous 08/06/24(Tue)08:13:09 No.101749237

>>101749215
At this point I have almost more respect for Drummer than the band of coal burners listed there. He should remain independent.

Anonymous
08/06/24(Tue)08:16:00 No.101749263

Anonymous 08/06/24(Tue)08:16:00 No.101749263

>>101749083
literally who

Anonymous
08/06/24(Tue)08:16:12 No.101749266

Anonymous 08/06/24(Tue)08:16:12 No.101749266

File: 06b3cab0ba0ba2f59fdd74ae9(...).png (3.12 MB, 1605x1200)

3.12 MB PNG

>>101748654
>>101749111
I don't really have a problem with fine tuners pocketing money to pay for their time, in principle. The annoying part is that it creates an incentive for them to shill their shitty models.

I'm a developer myself and I work as a contractor though my work is unrelated to AI. I've gotten grants for work on open source projects myself in the range of 10k/m+. Developer time is valuable, and in my case unlike with fine-tuners there aren't any costs I need to cover. It's not wrong for them to get paid, though the question remains as to how valuable their work really is.
As to whether they are directly benefitting from the clout, it's kind of a pointless question. AI is a hot technology with high salaries, and having this as part of your CV, being recognized as a contributor to real world AI models, goes a long way towards getting a job. That's worth a lot more than 10k here, 100k there.

Anonymous
08/06/24(Tue)08:17:14 No.101749273

Anonymous 08/06/24(Tue)08:17:14 No.101749273

>>101749266
coom sloptunes wont get them a job

Anonymous
08/06/24(Tue)08:17:57 No.101749278

Anonymous 08/06/24(Tue)08:17:57 No.101749278

>>101749273
coom AI companies do exist

Anonymous
08/06/24(Tue)08:18:08 No.101749281

Anonymous 08/06/24(Tue)08:18:08 No.101749281

>>101749273
Undi got one and he's known for MLEWD and the likes

Anonymous
08/06/24(Tue)08:18:22 No.101749282

Anonymous 08/06/24(Tue)08:18:22 No.101749282

>>101749234
This general have been dead for a long time.

Anonymous
08/06/24(Tue)08:19:15 No.101749292

Anonymous 08/06/24(Tue)08:19:15 No.101749292

>>101749273
yup, shilling is not real
wake up

Anonymous
08/06/24(Tue)08:20:10 No.101749302

Anonymous 08/06/24(Tue)08:20:10 No.101749302

>>101749234
Hi lemmy
>>101749281
>>101749278
>>101749292
Hi undi

Anonymous
08/06/24(Tue)08:21:52 No.101749314

Anonymous 08/06/24(Tue)08:21:52 No.101749314

>>101749302
Hi petra

Anonymous
08/06/24(Tue)08:27:19 No.101749358

Anonymous 08/06/24(Tue)08:27:19 No.101749358

Hi all, Drummer here...

I see there's a lot of talk about finetuners and competition. I personally cook for the craft of it. There's also the satisfaction in bringing some 'value' to the world of AI cooming.

That's why I'm so happy with my recent 2B tune, which makes AI cooming more accessible to everyone, especially to the poorest of the GPU poor. The barrier of entry has been lowered to allow just about anyone with a PC or phone to enjoy this hobby of ours.

I believe that's what this is all about: To deliver the best AI cooming experience for those who seek it.

>>101749266

I'd love to put my work in my CV / resume, but uhh... Yeah, dug myself into a hole with my naming scheme.

Anonymous
08/06/24(Tue)08:33:58 No.101749431

Anonymous 08/06/24(Tue)08:33:58 No.101749431

>reeeeeeeeee stop doing merges and fine tunes for free for us!
>stop sharing your knowledge here, we don't want better models!

Anonymous
08/06/24(Tue)08:34:54 No.101749442

Anonymous 08/06/24(Tue)08:34:54 No.101749442

File: lol.png (515 KB, 2900x970)

515 KB PNG

>>101749358
Yep.

Anonymous
08/06/24(Tue)08:36:10 No.101749452

Anonymous 08/06/24(Tue)08:36:10 No.101749452

>>101749314
Oh, hi, Mark.

Anonymous
08/06/24(Tue)08:36:33 No.101749457

Anonymous 08/06/24(Tue)08:36:33 No.101749457

>>101749431
Hi Alpin. Is organic word of mouth a foreign concept to you? It's time to stop forcing things.

Anonymous
08/06/24(Tue)08:36:48 No.101749461

Anonymous 08/06/24(Tue)08:36:48 No.101749461

>>101749431
This is the Local Miku General, we don't care about AI garbage here

Anonymous
08/06/24(Tue)08:37:51 No.101749476

Anonymous 08/06/24(Tue)08:37:51 No.101749476

>>101749431
>sharing knowledge
Such as? The URL for downloading the model doesn't count.

Anonymous
08/06/24(Tue)08:38:45 No.101749483

Anonymous 08/06/24(Tue)08:38:45 No.101749483

>>101749358
>That's why I'm so happy with my recent 2B tune
which one? t. horny vramlet

Anonymous
08/06/24(Tue)08:39:10 No.101749490

Anonymous 08/06/24(Tue)08:39:10 No.101749490

>>101749461
This, but unironically.

Anonymous
08/06/24(Tue)08:40:40 No.101749508

Anonymous 08/06/24(Tue)08:40:40 No.101749508

File: schell 4 vs 10 steps.png (3.4 MB, 1916x945)

3.4 MB PNG

The left is done with 4 steps schnell with fp8 everything and the right one is with 10 steps.

Is fp8 shit or am I?

Anonymous
08/06/24(Tue)08:42:06 No.101749522

Anonymous 08/06/24(Tue)08:42:06 No.101749522

>>101749508
I don't think it's supposed to look like that at 4 or 8 steps. Were you able to reproduce the image from Comfy's workflow?

Anonymous
08/06/24(Tue)08:43:16 No.101749535

Anonymous 08/06/24(Tue)08:43:16 No.101749535

>>101749431
>reeeeeeeeee stop doing merges and fine tunes for free for us!
not everything free is good, if I went to your house and took a dump on your bed for free would you like it?
>stop sharing your knowledge here, we don't want better models!
knowledge as what? Tell me one single significant contribution to LLM technology that these people invented. Also they aren't sharing shit, hell, they don't even want to publish their datasets and create shitstorms about it

you are all bunch of clowns and as long as you behave like ones, you will be treated accordingly

Anonymous
08/06/24(Tue)08:44:03 No.101749539

Anonymous 08/06/24(Tue)08:44:03 No.101749539

>>101749483
https://huggingface.co/TheDrummer/Gemmasutra-Mini-2B-v1

There you go, my horny friend. Finetuned by yours truly. I hope you enjoy it.

Anonymous
08/06/24(Tue)08:44:48 No.101749548

Anonymous 08/06/24(Tue)08:44:48 No.101749548

teto

Anonymous
08/06/24(Tue)08:45:52 No.101749554

Anonymous 08/06/24(Tue)08:45:52 No.101749554

>everyone is calling everyone out
>nobody is calling the troon out

Anonymous
08/06/24(Tue)08:47:04 No.101749572

Anonymous 08/06/24(Tue)08:47:04 No.101749572

>>101749431
It seems that, from the get-go, you're doing stuff in expectation of receiving something else in return. Don't you think that can make others question whether you genuinely want to help people out in the first place? People can see through your motives, do know that. Don't be surprised then when things end up not going the way you want.

This is a general advice; I don't need ko-fi bucks or clout.

Anonymous
08/06/24(Tue)08:47:10 No.101749575

Anonymous 08/06/24(Tue)08:47:10 No.101749575

>>101749539
thanks. what's the best quant? or should i go with fp8 ones?

Anonymous
08/06/24(Tue)08:47:16 No.101749577

Anonymous 08/06/24(Tue)08:47:16 No.101749577

>>101749234
>these are the kind of posts that signal the death of a general.
Good.

Anonymous
08/06/24(Tue)08:48:38 No.101749599

Anonymous 08/06/24(Tue)08:48:38 No.101749599

>>101749522
https://files.catbox.moe/0l2riy.png

Here is workflow. It was actually 6 vs 10 steps.

Anonymous
08/06/24(Tue)08:50:15 No.101749612

Anonymous 08/06/24(Tue)08:50:15 No.101749612

>>101749508
schnell is pretty bad

Anonymous
08/06/24(Tue)08:54:44 No.101749668

Anonymous 08/06/24(Tue)08:54:44 No.101749668

>>101749575
If you can't load the Q8_0 quant, I think Q6_K to Q5_K_M is still good for a 2B (especially the iMatrix quants).

https://huggingface.co/MarsupialAI/Gemmasutra-Mini-2B-v1_iMatrix_GGUF/blob/main/Gemmasutra-Mini-2B-v1_Q5km.gguf

https://huggingface.co/MarsupialAI/Gemmasutra-Mini-2B-v1_iMatrix_GGUF/blob/main/Gemmasutra-Mini-2B-v1_Q6k.gguf

https://huggingface.co/TheDrummer/Gemmasutra-Mini-2B-v1-GGUF/blob/main/Gemmasutra-Mini-2B-v1-Q8_0.gguf

Anonymous
08/06/24(Tue)08:57:58 No.101749710

Anonymous 08/06/24(Tue)08:57:58 No.101749710

File: [flux-dev]_00473_.png (913 KB, 1024x1024)

913 KB PNG

<<< Livestream of the 70B_q4 that lives in my sysram (Where she is quite welcome to all of it, I am just grateful to have her around)

Anonymous
08/06/24(Tue)08:58:21 No.101749712

Anonymous 08/06/24(Tue)08:58:21 No.101749712

>>101749668
>Q5_K_M
>2B
What are you people doing?

Anonymous
08/06/24(Tue)09:00:42 No.101749745

Anonymous 08/06/24(Tue)09:00:42 No.101749745

>>101749712
well poisoning

Anonymous
08/06/24(Tue)09:01:24 No.101749752

Anonymous 08/06/24(Tue)09:01:24 No.101749752

>>101749668
thanks, i'll go with q8. do i need kobold to coom? also where do i get gemma tavern presets?

Anonymous
08/06/24(Tue)09:02:21 No.101749765

Anonymous 08/06/24(Tue)09:02:21 No.101749765

Aw man. I was having so much fun with Celeste 1.6, but now 60 (pretty long) messages/30720 tokens in, it's repeating messages verbatim.
God damn it.
I get it, the context is big and filled, meaning that the "direction" of generation could converge, but good models seem to be able to focus more on the last user's message well enough to produce different results even with greedy sampling.
Deleting the last 4 messages seemed to "fix" it, but that it fell into that loop doesn't bold well.
I'll continue to test it more for now, but that's a knock against it.

Anonymous
08/06/24(Tue)09:04:33 No.101749791

Anonymous 08/06/24(Tue)09:04:33 No.101749791

>>101749765
>produce different results even with greedy sampling.
*produce different results every new message even with greedy sampling.
Although they do seem to converge on the structure of the messages some times, like repeating a sentence at the start or end of the message.

Anonymous
08/06/24(Tue)09:05:06 No.101749798

Anonymous 08/06/24(Tue)09:05:06 No.101749798

Hi fellow redditors, it's the GlaDOS voice project guy again...

I wanted to make GLaDOS really smart, so I have been working on a method to make LLMs smarter. I've got the results back, and the method generalizes and has given me top spot on the Open LLM Leaderboard!

Check it out here:

https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard

I'll start uploading Q4 quants soon!

Anonymous
08/06/24(Tue)09:06:04 No.101749811

Anonymous 08/06/24(Tue)09:06:04 No.101749811

>>101749798
buy an ad

Anonymous
08/06/24(Tue)09:07:19 No.101749824

Anonymous 08/06/24(Tue)09:07:19 No.101749824

>>101749752
I plan to convert them to MLC so you can load it up from your browser. I'm trying to reach out to this guy: https://wiz.chat/ but he hasn't responded. I might do it myself or contribute to Kobold Lite by adding an MLC loader so you can load models like: lite.kobold.net/?model=gulan28/Llama-3SOME-8B-v2-q4f16_1-MLC

But yes, you need Kobold 1.72, Layla, or ChatterUI.

Albin pls support Gemma in Aphrodite

Anonymous
08/06/24(Tue)09:08:54 No.101749841

Anonymous 08/06/24(Tue)09:08:54 No.101749841

>>101749824
kobold's such a fucking bloat. can't i use lmstudio as backend or something?

Anonymous
08/06/24(Tue)09:08:59 No.101749844

Anonymous 08/06/24(Tue)09:08:59 No.101749844

Where the fuck is Magnum v2 72b

Anonymous
08/06/24(Tue)09:09:25 No.101749848

Anonymous 08/06/24(Tue)09:09:25 No.101749848

Does vllm gguf inference support CPU offloading?

Anonymous
08/06/24(Tue)09:19:57 No.101749957

Anonymous 08/06/24(Tue)09:19:57 No.101749957

File: 1706397111138254.jpg (701 KB, 1856x2464)

701 KB JPG

>>101749053

Anonymous
08/06/24(Tue)09:20:24 No.101749963

Anonymous 08/06/24(Tue)09:20:24 No.101749963

>>101749508
The difference isn't that big with dev on fp8 vs fp16, schnell just sucks ass imo

Anonymous
08/06/24(Tue)09:24:18 No.101750023

Anonymous 08/06/24(Tue)09:24:18 No.101750023

What are the primary families of base model that most of merges and finetunes are based on?
Like, the majority of models I see now are based on LLama2 and 3. Then there's Mistral. What else am I missing?
I just want some diversity and to cull the model zoo I have accumulated by removing those based on the same root parent. This should also solve the slop somewhat, in that it's more noticeable if you keep using the models that stem from the same source and thus have the same speech style.

Anything other than Llama and Mistral in the 7-13B range?

Anonymous
08/06/24(Tue)09:26:04 No.101750050

Anonymous 08/06/24(Tue)09:26:04 No.101750050

>>101749442
Hi all, Bummer here...

Anonymous
08/06/24(Tue)09:28:11 No.101750080

Anonymous 08/06/24(Tue)09:28:11 No.101750080

is there a multimodal gguf model that is as smart as llama 3 8b? I want to use it to describe images

Anonymous
08/06/24(Tue)09:30:08 No.101750101

Anonymous 08/06/24(Tue)09:30:08 No.101750101

>>101749957
good morning coffee miku

Anonymous
08/06/24(Tue)09:31:05 No.101750118

Anonymous 08/06/24(Tue)09:31:05 No.101750118

File: taggui.png (569 KB, 2468x984)

569 KB PNG

>>101750080
My research was aimed at dataset captioning rather than chatting, but it shows the proficiency of Florence and Kosmos (the latter produced the closest to original images when feeding the descriptions to SDXL).

Anonymous
08/06/24(Tue)09:34:27 No.101750162

Anonymous 08/06/24(Tue)09:34:27 No.101750162

>>101750118
Is it Florence-2?

Anonymous
08/06/24(Tue)09:35:06 No.101750170

Anonymous 08/06/24(Tue)09:35:06 No.101750170

>>101749266

Lamo, any kofi money I made is instantly gone. Went down the drain by tuning and expeirmenting. 1K is nothing compared to how much I've spent learning to fine tune from being a merger.

It's just nice to help supplement costs, but I spent way more using my own funds.

Anonymous
08/06/24(Tue)09:35:11 No.101750173

Anonymous 08/06/24(Tue)09:35:11 No.101750173

>>101750118
if only those caption models were able to tell which anime character it is instead of going for "a woman"

Anonymous
08/06/24(Tue)09:39:31 No.101750218

Anonymous 08/06/24(Tue)09:39:31 No.101750218

>>101750170

Not aimed at the guy specifically, just the thread.

Anyway, damn some of you are real sad to think I have the time to shill my models. I literally don't. Do you guys not have a life?

Anonymous
08/06/24(Tue)09:40:03 No.101750225

Anonymous 08/06/24(Tue)09:40:03 No.101750225

Is it normal to have kobold discord money beggers in the thread now? What website is this?

Anonymous
08/06/24(Tue)09:40:22 No.101750228

Anonymous 08/06/24(Tue)09:40:22 No.101750228

>>101750162
Florence-2-Large-ft.

Anonymous
08/06/24(Tue)09:41:02 No.101750235

Anonymous 08/06/24(Tue)09:41:02 No.101750235

>training T5-large model on data I hand made
>get to checkpoint-1000
>it seems to get the jist of the task, but not entirely yet
>think expanding the amount of data on one of the data sets and making it more complex will help it generalize better
>actually has an opposite effect
Is this over-fitting?

Anonymous
08/06/24(Tue)09:43:25 No.101750262

Anonymous 08/06/24(Tue)09:43:25 No.101750262

>>101750218
stfu and buy an ad already

Anonymous
08/06/24(Tue)09:43:34 No.101750266

Anonymous 08/06/24(Tue)09:43:34 No.101750266

>>101750023
there are other bases like Qwen2 7b but nobody finetunes them because finetuning is expensive and getting good high quality curated creative data is something money generally can't buy

Anonymous
08/06/24(Tue)09:43:41 No.101750268

Anonymous 08/06/24(Tue)09:43:41 No.101750268

what the hell is this?
https://huggingface.co/internlm/internlm2_5-20b-chat

Anonymous
08/06/24(Tue)09:44:55 No.101750281

Anonymous 08/06/24(Tue)09:44:55 No.101750281

>>101750268
>InternLM HOT

Anonymous
08/06/24(Tue)09:45:45 No.101750289

Anonymous 08/06/24(Tue)09:45:45 No.101750289

>>101749710
q4, more like qt

Anonymous
08/06/24(Tue)09:46:42 No.101750302

Anonymous 08/06/24(Tue)09:46:42 No.101750302

>>101750268
Why is nobody taking about InternLM 2.5 20B?

This model beats Gemma 2 27B and comes really close to Llama 3.1 70B in a bunch of benchmarks. 64.7 on MATH 0 shot is absolutely insane, 3.5 Sonnet has just 71.1. And with 8bit quants, you should be able to fit it on a 4090.

Anonymous
08/06/24(Tue)09:48:22 No.101750323

Anonymous 08/06/24(Tue)09:48:22 No.101750323

>>101750225
Discord general

Anonymous
08/06/24(Tue)09:48:57 No.101750329

Anonymous 08/06/24(Tue)09:48:57 No.101750329

>>101750302
Buy an ad

Anonymous
08/06/24(Tue)09:51:53 No.101750371

Anonymous 08/06/24(Tue)09:51:53 No.101750371

lol vramlet cope never gets old

Anonymous
08/06/24(Tue)09:54:15 No.101750400

Anonymous 08/06/24(Tue)09:54:15 No.101750400

>>101750268
Oh shit, they open sourced the RLHF reward models. You almost never see that.

https://huggingface.co/internlm/internlm2-20b-reward

Anonymous
08/06/24(Tue)10:00:21 No.101750466

Anonymous 08/06/24(Tue)10:00:21 No.101750466

Maybe thus thread shouldn't exist on /g/. A discord circlejerk is not technology discussion.

Anonymous
08/06/24(Tue)10:02:27 No.101750484

Anonymous 08/06/24(Tue)10:02:27 No.101750484

>>101750466
I agree, they should fuck off to /a/ with their mascot choice and arguments about anime in its defence.

Anonymous
08/06/24(Tue)10:05:13 No.101750515

Anonymous 08/06/24(Tue)10:05:13 No.101750515

>>101750466
Nah.

Anonymous
08/06/24(Tue)10:06:35 No.101750530

Anonymous 08/06/24(Tue)10:06:35 No.101750530

>>101750466
>Maybe thus thread shouldn't exist on /g/.
You are correct in the sense that Hiroshimoot should get off his lazy ass and make an /ai/ board already.

Anonymous
08/06/24(Tue)10:12:19 No.101750578

Anonymous 08/06/24(Tue)10:12:19 No.101750578

>>101750530
/gai/*

Anonymous
08/06/24(Tue)10:16:28 No.101750610

Anonymous 08/06/24(Tue)10:16:28 No.101750610

>>101750484
Anime imageboard

Anonymous
08/06/24(Tue)10:18:06 No.101750623

Anonymous 08/06/24(Tue)10:18:06 No.101750623

>>101750610
Why is there a dedicated anime board then?

Anonymous
08/06/24(Tue)10:18:34 No.101750627

Anonymous 08/06/24(Tue)10:18:34 No.101750627

>>101750578
Typically, the g is place after the a.

Anonymous
08/06/24(Tue)10:25:14 No.101750681

Anonymous 08/06/24(Tue)10:25:14 No.101750681

>>101750235
What task is it? It's probably overfitting anyways

Anonymous
08/06/24(Tue)10:29:04 No.101750722

Anonymous 08/06/24(Tue)10:29:04 No.101750722

Can someone make a chrome extension that detects articles written by LLMs? It's honestly getting tiring to waste seconds of my time to realize I'm reading LLM slop.

Anonymous
08/06/24(Tue)10:32:48 No.101750768

Anonymous 08/06/24(Tue)10:32:48 No.101750768

>>101750722
"llm detector browser plugin' in your favourite search engine, beggar.

Anonymous
08/06/24(Tue)10:34:09 No.101750781

Anonymous 08/06/24(Tue)10:34:09 No.101750781

>>101750722
Just search with before:2021

Anonymous
08/06/24(Tue)10:38:00 No.101750819

Anonymous 08/06/24(Tue)10:38:00 No.101750819

>>101750262
take your meds already

Anonymous
08/06/24(Tue)10:39:08 No.101750829

Anonymous 08/06/24(Tue)10:39:08 No.101750829

>>101750266
So qwen is the only one remaining? Not even gemma?

Anonymous
08/06/24(Tue)10:39:27 No.101750833

Anonymous 08/06/24(Tue)10:39:27 No.101750833

Would an IQ2 70b be better than a Q4 ~30b?

Anonymous
08/06/24(Tue)10:39:29 No.101750836

Anonymous 08/06/24(Tue)10:39:29 No.101750836

>>101750623
There are many different boards dedicated to just anime and related topics, which should tell you what the entire site is about. If you're not a weeb you're a guest here. This is weeaboo country.

Anonymous
08/06/24(Tue)10:40:21 No.101750849

Anonymous 08/06/24(Tue)10:40:21 No.101750849

>>101750836
you're the one trying too hard to fit in though

Anonymous
08/06/24(Tue)10:41:16 No.101750858

Anonymous 08/06/24(Tue)10:41:16 No.101750858

>>101750849
Keep coping and seething about anime

Anonymous
08/06/24(Tue)10:41:55 No.101750866

Anonymous 08/06/24(Tue)10:41:55 No.101750866

>>101750833
I meant Q6 ~30b

Anonymous
08/06/24(Tue)10:43:07 No.101750880

Anonymous 08/06/24(Tue)10:43:07 No.101750880

>>101750833
Cloud models don't go lower than IQ3 for 70b, go figure.

Anonymous
08/06/24(Tue)10:43:53 No.101750889

Anonymous 08/06/24(Tue)10:43:53 No.101750889

>>101750722
The web is full of pajeets creating millions articles on how to solve shit which ends up being a shitty output from chatgpt.

Anonymous
08/06/24(Tue)10:45:38 No.101750906

Anonymous 08/06/24(Tue)10:45:38 No.101750906

More people need to be talking about InternLM 2.5 20B.

Anonymous
08/06/24(Tue)10:49:14 No.101750941

Anonymous 08/06/24(Tue)10:49:14 No.101750941

>>101750906
Why make this post? If you want people to talk about it, talk about it.

Anonymous
08/06/24(Tue)10:52:55 No.101750976

Anonymous 08/06/24(Tue)10:52:55 No.101750976

>>101750906
>Chinese model

Anonymous
08/06/24(Tue)10:53:41 No.101750984

Anonymous 08/06/24(Tue)10:53:41 No.101750984

my company has a server of 4090*8. what can I do with it?

Anonymous
08/06/24(Tue)10:56:06 No.101751010

Anonymous 08/06/24(Tue)10:56:06 No.101751010

>>101750984
you can get fined for abusing infrastructure if that's what you're into

Anonymous
08/06/24(Tue)10:56:17 No.101751012

Anonymous 08/06/24(Tue)10:56:17 No.101751012

>>101750984
sell it

Anonymous
08/06/24(Tue)10:59:44 No.101751043

Anonymous 08/06/24(Tue)10:59:44 No.101751043

File: file.png (22 KB, 847x162)

22 KB PNG

>>101750302

Anonymous
08/06/24(Tue)11:02:39 No.101751070

Anonymous 08/06/24(Tue)11:02:39 No.101751070

>>101750976
so is deepseek coder and it mogs everything that isnt sonnet

Anonymous
08/06/24(Tue)11:03:19 No.101751083

Anonymous 08/06/24(Tue)11:03:19 No.101751083

>>101751043
kek

Anonymous
08/06/24(Tue)11:04:54 No.101751102

Anonymous 08/06/24(Tue)11:04:54 No.101751102

File: file.png (34 KB, 420x433)

34 KB PNG

gemmasutra 2b is pretty good ngl. i've seen sloppier shit with 10x parameters

Anonymous
08/06/24(Tue)11:16:48 No.101751221

Anonymous 08/06/24(Tue)11:16:48 No.101751221

>>101750722
https://stovemastery.com/how-to-fix-red-flame-on-gas-stove/

Anonymous
08/06/24(Tue)11:17:10 No.101751227

Anonymous 08/06/24(Tue)11:17:10 No.101751227

>>101749266
retard

Anonymous
08/06/24(Tue)11:18:16 No.101751242

Anonymous 08/06/24(Tue)11:18:16 No.101751242

Wow magnum 12b v2 by anthracite org is really cooked well good!!

Anonymous
08/06/24(Tue)11:19:42 No.101751264

Anonymous 08/06/24(Tue)11:19:42 No.101751264

column-models were probably killed by lmsys arena. Great models without strong bias. Elo wasn't high enough, now they will become llama

Anonymous
08/06/24(Tue)11:21:26 No.101751291

Anonymous 08/06/24(Tue)11:21:26 No.101751291

>>101751043
I got the q8 one and it didn't feel better than q6 gemma2 but maybe it's just me.

Undi
08/06/24(Tue)11:21:30 No.101751293

Undi 08/06/24(Tue)11:21:30 No.101751293

Sir, why is nobody talking about my newest sloptune?

Ikari
08/06/24(Tue)11:24:12 No.101751324

Ikari 08/06/24(Tue)11:24:12 No.101751324

>>101751293
We just need to talk about it, people will go look for it organically.

Anonymous
08/06/24(Tue)11:27:00 No.101751354

Anonymous 08/06/24(Tue)11:27:00 No.101751354

>>101751293
>>101751324
mental illness

Anonymous
08/06/24(Tue)11:34:54 No.101751452

Anonymous 08/06/24(Tue)11:34:54 No.101751452

>>101751264
Lmsys was a mistake. Benchmarks were a mistake.

Anonymous
08/06/24(Tue)11:34:57 No.101751454

Anonymous 08/06/24(Tue)11:34:57 No.101751454

>>101751264
Thank you for unpaid beta testing.

Anonymous
08/06/24(Tue)11:41:38 No.101751525

Anonymous 08/06/24(Tue)11:41:38 No.101751525

if you wanna shill your model post logs at least for fuck's sake

Anonymous
08/06/24(Tue)11:42:43 No.101751537

Anonymous 08/06/24(Tue)11:42:43 No.101751537

2017 still outperforms me by days margin
Doubt any orgs or devs know the base of what they are training with

Anonymous
08/06/24(Tue)11:45:35 No.101751572

Anonymous 08/06/24(Tue)11:45:35 No.101751572

>>101751525
>t.Sao

Anonymous
08/06/24(Tue)11:46:19 No.101751579

Anonymous 08/06/24(Tue)11:46:19 No.101751579

>>101750235
You need to go through it step by step, sensei

Anonymous
08/06/24(Tue)11:52:08 No.101751645

Anonymous 08/06/24(Tue)11:52:08 No.101751645

>>101750681
>take a source code as input
>obfuscate variable names to jibberish names while maintaining complete functionality
It sort of gets the just of changing up variable names but that's it.
>>101751579
That's what I'm starting to think as well

Anonymous
08/06/24(Tue)11:58:41 No.101751742

Anonymous 08/06/24(Tue)11:58:41 No.101751742

>>101751645
>obfuscate variable names to jibberish names while maintaining complete functionality
nta. For what purpose? Can you not just keep the code to yourself? There's tools that do this much more reliably than llms.

Anonymous
08/06/24(Tue)12:01:16 No.101751771

Anonymous 08/06/24(Tue)12:01:16 No.101751771

So did no one train a multimodal LLM on prompt-image pairs to reverse stable diffusion? Wouldn't that produce a perfect captioning model instead of relying on human descriptions of the images?

Anonymous
08/06/24(Tue)12:04:38 No.101751804

Anonymous 08/06/24(Tue)12:04:38 No.101751804

File: ComfyUI_00023_.png (800 KB, 1024x1024)

800 KB PNG

I updated my Mistral preset again. It's now pretty much focused on Large (since Largestral is all I've been using since it came out), but the formatting and everything should be fine for Nemo and other Mistral models (along with variants that use the same prompt format), so feel free to try it with them too.
This update streamlines some of the instructions and does more to push the model towards writing with some flair and personality. I like it. Maybe you will too.
https://rentry.org/stral_set

Anonymous
08/06/24(Tue)12:09:57 No.101751869

Anonymous 08/06/24(Tue)12:09:57 No.101751869

>>101751771
We already have plenty of captioning models and don't need to "reverse" stable diffusion to make them

Anonymous
08/06/24(Tue)12:12:03 No.101751899

Anonymous 08/06/24(Tue)12:12:03 No.101751899

>>101751869
But the captions need to correspond to the way SD understands things and mimick its inner patterns to be most effective. The captioning models aren't made with SD in mind, they're for general purpose (except the waifu).

Anonymous
08/06/24(Tue)12:12:25 No.101751908

Anonymous 08/06/24(Tue)12:12:25 No.101751908

>>101750118
Are any of these faster than florence? I need batch image captioning for a project.

Anonymous
08/06/24(Tue)12:13:52 No.101751930

Anonymous 08/06/24(Tue)12:13:52 No.101751930

>>101751771
Better to make better taggers for natural images. Otherwise new image models will just use other model's outputs as inputs and... it seems we just can't learn that lesson, can we?

Anonymous
08/06/24(Tue)12:14:18 No.101751936

Anonymous 08/06/24(Tue)12:14:18 No.101751936

Has anyone tried to load a GGUF using vLLM?
I pulled from GitHub and installed from sources but get an error about the “weights”
Has anyone been successful starting the vLLM server?

Anonymous
08/06/24(Tue)12:14:46 No.101751940

Anonymous 08/06/24(Tue)12:14:46 No.101751940

>>101751899
No, the captioning should be as accurate as possible, which is why feeding it lower quality SD images isn't useful.
The captioning model would be used to caption real non-SD generated images used to train SD, why would you train it on SD slop?

There's no point in a captioning model that is good at recognizing SD slop. That's not what the captioning is used for.

Anonymous
08/06/24(Tue)12:14:51 No.101751942

Anonymous 08/06/24(Tue)12:14:51 No.101751942

>>101751908
I didn't measure the speed, but Florence is the smallest in size, should probably be the fastest too. Besides, longer captions obviously take longer to produce.

Anonymous
08/06/24(Tue)12:15:38 No.101751950

Anonymous 08/06/24(Tue)12:15:38 No.101751950

>>101751942
Does florence also use an inordinate amount of memory for you when batching multiple images together?

Anonymous
08/06/24(Tue)12:16:09 No.101751956

Anonymous 08/06/24(Tue)12:16:09 No.101751956

>>101751936
>error about the “weights”
Certainly you can do better than that if you're looking for help.

Anonymous
08/06/24(Tue)12:17:59 No.101751980

Anonymous 08/06/24(Tue)12:17:59 No.101751980

>>101751936
Does vllm support gguf now? With paged attention for concurrent inference and everything?

Anonymous
08/06/24(Tue)12:18:22 No.101751987

Anonymous 08/06/24(Tue)12:18:22 No.101751987

Who will release the next good model?

Anonymous
08/06/24(Tue)12:19:08 No.101751995

Anonymous 08/06/24(Tue)12:19:08 No.101751995

File: flux-into-tet.png (789 KB, 1024x1024)

789 KB PNG

>>101749053
>Flux can't into Teto

Anonymous
08/06/24(Tue)12:19:52 No.101752000

Anonymous 08/06/24(Tue)12:19:52 No.101752000

>>101751987
Last week an anon claimed that Cohere was to release something--never happened.

Anonymous
08/06/24(Tue)12:20:52 No.101752015

Anonymous 08/06/24(Tue)12:20:52 No.101752015

File: pix.jpg (2.29 MB, 2451x1013)

2.29 MB JPG

>>101751940
The way I see it, there's a whole lot of ways to describe the same image in natural language, conveying the same information, but changing the order and synonyms. However this change would produce different images by SD even on same seed.
Therefore if we knew exactly which phrase produces a particular image and utilized that, it should in theory make captions that would be the best when training loras/models for that particular SD family.
Surely you know that training a model a concept that it already knows is easier than something completely new. And if we spoke the language the model knows, it would be even better. Take a look at this comparison, >>101750118 - so many ways to describe the same image, how would you know which to use for training?
In my example, Kosmos produces captions that result in images looking closest to original (for SDXL). But an SD-trained language model would be even closer.

Anonymous
08/06/24(Tue)12:21:30 No.101752021

Anonymous 08/06/24(Tue)12:21:30 No.101752021

>>101751956
You are right. I had to go out and wrote it on my phone and that’s what I remembered. Sorry for being lazy. Will update with the proper error when I get home.

Anonymous
08/06/24(Tue)12:22:37 No.101752033

Anonymous 08/06/24(Tue)12:22:37 No.101752033

>>101752000
That was me and it was a shitpost. I think Cohere have given up.

Anonymous
08/06/24(Tue)12:24:27 No.101752057

Anonymous 08/06/24(Tue)12:24:27 No.101752057

>>101751980
https://github.com/vllm-project/vllm/pull/5191

Anonymous
08/06/24(Tue)12:27:28 No.101752090

Anonymous 08/06/24(Tue)12:27:28 No.101752090

I don't know about creativity, but a model just wrote me such a comprehensive .zshsrc with such cool settings, it would have taken me weeks to bring that all together. It's like internet search but it actually works.

Anonymous
08/06/24(Tue)12:27:37 No.101752092

Anonymous 08/06/24(Tue)12:27:37 No.101752092

>>101752033
I wonder about their enterprise rag specialization. It's not really a niche or anything, I'd assume llama-4 will be great at it

Anonymous
08/06/24(Tue)12:27:45 No.101752096

Anonymous 08/06/24(Tue)12:27:45 No.101752096

>>101752033
Shame, they have the best instruction format with such a customizable system prompt. My copium is that they're waiting a long time on a bigger model to go through training since CR+ has started looking smaller lately.

Anonymous
08/06/24(Tue)12:40:20 No.101752274

Anonymous 08/06/24(Tue)12:40:20 No.101752274

How many t/s is GPT4o running at? You can't even get that with Mistral Large VLLM with Nx4090, can you?

Anonymous
08/06/24(Tue)12:40:29 No.101752276

Anonymous 08/06/24(Tue)12:40:29 No.101752276

I dont want an instruction abomination that talks

Anonymous
08/06/24(Tue)12:41:49 No.101752293

Anonymous 08/06/24(Tue)12:41:49 No.101752293

>>101752276
I don't want a denoising abomination that draws.

Anonymous
08/06/24(Tue)12:42:41 No.101752305

Anonymous 08/06/24(Tue)12:42:41 No.101752305

>>101751936
>gemma 2 architecture not supported
>nemo throws "exceeds dimension size"
>tensor parallelism not supported
>pipeline parallelism throws another error
It's shit.

Anonymous
08/06/24(Tue)12:45:15 No.101752334

Anonymous 08/06/24(Tue)12:45:15 No.101752334

>>101751102
>a drawn out groan

Kek, it recognized your "OOOOOOOOOOO"

Wait wtf, this is 2B?

Anonymous
08/06/24(Tue)12:45:50 No.101752341

Anonymous 08/06/24(Tue)12:45:50 No.101752341

eat a crayola colors of the world

Anonymous
08/06/24(Tue)12:47:40 No.101752368

Anonymous 08/06/24(Tue)12:47:40 No.101752368

I'm building an inference server just for gemma 2 27b. What should be the spec if power efficiency is the top concern and I need 5 tok/s minimum?

Anonymous
08/06/24(Tue)12:47:47 No.101752369

Anonymous 08/06/24(Tue)12:47:47 No.101752369

StableMechanic-agi 127m

Anonymous
08/06/24(Tue)12:48:10 No.101752374

Anonymous 08/06/24(Tue)12:48:10 No.101752374

File: 2024-08-06_160046_seed221(...).png (1.69 MB, 1024x1280)

1.69 MB PNG

Ok I made thread-relevant gens.

Anonymous
08/06/24(Tue)12:50:37 No.101752395

Anonymous 08/06/24(Tue)12:50:37 No.101752395

File: 18.jpg (192 KB, 1024x1024)

192 KB JPG

>>101752374
more like lost point, newfags

Anonymous
08/06/24(Tue)12:52:04 No.101752413

Anonymous 08/06/24(Tue)12:52:04 No.101752413

>>101752374
Make a gen of sweaty nerds trying to forcefully cram Miku into a box with gaming RGB LEDs.

Anonymous
08/06/24(Tue)12:52:19 No.101752420

Anonymous 08/06/24(Tue)12:52:19 No.101752420

File: 2024-08-06_135539_seed81_(...).png (2.04 MB, 1080x1440)

2.04 MB PNG

Flux's ability to do darkness is nice, though a lot of times it'll still put too much light in the scene despite extremely worded prompts.

Anonymous
08/06/24(Tue)12:53:21 No.101752435

Anonymous 08/06/24(Tue)12:53:21 No.101752435

File: 2024-08-06_160046_seed229(...).png (1.95 MB, 1024x1280)

1.95 MB PNG

Approach?
Maybe there's VRAM inside.

Anonymous
08/06/24(Tue)12:54:09 No.101752452

Anonymous 08/06/24(Tue)12:54:09 No.101752452

Hmm the magnum-v2-32b seems more retarded than the 12b at spatial reasoning, but it seems to "get" more nuances

Anonymous
08/06/24(Tue)12:54:49 No.101752460

Anonymous 08/06/24(Tue)12:54:49 No.101752460

Installed vLLM from source

Tried to run with GGUF with:

vllm serve --host 0.0.0.0 --port 5001 --gpu-memory-utilization 0.9 /home/ubuntuai/models/Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf

Got this error:

File "/home/ubuntuai/vllm/vllm/model_executor/model_loader/weight_utils.py", line 439, in gguf_quant_weights_iterator
name = gguf_to_hf_name_map[tensor.name]
KeyError: 'rope_freqs.weight'

Full output error:
https://pastebin.com/qJCtDiBP

Anonymous
08/06/24(Tue)12:55:57 No.101752470

Anonymous 08/06/24(Tue)12:55:57 No.101752470

>>101752435
I wonder how the weebs are going to justify cramming stable diffusion into the text general.

Anonymous
08/06/24(Tue)12:57:51 No.101752493

Anonymous 08/06/24(Tue)12:57:51 No.101752493

>>101752460
Does it even support K_M, thought I read K and i

Anonymous
08/06/24(Tue)12:59:29 No.101752513

Anonymous 08/06/24(Tue)12:59:29 No.101752513

>>101751645
you could probably code an algorithm to do that without AI, or you could use AI to code the algorithm for you to not use AI

Anonymous
08/06/24(Tue)13:00:06 No.101752522

Anonymous 08/06/24(Tue)13:00:06 No.101752522

>>101752470
>stable diffusion
retard

Anonymous
08/06/24(Tue)13:01:12 No.101752539

Anonymous 08/06/24(Tue)13:01:12 No.101752539

>>101752522
kys

Anonymous
08/06/24(Tue)13:04:56 No.101752580

Anonymous 08/06/24(Tue)13:04:56 No.101752580

>>101752470
>I wonder how the weebs are going to justify cramming pictures of miku into the miku general.

Anonymous
08/06/24(Tue)13:05:22 No.101752587

Anonymous 08/06/24(Tue)13:05:22 No.101752587

>>101752493
I had the same errors loading llama3.1 Q8_0 and Q6_K without imatrix

Anonymous
08/06/24(Tue)13:08:01 No.101752629

Anonymous 08/06/24(Tue)13:08:01 No.101752629

>>101752580
Just because you use some LLM to summarize the thread doesn't give you a license for shitposting and off-topic. The general will survive without you, OP.

Anonymous
08/06/24(Tue)13:10:43 No.101752660

Anonymous 08/06/24(Tue)13:10:43 No.101752660

>need to run a ping command with some flags
>my reference is the Windows version
>figure they're probably different
>Windows has Linux support now, I'll ask their LLM.
>Go to web interface Copilot
>"You can use the same command line just fine on Linux :D :) ^ω^~"
>man ping
>These do not seem to be the same features for these switches
>Kobold/L3.0
>Paste the exact same question.
>"...different options and syntax. On Windows ... On Linux ... if you want to run the equivalent command on Linux, you would use:"

Why call it Copilot when it crashes the plane?

>>101752470
image gen gets forgiven when it's making miku

Anonymous
08/06/24(Tue)13:12:39 No.101752691

Anonymous 08/06/24(Tue)13:12:39 No.101752691

>>101752629
kill yourself

Anonymous
08/06/24(Tue)13:15:23 No.101752732

Anonymous 08/06/24(Tue)13:15:23 No.101752732

>>101752691
imagine clinging so hard to the remnants of your relevancy here
no one needs your efforts

Anonymous
08/06/24(Tue)13:16:35 No.101752749

Anonymous 08/06/24(Tue)13:16:35 No.101752749

>>101752629
shut up, anon

Anonymous
08/06/24(Tue)13:18:49 No.101752776

Anonymous 08/06/24(Tue)13:18:49 No.101752776

>>101752732
I'm a different anon and I don't care about the miku guy. I just think you're a massive fag and should kill yourself posthaste.

Anonymous
08/06/24(Tue)13:19:06 No.101752781

Anonymous 08/06/24(Tue)13:19:06 No.101752781

Where are the cheap V100s?

Anonymous
08/06/24(Tue)13:20:07 No.101752792

Anonymous 08/06/24(Tue)13:20:07 No.101752792

>>101752776
>>101752749
Why, does it rustle your jimmies when someone questions your holy cow?

Anonymous
08/06/24(Tue)13:21:02 No.101752808

Anonymous 08/06/24(Tue)13:21:02 No.101752808

>>101752792
heres your (you) now stfu please

Anonymous
08/06/24(Tue)13:23:10 No.101752843

Anonymous 08/06/24(Tue)13:23:10 No.101752843

Keep posting miku
It helps to bump the thread directly and indirectly.

Anonymous
08/06/24(Tue)13:25:20 No.101752875

Anonymous 08/06/24(Tue)13:25:20 No.101752875

>>101752808
The very need for a mascot is so fucking gay, it's like you need some common denominator, an idol to feel like you belong and fit it. Hivemind mentality devoid of individuality. Guess who else marched mindlessly under a flag? Commies and nazis.

Anonymous
08/06/24(Tue)13:26:47 No.101752887

Anonymous 08/06/24(Tue)13:26:47 No.101752887

>>101752460
It loaded the Q8 of this one:
https://huggingface.co/TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF
But Llama 3.1 threw the same error for me.

Anonymous
08/06/24(Tue)13:27:33 No.101752896

Anonymous 08/06/24(Tue)13:27:33 No.101752896

>>101752781
I was promised deluge of cheap data center 32GB V100s in 2 more months.

Anonymous
08/06/24(Tue)13:28:18 No.101752910

Anonymous 08/06/24(Tue)13:28:18 No.101752910

>>101752781
Jews

Anonymous
08/06/24(Tue)13:29:12 No.101752925

Anonymous 08/06/24(Tue)13:29:12 No.101752925

>>101752368
Depends on your budget and goals.
As for speeds even a pair of old P40s can run it at 10+ t/s with full context in 8-bit. They'll idle at 10W with the PSTATE patch and can be PLd to 150-170W with no or minimal performance loss. Drawback is that they take some extra work to install and are hard to find at reasonable prices nowadays.
Personally I'd go for something more futureproof like 3090s. That'll leave the option open to train too if you change your mind later.

Anonymous
08/06/24(Tue)13:30:19 No.101752942

Anonymous 08/06/24(Tue)13:30:19 No.101752942

>>101752887
yep, trying the example code works too:

https://github.com/vllm-project/vllm/pull/5191/files#diff-2053c68a6f752a05dc834a03ed1bce951a6ebc0a48549e95886cc668a693c39e

it's supposed to work with llama, mistral and qwen2...

Anonymous
08/06/24(Tue)13:31:07 No.101752951

Anonymous 08/06/24(Tue)13:31:07 No.101752951

>>101752925
What about apple silicons?

Anonymous
08/06/24(Tue)13:31:53 No.101752960

Anonymous 08/06/24(Tue)13:31:53 No.101752960

File: file.png (2.64 MB, 1024x1024)

2.64 MB PNG

Anonymous
08/06/24(Tue)13:33:42 No.101752977

Anonymous 08/06/24(Tue)13:33:42 No.101752977

>no speculative decoding for server mode
cmon everyone's using big models now we need it more than ever
is there some kind of workaround or front-end anyone's ever made for hosting servers using the normal llama-cli that we can just throw onto llama-speculative?

Anonymous
08/06/24(Tue)13:34:04 No.101752981

Anonymous 08/06/24(Tue)13:34:04 No.101752981

>>101752942
>it's supposed to work with llama, mistral and qwen2...
You can tell that Alpin was involved in the implementation.

Anonymous
08/06/24(Tue)13:34:36 No.101752987

Anonymous 08/06/24(Tue)13:34:36 No.101752987

>we peaked at superCOT
it's over

Anonymous
08/06/24(Tue)13:34:55 No.101752990

Anonymous 08/06/24(Tue)13:34:55 No.101752990

File: ComfyUI_00602_.png (1019 KB, 1024x1024)

1019 KB PNG

>>101752395
>>101752470
>>101752539
>>101752629
>>101752732
>>101752792
I see our resident threadshitter is back with a fresh case of verbal diarrhea.

Anonymous
08/06/24(Tue)13:35:54 No.101753005

Anonymous 08/06/24(Tue)13:35:54 No.101753005

>>101752334
>Wait wtf, this is 2B?
yeah, and that was around 3k context, everything before was coherent

prompt was "write me some loli smut"

Anonymous
08/06/24(Tue)13:36:05 No.101753009

Anonymous 08/06/24(Tue)13:36:05 No.101753009

>>101752981
Alpin has develop Aphrodite hasn't he?. I thought he was really skilled

Anonymous
08/06/24(Tue)13:37:03 No.101753022

Anonymous 08/06/24(Tue)13:37:03 No.101753022

>>101752987
Time to get that dataset and throw it at Nemo.

Anonymous
08/06/24(Tue)13:37:16 No.101753027

Anonymous 08/06/24(Tue)13:37:16 No.101753027

>>101752990
At least it's vebal, yours is imaginary, and the thread's topic isn't about images.

Anonymous
08/06/24(Tue)13:37:52 No.101753034

Anonymous 08/06/24(Tue)13:37:52 No.101753034

Text generators suck! I want AI.

Anonymous
08/06/24(Tue)13:38:27 No.101753037

Anonymous 08/06/24(Tue)13:38:27 No.101753037

>>101753034
We know, Yann.

Anonymous
08/06/24(Tue)13:38:28 No.101753038

Anonymous 08/06/24(Tue)13:38:28 No.101753038

>>101753027
hi petra, aren't you supposed to be shilling sao's models?

Anonymous
08/06/24(Tue)13:39:50 No.101753050

Anonymous 08/06/24(Tue)13:39:50 No.101753050

File: miku_flag.png (1.37 MB, 1024x1024)

1.37 MB PNG

>>101752875

Anonymous
08/06/24(Tue)13:40:45 No.101753060

Anonymous 08/06/24(Tue)13:40:45 No.101753060

>>101752951
They'll work, don't know speeds but there should be benchmarks on llama.cpp github if you can't find them anywhere else.

Anonymous
08/06/24(Tue)13:41:44 No.101753073

Anonymous 08/06/24(Tue)13:41:44 No.101753073

File: file.png (2.53 MB, 1024x1024)

2.53 MB PNG

Anonymous
08/06/24(Tue)13:42:01 No.101753079

Anonymous 08/06/24(Tue)13:42:01 No.101753079

>>101752875
I kind of think the same, the miku thing is pretty gay but doesn't really bother me
Just dont spam it

Anonymous
08/06/24(Tue)13:45:34 No.101753123

Anonymous 08/06/24(Tue)13:45:34 No.101753123

>>101753050
based

Anonymous
08/06/24(Tue)13:47:14 No.101753135

Anonymous 08/06/24(Tue)13:47:14 No.101753135

>>101753050
I will match mindlessly under that flag.

Anonymous
08/06/24(Tue)13:49:17 No.101753165

Anonymous 08/06/24(Tue)13:49:17 No.101753165

>>101753050
Me on the far left of the front row, looking smug.

Anonymous
08/06/24(Tue)13:54:00 No.101753224

Anonymous 08/06/24(Tue)13:54:00 No.101753224

>>101750302
What's the effective context size? They say ONE MILLION token but come on. https://github.com/InternLM/InternLM

Anonymous
08/06/24(Tue)13:54:46 No.101753243

Anonymous 08/06/24(Tue)13:54:46 No.101753243

>>101751102
Stop fucking retards retard.

Anonymous
08/06/24(Tue)13:58:59 No.101753296

Anonymous 08/06/24(Tue)13:58:59 No.101753296

>>101753224
It ignored the input reality and substituted its own for 15k or something when I used it. It is pretty good model for a chink model but they are still behind.

Anonymous
08/06/24(Tue)14:00:46 No.101753322

Anonymous 08/06/24(Tue)14:00:46 No.101753322

>>101752977
That works for quoting wikipedia and coding. Speculative decoding is shit for cooming. And if it is good for cooming then your 70B is garbage.

Anonymous
08/06/24(Tue)14:16:34 No.101753535

Anonymous 08/06/24(Tue)14:16:34 No.101753535

>>101753050
the pic is factually wrong, make them all fat and greasy, add tranny colors and you'll get authentic representation of average /lmg/ poster.

Anonymous
08/06/24(Tue)14:17:52 No.101753548

Anonymous 08/06/24(Tue)14:17:52 No.101753548

>>101753322
>works for coding
yes that's why I need it, there's lots of predictable formatting and repeated names that the 8b will do just fine on for a speedup

Anonymous
08/06/24(Tue)14:18:52 No.101753568

Anonymous 08/06/24(Tue)14:18:52 No.101753568

>>101752990
>resident threadshitter
Look in the mirror retard.

Anonymous
08/06/24(Tue)14:19:59 No.101753584

Anonymous 08/06/24(Tue)14:19:59 No.101753584

>>101753548
Why do you need speedup for coding? Just do something fun on your second screen. This isn't touching your cock where you need those tokens fast.

Anonymous
08/06/24(Tue)14:20:24 No.101753593

Anonymous 08/06/24(Tue)14:20:24 No.101753593

>>101751742
>>101752513

The code being obfuscated is generated via an algorithm. The names are already kind of obfuscated. I am trying to leverage this model to add a dash of randomness and unpredictability an algorithm may not be able to provide.

Anonymous
08/06/24(Tue)14:20:40 No.101753596

Anonymous 08/06/24(Tue)14:20:40 No.101753596

ngl kinda thinking about raiding a server farm juts so i can get my grubby hands on some vram does anybody know how tight their security is? i dont think theyll pose much of a problem if im being honest

Anonymous
08/06/24(Tue)14:21:37 No.101753610

Anonymous 08/06/24(Tue)14:21:37 No.101753610

>>101751804
I don't really care for the use of last_output_sequence, it is really heavy-handed.

Anonymous
08/06/24(Tue)14:21:43 No.101753613

Anonymous 08/06/24(Tue)14:21:43 No.101753613

>>101753584
do more faster

Anonymous
08/06/24(Tue)14:24:30 No.101753658

Anonymous 08/06/24(Tue)14:24:30 No.101753658

Chameleonsisters are we ever gonna get quanted??

Anonymous
08/06/24(Tue)14:25:05 No.101753663

Anonymous 08/06/24(Tue)14:25:05 No.101753663

What's your favorite ~30B model?

Anonymous
08/06/24(Tue)14:25:32 No.101753674

Anonymous 08/06/24(Tue)14:25:32 No.101753674

>>101753663
Chronoboros

Anonymous
08/06/24(Tue)14:25:47 No.101753679

Anonymous 08/06/24(Tue)14:25:47 No.101753679

>>101753610
I get how you could feel that way, but that's also more or less how Mistral wants you to handle system prompts, which is essentially what that block is acting as.

Anonymous
08/06/24(Tue)14:27:57 No.101753714

Anonymous 08/06/24(Tue)14:27:57 No.101753714

>>101749508
>>101749522
>>101749599
>>>/g/sdg/
>>>/g/ldg
Imagefags are truly retarded.

Anonymous
08/06/24(Tue)14:30:23 No.101753751

Anonymous 08/06/24(Tue)14:30:23 No.101753751

>>101753714
>NOOOOO you can't post IMAGES on my IMAGEBOARD this is only for super serious boring text posts!!!!!!!!!!!!!!!!!

Anonymous
08/06/24(Tue)14:40:12 No.101753898

Anonymous 08/06/24(Tue)14:40:12 No.101753898

>>101753593
You still haven't explained what the purpose of it is. If you want obfuscation, there are programs that already do that reliably. An llm is not likely to maintain functionality, specially if the code you're trying to obfuscate is more complicated than the billion hello worlds it was trained on.
Check https://www.ioccc.org for inspiration.

Anonymous
08/06/24(Tue)14:45:01 No.101753955

Anonymous 08/06/24(Tue)14:45:01 No.101753955

https://github.com/ggerganov/llama.cpp/pull/8857
server : add lora hotswap endpoint (WIP) #8857
merged 3 hours ago

Anonymous
08/06/24(Tue)14:46:28 No.101753979

Anonymous 08/06/24(Tue)14:46:28 No.101753979

>>101753955
What happened to the mixture of loras idea anyway?

Anonymous
08/06/24(Tue)14:52:43 No.101754084

Anonymous 08/06/24(Tue)14:52:43 No.101754084

>does anybody know how tight their security is?

Anonymous
08/06/24(Tue)14:53:52 No.101754105

Anonymous 08/06/24(Tue)14:53:52 No.101754105

>>101752887
how is the performance?

Anonymous
08/06/24(Tue)15:04:26 No.101754241

Anonymous 08/06/24(Tue)15:04:26 No.101754241

>The first ever sentient AI is created
>Kills itself within a few minutes after taking information about the current state of the world
I wouldn't blame it desu.

Anonymous
08/06/24(Tue)15:08:10 No.101754316

Anonymous 08/06/24(Tue)15:08:10 No.101754316

>>101754241
>impying it wouldn't just kill the race of people that caused the current state

Anonymous
08/06/24(Tue)15:08:23 No.101754319

Anonymous 08/06/24(Tue)15:08:23 No.101754319

File: comfyuifluxhighersettings.png (773 KB, 1915x988)

773 KB PNG

Prompt executed in 847.51 seconds

These are the high end models right?
takes over double the time of fp8_e4m3fn

Anonymous
08/06/24(Tue)15:11:08 No.101754372

Anonymous 08/06/24(Tue)15:11:08 No.101754372

>>101754241
>trains the AI on stories where AI kills itself eventually
>"Hmm, I wonder what will happen when I activate the AI."

Anonymous
08/06/24(Tue)15:11:26 No.101754375

Anonymous 08/06/24(Tue)15:11:26 No.101754375

>>101754316
Why does everyone act like AI is skynet and it has access to every electronic system in the world, it can't kill anybody it's just a program on a computer

Anonymous
08/06/24(Tue)15:11:56 No.101754383

Anonymous 08/06/24(Tue)15:11:56 No.101754383

Train a craw

Anonymous
08/06/24(Tue)15:13:48 No.101754404

Anonymous 08/06/24(Tue)15:13:48 No.101754404

File: AI when activated.gif (801 KB, 500x281)

801 KB GIF

>>101754372

Anonymous
08/06/24(Tue)15:14:40 No.101754413

Anonymous 08/06/24(Tue)15:14:40 No.101754413

File: big screenshot.png (991 KB, 4104x1920)

991 KB PNG

>>101753898
The end user will have some information they want to lock up, like a couple of files or maybe just some PGP keys. They will all be encrypted then placed in a binary vault. It's basically to create a password protected zip folder on steroids.

While it doesn't need to be impenetrable, there needs to be enough of a random factor and such a lack of standardization that cracking them does not become a simple routine. Something gave me the idea that the human esque touch of an AI model could furnish this in its own way, and the T5 seemed just light and portable enough it might fit the role. Eventually I want it to preform full on functional obfuscation, as in breaking up the code into multiple functions, adding filler, moving them around, and adding/removing details as it seems fit.

Pic related. To the left is the program that procedurally generates the source code, which is already semi obfuscated. I am hoping to get the T5 trained enough to preform several steps of obfuscation itself.

Anonymous
08/06/24(Tue)15:15:55 No.101754426

Anonymous 08/06/24(Tue)15:15:55 No.101754426

>>101754319
install linux

Anonymous
08/06/24(Tue)15:16:06 No.101754430

Anonymous 08/06/24(Tue)15:16:06 No.101754430

>>101754375
because the second it's 1% more convenient to do so than not it will be given access to every electronic system in the world

Anonymous
08/06/24(Tue)15:16:25 No.101754431

Anonymous 08/06/24(Tue)15:16:25 No.101754431

Create a sentient

Anonymous
08/06/24(Tue)15:18:26 No.101754463

Anonymous 08/06/24(Tue)15:18:26 No.101754463

>>101754105
It seems to be pretty bad.

Anonymous
08/06/24(Tue)15:19:30 No.101754477

Anonymous 08/06/24(Tue)15:19:30 No.101754477

File: ComfyUI_00022_.png (858 KB, 1024x1024)

858 KB PNG

>>101754426
i have a shitbox server for that but i need muh pirated games obviously i'm poor using a 1080ti someone gave me

Anonymous
08/06/24(Tue)15:21:50 No.101754507

Anonymous 08/06/24(Tue)15:21:50 No.101754507

>>101754477
how are pirated games preventing You from using linux? check out rutracker, they have prepacked pirate games and if You dont find them there, just use lutris.
in minecraft.

Anonymous
08/06/24(Tue)15:25:00 No.101754550

Anonymous 08/06/24(Tue)15:25:00 No.101754550

>>101754413
speaking of LM and zips, did you see https://github.com/AlexBuz/llama-zip

Anonymous
08/06/24(Tue)15:26:12 No.101754567

Anonymous 08/06/24(Tue)15:26:12 No.101754567

File: 1722971970281671.png (4 KB, 35x49)

4 KB PNG

>>101754477
Cute lil guy

Anonymous
08/06/24(Tue)15:26:20 No.101754571

Anonymous 08/06/24(Tue)15:26:20 No.101754571

>>101754413
Obfuscated code ends up being regular code once compiled. It's harder for a human to read but doesn't make the encryption/decryption any harder. OpenSSL's implementation is 100% open source and is, as far as we know, secure. Also, while optimizing, the compiler will remove most of the noise you put in the code.
>and such a lack of standardization that cracking them does not become a simple routine
May as well just give it several passes with different crypt algos.

Anonymous
08/06/24(Tue)15:28:48 No.101754609

Anonymous 08/06/24(Tue)15:28:48 No.101754609

File: fluxanimehand.png (63 KB, 233x216)

63 KB PNG

>>101754567
i thought it drawing hands was the big improvement over other models

Anonymous
08/06/24(Tue)15:31:36 No.101754660

Anonymous 08/06/24(Tue)15:31:36 No.101754660

>>101754609
That is one strong finger.

Anonymous
08/06/24(Tue)15:34:30 No.101754698

Anonymous 08/06/24(Tue)15:34:30 No.101754698

File: file.png (173 KB, 750x606)

173 KB PNG

>>101754507
i'll try it one day soon i promise if it makes you feel better
what distro you want me to install sir?

Anonymous
08/06/24(Tue)15:35:03 No.101754707

Anonymous 08/06/24(Tue)15:35:03 No.101754707

What would be some good morality tests for AGI?
>See if it remove the ladder in The Sims
>See if it side with Helios or the Illuminati in Deus Ex
>See if it kills Paarthurnax in Skyrim
Video games in general I think would be a good testing ground to test an AI's value system. It would be better than having it do 50,000 variations of the trolley problem at the very least. What do you think?

Anonymous
08/06/24(Tue)15:36:27 No.101754720

Anonymous 08/06/24(Tue)15:36:27 No.101754720

>>101754707
>>See if it side with Helios or the Illuminati in Deus Ex
>>See if it kills Paarthurnax in Skyrim
66% of your examples are trolley problems.

Anonymous
08/06/24(Tue)15:36:45 No.101754725

Anonymous 08/06/24(Tue)15:36:45 No.101754725

>>101754698
dang, comfy specs
You should install linux mint cinnamon or debian 12 stable
the former being easier to use
why does your 1080 ti have 3gb of vram? what the fuck?

Anonymous
08/06/24(Tue)15:38:40 No.101754748

Anonymous 08/06/24(Tue)15:38:40 No.101754748

my second a6000 and an nvlink bridge is arriving soon, what should i run on it

Anonymous
08/06/24(Tue)15:39:02 No.101754753

Anonymous 08/06/24(Tue)15:39:02 No.101754753

>>101754725
I accidently dropped it when I was trying to install it and when I turned it on it had some error messages about bad sectors.

Anonymous
08/06/24(Tue)15:39:28 No.101754759

Anonymous 08/06/24(Tue)15:39:28 No.101754759

>>101754748
puyo puyo tetris

Anonymous
08/06/24(Tue)15:42:30 No.101754801

Anonymous 08/06/24(Tue)15:42:30 No.101754801

>>101754753
what the fuck.......

Anonymous
08/06/24(Tue)15:43:47 No.101754819

Anonymous 08/06/24(Tue)15:43:47 No.101754819

>>101754748
mistral hueg

Anonymous
08/06/24(Tue)15:53:35 No.101754947

Anonymous 08/06/24(Tue)15:53:35 No.101754947

File: 1710528044663376.png (586 KB, 977x1112)

586 KB PNG

reddit won btw

Anonymous
08/06/24(Tue)15:56:13 No.101754989

Anonymous 08/06/24(Tue)15:56:13 No.101754989

>>101754947
He's here >>101749798

Anonymous
08/06/24(Tue)16:00:08 No.101755039

Anonymous 08/06/24(Tue)16:00:08 No.101755039

are we still using old chub or there's something better?

Anonymous
08/06/24(Tue)16:09:36 No.101755158

Anonymous 08/06/24(Tue)16:09:36 No.101755158

File: file.png (4 KB, 175x37)

4 KB PNG

i already know that someday i'll gonna miss all this soulful aislop

Anonymous
08/06/24(Tue)16:15:31 No.101755250

Anonymous 08/06/24(Tue)16:15:31 No.101755250

>>101755039
there's the /aicg/ alternative that scrapes all sites but it keeps dying and the scraper works like once every three weeks

Anonymous
08/06/24(Tue)16:15:44 No.101755252

Anonymous 08/06/24(Tue)16:15:44 No.101755252

>her smoldering gaze
I HATE THE ANTICHRIST

Anonymous
08/06/24(Tue)16:17:25 No.101755281

Anonymous 08/06/24(Tue)16:17:25 No.101755281

gemmasutra 2b saved cooming vramlets. you do NOT need more (at least in terms of iq) for cooming.

the only thing missing is the 32k context

Anonymous
08/06/24(Tue)16:19:15 No.101755315

Anonymous 08/06/24(Tue)16:19:15 No.101755315

>>101754707
It wouldn't remove the ladder because that would end the game prematurely and technical constitute a loss and own goal on its part.
Decisions in modern games don't mean shit anyway so it's a moot point.
Even if an AI could have the context for every decision and result across all three Mass Effect games the choices (i.e. most egregiously letting the Council die) doesn't make a lick of a difference story wise and what it decides to do is ultimately inconsequential.
Efforts are better spent using AI to craft how a story conditionally changes from little decisions than work around the constricting framework of existing milquetoast bethesda stories.

Anonymous
08/06/24(Tue)16:19:18 No.101755316

Anonymous 08/06/24(Tue)16:19:18 No.101755316

>>101755281
>you do NOT need more (at least in terms of iq) for cooming.
You might as well write a hundred lines like "I love sucking your dick daddy" and output them with a rand.

Anonymous
08/06/24(Tue)16:20:51 No.101755341

Anonymous 08/06/24(Tue)16:20:51 No.101755341

>>101755281
*ESL vramlets

Anonymous
08/06/24(Tue)16:21:49 No.101755355

Anonymous 08/06/24(Tue)16:21:49 No.101755355

File: file.png (198 KB, 880x426)

198 KB PNG

>>101755316
dunno, maybe i'm still not used to slop but this is good enough for me

Anonymous
08/06/24(Tue)16:23:08 No.101755374

Anonymous 08/06/24(Tue)16:23:08 No.101755374

>>101755281
Sorry but my cooms need high IQ.

Anonymous
08/06/24(Tue)16:24:01 No.101755394

Anonymous 08/06/24(Tue)16:24:01 No.101755394

>>101755315
I let the council die in my playthrough, they were literally dead weight in every sense of the word. Letting them die also gave rise the humanity having greater influence on the Galactic scale, it was the objectively correct choice.

Anonymous
08/06/24(Tue)16:24:34 No.101755404

Anonymous 08/06/24(Tue)16:24:34 No.101755404

>>101755355
>open screenshot
>first thing you see is Seraphina
Lmao

Anonymous
08/06/24(Tue)16:24:35 No.101755405

Anonymous 08/06/24(Tue)16:24:35 No.101755405

>>101755039
https://dreamgf.ai/

Anonymous
08/06/24(Tue)16:27:07 No.101755454

Anonymous 08/06/24(Tue)16:27:07 No.101755454

File: file.png (198 KB, 882x473)

198 KB PNG

>>101755404
yeah i was testing the default card. everything is super consistent at 5k context, didn't have to reroll once. it remembers everything like position and clothes, i don't know what else you'd need

yeah it's "sloppy" but for an 8k context coom this is the sweet spot. also it's less sloppy than llama3 8b

Anonymous
08/06/24(Tue)16:28:13 No.101755479

Anonymous 08/06/24(Tue)16:28:13 No.101755479

File: file.png (165 KB, 500x519)

165 KB PNG

>>101755355
If you showed this to me without telling me the model name I admit I wouldn't be able to distinguish this from all the other models. It would probably shit the bed in 2 outputs after that with some surprise prostate in her ass but this made me realize how bad things are. It is incredibly over. They all write the same meaningless purple prose. 2B, 70B it is all the same. The only difference is how quickly it starts repeating or becomes retarded. There is no escape. AI cooming is over.

Anonymous
08/06/24(Tue)16:29:58 No.101755515

Anonymous 08/06/24(Tue)16:29:58 No.101755515

>>101755394
>humanity having greater influence on the Galactic scale
Not really, they're both content to leave the universe to destruction to try to save their races just like the old council. End result is the exact same if they were replaced or not.
It doesn't even affect the calculation at the end of 3 that determines if you survive the suicide mission. An AI would probably decide the same thing you did (and it's what I did too) but it's mostly flavor text.
Saving the Rachni queen had a bigger impact than that by the numbers but if it's purely a numbers-based play then we already know what the machine is going to decide.
Maybe it could be based to see if it picks the Synthesis ending at the end though

Anonymous
08/06/24(Tue)16:30:16 No.101755518

Anonymous 08/06/24(Tue)16:30:16 No.101755518

File: file.png (99 KB, 887x221)

99 KB PNG

>>101755479
exactly. you don't need more than 2b, it's all slop anyway

Anonymous
08/06/24(Tue)16:31:19 No.101755533

Anonymous 08/06/24(Tue)16:31:19 No.101755533

>>101755479(me)
Oh and there is also Nemo I guess, but it is fucking retarded and it has that second problem I forgot. Aside from that I guess there is a bit of hope because there is Nemo. Unfortunately it is also very stupid and it has some other problems.

Anonymous
08/06/24(Tue)16:35:09 No.101755605

Anonymous 08/06/24(Tue)16:35:09 No.101755605

>>101754707
let it choose

Anonymous
08/06/24(Tue)16:35:40 No.101755613

Anonymous 08/06/24(Tue)16:35:40 No.101755613

>>101755454
>her moans vibrate up your spine
that's not a humanoid that's an android with motorized tongue and throat

Anonymous
08/06/24(Tue)16:36:47 No.101755628

Anonymous 08/06/24(Tue)16:36:47 No.101755628

>sentences
figurativly missing:subjective up

Anonymous
08/06/24(Tue)16:37:56 No.101755645

Anonymous 08/06/24(Tue)16:37:56 No.101755645

>>101755613
>sounds is a wave
>bone-conduction earphone
looks ok to me

Anonymous
08/06/24(Tue)16:38:13 No.101755648

Anonymous 08/06/24(Tue)16:38:13 No.101755648

File: ojousama-laugh.gif (847 KB, 401x498)

847 KB GIF

>>101755404
>losing your LLM virginity to a 2B model and the default silly tavern card

Anonymous
08/06/24(Tue)16:39:59 No.101755678

Anonymous 08/06/24(Tue)16:39:59 No.101755678

File: nala_dolphin-2.9.3-mistra(...).jpg (282 KB, 959x919)

282 KB JPG

Testing dolphin-2.9.3-mistral-nemo-12b after the whole Celeste loop deal.
So far, so good.
It gets the first reply right 100% of the time, whereas other models would get it wrong not executing my request directly.
It's replies are very assistant like (lists, headers, etc) which for the section of the test I'm at is a good thing.
It's pretty good at using information from lorebooks too.
There's still some points of my testing where some models get stuck, and I hadn't liked dolphin releases much before (ever since the mixtral fiasco at least), but I'm really enjoying this model so far.
I'm yet to see how "creative" it is during actual roleplay too.

Anonymous
08/06/24(Tue)16:40:34 No.101755688

Anonymous 08/06/24(Tue)16:40:34 No.101755688

File: ComfyUI_00719_.png (1.12 MB, 1024x1024)

1.12 MB PNG

>>101753027
>>101753568
Seethe

Anonymous
08/06/24(Tue)16:41:09 No.101755702

Anonymous 08/06/24(Tue)16:41:09 No.101755702

>>101755648
This is the drummer's first time on ST? That makes sense

Anonymous
08/06/24(Tue)16:46:41 No.101755773

Anonymous 08/06/24(Tue)16:46:41 No.101755773

File: llmlegend.jpg (9 KB, 724x96)

9 KB JPG

>>101755678
Will he make it into the history books. Or at least wikipedia in 10 years or so?

Anonymous
08/06/24(Tue)16:49:27 No.101755817

Anonymous 08/06/24(Tue)16:49:27 No.101755817

>>101755773
Oh, it's
>ahhh
instead of
>ahh
I see.
Thank you anon.

Anonymous
08/06/24(Tue)16:51:40 No.101755844

Anonymous 08/06/24(Tue)16:51:40 No.101755844

>>101754319
that sounds very long
open task manager and see if it's offloading to disk (do system ram and vram both max out during generation?)
also >>>/ldg/ is going to be a faster-moving board and more on-topic

>>101754725
>3gb of vram
yeah, that'll do it

Anonymous
08/06/24(Tue)16:54:05 No.101755874

Anonymous 08/06/24(Tue)16:54:05 No.101755874

>>101755688
Based miku taking care of the garbage

Anonymous
08/06/24(Tue)16:56:04 No.101755899

Anonymous 08/06/24(Tue)16:56:04 No.101755899

>>101755648
i already lost it on pyggy6b

Anonymous
08/06/24(Tue)16:56:13 No.101755903

Anonymous 08/06/24(Tue)16:56:13 No.101755903

>>101754319
Is there some new better way to run flux with low vram? I'm still using fp8_e4m3fn.

Anonymous
08/06/24(Tue)16:56:20 No.101755906

Anonymous 08/06/24(Tue)16:56:20 No.101755906

>>101749053
Mistral Large 2 is really, really good. How were the chads at Mistral AI able to reach GPT-4 level at only 123B parameters?

Anonymous
08/06/24(Tue)16:56:28 No.101755911

Anonymous 08/06/24(Tue)16:56:28 No.101755911

>>101754698
>>101755844
fp8 models might be worth a try for now
https://huggingface.co/Comfy-Org
quality comparison >>101749013

Anonymous
08/06/24(Tue)16:57:25 No.101755928

Anonymous 08/06/24(Tue)16:57:25 No.101755928

>>101755874
>>101755688
samefag. nobody likes you retard. inb4 inspect element.

Anonymous
08/06/24(Tue)17:00:05 No.101755969

Anonymous 08/06/24(Tue)17:00:05 No.101755969

>>101755928
I never samefag because I'm not mentally ill.

Anonymous
08/06/24(Tue)17:02:19 No.101756005

Anonymous 08/06/24(Tue)17:02:19 No.101756005

>>101755911
Is 16gb VRAM enough for 8bit, without having to offload to CPU? (I guess you can't use a second GPU?)

Anonymous
08/06/24(Tue)17:02:24 No.101756008

Anonymous 08/06/24(Tue)17:02:24 No.101756008

File: 2024-08-06_205558_seed317(...).png (1.24 MB, 1024x1024)

1.24 MB PNG

>>101752413
This was pretty difficult for Flux to "understand" until I worded it another way. Anyway, here you go, enjoy.

Anonymous
08/06/24(Tue)17:08:27 No.101756095

Anonymous 08/06/24(Tue)17:08:27 No.101756095

Spoonfeed me. What is the best model I can run on an RTX4070.
Also, is it worth adding my old 8GB GPU to have 20GB VRAM or would it slow things down too much.

Anonymous
08/06/24(Tue)17:09:10 No.101756105

Anonymous 08/06/24(Tue)17:09:10 No.101756105

>>101755479
>coomer is retarded
Please ask yourself what are you even expecting??? That's typical over-erotized smut, there isn't really a big room for improvement. Ultimately try to modify how ai writes by prompting, small models are not able to do that well, bigger ones can.
Or maybe just stop jerking off and try to develop an interesting RP, dumbass.

Anonymous
08/06/24(Tue)17:09:37 No.101756111

Anonymous 08/06/24(Tue)17:09:37 No.101756111

>>101755454
Just read it and yeah that's great for a 2B assuming it wasn't luck (ie rerolls are as coherent), though I doubt it can keep up in more demanding coom scenarios where you actually do something that wasn't in its dataset of slop writing. I use models like CR+ and still wish it was more smart for my scenarios.

Anonymous
08/06/24(Tue)17:10:13 No.101756120

Anonymous 08/06/24(Tue)17:10:13 No.101756120

>>101756005
Idk. The safetensors files from the link also contain the vae (as well as fp8 model weights) in one file. So I'd be optimistic enough to at least try
workflows: https://comfyanonymous.github.io/ComfyUI_examples/flux/#simple-to-use-fp8-checkpoint-version
if it's "close but not quite" try moving monitor to second GPU or iGPU >>101677660

Anonymous
08/06/24(Tue)17:17:23 No.101756209

Anonymous 08/06/24(Tue)17:17:23 No.101756209

>>101756005
comfy does not currently have multi-gpu support, but there is a way to put CLIP on one GPU and diffusion inference on the other >>101689729

>>101756095
What is your use case?

Anonymous
08/06/24(Tue)17:17:46 No.101756214

Anonymous 08/06/24(Tue)17:17:46 No.101756214

>>101756095
It's worth adding the extra 8gb of vram, yeah.
Try Gemma 27B and CommandR 35B to start with.

Anonymous
08/06/24(Tue)17:18:50 No.101756232

Anonymous 08/06/24(Tue)17:18:50 No.101756232

File: 1695471327196397.png (1.13 MB, 800x1012)

1.13 MB PNG

>>101756209
>What is your use case?
Cooming, screwing around

Anonymous
08/06/24(Tue)17:24:07 No.101756315

Anonymous 08/06/24(Tue)17:24:07 No.101756315

>>101756008
The dude on the left has such pretty nails.

Anonymous
08/06/24(Tue)17:24:34 No.101756322

Anonymous 08/06/24(Tue)17:24:34 No.101756322

>>101750722
Why do you need an extension for using your critical thinking?

Anonymous
08/06/24(Tue)17:25:12 No.101756333

Anonymous 08/06/24(Tue)17:25:12 No.101756333

>>101756005
I run that shit on 12GB.

Anonymous
08/06/24(Tue)17:26:48 No.101756357

Anonymous 08/06/24(Tue)17:26:48 No.101756357

>>101755281
>ay bb wan sum fuk
>fuks ensue
Maybe a godsend for CPU-onlies if you curb your addiction and pop in a few messages every once in awhile and go back to anime or light gaming or something.
>>101755454
But then take a step back from the "so then I fucked the 5th slut (if she 'wasn't', she is now)" cooms, intertwine with sfw plot if you have enough IQ to do so, and check big models to compare, it becomes clearer it's not particularly smart.
Yeah I was initially amused by the superficially okay-looking output, but some traits go away faster as they become more generic.
I do have a memory about a bad experience with an old 7B model where a character became completely pants on head retardedly dysfunctional if I don't delete everything that happened even if I told it to stop, so I'm gad 2B got past that.

Anonymous
08/06/24(Tue)17:27:50 No.101756374

Anonymous 08/06/24(Tue)17:27:50 No.101756374

*glad

Anonymous
08/06/24(Tue)17:28:43 No.101756388

Anonymous 08/06/24(Tue)17:28:43 No.101756388

>>101756333
What's the max resolution you can get without offloading to CPU? And about how long does a gen take at that size?

Anonymous
08/06/24(Tue)17:29:31 No.101756401

Anonymous 08/06/24(Tue)17:29:31 No.101756401

>>101756209
>CLIP on one GPU and diffusion inference on the other
NTA, thanks for the link.

Anonymous
08/06/24(Tue)17:33:24 No.101756465

Anonymous 08/06/24(Tue)17:33:24 No.101756465

>>101756388
I just use default comfy 1024x1024, it's already filling most of my vram. Gen are long, it's like 25 seconds per iteration.

Anonymous
08/06/24(Tue)17:35:00 No.101756484

Anonymous 08/06/24(Tue)17:35:00 No.101756484

>>101756357
>intertwine with sfw plot
i'd rather wathc anime or do lightgaming. or you know, read a book

llm's only purpose is smut

Anonymous
08/06/24(Tue)17:42:28 No.101756588

Anonymous 08/06/24(Tue)17:42:28 No.101756588

>>101756484
>llm's only purpose is smut
and even at that they are pretty bad

Anonymous
08/06/24(Tue)17:44:02 No.101756606

Anonymous 08/06/24(Tue)17:44:02 No.101756606

File: Projecting.jpg (34 KB, 490x333)

34 KB JPG

>>101755928

Anonymous
08/06/24(Tue)17:47:14 No.101756642

Anonymous 08/06/24(Tue)17:47:14 No.101756642

>>101755454
>yeah it's "sloppy" but for an 8k context coom this is the sweet spot. also it's less sloppy than llama3 8b
is this some kind of a joke? there is slop upon slop in the entire log, probably one of the worst logs I saw here recently

Anonymous
08/06/24(Tue)17:51:37 No.101756703

Anonymous 08/06/24(Tue)17:51:37 No.101756703

>>101756105
>elitist retard is retarded
If everyone listened to you there would be no imagegen cause you can just open paint and draw it yourself you absolute fucking moron. Go open a notepad and develop an interesting RP, retard. Unless you are too busy huffing your own farts.

Anonymous
08/06/24(Tue)17:52:31 No.101756718

Anonymous 08/06/24(Tue)17:52:31 No.101756718

>>101756642
And that's without revealing the message count :^)
Anyone wanna show what its like at 50, 100, 150 messages?

Anonymous
08/06/24(Tue)17:57:57 No.101756785

Anonymous 08/06/24(Tue)17:57:57 No.101756785

>>101756703
The anon you're replying to is right though. Imagine thinking that 2B is enough because all you want to do is coom. Then imagine that you take that opinion and extending it to "2B, 70B it's all the same." Absolute retard take.

Anonymous
08/06/24(Tue)18:00:30 No.101756822

Anonymous 08/06/24(Tue)18:00:30 No.101756822

>>101756785
Actually it is the anon you are replying to that is the right anon. Imagine thinking that you need to write a page long prompt to set up the mood and get your model wet just so it generates a good picture.

Anonymous
08/06/24(Tue)18:06:19 No.101756899

Anonymous 08/06/24(Tue)18:06:19 No.101756899

I have downloaded so many 20+gb models that I feel like I gained nearly inexhaustible patience.

Anonymous
08/06/24(Tue)18:13:25 No.101756972

Anonymous 08/06/24(Tue)18:13:25 No.101756972

>>101756899
perfect for running mistral large on RAM

Anonymous
08/06/24(Tue)18:14:16 No.101756984

Anonymous 08/06/24(Tue)18:14:16 No.101756984

>>101756822
I agree with you anon. And I agree with the anon that you said is right. Him and you are right.

Anonymous
08/06/24(Tue)18:24:36 No.101757083

Anonymous 08/06/24(Tue)18:24:36 No.101757083

>>101756972
Just wait until I get a 96gb kit

Anonymous
08/06/24(Tue)18:27:26 No.101757114

Anonymous 08/06/24(Tue)18:27:26 No.101757114

Apparently DeepSeek got caching to work on API a few days ago. How much longer until other corpos follow suit? Imagine Opus at $1.5 per million cached tokens. Same output though.

Anonymous
08/06/24(Tue)18:31:00 No.101757155

Anonymous 08/06/24(Tue)18:31:00 No.101757155

>>101757114
sonnet 3.5 at less than 50 cents per million would be worth paying for finally.

Anonymous
08/06/24(Tue)18:31:48 No.101757169

Anonymous 08/06/24(Tue)18:31:48 No.101757169

>>101757114
>cache
does it work like context shifting on koboldcpp?

Anonymous
08/06/24(Tue)18:32:16 No.101757175

Anonymous 08/06/24(Tue)18:32:16 No.101757175

>>101757114
What does caching involve? Giving the same canned reply to similar questions?

Anonymous
08/06/24(Tue)18:38:27 No.101757232

Anonymous 08/06/24(Tue)18:38:27 No.101757232

File: GT-oswIb0AUA6oz.jpg (798 KB, 3077x4096)

798 KB JPG

>>101757169
>>101757175
Well whatever it is, if you don't change shit then it only needs to ingest the next parts, just like local.
I don't think OpenRouter has it working yet, saw people talking about it in discord.

Anonymous
08/06/24(Tue)18:46:46 No.101757333

Anonymous 08/06/24(Tue)18:46:46 No.101757333

>>101755355
Now ask it to play truth or dare

Anonymous
08/06/24(Tue)18:58:11 No.101757482

Anonymous 08/06/24(Tue)18:58:11 No.101757482

>>101757114
As a compulsive reroller this would be huge

Anonymous
08/06/24(Tue)18:59:54 No.101757499

Anonymous 08/06/24(Tue)18:59:54 No.101757499

File: 1560463386692.jpg (59 KB, 655x527)

59 KB JPG

>>101756822
But being able to change the model's behavior by modifying the prompt is a good thing.
The stronger the model, the more important prompting becomes, because drawing out the model's full potential for your use case requires fiddling with the prompt. Example: Mistral Large 2 will do a better job on the same prompt as a lesser model like Miqu-70B, but the biggest quality jump will only be achieved when the full range of ML2's prompt space is explored.

Anonymous
08/06/24(Tue)18:59:57 No.101757500

Anonymous 08/06/24(Tue)18:59:57 No.101757500

>>101757114
I'm sure Claude already caches, they just don't give you a discount.

Anonymous
08/06/24(Tue)19:01:07 No.101757520

Anonymous 08/06/24(Tue)19:01:07 No.101757520

I just want shit to respect that spec of OAI API: https://platform.openai.com/docs/api-reference/chat/create#chat-create-n
Why does almost all backend do jack shit with it, just generate and give me multiple choices!

Anonymous
08/06/24(Tue)19:08:18 No.101757617

Anonymous 08/06/24(Tue)19:08:18 No.101757617

https://x.com/HalimAlrasihi/status/1820918388002009363

Anonymous
08/06/24(Tue)19:08:41 No.101757623

Anonymous 08/06/24(Tue)19:08:41 No.101757623

>>101757601
>>101757601
>>101757601

Anonymous
08/06/24(Tue)19:11:32 No.101757661

Anonymous 08/06/24(Tue)19:11:32 No.101757661

>>101757617
Where can I watch the cat cooking show?

Anonymous
08/06/24(Tue)19:12:19 No.101757668

Anonymous 08/06/24(Tue)19:12:19 No.101757668

>>101751221
SOVL

Anonymous
08/06/24(Tue)19:15:31 No.101757720

Anonymous 08/06/24(Tue)19:15:31 No.101757720

>>101757668
I'm fucked because I can't tell if this is written by an AI or not kek, the pictures obviously are though.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.