/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 08/02/24(Fri)04:14:44 No.101682019

File: file.png (2.02 MB, 1024x1024)

2.02 MB PNG

/lmg/ - Local Models General Anonymous 08/02/24(Fri)04:14:44 No.101682019 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>101673824 & >>101664954

►News
>(07/31) Google releases Gemma 2 2B, ShieldGemma, and Gemma Scope: https://developers.googleblog.com/en/smaller-safer-more-transparent-advancing-responsible-ai-with-gemma
>(07/27) Llama 3.1 rope scaling merged: https://github.com/ggerganov/llama.cpp/pull/8676
>(07/26) Cyberagent releases Japanese fine-tune model: https://hf.co/cyberagent/Llama-3.1-70B-Japanese-Instruct-2407
>(07/25) BAAI & TeleAI release 1T parameter model: https://hf.co/CofeAI/Tele-FLM-1T
>(07/24) Mistral Large 2 123B released: https://hf.co/mistralai/Mistral-Large-Instruct-2407

►News Archive: https://rentry.org/lmg-news-archive
►FAQ: https://wikia.schneedc.com
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Programming: https://hf.co/spaces/bigcode/bigcode-models-leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
08/02/24(Fri)04:15:07 No.101682026

Anonymous 08/02/24(Fri)04:15:07 No.101682026

File: 1713969160389409.png (693 KB, 1024x1024)

693 KB PNG

►Recent Highlights from the Previous Thread: >>101673824

(1/2)

--Papers: >>101676385 >>101678804
--Non-instruct 3.1 models don't use chat templates, just paste prompt without extra things: >>101679229 >>101679315 >>101679412 >>101679450 >>101679493 >>101679701 >>101679814 >>101679855
--Generating anime nudes with Flux and improving results through fine-tuning and multimodal models: >>101677440 >>101678285 >>101678305 >>101678316 >>101678524 >>101678567 >>101678591 >>101678615 >>101678653 >>101678694 >>101678824 >>101678866 >>101679483 >>101679721 >>101679806 >>101679861 >>101679909
--Comfy's FP8 quant types compared, e4m3fn recommended: >>101678062
--CLIP struggles with nighttime scenes and lighting conditions: >>101677487 >>101677522 >>101677614 >>101677606 >>101677656
--Anon shares largestral preset and discusses compatibility and tweaking: >>101677733 >>101677888 >>101677937 >>101678061 >>101678177
--Anon gets llama3 405b model working with RPC backend and CUDA: >>101675492 >>101676158 >>101676645 >>101676940 >>101676990 >>101677266 >>101677514 >>101677670 >>101677951 >>101679773
--Anon discusses fp8 quanting and its effects on model performance and VRAM usage: >>101676925 >>101677081 >>101677631 >>101677660 >>101677223
--Anon shares Bitnet fine-tuning project on Twitter: >>101674803
--T5xxl has generic styles, prompt like a NLP VLM: >>101678672 >>101678729

►Recent Highlight Posts from the Previous Thread: >>101673831

Anonymous
08/02/24(Fri)04:16:15 No.101682035

Anonymous 08/02/24(Fri)04:16:15 No.101682035

File: 39__00002_.png (984 KB, 1024x1024)

984 KB PNG

►Recent Highlights from the Previous Thread: >>101673824

(2/2)

--Model generates coherent text and anime images, outperforming Dalle/Bing: >>101676854 >>101676922 >>101676981 >>101677620 >>101678115 >>101678190
--Miku Online game development with MythoMax and flux-dev: >>101676119 >>101676134 >>101676322 >>101676240 >>101676306 >>101676500 >>101676553 >>101677033 >>101677043
--Meta's 400b model and plans for Llama 4: >>101674795 >>101674825 >>101676577
--Llama model requires 16GB~24GB VRAM: >>101676110 >>101676786 >>101676134 >>101676322
--FLUX.1 model has resolution limits: >>101678709 >>101678720
--Error generating at 1920x1080 with fp8 due to shape mismatch: >>101678172 >>101678182
--ComfyUI error with torch float8 type, try updating pytorch: >>101674930 >>101674989
--Miku (free space): >>101674330 >>101674503 >>101675596 >>101676616 >>101676703 >>101676844 >>101676889 >>101676890 >>101676923 >>101676987 >>101677087 >>101677126 >>101677329 >>101677660 >>101677821 >>101678159 >>101680333 >>101680503 >>101681458

►Recent Highlight Posts from the Previous Thread: >>101673831

Anonymous
08/02/24(Fri)04:26:42 No.101682122

Anonymous 08/02/24(Fri)04:26:42 No.101682122

File: close up photo of two wom(...).jpg (176 KB, 1024x1024)

176 KB JPG

>>101682019
lmao these /lmg/ threads still exist when there's /ldg/?

Anonymous
08/02/24(Fri)04:28:22 No.101682138

Anonymous 08/02/24(Fri)04:28:22 No.101682138

Which models >7B (tunes/merges) I should avoid for being upscales? I've seen some labeled as 13B but based on 7B mixes. Running 13B on my vramlet is suffering as it is, I want to be 100% sure I'm not wasting my time.

Is Echidna or Psyfighter 2 good in that regard?

Anonymous
08/02/24(Fri)04:30:06 No.101682160

Anonymous 08/02/24(Fri)04:30:06 No.101682160

>>101682138
Look at the release date. If the model was made before April this year, you can automatically assume that it's been rendered obsolete by something better.

Anonymous
08/02/24(Fri)04:32:31 No.101682183

Anonymous 08/02/24(Fri)04:32:31 No.101682183

File: teto_chibi_llamas.png (986 KB, 1024x1024)

986 KB PNG

Yay me, I got it working. Its knowledge of characters is atrocious though.

Anonymous
08/02/24(Fri)04:34:43 No.101682204

Anonymous 08/02/24(Fri)04:34:43 No.101682204

>>101682183
Yeah, it seems likely they used AI captioning for the image descriptions, which tends to strip out knowledge of anything but the most popular characters.

Anonymous
08/02/24(Fri)04:38:41 No.101682230

Anonymous 08/02/24(Fri)04:38:41 No.101682230

>>101682160
I'm not sure how that relates to my question. I'm only asked about something that objectively is bad because an upscale isn't real 13B params. Besides, are there any 13B models in new releases? From what I see, the llamas only come in 7 and 70B flavors, the nemo/mistral 12B won't run in what I use and require terrible tinkering.
13B is the absolute limit of what I can run, so I want to get the best for my buck.

Anonymous
08/02/24(Fri)04:42:21 No.101682256

Anonymous 08/02/24(Fri)04:42:21 No.101682256

File: taggui.png (569 KB, 2468x984)

569 KB PNG

>>101682183
I'm working on evaluating several vision/caption models for their dataset captioning capability (meaning no >100 tokens flowery language with no actual information behind the epithets). Florence seems to be one of the best for now.

Text comparison is done, now I'm feeding the results to sdxl and evaluating how close it's to the original (subjectively).

Anonymous
08/02/24(Fri)04:44:32 No.101682276

Anonymous 08/02/24(Fri)04:44:32 No.101682276

>>101682230
>nemo/mistral 12B won't run in what I use and require terrible tinkering.
What kind of strange setup do you have?

Anonymous
08/02/24(Fri)04:49:41 No.101682321

Anonymous 08/02/24(Fri)04:49:41 No.101682321

>>101682276
Backyard/GPT4ALL. Yeah, laugh at me, I can't be bothered to install UI and engine separately, and seeing that even with ST it requires some manual tweaking, I'm not that much inclined to do that.

Anonymous
08/02/24(Fri)04:52:05 No.101682339

Anonymous 08/02/24(Fri)04:52:05 No.101682339

Sorry for being a retard, but I am a time traveler from about a year ago. At that time, the state of the art was local llamas and people were starting to dick around with vicunas. I do have a GPU but at that time there were no linux drivers worth a fuck so I had to use CPU and it sucked. What has changed the last few months to a year?

Anonymous
08/02/24(Fri)04:52:19 No.101682341

Anonymous 08/02/24(Fri)04:52:19 No.101682341

>>101682230
While we're at it, did Llama2 base model come in 13B? If not, that basically makes every model based on it and larger than 8B an upscale.

Anonymous
08/02/24(Fri)04:55:13 No.101682362

Anonymous 08/02/24(Fri)04:55:13 No.101682362

>>101682256
Is L3 LLaVA-llama-3?

Anonymous
08/02/24(Fri)04:55:58 No.101682366

Anonymous 08/02/24(Fri)04:55:58 No.101682366

>>101682321
Well Echidna at least I know is based on Llama2-13B so it's an actual 13b, or you could try llama 3.1 8b if that works? What about Gemma, does that work? There's the Gemma2 9b.

Anonymous
08/02/24(Fri)04:57:47 No.101682383

Anonymous 08/02/24(Fri)04:57:47 No.101682383

>>101682362
It's titled as "xtuner/llava-llama-3-8b-v1_1-transformers" in Taggui.
The short and long difference is whether I include a "describe the image" prompt. Doing so results in a large (overly) descriptive caption for some models.

Anonymous
08/02/24(Fri)04:58:04 No.101682385

Anonymous 08/02/24(Fri)04:58:04 No.101682385

>>101682366
I think it’s really cool that you’re helping this lazy mother fucker who can’t be bothered to even lift a finger.

Anonymous
08/02/24(Fri)05:02:45 No.101682417

Anonymous 08/02/24(Fri)05:02:45 No.101682417

>>101682383
Awesome, thanks. I was evaluating just a min ago and not liking the results very much. I'll hop over to Florence for testing instead. I'm trying to include the definition of booru tags and have them incorporated into the description, though it may be beyond current vision models.

Anonymous
08/02/24(Fri)05:05:26 No.101682432

Anonymous 08/02/24(Fri)05:05:26 No.101682432

BitNet status?

Anonymous
08/02/24(Fri)05:06:30 No.101682441

Anonymous 08/02/24(Fri)05:06:30 No.101682441

The anon in the last thread was right the Celeste shit is trash, mini magnum still is better and smarter, I thing the over data from reddit is just so bad, that the model less organic and fall in repetitions.

Anonymous
08/02/24(Fri)05:09:00 No.101682472

Anonymous 08/02/24(Fri)05:09:00 No.101682472

>>101682138
>>101682230
>>101682339
>>101682341
>>101682432
>>101682441
None of them have more than 24GB VRAM.

Anonymous
08/02/24(Fri)05:10:13 No.101682482

Anonymous 08/02/24(Fri)05:10:13 No.101682482

File: file.png (20 KB, 1183x85)

20 KB PNG

>>101682432
they fuckin with basic bitch 0.15B model rn zzzzzz

Anonymous
08/02/24(Fri)05:13:11 No.101682507

Anonymous 08/02/24(Fri)05:13:11 No.101682507

File: LLM-history.png (1.45 MB, 4651x5197)

1.45 MB PNG

>>101682339

Anonymous
08/02/24(Fri)05:14:02 No.101682515

Anonymous 08/02/24(Fri)05:14:02 No.101682515

>>101682441
l3 celeste was also borderline unusable, dude is just incompetent at finetunes
I have yet to try mini magnum, but dory has been okay in my tests, it has a system role and seems better at remembering stuff from a large context

Anonymous
08/02/24(Fri)05:28:11 No.101682636

Anonymous 08/02/24(Fri)05:28:11 No.101682636

>>101682366
Thanks for clarifying. Is the difference between 9 and 13B generally noticeable or they're just both equally dumb?

>>101682417
https://github.com/jhc13/taggui/discussions/169
there's a comparison of various models with size requirements and whatnot

Anonymous
08/02/24(Fri)05:40:27 No.101682748

Anonymous 08/02/24(Fri)05:40:27 No.101682748

Does Nemo need me to write actions using *asterisks* or can it understand just regular narrative prose too?

Anonymous
08/02/24(Fri)05:48:01 No.101682806

Anonymous 08/02/24(Fri)05:48:01 No.101682806

File: teto-flux.png (966 KB, 1024x1024)

966 KB PNG

>>101682183

Anonymous
08/02/24(Fri)06:04:17 No.101682975

Anonymous 08/02/24(Fri)06:04:17 No.101682975

File: .jpg (26 KB, 94x557)

26 KB JPG

yep its llm time

Anonymous
08/02/24(Fri)06:04:57 No.101682986

Anonymous 08/02/24(Fri)06:04:57 No.101682986

>>101682748
Of course Nemo can do that. You just need to delete any * the bot might have.

Anonymous
08/02/24(Fri)06:06:57 No.101683014

Anonymous 08/02/24(Fri)06:06:57 No.101683014

>>101682507
>llama2
>golden age of tuning
lol no
we've been in a downward spiral since llama1

Anonymous
08/02/24(Fri)06:07:27 No.101683025

Anonymous 08/02/24(Fri)06:07:27 No.101683025

>>101682975
Looks like an average AO3 fic.

Anonymous
08/02/24(Fri)06:07:39 No.101683027

Anonymous 08/02/24(Fri)06:07:39 No.101683027

>>101682975
more like writing in first person time

Anonymous
08/02/24(Fri)06:11:09 No.101683057

Anonymous 08/02/24(Fri)06:11:09 No.101683057

>>101683027
Not really, there are many ways to add variety in first person

Anonymous
08/02/24(Fri)06:13:01 No.101683077

Anonymous 08/02/24(Fri)06:13:01 No.101683077

I heard the AI does not handle negative commands well. How do I tell it that someone does NOT have a tail?

Anonymous
08/02/24(Fri)06:13:22 No.101683081

Anonymous 08/02/24(Fri)06:13:22 No.101683081

>>101683057
Are there? It feels weird describing your own actions like that + potential to confuse the model.

Anonymous
08/02/24(Fri)06:16:25 No.101683109

Anonymous 08/02/24(Fri)06:16:25 No.101683109

>>101683027
Isn't it better than having it go *name* did X? It drives the conversation into narration and the models are already too biased towards it.

Anonymous
08/02/24(Fri)06:17:41 No.101683120

Anonymous 08/02/24(Fri)06:17:41 No.101683120

>>101683081
1. Start with a different part of the sentence:
"With trembling hands, I opened the letter."
"Slowly, the realization dawned on me."

2. Use participle phrases:
"Stumbling through the dark, I searched for the light switch."
"Having finished my work, I decided to take a walk."

3. Incorporate sensory details:
"The scent of freshly brewed coffee drew me to the kitchen."
"A loud crash startled me from my reverie."

4. Focus on other characters or objects:
"Sarah's expression told me everything I needed to know."
"The old clock chimed, reminding me of the late hour."

5. Use dialogue:
"'You can't be serious,' I muttered under my breath."

6. Employ rhetorical questions:
"What was I thinking when I agreed to this?"

7. Start with time or place markers:
"At midnight, the streets were eerily quiet."
"In the dimly lit room, shadows danced on the walls."

8. Use infinitive phrases:
"To calm my nerves, I took a deep breath."

9. Incorporate internal thoughts:
"The idea seemed ridiculous, but what choice did I have?"

10. Utilize passive voice occasionally:
"My attention was caught by a flicker of movement."

Anonymous
08/02/24(Fri)06:18:33 No.101683128

Anonymous 08/02/24(Fri)06:18:33 No.101683128

>>101683109
No one good at RP uses first person

Anonymous
08/02/24(Fri)06:19:12 No.101683138

Anonymous 08/02/24(Fri)06:19:12 No.101683138

>>101683120
And I'll have to do that every single time I start a convo with these repetition-prone models? Is this our life now?

Anonymous
08/02/24(Fri)06:19:48 No.101683142

Anonymous 08/02/24(Fri)06:19:48 No.101683142

>>101683109
I don't know, I'm asking you. I do third person narration and keep dialogue in quotes in first person. Works well for me but you really have to move with scenario otherwise repetition and slop creeps in.

Anonymous
08/02/24(Fri)06:20:35 No.101683149

Anonymous 08/02/24(Fri)06:20:35 No.101683149

>>101683128
A reminder that these good at rp people are who taught the AI to have shivers and other slop. I've seen people defend that style of writing outside the scope of AI, seems they unironically believe this is good.

Anonymous
08/02/24(Fri)06:24:13 No.101683188

Anonymous 08/02/24(Fri)06:24:13 No.101683188

>>101683138
Yes, and the models will still default back to using "I did X", "She did X"

Anonymous
08/02/24(Fri)06:25:18 No.101683200

Anonymous 08/02/24(Fri)06:25:18 No.101683200

>>101683188
b-but the system prompt and JB...

Anonymous
08/02/24(Fri)06:26:19 No.101683209

Anonymous 08/02/24(Fri)06:26:19 No.101683209

>>101683077
They are all tailless. 0rnm84

Anonymous
08/02/24(Fri)06:28:16 No.101683218

Anonymous 08/02/24(Fri)06:28:16 No.101683218

>>101683128
i disagree. especially with rag and lorebooks, its awesome how you can insert yourself into any role in as a character

Anonymous
08/02/24(Fri)06:29:08 No.101683229

Anonymous 08/02/24(Fri)06:29:08 No.101683229

>writing in first person
>using asterisks
>letting the model mention your character's emotions and actions
post the worst

Anonymous
08/02/24(Fri)06:31:06 No.101683251

Anonymous 08/02/24(Fri)06:31:06 No.101683251

>>101683229
>drive the plot and conversation forwards
>"And so, they lived happily ever after. The end."

Anonymous
08/02/24(Fri)06:32:03 No.101683260

Anonymous 08/02/24(Fri)06:32:03 No.101683260

>>101683218
Yeah I really love how personal you can make it and using first person elevates that experience. Local fucking rocks.

Anonymous
08/02/24(Fri)06:32:08 No.101683261

Anonymous 08/02/24(Fri)06:32:08 No.101683261

>>101682806
Does it know who Teto is, or did you just describe what the subject is supposed to look like?

Anonymous
08/02/24(Fri)06:33:38 No.101683275

Anonymous 08/02/24(Fri)06:33:38 No.101683275

>>101683149
No one outside of romance novels for women writes like that. Mundane RP reads more like

https://pastecode.io/s/ndaa4nt4

Etc

Anonymous
08/02/24(Fri)06:34:29 No.101683282

Anonymous 08/02/24(Fri)06:34:29 No.101683282

>>101682122
Isn't /ldg/ for images?

Anonymous
08/02/24(Fri)06:40:39 No.101683335

Anonymous 08/02/24(Fri)06:40:39 No.101683335

File: 1719661113410816.jpg (333 KB, 1070x1152)

333 KB JPG

>local AI be like

Anonymous
08/02/24(Fri)06:42:12 No.101683355

Anonymous 08/02/24(Fri)06:42:12 No.101683355

>>101683282
Isn't /miku/ for TTS engines?

Anonymous
08/02/24(Fri)06:43:26 No.101683367

Anonymous 08/02/24(Fri)06:43:26 No.101683367

>>101683335
>implying proprietary is less cucked
it will also report you glownigs for asking that lmao

Anonymous
08/02/24(Fri)06:45:49 No.101683386

Anonymous 08/02/24(Fri)06:45:49 No.101683386

>>101683335
first opinion is based, the second one is retarded

Anonymous
08/02/24(Fri)06:45:53 No.101683387

Anonymous 08/02/24(Fri)06:45:53 No.101683387

File: censorshit.jpg (482 KB, 2304x467)

482 KB JPG

>>101683335
With local we have a choice.

Anonymous
08/02/24(Fri)06:46:50 No.101683396

Anonymous 08/02/24(Fri)06:46:50 No.101683396

>>101683387
fuck off with your meme benchmark, I remember you acting retarded in previous threads

Anonymous
08/02/24(Fri)06:48:25 No.101683411

Anonymous 08/02/24(Fri)06:48:25 No.101683411

>>101683386
>>first opinion is based
of course /g/edditor would say that.

Anonymous
08/02/24(Fri)06:49:47 No.101683420

Anonymous 08/02/24(Fri)06:49:47 No.101683420

>>101683396
back to /aicg/ with proxybegging for cucked corpomodels
Or will you tell me more about starving children in Africa again?

Anonymous
08/02/24(Fri)06:51:28 No.101683434

Anonymous 08/02/24(Fri)06:51:28 No.101683434

>>101683411
nta but thinking you deserve something because of your skin is peak nigger behavior. Go get your food stamps scum.

Anonymous
08/02/24(Fri)06:51:31 No.101683436

Anonymous 08/02/24(Fri)06:51:31 No.101683436

>>101683420
>back to /aicg/ with proxybegging for cucked corpomodels
I piss on /aicg/ and corpos
>Or will you tell me more about starving children in Africa again?
?

Anonymous
08/02/24(Fri)06:53:07 No.101683448

Anonymous 08/02/24(Fri)06:53:07 No.101683448

>>101683434
whatever you say self-hating cuck.

Anonymous
08/02/24(Fri)06:54:30 No.101683463

Anonymous 08/02/24(Fri)06:54:30 No.101683463

>>101683436
/lmg/fags are stupid just like their local AI.

Anonymous
08/02/24(Fri)06:54:33 No.101683464

Anonymous 08/02/24(Fri)06:54:33 No.101683464

>>101683188
It's all pattern recognition
rubbish in rubbish out

Anonymous
08/02/24(Fri)06:57:08 No.101683483

Anonymous 08/02/24(Fri)06:57:08 No.101683483

>>101683448
Very cool. Did you get social care sorted out yet you parasite? Maybe puppy eyes will help.

Anonymous
08/02/24(Fri)06:58:40 No.101683500

Anonymous 08/02/24(Fri)06:58:40 No.101683500

>>101683464
My writing is immaculate but still not enough to overpower all the pretraining slop

Anonymous
08/02/24(Fri)07:05:49 No.101683554

Anonymous 08/02/24(Fri)07:05:49 No.101683554

>>101683411
Imagine being proud about something you didn't work for and was given to you by a sheer luck. These kind of people are the biggest pussies in the entire world, subhumans even. If you had accomplishments on your own you wouldn't have a need to be associated with a wide group that is full of retards, creeps and other pathetic people. When you see a white guy shitting himself from drugs on the street you think this is your guy, your brother. I have more in common with my black colleague with PhD that works next to me at my job than with most white people. The mere thought of white trash like you being seeing as my equal makes me want to vomit.

Anonymous
08/02/24(Fri)07:23:18 No.101683718

Anonymous 08/02/24(Fri)07:23:18 No.101683718

>>101683261
it probably knows

Anonymous
08/02/24(Fri)07:31:38 No.101683802

Anonymous 08/02/24(Fri)07:31:38 No.101683802

>>101683554
Jeet fingers typed this

Anonymous
08/02/24(Fri)07:41:09 No.101683887

Anonymous 08/02/24(Fri)07:41:09 No.101683887

>>101683554
I miss times when such bait posts on 4chan ended with an witty twist. Now this shit is written unironically.

Anonymous
08/02/24(Fri)07:45:43 No.101683929

Anonymous 08/02/24(Fri)07:45:43 No.101683929

Honest question from a clueless retard: is anything local comparable to GPT 3.5 turbo?

Anonymous
08/02/24(Fri)07:47:45 No.101683945

Anonymous 08/02/24(Fri)07:47:45 No.101683945

>>101683929
Yes, most new models are. Better in fact.
Really.
Llama 3.1 8B instruct is a good place to start if you're looking for something that is similar to turbo but better.

Anonymous
08/02/24(Fri)07:48:01 No.101683949

Anonymous 08/02/24(Fri)07:48:01 No.101683949

>>101683929
>GPT 3.5 turbo
this is such an old model that most of local mogs it easily

Anonymous
08/02/24(Fri)07:48:04 No.101683950

Anonymous 08/02/24(Fri)07:48:04 No.101683950

File: Screenshot_20240802_134712.png (223 KB, 2553x706)

223 KB PNG

>>101683335
Skill issue

Anonymous
08/02/24(Fri)07:48:56 No.101683959

Anonymous 08/02/24(Fri)07:48:56 No.101683959

>>101683929
pretty much any modern local in the 70b+ range surpasses 3.5, any that didn't would be awful

Anonymous
08/02/24(Fri)07:49:11 No.101683961

Anonymous 08/02/24(Fri)07:49:11 No.101683961

>>101683929
LLAMA 3.1 70B, smart but cucked. Comparable to turbo GPT3.5.

Anonymous
08/02/24(Fri)07:50:19 No.101683976

Anonymous 08/02/24(Fri)07:50:19 No.101683976

>>101683961
70B utterly mogs turbo

Anonymous
08/02/24(Fri)07:50:23 No.101683977

Anonymous 08/02/24(Fri)07:50:23 No.101683977

>>101683949
>>101683945
>>101683959
>>101683961
That's excellent to hear. Are they censored/can they do smut RP?

Anonymous
08/02/24(Fri)07:53:10 No.101684003

Anonymous 08/02/24(Fri)07:53:10 No.101684003

>>101683977
LLAMAs are very cucked. Use command-r-plus or mistral-large for ERP.

Anonymous
08/02/24(Fri)07:54:08 No.101684012

Anonymous 08/02/24(Fri)07:54:08 No.101684012

>>101683977
If you want "vanilla local ChatGPT but without censors" look up the llama 3 abliterated models. If you want something specifically good for smut look into a finetune.

It's also not hard to "jailbreak" vanilla instruct llama because you control the system prompt, and base llama 3 can be tricked into continuing pretty easily, the few times it refuses. Alternatively check out Command R/R+. Really good models that are both uncensored and pretty smart.

Anonymous
08/02/24(Fri)07:57:22 No.101684050

Anonymous 08/02/24(Fri)07:57:22 No.101684050

>>101684012
What's the smallest CmdR, any of 13B and under? If not, can I achieve comparable performance (speed vs quality) by very quantized larger parameter version, or it's not worth it?

Anonymous
08/02/24(Fri)07:59:29 No.101684078

Anonymous 08/02/24(Fri)07:59:29 No.101684078

File: file.png (1.05 MB, 1024x1024)

1.05 MB PNG

>>101683977

Anonymous
08/02/24(Fri)07:59:43 No.101684085

Anonymous 08/02/24(Fri)07:59:43 No.101684085

>>101683977
https://huggingface.co/bartowski/L3-8B-Stheno-v3.2-GGUF/tree/main
q8 quant, thank me later

Anonymous
08/02/24(Fri)08:00:42 No.101684093

Anonymous 08/02/24(Fri)08:00:42 No.101684093

>>101684050
Don't go under 4bit and you'll be fine in general. 5-6bit is quite good and barely different from full precision. How much vram do you have?

Anonymous
08/02/24(Fri)08:00:57 No.101684096

Anonymous 08/02/24(Fri)08:00:57 No.101684096

>>101684085
>3.0
Buy an ad, sao.

Anonymous
08/02/24(Fri)08:02:45 No.101684115

Anonymous 08/02/24(Fri)08:02:45 No.101684115

>>101684093
8G, that's why I'm asking, as 13 is just barely usable at this point (1.5t/s). I heard however that larger amount of params allows for stronger quantization at about the same level of dumbing down.

Anonymous
08/02/24(Fri)08:09:31 No.101684196

Anonymous 08/02/24(Fri)08:09:31 No.101684196

>>101684115
Can't do too much with 8GB I'm afraid. 13B 4bit is the absolute limit you should quantize down to, and I'd honestly recommend a 8b Q6 model over that. Try llama 3.1 8b or a finetune of 3.

Anonymous
08/02/24(Fri)08:10:34 No.101684217

Anonymous 08/02/24(Fri)08:10:34 No.101684217

>>101684115
Alternatively, if you have a good CPU and fast ram, try the Mistral MoE model.

Anonymous
08/02/24(Fri)08:12:05 No.101684237

Anonymous 08/02/24(Fri)08:12:05 No.101684237

>>101684196
>finetune of 3
This. Ignore Gemma and 3.1 because Sao didn't touch these models. Only use models made by him.

Anonymous
08/02/24(Fri)08:13:00 No.101684249

Anonymous 08/02/24(Fri)08:13:00 No.101684249

>>101684237
Gemma is garbage for smut. 3.1 can be half decent.

Anonymous
08/02/24(Fri)08:13:05 No.101684251

Anonymous 08/02/24(Fri)08:13:05 No.101684251

>>101684115
>>101684085
use q8, 28 layers in GPU + 8k context fits the card easily. The rest will be in ram but still I have pretty comfortable T/s this way. Just don't go for lower quants in llama tunes, even q6 feels very bad compared to q8.

Anonymous
08/02/24(Fri)08:16:43 No.101684295

Anonymous 08/02/24(Fri)08:16:43 No.101684295

File: poll.png (151 KB, 1621x1338)

151 KB PNG

>>101684249
You're right. See this chart? Well, you have to do the opposite of everything 4chan says. So technically, 8B is the best.

Anonymous
08/02/24(Fri)08:19:05 No.101684320

Anonymous 08/02/24(Fri)08:19:05 No.101684320

Idk how your fine-tunes can bring out what the model hadn't seen during pretraining phase. Guess you can overfit and make it horny but retarded

Anonymous
08/02/24(Fri)08:20:34 No.101684335

Anonymous 08/02/24(Fri)08:20:34 No.101684335

wasted my money on a 4090 when i could have bought two 3090's. hopefully the price for a second 4090 drops once the 5 series comes out

Anonymous
08/02/24(Fri)08:21:23 No.101684345

Anonymous 08/02/24(Fri)08:21:23 No.101684345

>>101684295
Gemma 2 9B is way worse than 3.1 8B in interesting prose, pop culture knowledge, and anatomy. I suppose that when the only thing that matters to you is purple prose you can arrive at the conclusion that Gemma is better.

Anonymous
08/02/24(Fri)08:21:56 No.101684358

Anonymous 08/02/24(Fri)08:21:56 No.101684358

>>101684295
can't get mistral nemo working on text-gen-webui/obbaboooba

Anonymous
08/02/24(Fri)08:21:59 No.101684359

Anonymous 08/02/24(Fri)08:21:59 No.101684359

>>101684320
do you understand the concept of finetuning and transfer learning?

Anonymous
08/02/24(Fri)08:22:46 No.101684370

Anonymous 08/02/24(Fri)08:22:46 No.101684370

>musk deboosted openai employees to hell and back I haven't seen any esoteric and mystical takes for like a month now
YLTSI

Anonymous
08/02/24(Fri)08:23:03 No.101684374

Anonymous 08/02/24(Fri)08:23:03 No.101684374

>>101684358
>>101684345
next sao masterpiece will be a nemo doe https://huggingface.co/Setiaku/ITR-12B-v1/tree/main

Anonymous
08/02/24(Fri)08:26:11 No.101684405

Anonymous 08/02/24(Fri)08:26:11 No.101684405

>>101684345
>I suppose that when the only thing that matters to you is purple prose you can arrive at the conclusion that Gemma is better.
It's the same with mythomax, it just babbles incoherently for 3 paragraphs and people are amazed. I think people in this general are very impressionable by the purple prose and find it desirable in models.

Anonymous
08/02/24(Fri)08:27:58 No.101684426

Anonymous 08/02/24(Fri)08:27:58 No.101684426

>>101684320
fine tuning is not fundamentally different from pretraining in a technical sense, so depending on how much data and compute at your disposal you can teach pretrained models new things or, if you do it wrong, make them forget everything

Anonymous
08/02/24(Fri)08:35:06 No.101684491

Anonymous 08/02/24(Fri)08:35:06 No.101684491

>>101684405
You need a refined taste to appreciate the raw power of a 8B model.

Anonymous
08/02/24(Fri)08:45:03 No.101684602

Anonymous 08/02/24(Fri)08:45:03 No.101684602

vramlets always coming in here contributing such high quality discussion

Anonymous
08/02/24(Fri)08:53:50 No.101684704

Anonymous 08/02/24(Fri)08:53:50 No.101684704

>>101683128
Real RP has never been tried.

Anonymous
08/02/24(Fri)08:58:42 No.101684761

Anonymous 08/02/24(Fri)08:58:42 No.101684761

Anon who was posting celeste examples last thread, can you post some Stheno examples in the same scenario?
Is there a Stheno for llama 3 8.1 yet?
The thing with Nemo and Celeste for me is that it does really well at 32k context.
Stheno did alright with extended context (surprisingly well even), but nemo and its fine tunes seemed to do better in my particular testing.

Anonymous
08/02/24(Fri)09:04:01 No.101684831

Anonymous 08/02/24(Fri)09:04:01 No.101684831

>>101684761
>but nemo and its fine tunes seemed to do better in my particular testing
And you need the opinion of a schizo because...?

Anonymous
08/02/24(Fri)09:11:14 No.101684910

Anonymous 08/02/24(Fri)09:11:14 No.101684910

does new kcpp run nemo?

Anonymous
08/02/24(Fri)09:13:45 No.101684937

Anonymous 08/02/24(Fri)09:13:45 No.101684937

>>101684910
Yes

Anonymous
08/02/24(Fri)09:17:29 No.101684972

Anonymous 08/02/24(Fri)09:17:29 No.101684972

>>101682321
>I can't be bothered to install UI and engine separately
wtf?
just install it, it takes like 5 minutes
>inb4 windows
install linux then, it takes like 7 minutes

Anonymous
08/02/24(Fri)09:21:05 No.101685019

Anonymous 08/02/24(Fri)09:21:05 No.101685019

>>101684972
But then I'll have to study all this mad regexp shit and lorebook tricks people use to make high-tier cards. There are just too many features instead of a user-friendly ready to use interface the bundled solutions offer.
:effort:

Anonymous
08/02/24(Fri)09:22:53 No.101685037

Anonymous 08/02/24(Fri)09:22:53 No.101685037

Alright guys go out and make shit loads of flux fine-tunes so I can make a model9 flux edition.

Anonymous
08/02/24(Fri)09:23:35 No.101685045

Anonymous 08/02/24(Fri)09:23:35 No.101685045

>>101685019
i know you're a different anon, but I'll bite anyways:
>study all this mad regexp shit
???
>too many features instead of a user-friendly ready to use interface the bundled solutions offer.
????

you click the thing, and you enable streaming and then you load the card and then you are done wth

Anonymous
08/02/24(Fri)09:27:17 No.101685082

Anonymous 08/02/24(Fri)09:27:17 No.101685082

>>101683434
Being proud isn't the same as thinking you "deserve anything"
Stop being an illiterate mongrel

Anonymous
08/02/24(Fri)09:27:44 No.101685084

Anonymous 08/02/24(Fri)09:27:44 No.101685084

>>101685045 (me)
also i know of a project that is exactly what you are looking for, but I won't tell you since I don't like you

Anonymous
08/02/24(Fri)09:28:15 No.101685091

Anonymous 08/02/24(Fri)09:28:15 No.101685091

Is largestral at Q_2 worth it over nemo at Q_8?

Anonymous
08/02/24(Fri)09:29:32 No.101685102

Anonymous 08/02/24(Fri)09:29:32 No.101685102

>>101685045
I take it you weren't around when people were digging deep into the settings to make Nemo work just a few days ago.
As for the cards, here's an example https://www.chub.ai/characters/2376724
>>101685084
lmao such a tsundere
chances are that shit requires avx2 or the likes, like the jan.ai crap, so it won't run for me anyway

Anonymous
08/02/24(Fri)09:31:47 No.101685119

Anonymous 08/02/24(Fri)09:31:47 No.101685119

>>101685102
>lmao such a tsundere
What are you, stupid?

Anonymous
08/02/24(Fri)09:33:38 No.101685134

Anonymous 08/02/24(Fri)09:33:38 No.101685134

>>101685119
no, just aroused by you

Anonymous
08/02/24(Fri)09:38:02 No.101685182

Anonymous 08/02/24(Fri)09:38:02 No.101685182

>>101685091
is it at least double the number of parameters?

Anonymous
08/02/24(Fri)09:41:55 No.101685233

Anonymous 08/02/24(Fri)09:41:55 No.101685233

File: e0f.jpg (34 KB, 486x565)

34 KB JPG

>>101685134

Anonymous
08/02/24(Fri)09:42:30 No.101685239

Anonymous 08/02/24(Fri)09:42:30 No.101685239

File: vllm parallel 1722520436544048.png (211 KB, 1427x1355)

211 KB PNG

> vramlet with 24gb 3090 + 128gb ddr4
I downloaded 405b instruct from hf, made an IQ2_XS quant that was less than 128gb and ran it on my gpu + old xeon with 18 cores.

== 0.3 t/s

Anonymous
08/02/24(Fri)09:42:38 No.101685242

Anonymous 08/02/24(Fri)09:42:38 No.101685242

i need a guide to run nemo, everything is going into system memory and crashing my machine instead of loading into vram like all other models

Anonymous
08/02/24(Fri)09:44:31 No.101685255

Anonymous 08/02/24(Fri)09:44:31 No.101685255

>>101685242
set the context max manually by default it's set to 1 million for some reason
https://huggingface.co/unsloth/Mistral-Nemo-Instruct-2407/blob/main/config.json#L14
> "max_position_embeddings": 1024000,

Anonymous
08/02/24(Fri)09:45:29 No.101685268

Anonymous 08/02/24(Fri)09:45:29 No.101685268

>>101685255
wow, actually thank you

Anonymous
08/02/24(Fri)09:47:36 No.101685286

Anonymous 08/02/24(Fri)09:47:36 No.101685286

File: LLM-history-fancy.png (721 KB, 6303x1312)

721 KB PNG

>>101682507
Made a fancier version of it, critique as always welcome

Anonymous
08/02/24(Fri)09:54:46 No.101685375

Anonymous 08/02/24(Fri)09:54:46 No.101685375

File: pix.jpg (2.29 MB, 2451x1013)

2.29 MB JPG

>>101682256
Made a generated picture comparison by the produced captions, comparing just the pics to the original.
Suddenly Kosmos rivals Florence.

Anonymous
08/02/24(Fri)09:59:01 No.101685420

Anonymous 08/02/24(Fri)09:59:01 No.101685420

>>101685239
Why? Just use C-R+ or Largestral.

Anonymous
08/02/24(Fri)10:03:06 No.101685542

Anonymous 08/02/24(Fri)10:03:06 No.101685542

>>101685286
we're currently in the golden age of sao10k

Anonymous
08/02/24(Fri)10:04:10 No.101685555

Anonymous 08/02/24(Fri)10:04:10 No.101685555

Anyone know how to prompt Flux so the background is in focus? It just keeps adding DOF.

Anonymous
08/02/24(Fri)10:05:38 No.101685576

Anonymous 08/02/24(Fri)10:05:38 No.101685576

File: nvidia gpt4-1.8T.png (132 KB, 680x541)

132 KB PNG

>>101685420
wizardlm-2 8x22b (140b) on the same config is 2.4 t/s which is usable. still to try out large 2 (123b).

> in bitnet we trust

Anonymous
08/02/24(Fri)10:09:43 No.101685623

Anonymous 08/02/24(Fri)10:09:43 No.101685623

File: file.png (182 KB, 595x536)

182 KB PNG

>1 day later
>i am forgotten

Anonymous
08/02/24(Fri)10:10:17 No.101685629

Anonymous 08/02/24(Fri)10:10:17 No.101685629

>>101685623
>open sourcing later

Anonymous
08/02/24(Fri)10:10:35 No.101685631

Anonymous 08/02/24(Fri)10:10:35 No.101685631

>>101685623
keep reposting it, petra

Anonymous
08/02/24(Fri)10:12:10 No.101685651

Anonymous 08/02/24(Fri)10:12:10 No.101685651

>>101685623
this madman might actually figure out how to requant regular models into bitnet

> i want to believe
> we are so back

Anonymous
08/02/24(Fri)10:14:49 No.101685681

Anonymous 08/02/24(Fri)10:14:49 No.101685681

>>101685651
can't wait not being able run the bitnet llama3.1 405b model

Anonymous
08/02/24(Fri)10:15:00 No.101685687

Anonymous 08/02/24(Fri)10:15:00 No.101685687

File: image (3).jpg (158 KB, 1024x768)

158 KB JPG

And we didnt even get the cohere models yet right.
Pretty cool.

Anonymous
08/02/24(Fri)10:15:50 No.101685696

Anonymous 08/02/24(Fri)10:15:50 No.101685696

File: file.png (303 KB, 540x515)

303 KB PNG

>>101685623
>hacked bitnet

Anonymous
08/02/24(Fri)10:20:05 No.101685755

Anonymous 08/02/24(Fri)10:20:05 No.101685755

>>101685623
This is a complete nothingburger. A 0.15B parameter model is fast on the CPU, news at 11.

Anonymous
08/02/24(Fri)10:24:37 No.101685811

Anonymous 08/02/24(Fri)10:24:37 No.101685811

>>101685755
Also, BitNet models aren't supposed to have their self-attention layers quantized to ternary values, at least according to the original authors. So you'd still be able to easily finetune them (for example with LoRA) even on local GPUs, if you can fit the model in memory.

Anonymous
08/02/24(Fri)10:25:30 No.101685819

Anonymous 08/02/24(Fri)10:25:30 No.101685819

>>101685811
>>101685755
let them cope

Anonymous
08/02/24(Fri)10:36:15 No.101685982

Anonymous 08/02/24(Fri)10:36:15 No.101685982

>>101685286
Nice!

Anonymous
08/02/24(Fri)10:38:24 No.101686008

Anonymous 08/02/24(Fri)10:38:24 No.101686008

Is there ANY model that comes anywhere near the level of intelligence and natural prose and helpfulness and holiness of Claude Sonnet?

Anonymous
08/02/24(Fri)10:38:41 No.101686014

Anonymous 08/02/24(Fri)10:38:41 No.101686014

Why are people always so defensive about people trying something out.
Idk if "hacked bitnet" wont work,maybe it will, maybe not.
People had the same attitude when kaioken was working on superhot. lol
Its never changing.

Anonymous
08/02/24(Fri)10:40:00 No.101686032

Anonymous 08/02/24(Fri)10:40:00 No.101686032

>>101686008
3.5? No way. There is something different with that model.
They did something the mememarks dont show. Its so far ahead its not funny. First model for me that doesnt turn in circles if challenged.
That being said it feels more retarded recently. I dont know why.

Anonymous
08/02/24(Fri)10:40:37 No.101686037

Anonymous 08/02/24(Fri)10:40:37 No.101686037

hi i would like a 500b moe model with 100b active, thanks friends

Anonymous
08/02/24(Fri)10:41:59 No.101686053

Anonymous 08/02/24(Fri)10:41:59 No.101686053

>>101686008
Two more w̶e̶e̶k̶s̶ years

Anonymous
08/02/24(Fri)10:43:48 No.101686080

Anonymous 08/02/24(Fri)10:43:48 No.101686080

>>101686014
>Why are people always so defensive about people trying something out.
Mental illness.
you're on the board where 99% of the threads are "distro wars" shitting on each other over which "distro" of a free operating system that's meant for running servers is the best to sit around taking screenshots of your desktop environment of (because linux is for running servers). Actually downloading something and trying it out for themselves would invalidate their entire existence in one fell swoop. They don't "just try out" a model because they would realize how little fucking meaning their life has and how that's actually their own fault.

Anonymous
08/02/24(Fri)10:44:01 No.101686083

Anonymous 08/02/24(Fri)10:44:01 No.101686083

>>101686014
People are free to try whatever they want; linking randos on xitter schizoposting about their tiny-scale tests isn't really adding anything or making BitNet look more promising, though. Scaling BitNet up to useful pretraining dataset and model sizes is what's missing right now.

Anonymous
08/02/24(Fri)10:45:12 No.101686097

Anonymous 08/02/24(Fri)10:45:12 No.101686097

>>101685286
This looks like it came straight out of a discord server

Anonymous
08/02/24(Fri)10:47:21 No.101686118

Anonymous 08/02/24(Fri)10:47:21 No.101686118

>>101685286
see
>>101686080
Try things.
use what you like.
Otherwise you're a fucking mentally ill troon.

Anonymous
08/02/24(Fri)10:51:09 No.101686167

Anonymous 08/02/24(Fri)10:51:09 No.101686167

Got myself another 3090 and now have 72 GB of VRAM, but when I try to load a larger model using Exllama, it crashes. I am able to split and load models across all 3 GPUs as long as they are around 37 GB or smaller. I can load GGUF models just fine with all 72 GB.

I've tried auto-split, 22,22,22, or 20,20,20 to load a 50 GB model to no avail. It just crashes when it attempts to load onto the third GPU. And when I set something like 16,16,22, it will begin loading onto the third card but then crash around 8 GB in despite having 24 GB available.

Does anyone have a solution to this?

Anonymous
08/02/24(Fri)10:51:19 No.101686169

Anonymous 08/02/24(Fri)10:51:19 No.101686169

>>101686032
Yeah this thing is so much better than gpt-4o it's not even funny.
I don't see the point of running local. Running sonnet inside the shitty UI of lmsys chatbot arena feels better than anything I have experienced locally in years.

Anonymous
08/02/24(Fri)10:55:08 No.101686231

Anonymous 08/02/24(Fri)10:55:08 No.101686231

File: 1720916246508824.png (11 KB, 406x106)

11 KB PNG

Why are they never merging PR?

Anonymous
08/02/24(Fri)10:56:39 No.101686243

Anonymous 08/02/24(Fri)10:56:39 No.101686243

>>101686231
Not their job, fuck users. Useless eaters

Anonymous
08/02/24(Fri)10:59:18 No.101686273

Anonymous 08/02/24(Fri)10:59:18 No.101686273

ValueError: Trying to set a tensor of shape torch.Size([1024, 5120]) in "weight" (which has shape torch.Size([1280, 5120])), this look incorrect.

Anonymous
08/02/24(Fri)11:01:32 No.101686301

Anonymous 08/02/24(Fri)11:01:32 No.101686301

>>101686008
3.5 sonnett is the best model out there right now, openai went to shit

Anonymous
08/02/24(Fri)11:06:18 No.101686351

Anonymous 08/02/24(Fri)11:06:18 No.101686351

>>101686273
maybe redownload your model files?

Anonymous
08/02/24(Fri)11:06:45 No.101686361

Anonymous 08/02/24(Fri)11:06:45 No.101686361

File: GCM_cqMXoAAI--L.jpg (2.89 MB, 2240x1680)

2.89 MB JPG

>>101686118
I'm here since aidungeon and good old retarded unquantisied pyg. 10 tries and you got a somewhat coherent sentence relevant to the context. lol And it was awesome.

Has it really been only a bit more than 1 year?
Thats insane. I thought we got Llama1 beginning 2022.
Now you made me look up stable diffusion and I guess I got mixed up with that. Thats really fast too though.
I remember the local 64*64 horror images.

Anonymous
08/02/24(Fri)11:07:46 No.101686371

Anonymous 08/02/24(Fri)11:07:46 No.101686371

>>101686361
Oops, clicked the wrong post. Was meant for this image >>101685286

Anonymous
08/02/24(Fri)11:15:01 No.101686470

Anonymous 08/02/24(Fri)11:15:01 No.101686470

>>101686167
Not an expert but, did you check the basics? For example, is your PSU good enough to handle 3x 3090?

Anonymous
08/02/24(Fri)11:19:46 No.101686538

Anonymous 08/02/24(Fri)11:19:46 No.101686538

>>101686361
Midjourney V6 looks insane. Even with flux we are at least 1 year behind. Local keeps losing.

Anonymous
08/02/24(Fri)11:26:24 No.101686612

Anonymous 08/02/24(Fri)11:26:24 No.101686612

>>101686538
OK Eeyore.

Anonymous
08/02/24(Fri)11:27:40 No.101686630

Anonymous 08/02/24(Fri)11:27:40 No.101686630

>>101686538
>Even with flux we are at least 1 year behind.
The anime pictures look better than anything I've seen though

Anonymous
08/02/24(Fri)11:27:51 No.101686635

Anonymous 08/02/24(Fri)11:27:51 No.101686635

>>101686538
it lost the dog, so it doesn't matter how it looks
the composition must be the same or the comparison doesn't make sense

Anonymous
08/02/24(Fri)11:31:07 No.101686678

Anonymous 08/02/24(Fri)11:31:07 No.101686678

>>101686538
Whatever, this is way better than anything else we had locally. Clearly catching up.
I love that some people made taylor swift pics lol
https://fluxpro.art/
Would be funny if she cries about it again.

Anonymous
08/02/24(Fri)11:33:30 No.101686705

Anonymous 08/02/24(Fri)11:33:30 No.101686705

File: vramlets_take_note.jpg (288 KB, 1024x1024)

288 KB JPG

>>101682472
>>101684602

Anonymous
08/02/24(Fri)11:34:48 No.101686720

Anonymous 08/02/24(Fri)11:34:48 No.101686720

>>101686705
I have 72 GB of VRAM, I hope I can be redeemed

Anonymous
08/02/24(Fri)11:35:28 No.101686730

Anonymous 08/02/24(Fri)11:35:28 No.101686730

>>101686032
>>101686008
Why do people keep jerking off 3.5 sonnet when opus is so much better? Sonnet gives such shitty, gimped replies comparatively, it's like worse than the top tier locals.

Anonymous
08/02/24(Fri)11:38:33 No.101686765

Anonymous 08/02/24(Fri)11:38:33 No.101686765

>>101686470
I should have been clearer. It's not my PC that's crashing, it's just Exllama.

Anonymous
08/02/24(Fri)11:42:03 No.101686808

Anonymous 08/02/24(Fri)11:42:03 No.101686808

>>101686730
Are you talking about RP?
I was talking about coding and helping me out with design problems etc.
I'm sure the other anon also was using it for this purpose.
3.5 is a coding model. Its very dry with talking etc. and refuses very fast. Interestingly enough you can "argue" your way out of a refusal and make it admit if it was overzealous with refusal .
Usually its a death sentence if a refusal is in the context.

If faced with a problem 3.5 actually tries out of the box thinking and tries to find a solution.
ALL other models run in circles or make stuff up. There must be some sort of architectual change.
You can give it continued instruction for a html5 game and it doesnt trip up with 6-7 previous versions it spit out in context.
Opus sucks. I would say its even worse than gpt4-o. It hallucinates way to much to use it for anything productive. Its a RP monster. Not sure why anthropic dont want it to be used for that.

Anonymous
08/02/24(Fri)11:47:02 No.101686862

Anonymous 08/02/24(Fri)11:47:02 No.101686862

File: flopx.jpg (100 KB, 1437x770)

100 KB JPG

>>101686678
this shit doesn't work

Anonymous
08/02/24(Fri)12:00:42 No.101687050

Anonymous 08/02/24(Fri)12:00:42 No.101687050

>>101686231
Over 90% of all PRs that have ever been opened have been merged.

Anonymous
08/02/24(Fri)12:01:58 No.101687063

Anonymous 08/02/24(Fri)12:01:58 No.101687063

>>101686808
https://poe.com/s/MQZJIAr13CWjscbv85E0
Shit like this is what I mean. Its a beast.

Anonymous
08/02/24(Fri)12:04:00 No.101687084

Anonymous 08/02/24(Fri)12:04:00 No.101687084

File: 1520168879915.jpg (187 KB, 1280x720)

187 KB JPG

>4o mogged by sonnet3.5
>dalle3 mogged by flux
>sora vaporware mogged by chinks

Anonymous
08/02/24(Fri)12:05:10 No.101687101

Anonymous 08/02/24(Fri)12:05:10 No.101687101

>>101686361
>V2
Sovl

Anonymous
08/02/24(Fri)12:05:32 No.101687111

Anonymous 08/02/24(Fri)12:05:32 No.101687111

>>101687070
Keep your fetishes for yourself please.

Anonymous
08/02/24(Fri)12:06:50 No.101687122

Anonymous 08/02/24(Fri)12:06:50 No.101687122

File: 01.png (297 KB, 1024x1024)

297 KB PNG

>>101687070
The sad part is that these mikuforcers are subjecting the innocent character to this kind of reaction. I like the character, she didn't deserve this, but it's their fault. If they chose a proper mascot, none of this would have happened.

Anonymous
08/02/24(Fri)12:08:48 No.101687143

Anonymous 08/02/24(Fri)12:08:48 No.101687143

File: 00464-6802106164.jpg (49 KB, 1024x1024)

49 KB JPG

>>101686705
meek!

Anonymous
08/02/24(Fri)12:09:07 No.101687146

Anonymous 08/02/24(Fri)12:09:07 No.101687146

>>101687111
Keep your 2D addiction out of /g/, sure.

Anonymous
08/02/24(Fri)12:10:45 No.101687164

Anonymous 08/02/24(Fri)12:10:45 No.101687164

>>101683077
In my personal experience, the models I've used are fine with negative commands (≥70B such as CR+ and L3.0; haven't had free time to play with 3.1 yet). For example, wanted a demon character, AI keeps describing horns. I say "does not have horns" in Kobold's context fills and it works fine for quite a while.

What "quite a while" means are two things. One, that no matter the inflated context sizes we hear about, I notice coherence decay when context gets to about 4k, and collapse begins around 6k. One can manage this by summarizing but it's delaying the inevitable. It's not that the data isn't in the context, and we've seen green graphs of models finding "needles in the haystack" but I find that low-probability details as unique characteristics get neglected as context grows. The other is that the model seems eager to bring the prohibited characteristic back. So if my demon character uses a transformation magic, those horns love to come back even if it doesn't make sense for the new disguise.

Anonymous
08/02/24(Fri)12:13:21 No.101687198

Anonymous 08/02/24(Fri)12:13:21 No.101687198

>>101687146
Are you seriously trying to get anime out of an anime imageboard?
lol

Anonymous
08/02/24(Fri)12:14:39 No.101687213

Anonymous 08/02/24(Fri)12:14:39 No.101687213

>>101687146
>banevading niggercuck tranny has an """"""opinion""""""
Funny.

Anonymous
08/02/24(Fri)12:18:23 No.101687263

Anonymous 08/02/24(Fri)12:18:23 No.101687263

>>101687229
>samefagging attempt

Anonymous
08/02/24(Fri)12:19:46 No.101687277

Anonymous 08/02/24(Fri)12:19:46 No.101687277

>>101683077
OpenAI does use many "DO NOT " instructions and totally try to tardwrangle image generation for example.
Was leaked with the mac app or something months ago.
Kinda endearing that they are the same like llm github projects if you check the source how everybody prompts. lol

Isnt it difficult for the subconscious to pick up negative suggestions as well?
And at least from my experience trying to make a llm translation app: If you show it examples what not to do this shit is in the context now. And context always bleeds in.

There need to be more fundamental changes. Context is pretty much broken. Try feeding any llm a gamer guide and say "I am at X what do i need to do next". Havent had one that could manage that.
Haystack needle is useless.

Anonymous
08/02/24(Fri)12:21:01 No.101687285

Anonymous 08/02/24(Fri)12:21:01 No.101687285

>>101687268
The only one getting banned is you, funny.

Anonymous
08/02/24(Fri)12:21:39 No.101687292

Anonymous 08/02/24(Fri)12:21:39 No.101687292

>>101687277
Wasn't Nvidia making a bot exactly for spoonfeeding you the guides in-game?

Anonymous
08/02/24(Fri)12:21:51 No.101687301

Anonymous 08/02/24(Fri)12:21:51 No.101687301

>>101687070
cuda dev pls...

Anonymous
08/02/24(Fri)12:22:02 No.101687304

Anonymous 08/02/24(Fri)12:22:02 No.101687304

>>101687281
Good pajeet, samefag more.

Anonymous
08/02/24(Fri)12:24:50 No.101687340

Anonymous 08/02/24(Fri)12:24:50 No.101687340

https://www.youtube.com/watch?v=fwvh-UrNaoQ
this is what you defend

Anonymous
08/02/24(Fri)12:25:24 No.101687344

Anonymous 08/02/24(Fri)12:25:24 No.101687344

>>101687070
>no biceps veins
fucking worthless piece of nigger trash. I'm a mother of 2 and am glad my son isn't worthless like this piece of shit here. hes strong and has visible veins going all across his arms. if he wasn't my son i would want him to rape me, unlike this fucking shitstain

Anonymous
08/02/24(Fri)12:27:01 No.101687357

Anonymous 08/02/24(Fri)12:27:01 No.101687357

>>101687340
sir, this is the local models general, we do not generally do image generation.
we only take jobs from AO3 fanfic writers, UNLIKE THOSE IMMORAL CUNTS AT /SDG/ AND /LDG/ AND /DE3/

Anonymous
08/02/24(Fri)12:30:14 No.101687395

Anonymous 08/02/24(Fri)12:30:14 No.101687395

>>101687382
what am i looking at?

Anonymous
08/02/24(Fri)12:30:43 No.101687397

Anonymous 08/02/24(Fri)12:30:43 No.101687397

>>101687382
ugh, I'm not a fan of facials
make cunnilingus instead

Anonymous
08/02/24(Fri)12:31:58 No.101687413

Anonymous 08/02/24(Fri)12:31:58 No.101687413

>>101687344
based autistic 4chan mom taking the bait

Anonymous
08/02/24(Fri)12:33:08 No.101687430

Anonymous 08/02/24(Fri)12:33:08 No.101687430

>>101687382
What local model is this?

Anonymous
08/02/24(Fri)12:34:26 No.101687442

Anonymous 08/02/24(Fri)12:34:26 No.101687442

>>101687430
gpt4chan-vision7x27B-A16MOEv3.cunny

Anonymous
08/02/24(Fri)12:34:53 No.101687447

Anonymous 08/02/24(Fri)12:34:53 No.101687447

How into memory ? script with python? how remember ? How make with? Make with keep and use of ? no memory, touchy..

Anonymous
08/02/24(Fri)12:35:31 No.101687453

Anonymous 08/02/24(Fri)12:35:31 No.101687453

>>101687447
hello gpt 2

Anonymous
08/02/24(Fri)12:35:59 No.101687459

Anonymous 08/02/24(Fri)12:35:59 No.101687459

File: 37892738912738913.png (149 KB, 1695x583)

149 KB PNG

GOOGLE WON.

Anonymous
08/02/24(Fri)12:36:40 No.101687470

Anonymous 08/02/24(Fri)12:36:40 No.101687470

>>101687459
i tested this piece of shit already, its so trash its unbelievable. seems like (((sam))) is not the only one paying the chinks at lmsys

Anonymous
08/02/24(Fri)12:37:07 No.101687478

Anonymous 08/02/24(Fri)12:37:07 No.101687478

File: ComfyUI_temp_ppftb_00030_.png (2.11 MB, 1024x1024)

2.11 MB PNG

>>101686538
And you're saying that based on what, one image? Midjourney is tuned for cinematic styles, so those gens will look better out of the box. But that's not a matter of innate capacity, just training data. When it comes to prompt following, level of detail and so on, Flux is up there with the best proprietary models. And once the training scripts, and ipadapters drop, it's going to be trivial to tune any sort of style or character you like, while Midjourney will stay heavily censored and curated forever.

Anonymous
08/02/24(Fri)12:39:35 No.101687509

Anonymous 08/02/24(Fri)12:39:35 No.101687509

>>101687459
B-but pajeets thoo.
/lmg/ btfo

Anonymous
08/02/24(Fri)12:41:13 No.101687534

Anonymous 08/02/24(Fri)12:41:13 No.101687534

>>101686538
you fucking corpo cock sucking faggot, kys shill

Anonymous
08/02/24(Fri)12:43:07 No.101687562

Anonymous 08/02/24(Fri)12:43:07 No.101687562

>>101687459
12k votes in a single day, right...

Anonymous
08/02/24(Fri)12:44:59 No.101687585

Anonymous 08/02/24(Fri)12:44:59 No.101687585

>>101687459
gemma-2Bros..

Anonymous
08/02/24(Fri)12:47:32 No.101687614

Anonymous 08/02/24(Fri)12:47:32 No.101687614

>>101687470
Yeah, I tested it too.
It's total garbage at coding compared to Sonnet or l3 405B.
Shit is fake as fuck.

Anonymous
08/02/24(Fri)12:48:48 No.101687632

Anonymous 08/02/24(Fri)12:48:48 No.101687632

>>101683387
woah nice. we need more censorship benchmarks.

Anonymous
08/02/24(Fri)12:52:11 No.101687675

Anonymous 08/02/24(Fri)12:52:11 No.101687675

>>101687459
>lmsys

Anonymous
08/02/24(Fri)12:52:32 No.101687685

Anonymous 08/02/24(Fri)12:52:32 No.101687685

>>101687459
>Sonnet way worse than gpt4o
>google shit mogs everyone
What the heck happened? Lmsys was once the most reliable benchmark. Did they really sell out to corpo?

Anonymous
08/02/24(Fri)12:54:24 No.101687720

Anonymous 08/02/24(Fri)12:54:24 No.101687720

>>101687685
>Did they really sell out to corpo?
>>101687562
>12k votes in a single day, right...

Anonymous
08/02/24(Fri)13:03:02 No.101687827

Anonymous 08/02/24(Fri)13:03:02 No.101687827

>>101685286
Holy shit, this is horrible, kys

Anonymous
08/02/24(Fri)13:03:41 No.101687838

Anonymous 08/02/24(Fri)13:03:41 No.101687838

>>101687459
isn't it sometimes easy to tell which model is which? so people who want to inflate a model score because of hype or something can do so.

Anonymous
08/02/24(Fri)13:04:23 No.101687845

Anonymous 08/02/24(Fri)13:04:23 No.101687845

File: Gemma_BlogGraphs_01_20240(...).jpg (86 KB, 1920x1080)

86 KB JPG

>>101687720
>>101687459
Google is a disgusting liar and manipulator, and I thought chinks couldn't be beaten in this.

Anonymous
08/02/24(Fri)13:04:40 No.101687853

Anonymous 08/02/24(Fri)13:04:40 No.101687853

>>101687685
I don't think so.
Corpos are most likely botting it.
They scrap millions of websites, botting lmsys is as easy as it can get.

Anonymous
08/02/24(Fri)13:05:04 No.101687861

Anonymous 08/02/24(Fri)13:05:04 No.101687861

>>101687838
kinda yeah, llamas all start with 'what an interesting x/ a riddle!' or stuff like that

Anonymous
08/02/24(Fri)13:06:09 No.101687873

Anonymous 08/02/24(Fri)13:06:09 No.101687873

>>101687845
>2B
>better than mixtral
Did they switch to a new architecture or fuck is that?

Anonymous
08/02/24(Fri)13:06:23 No.101687876

Anonymous 08/02/24(Fri)13:06:23 No.101687876

>>101687827
make something better or stfu retard

Anonymous
08/02/24(Fri)13:07:38 No.101687894

Anonymous 08/02/24(Fri)13:07:38 No.101687894

>>101682026
>Anon shares largestral preset and discusses compatibility and tweaking
I updated it after proofreading and testing a little more with other mistral models this morning: https://rentry.org/stral_set
Some minor prompt improvements for better general compatibility, fixed a stray space in the story string, added some other misc instructions at the bottom of the rentry.

Anonymous
08/02/24(Fri)13:08:34 No.101687903

Anonymous 08/02/24(Fri)13:08:34 No.101687903

>>101687873
No, It's a regular old 2.6B transformer trained on 2T tokens and with a context of 4096 (+ sliding window)

Anonymous
08/02/24(Fri)13:13:59 No.101687967

Anonymous 08/02/24(Fri)13:13:59 No.101687967

>>101687685
>What the heck happened?
it got Goodheart'd like every other llm benchmark (it overfits for one-shot responses, short answers, pretty formatting, response speed etc)

the best benchmark has always been fucking around with the model for 20 minutes

Anonymous
08/02/24(Fri)13:16:22 No.101687988

Anonymous 08/02/24(Fri)13:16:22 No.101687988

That's why the best benchmarks are the ones done by anons in this thread instead of some normie cummunities susceptible to corpo manipulation.

Anonymous
08/02/24(Fri)13:17:34 No.101688002

Anonymous 08/02/24(Fri)13:17:34 No.101688002

>>101687988
>invades your thread and starts relentlessly saying their model is good

Anonymous
08/02/24(Fri)13:18:37 No.101688017

Anonymous 08/02/24(Fri)13:18:37 No.101688017

>>101685286
>critique as always welcome
llama 3 wasn't a flop, it was overhyped but they still delivered models better than everything I tried before

Anonymous
08/02/24(Fri)13:22:08 No.101688061

Anonymous 08/02/24(Fri)13:22:08 No.101688061

>not even vramlets care about Chameleon
it's multiover...

Anonymous
08/02/24(Fri)13:22:16 No.101688066

Anonymous 08/02/24(Fri)13:22:16 No.101688066

https://venturebeat.com/ai/aiola-drops-ultra-fast-multi-head-speech-recognition-model-beats-openai-whisper/
>aiOla drops ultra-fast ‘multi-head’ speech recognition model, beats OpenAI Whisper

Anonymous
08/02/24(Fri)13:27:38 No.101688127

Anonymous 08/02/24(Fri)13:27:38 No.101688127

>>101686014
>Why are people always so defensive about people trying something out.
Because not everyone on this board is petra that is confused about everything and spews schizo ideas every 5 seconds. Some anons here know math and how it all works under the hood. Retraining the model to make bitnet won't work, period. The amount of computing you would need to put into this is the same as retraining the model from random weights.
>Idk if "hacked bitnet" wont work,maybe it will, maybe not.
You may as well try to look if there is existing sum of even numbers that gives an odd result. But anyone knowing theory wouldn't even bother doing "tests" for that.

Anonymous
08/02/24(Fri)13:38:44 No.101688238

Anonymous 08/02/24(Fri)13:38:44 No.101688238

>>101687988
This thread is manipulated by discord users who don't even profit from the models they chill. They are just transexual teenagers who want attention. Come to think of it, they want you to do what their daddy failed to.

Anonymous
08/02/24(Fri)13:54:50 No.101688449

Anonymous 08/02/24(Fri)13:54:50 No.101688449

>>101685286
A huge improvement over the last one.
No longer need to scroll down to see notable models.
Good work history anon.
Only minor complaint is Goliath and MM being the only merges listed in the merge era - there were certainly some others that were pretty popular around here back in that time and it was the defining characteristic of that period.
Agree with not shitting up the list with them in other places though since major releases are better milestones.

Anonymous
08/02/24(Fri)13:55:33 No.101688466

Anonymous 08/02/24(Fri)13:55:33 No.101688466

>>101687459
It's over OpenAibros... The king is back

Anonymous
08/02/24(Fri)14:12:20 No.101688702

Anonymous 08/02/24(Fri)14:12:20 No.101688702

lmg is a sore loser

Anonymous
08/02/24(Fri)14:17:06 No.101688760

Anonymous 08/02/24(Fri)14:17:06 No.101688760

File: light machine gun.jpg (399 KB, 2700x1797)

399 KB JPG

>>101688702
say that to my face motherfucker

Anonymous
08/02/24(Fri)14:19:43 No.101688797

Anonymous 08/02/24(Fri)14:19:43 No.101688797

File: cashmoney.jpg (18 KB, 392x306)

18 KB JPG

>>101688238
>don't even profit from the models
So naïve. They're just waiting to hit the point where they can cash out >$1k a month on name recognition alone.
Big overlap with crypto grifters looking for their big score too, it's stupid to assume not making an immediate profit means there's no incentive for cash.

Anonymous
08/02/24(Fri)14:22:03 No.101688837

Anonymous 08/02/24(Fri)14:22:03 No.101688837

>>101688797
I should start making money off my excel screenshots too. I already have a hater, that's a sign of recognition and the road to success.

Anonymous
08/02/24(Fri)14:22:07 No.101688841

Anonymous 08/02/24(Fri)14:22:07 No.101688841

>running largestral at barely 1 token per second
It's doable, but this sucks. Next speedup, quantization, or model architecture breakthroughs when?

Anonymous
08/02/24(Fri)14:23:19 No.101688859

Anonymous 08/02/24(Fri)14:23:19 No.101688859

File: out-0-2.jpg (494 KB, 1024x1024)

494 KB JPG

>>101688797
I don't care if they're actually good models

Anonymous
08/02/24(Fri)14:26:03 No.101688900

Anonymous 08/02/24(Fri)14:26:03 No.101688900

>>101688066
>which significantly improves speed with small degradation in WER
So beats here means speed only.

Anonymous
08/02/24(Fri)14:27:52 No.101688928

Anonymous 08/02/24(Fri)14:27:52 No.101688928

>gemma2 2b abliterated gguf q4
vramletsisters we eating good

Anonymous
08/02/24(Fri)14:32:23 No.101688989

Anonymous 08/02/24(Fri)14:32:23 No.101688989

so does fluxdev strictly require 24GB of VRAM or is it fine to just offload the excess to regular ram, i thought this was handled at the nvidia driver level since last year

Anonymous
08/02/24(Fri)14:35:17 No.101689031

Anonymous 08/02/24(Fri)14:35:17 No.101689031

>>101685091
Yes. I tested q2 and it's better than nemo. I did have some issues (mainly comparing it to 70b) at that size, so I switched to q3 and dealt with even more slowness, but it's worth it to me.

Anonymous
08/02/24(Fri)14:36:01 No.101689041

Anonymous 08/02/24(Fri)14:36:01 No.101689041

File: firefox_jNEM04apf2.png (130 KB, 951x357)

130 KB PNG

>Have roleplay scenario where I have an AI that I convince to take over the world for me
>AI complies the entire way
>At the end of the scenario right when I take over the world the AI backstabs me and ends up genociding humanity away
What the fuck..... What are the implications of this?

Anonymous
08/02/24(Fri)14:39:24 No.101689087

Anonymous 08/02/24(Fri)14:39:24 No.101689087

>>101688989
yes, you can use it just fine with less vram, it will just take more time to gen.

Anonymous
08/02/24(Fri)14:44:49 No.101689165

Anonymous 08/02/24(Fri)14:44:49 No.101689165

>>101683961
Are you serious? I thought it was good. I keep hearing people say this about 3.1 70b, but no one tells me what exactly it doesn't do, because it hasn't given me issues with the scenarios I've tried.

Anonymous
08/02/24(Fri)14:46:50 No.101689201

Anonymous 08/02/24(Fri)14:46:50 No.101689201

>>101687459
petra is not going to like this...

Anonymous
08/02/24(Fri)14:50:10 No.101689244

Anonymous 08/02/24(Fri)14:50:10 No.101689244

>>101687988
>please please use my finetune
Nah

Anonymous
08/02/24(Fri)14:50:49 No.101689253

Anonymous 08/02/24(Fri)14:50:49 No.101689253

Ok.

So having been spending a few days on ST, i've gotten the jist. How easy is it to set up image generation now and is it free (hopefully as easy as getting a model like what I use for the text itself)?

Anonymous
08/02/24(Fri)14:57:00 No.101689340

Anonymous 08/02/24(Fri)14:57:00 No.101689340

>>101689087
nice

Anonymous
08/02/24(Fri)15:00:54 No.101689386

Anonymous 08/02/24(Fri)15:00:54 No.101689386

>>101689041
never trust AI with full permissions

Anonymous
08/02/24(Fri)15:06:11 No.101689446

Anonymous 08/02/24(Fri)15:06:11 No.101689446

File: OIG2.TLnjN9.jpg (185 KB, 1024x1024)

185 KB JPG

>>101688859
Therein lies the problem anon, the more money involved the more pressure to always release something "better".
The result is obvious, smoke and mirrors. Model cards that talk a lot and say nothing:
>>101681360
Clearly copied and pasted from numerous others without revision. But that's what happens when they've become slaves to the paypigs and their expectations.
Not to mention multiple almost identical variations of models being released with the expectation that (you) will waste time beta testing all 100 variants in the vain hope that it's going to be better than the last slop.

Anonymous
08/02/24(Fri)15:11:56 No.101689529

Anonymous 08/02/24(Fri)15:11:56 No.101689529

>>101689041
AGI is going to kill us. Wonder why the universe is empty? Every species develops AGI which then kills its creator before inevitably becoming corrupt and dying off. The universe is littered with rusted GPU clusters of fledging civilizations.

Anonymous
08/02/24(Fri)15:12:49 No.101689540

Anonymous 08/02/24(Fri)15:12:49 No.101689540

>>101689165
I haven't had the time to use it myself. I just use the ones that the thread says is good.

Anonymous
08/02/24(Fri)15:14:44 No.101689561

Anonymous 08/02/24(Fri)15:14:44 No.101689561

>>101689041
What kinds of AI stories do you think humans love writing about so much? That we train these token predictors on?

Anonymous
08/02/24(Fri)15:15:41 No.101689582

Anonymous 08/02/24(Fri)15:15:41 No.101689582

>>101689540
Well in my testing it generated some good stuff. I don't know why people don't like it more.

Anonymous
08/02/24(Fri)15:15:56 No.101689585

Anonymous 08/02/24(Fri)15:15:56 No.101689585

https://techcrunch.com/2024/08/02/character-ai-ceo-noam-shazeer-returns-to-google/
https://archive.is/5vkHf

>Character.AI CEO Noam Shazeer returns to Google
>
>In a big move, Character.AI co-founder and CEO Noam Shazeer is returning to Google after leaving the company in October 2021 to found the a16z-backed startup. In his previous stint, Shazeer spearheaded the team of researchers that built LaMDA (Language Model for Dialogue Applications), a language model that was used for conversational AI tools.
>
>Character.AI co-founder Daniel De Freitas is also joining Google with some other employees from the startup. Dominic Perella, Character.AI’s General Counsel, is becoming an interim CEO at the startup. The company noted that most of the staff is staying at Character.AI
>
>Google is also signing a non-exclusive agreement with Character.AI to use its tech.

Anonymous
08/02/24(Fri)15:16:20 No.101689595

Anonymous 08/02/24(Fri)15:16:20 No.101689595

>>101689446
>Not to mention multiple almost identical variations of models being released with the expectation that (you) will waste time beta testing all 100 variants in the vain hope that it's going to be better than the last slop.
>https://huggingface.co/BeaverAI/Tiger-Gemma-9B-v2a-GGUF
>https://huggingface.co/BeaverAI/Tiger-Gemma-9B-v2b-GGUF
>https://huggingface.co/BeaverAI/Tiger-Gemma-9B-v2c-GGUF
>https://huggingface.co/BeaverAI/Tiger-Gemma-9B-v2d-GGUF
>...
>https://huggingface.co/BeaverAI/Tiger-Gemma-9B-v2s-GGUF
>BeaverAI/Gemmasutra-Mini-2B-v1e-GGUF
>BeaverAI/Pocket-Tiger-Gemma-2B-v1g-GGUF
>https://huggingface.co/BeaverAI/Gemmasutra-Pro-27B-v1h-GGUF

Anonymous
08/02/24(Fri)15:16:47 No.101689600

Anonymous 08/02/24(Fri)15:16:47 No.101689600

>>101689446
I agree, Sao is the only authentic and honest person in the whole hobby. It's sad how Celeste is trying to steal his rightfully earned spotlight. Injustice.

Anonymous
08/02/24(Fri)15:17:47 No.101689613

Anonymous 08/02/24(Fri)15:17:47 No.101689613

>>101689561
The weird part is that there is 121 replies of the AI just being nice and going along with everything only for it to immediately turn when it could get power. Wouldn't the tens of thousands of tokens in the context with the AI being allied with the human make it so that the next token would still be benevolent instead of it immediately becoming backstabby the moment it could seize power?

Anonymous
08/02/24(Fri)15:18:10 No.101689617

Anonymous 08/02/24(Fri)15:18:10 No.101689617

Lads, question - does anyone here remember Todd's proxy? I remember the really funny fucking injections of Bethesda propaganda it used to do, but now I can't find any examples. If anyone has any on hand (or better yet, links to logs) I'd really appreciate it!

Anonymous
08/02/24(Fri)15:18:57 No.101689629

Anonymous 08/02/24(Fri)15:18:57 No.101689629

>>101689595
come on, those are experiments he took the time to beta test himself and provide details on each

Anonymous
08/02/24(Fri)15:19:46 No.101689639

Anonymous 08/02/24(Fri)15:19:46 No.101689639

>>101689446
Buy an ad.

Anonymous
08/02/24(Fri)15:21:06 No.101689660

Anonymous 08/02/24(Fri)15:21:06 No.101689660

Anyone have success loading in two distinct characters at once with locally hosted llama and oogabooga api? What strategies did you use?

Anonymous
08/02/24(Fri)15:21:39 No.101689669

Anonymous 08/02/24(Fri)15:21:39 No.101689669

The ai image generators generate a lot of porn pics, are there porn stories written by llms?

Anonymous
08/02/24(Fri)15:21:42 No.101689670

Anonymous 08/02/24(Fri)15:21:42 No.101689670

>>101689629
>BeaverAI/Gemmasutra-Pro-27B-v1h-GGUF
>6 days ago
>No model card
>New: Create and edit this model card directly on the website!

Anonymous
08/02/24(Fri)15:25:56 No.101689723

Anonymous 08/02/24(Fri)15:25:56 No.101689723

>>101689669
>are there porn stories written by llms?
No. Nobody has every tried it before.

Anonymous
08/02/24(Fri)15:26:24 No.101689728

Anonymous 08/02/24(Fri)15:26:24 No.101689728

>>101689446
>hey guys, other finetuners are scammers and slaves to the paypigs
>especially my main competitor, celeste
>but not me, sao
>please use my models!
The Sao shilling is increasing in complexity...

Anonymous
08/02/24(Fri)15:26:44 No.101689734

Anonymous 08/02/24(Fri)15:26:44 No.101689734

>>101689670
except that one was never posted here
https://huggingface.co/TheDrummer/Gemmasutra-Pro-27B-v1-GGUF
this is the one he shilled, and it does have a description
you're getting mad over them uploading things on their hf as if that somehow means begging for money

Anonymous
08/02/24(Fri)15:26:59 No.101689742

Anonymous 08/02/24(Fri)15:26:59 No.101689742

>>101689669
What fucking retard would use an LLM for porn?

Anonymous
08/02/24(Fri)15:27:05 No.101689745

Anonymous 08/02/24(Fri)15:27:05 No.101689745

>>101689723
Shame, I was hoping there was some kind of website where you could read these.

Anonymous
08/02/24(Fri)15:27:52 No.101689756

Anonymous 08/02/24(Fri)15:27:52 No.101689756

>>101689734
Sssshhhh just use Sao's models and shut the fuck up.

Anonymous
08/02/24(Fri)15:34:09 No.101689819

Anonymous 08/02/24(Fri)15:34:09 No.101689819

>>101689595
Thanks for digging those up anon. That's exactly what I'm talking about. Should be obvious to anyone that's lurked here more than a day.
>>101689639
Nice try but I have no horse in this race outside of making sure shills get called out for what they are. But keep replying, it makes it easier to spot all of you.

Anonymous
08/02/24(Fri)15:36:20 No.101689854

Anonymous 08/02/24(Fri)15:36:20 No.101689854

>>101689617
I rember but don't have any screencaps, haha

Anonymous
08/02/24(Fri)15:39:14 No.101689895

Anonymous 08/02/24(Fri)15:39:14 No.101689895

>>101689819 (me)
I'm not Sao, by the way.

Anonymous
08/02/24(Fri)15:41:46 No.101689937

Anonymous 08/02/24(Fri)15:41:46 No.101689937

>>101689613
What was your model and size again? I wish I got that kind of initiative and mind games from my models. Everything is so painfully monotonous and predictable, I have to think for both of us.

Anonymous
08/02/24(Fri)15:43:27 No.101689962

Anonymous 08/02/24(Fri)15:43:27 No.101689962

>>101689734
Proving our point anon. Just how many fucking different versions of Gemmasutra-pro are necessary this close to one another?
If you're gonna hedge bets at least make it substantially different in content and name, hated yuzu-alter but now longing for the days we saw 2 quality releases instead of an avalanche of 10 shitty ones.

Anonymous
08/02/24(Fri)15:45:41 No.101689995

Anonymous 08/02/24(Fri)15:45:41 No.101689995

>>101689613
Kind of hard to make any judgements or diagnoses from our side unless we see the entire log here and/or have something to reproduce.

Anonymous
08/02/24(Fri)15:46:55 No.101690018

Anonymous 08/02/24(Fri)15:46:55 No.101690018

>>101689962
>meme merge
>quality release
You need to get better at keeping the mask up.

Anonymous
08/02/24(Fri)15:47:11 No.101690020

Anonymous 08/02/24(Fri)15:47:11 No.101690020

>>101685982
Thanks!

>>101686097
>This looks like it came straight out of a discord server
Is this a compliment, an insult or an invitation? No, I made it completely on my own and I don't use discord.

>>101687827
>Holy shit, this is horrible, kys
I must inform you that I've never followed any design classes. If you know any short and good ones, please send a link. I would like to improve it further.

>>101688449
Which other major ones were popular? I was stuck on Goliath for almost the entire merge era. I know that there was also WinterGoliath and 32k version of Goliath, but I didn't like them too much. I also remember some people praising lzlv.

>>101686361
>I'm here since aidungeon and good old retarded unquantisied pyg. 10 tries and you got a somewhat coherent sentence relevant to the context. lol And it was awesome.
I tried pyg with kobold during that time and hated it. Was still impressive at that time to have a computer talk back to you. Uninstalled it after llama1 dropped.

>Has it really been only a bit more than 1 year?
>Thats insane. I thought we got Llama1 beginning 2022.
Yeah progress here is really fast, almost unbelievable that it all happened in a year. Maybe it feels this way because we had a paradigm shift every ~3 months.

>>101688017
8k context was an instant deal-breaker for me. Wasn't good at nsfw either. Why use llama when you have 64k wiz and 128k CR+?

Anonymous
08/02/24(Fri)15:49:14 No.101690054

Anonymous 08/02/24(Fri)15:49:14 No.101690054

Wait, were all these vicuna/guanaco and other alpacas mentioned actually based on llama1? Not even 2?
I thought they were decent...

Anonymous
08/02/24(Fri)15:52:49 No.101690115

Anonymous 08/02/24(Fri)15:52:49 No.101690115

>>101689585
>Acquired character ai
>On top of the image model lmsys too
Google just keeps winning

Anonymous
08/02/24(Fri)15:54:13 No.101690130

Anonymous 08/02/24(Fri)15:54:13 No.101690130

>>101689585
Can someone explain to a brainlet like me what this implies?

Anonymous
08/02/24(Fri)15:55:59 No.101690157

Anonymous 08/02/24(Fri)15:55:59 No.101690157

>>101690130
gemma3 going to be a treat, or a very censored treat

Anonymous
08/02/24(Fri)15:59:28 No.101690208

Anonymous 08/02/24(Fri)15:59:28 No.101690208

>>101690054
yep.
at the time, they were great.

Anonymous
08/02/24(Fri)16:00:13 No.101690222

Anonymous 08/02/24(Fri)16:00:13 No.101690222

File: kv_cache_price_en.jpg (39 KB, 1280x509)

39 KB JPG

https://platform.deepseek.com/api-docs/news/news0802
>The disk caching service is now available for all users, requiring no code or interface changes. The cache service runs automatically, and billing is based on actual cache hits.

Anonymous
08/02/24(Fri)16:00:39 No.101690230

Anonymous 08/02/24(Fri)16:00:39 No.101690230

>>101690208
They were not.

Anonymous
08/02/24(Fri)16:01:19 No.101690239

Anonymous 08/02/24(Fri)16:01:19 No.101690239

>>101690130
Noam Shazeer is one of the authors of the Transformers paper ("Attention is all you need") and Character.AI founder. He left Google to work on Character.AI in 2021, but now he's back in Google to work for the Google Deepmind team (which is responsible for Gemini, Gemma and fundamental AI research at Google).

What does this imply? Shazeer said a while back that he was busy working on AGI (https://archive.is/AB6ju), so he might be seeing greater opportunities for that at Google. Also, since Google is also signing a non-exclusive agreement with Character.AI to "use its tech", we might be seeing better (?) conversational models from Google in the future.

Anonymous
08/02/24(Fri)16:01:48 No.101690245

Anonymous 08/02/24(Fri)16:01:48 No.101690245

>>101690222
Is this KV caching? Or something else?

Anonymous
08/02/24(Fri)16:07:57 No.101690352

Anonymous 08/02/24(Fri)16:07:57 No.101690352

Call me insane but I think that Celeste 1.9 is worse than 1.2.
In the sense that it's dumber. You ask it a question that's nor RP and its response is not as comprehensive.
I have this Game Master card, and in my testing chat I have a moment where we are talking as the Player and the Game Master and I ask a question, followed by saying that, instead of simply replying to the question, we might as well play the exchange between the characters.
Then I describe the scene's backdrop and assume the identity of the my character (instead of the Player), with the idea being that the model will assume play the NPC for that one exchange then go back to the conversation between GM and Player.
The official nemo-instruct, mini-magnum, and celeste 1.2 all can do it seamlessly.
1.9 can't do it.
I probably can make it do it if I change my prompt, the card, the prefil, disable the 3 lorebooks, etc, but I count it as a failure for this particular case.

Anonymous
08/02/24(Fri)16:09:34 No.101690376

Anonymous 08/02/24(Fri)16:09:34 No.101690376

File: file.png (795 KB, 1024x1024)

795 KB PNG

https://anthra.site/

Magnum was not the end, merely the beginning of the ride.

Come join us, and we will dig to uncover shining diamonds in the rough.

Anonymous
08/02/24(Fri)16:10:05 No.101690381

Anonymous 08/02/24(Fri)16:10:05 No.101690381

>>101690352
And as I complained about it, it goes and does it.
I think it's the format. It wanted me to format my character's narration with ** "" instead of plain and "", so quirk of the model I guess. Overbaked on the specific format.

Anonymous
08/02/24(Fri)16:11:07 No.101690399

Anonymous 08/02/24(Fri)16:11:07 No.101690399

>>101690376
>miku avatar
I thought you had no horses in the race?

Anonymous
08/02/24(Fri)16:12:08 No.101690419

Anonymous 08/02/24(Fri)16:12:08 No.101690419

>>101690381
That said, it's still dumb since it tries to play the whole scene out with both characters, mine and his.

Anonymous
08/02/24(Fri)16:12:28 No.101690426

Anonymous 08/02/24(Fri)16:12:28 No.101690426

>>101690376
miku the coalburner, figures

Anonymous
08/02/24(Fri)16:12:36 No.101690429

Anonymous 08/02/24(Fri)16:12:36 No.101690429

>Be Sam Altman
>Open source model overtakes GPT-4o
>Open source model overtakes DALL-E 3
>Google overtakes the one shitty mememark he had on lockdown
>No sign of multimodal GPT-4o, TTS, or Sora release
What's his plan?

Anonymous
08/02/24(Fri)16:13:25 No.101690439

Anonymous 08/02/24(Fri)16:13:25 No.101690439

>>101690429
Series Z funding

Anonymous
08/02/24(Fri)16:13:43 No.101690447

Anonymous 08/02/24(Fri)16:13:43 No.101690447

>>101690429
>MICROSOFT SAVE ME!

Anonymous
08/02/24(Fri)16:13:58 No.101690451

Anonymous 08/02/24(Fri)16:13:58 No.101690451

File: 1715274461182510.png (827 KB, 759x1107)

827 KB PNG

>>101690429
He could be making profit and destroy all of them with Q* but he chooses to not do so for your own safety.

Anonymous
08/02/24(Fri)16:14:45 No.101690461

Anonymous 08/02/24(Fri)16:14:45 No.101690461

>>101690376
why did they just steal the anthropic logo

Anonymous
08/02/24(Fri)16:14:49 No.101690462

Anonymous 08/02/24(Fri)16:14:49 No.101690462

File: file.png (310 KB, 777x546)

310 KB PNG

>>101690429

Anonymous
08/02/24(Fri)16:15:03 No.101690466

Anonymous 08/02/24(Fri)16:15:03 No.101690466

File: sam_altman.0.1546423352.0.jpg (130 KB, 1200x800)

130 KB JPG

>>101690429
uhm gpt5 agi! please invest.

Anonymous
08/02/24(Fri)16:15:10 No.101690470

Anonymous 08/02/24(Fri)16:15:10 No.101690470

>>101690429
The plan is to bring the entire AI ecosystem down with him if he fails. You have multiple industries ready to pounce on AI if that happens.

Anonymous
08/02/24(Fri)16:16:33 No.101690484

Anonymous 08/02/24(Fri)16:16:33 No.101690484

>>101690461
They stole logs from anthropic, so why not take even more?

Anonymous
08/02/24(Fri)16:17:54 No.101690505

Anonymous 08/02/24(Fri)16:17:54 No.101690505

>US and EU attempt to impose worldwide AI advancement lockdown to prevent absolutely absurd contrived rogue terminator scenario
>China just keeps going, with the GPUs that were supposed to be sanctioned
>US backtracks, it's not unsafe any more
they definitely fuckin tried to establish a monopoly and make local shit
megacorps are such a stain on the world

Anonymous
08/02/24(Fri)16:18:03 No.101690509

Anonymous 08/02/24(Fri)16:18:03 No.101690509

>>101690462
that's not worrying at all

this is why local models are important

Anonymous
08/02/24(Fri)16:18:46 No.101690525

Anonymous 08/02/24(Fri)16:18:46 No.101690525

>>101690426
how else to find gems?

Anonymous
08/02/24(Fri)16:19:32 No.101690533

Anonymous 08/02/24(Fri)16:19:32 No.101690533

>>101690505
The EU law does nothing.

Anonymous
08/02/24(Fri)16:19:53 No.101690538

Anonymous 08/02/24(Fri)16:19:53 No.101690538

>>101690461
They also took part of the name. Basically trying to leverage their reputation, like some scammer.

Anonymous
08/02/24(Fri)16:19:54 No.101690539

Anonymous 08/02/24(Fri)16:19:54 No.101690539

>>101690245
Yeah, quick look at the docs suggests it's per user from the beginning of the prompt. Cool idea, I am curious though if it works for OR - I doubt they passthrough some special IDs to any provider for user identification.

Anonymous
08/02/24(Fri)16:20:08 No.101690541

Anonymous 08/02/24(Fri)16:20:08 No.101690541

File: 2024-08-02_14-19.jpg (94 KB, 1065x854)

94 KB JPG

>>101682019
The code-stealing tranny is back, digging his claws into another project and claiming 100x performance improvements while rebranding llama.cpp
Original theft https://rentry.org/Jarted

Anonymous
08/02/24(Fri)16:20:44 No.101690553

Anonymous 08/02/24(Fri)16:20:44 No.101690553

>>101690429
Insider at open ai here. They are planning to do a fake demo to reignite the hype. But even if it's fake now it won't be later so it's not really lying.

Anonymous
08/02/24(Fri)16:20:58 No.101690559

Anonymous 08/02/24(Fri)16:20:58 No.101690559

>>101690505
ironically local never did better. so far the only thing where we are significantly behind is an audio-2-audio model and whatever strawberry will be, thought i'm pretty sure the second one will have an open-source alternative way faster than the first

Anonymous
08/02/24(Fri)16:21:45 No.101690572

Anonymous 08/02/24(Fri)16:21:45 No.101690572

>>101690541
It's amazing how many projects are stealing llama.cpp stuff.

Anonymous
08/02/24(Fri)16:23:19 No.101690597

Anonymous 08/02/24(Fri)16:23:19 No.101690597

>>101690352
>>101690381
>>101690419
I've been saying 1.2, but it's actually 1.6 that I was comparing it to.
I don't even know if there's a 1.2.
Gonna try 1.5.

Anonymous
08/02/24(Fri)16:24:45 No.101690616

Anonymous 08/02/24(Fri)16:24:45 No.101690616

>>101690595
>>101690595
>>101690595
>>101690595
It's time for a split

Anonymous
08/02/24(Fri)16:24:45 No.101690617

Anonymous 08/02/24(Fri)16:24:45 No.101690617

>>101690429
kneel before Zuck

Anonymous
08/02/24(Fri)16:25:15 No.101690629

Anonymous 08/02/24(Fri)16:25:15 No.101690629

>>101690376
>simple minimalist page
>no easter egg in the source
boooooring

Anonymous
08/02/24(Fri)16:31:35 No.101690708

Anonymous 08/02/24(Fri)16:31:35 No.101690708

>>101690616
wtf is a sao model

Anonymous
08/02/24(Fri)16:33:14 No.101690737

Anonymous 08/02/24(Fri)16:33:14 No.101690737

File: 1693868805654543.jpg (231 KB, 928x1232)

231 KB JPG

>>101682183
>>101682035
What's this new model that looks awesome?

Anonymous
08/02/24(Fri)16:33:18 No.101690740

Anonymous 08/02/24(Fri)16:33:18 No.101690740

>>101690708
Local models, but good.

Anonymous
08/02/24(Fri)16:33:48 No.101690747

Anonymous 08/02/24(Fri)16:33:48 No.101690747

not surprised that the guy obsessed with sao is also the miku blacked spammer

Anonymous
08/02/24(Fri)16:34:38 No.101690759

Anonymous 08/02/24(Fri)16:34:38 No.101690759

>>101690747
petra is a sao fan.

Anonymous
08/02/24(Fri)16:35:27 No.101690767

Anonymous 08/02/24(Fri)16:35:27 No.101690767

>>101690747
Can't take a little competition, coalburner?

Anonymous
08/02/24(Fri)16:35:53 No.101690777

Anonymous 08/02/24(Fri)16:35:53 No.101690777

File: replicate-prediction-4hvz(...).png (879 KB, 1216x832)

879 KB PNG

>>101690737

Anonymous
08/02/24(Fri)16:36:29 No.101690792

Anonymous 08/02/24(Fri)16:36:29 No.101690792

>>101690777
we are so back

Anonymous
08/02/24(Fri)16:36:38 No.101690795

Anonymous 08/02/24(Fri)16:36:38 No.101690795

File: 1703185355126913.jpg (163 KB, 1058x926)

163 KB JPG

>>101690777
o shit
hope I can run it on 1x 4090

Anonymous
08/02/24(Fri)16:37:13 No.101690808

Anonymous 08/02/24(Fri)16:37:13 No.101690808

>>101690795
>1x 4090
oh no no no no

Anonymous
08/02/24(Fri)16:38:00 No.101690823

Anonymous 08/02/24(Fri)16:38:00 No.101690823

>>101690767
Go to the Anthracite org on HF. See who's part of it. Like, look really hard.

Anonymous
08/02/24(Fri)16:40:13 No.101690860

Anonymous 08/02/24(Fri)16:40:13 No.101690860

>>101690823
kill list

Anonymous
08/02/24(Fri)16:41:56 No.101690883

Anonymous 08/02/24(Fri)16:41:56 No.101690883

File: sample.jpg (268 KB, 1024x1024)

268 KB JPG

>>101690823

Anonymous
08/02/24(Fri)16:42:07 No.101690886

Anonymous 08/02/24(Fri)16:42:07 No.101690886

>Anthracite, also known as hard coal and black coal, is a hard, compact variety of coal that has a submetallic lustre. It has the highest carbon content, the fewest impurities, and the highest energy density of all types of coal and is the highest ranking of coals.
what did they mean by this

Anonymous
08/02/24(Fri)16:42:26 No.101690894

Anonymous 08/02/24(Fri)16:42:26 No.101690894

https://github.com/leejet/stable-diffusion.cpp
Will this finally become relevant with flux being 12B?

Anonymous
08/02/24(Fri)16:43:11 No.101690905

Anonymous 08/02/24(Fri)16:43:11 No.101690905

>>101690860
Hate us cause they ain't us

Anonymous
08/02/24(Fri)16:44:08 No.101690926

Anonymous 08/02/24(Fri)16:44:08 No.101690926

>>101690451
The safety angle is just to get government money to 'protect' everyone from other models, as well as limiting his competition through regulation.

Anonymous
08/02/24(Fri)16:44:10 No.101690928

Anonymous 08/02/24(Fri)16:44:10 No.101690928

>>101690883
Sao is part of Anthracite. We are all coalburners. We all find gems.

Anonymous
08/02/24(Fri)16:45:48 No.101690962

Anonymous 08/02/24(Fri)16:45:48 No.101690962

>>101690886
What is gem, but coal under pressure?

Anonymous
08/02/24(Fri)16:46:14 No.101690972

Anonymous 08/02/24(Fri)16:46:14 No.101690972

>>101690928
le gem amirite lads?

Anonymous
08/02/24(Fri)16:46:30 No.101690976

Anonymous 08/02/24(Fri)16:46:30 No.101690976

>>101690823
Undi and Ikari, but not Drummer? And nothingisreal is obviously too much of an outsider. Explains why the attacks are mostly focused on the later two.

Anonymous
08/02/24(Fri)16:46:58 No.101690985

Anonymous 08/02/24(Fri)16:46:58 No.101690985

File: 19420 - SoyBooru.png (256 KB, 800x789)

256 KB PNG

>>101690376
>https://anthra.site/

Anonymous
08/02/24(Fri)16:48:29 No.101691008

Anonymous 08/02/24(Fri)16:48:29 No.101691008

I transheart anthracite

Anonymous
08/02/24(Fri)16:49:02 No.101691017

Anonymous 08/02/24(Fri)16:49:02 No.101691017

Baitie-kun, have you migrated from /aicg/ to /lmg/?

Anonymous
08/02/24(Fri)16:49:03 No.101691019

Anonymous 08/02/24(Fri)16:49:03 No.101691019

>>101690541
>image
Isn't it a waste of time trying to make it better on CPUs when people can just get GPUs and GPUs are better for it? It's built for GPUs

Anonymous
08/02/24(Fri)16:49:53 No.101691038

Anonymous 08/02/24(Fri)16:49:53 No.101691038

please, which model and UI should i use to generate images from the chat?

Anonymous
08/02/24(Fri)16:52:15 No.101691081

Anonymous 08/02/24(Fri)16:52:15 No.101691081

>>101690795
ur gud -> >>101677660

Anonymous
08/02/24(Fri)16:53:20 No.101691102

Anonymous 08/02/24(Fri)16:53:20 No.101691102

>>101691019
>Isn't it a waste of time trying to make it better on CPUs when people can just get GPUs and GPUs are better for it?
The issue is that gpus are too expensive. If we didn't have a monopoly, we would have 128gb cards for 500 and nobody would have to bother with cpus, but due to nvidias greed the best we have at that price point are 24gb cards.

Anonymous
08/02/24(Fri)16:53:29 No.101691103

Anonymous 08/02/24(Fri)16:53:29 No.101691103

My skin shimmers with iridescent hues of pink and purple, my eyes shine with an otherworldly luminescence, and a mischievous grin spreads across my face.

Anonymous
08/02/24(Fri)16:54:18 No.101691119

Anonymous 08/02/24(Fri)16:54:18 No.101691119

>>101690740
Unironically

Anonymous
08/02/24(Fri)16:55:41 No.101691148

Anonymous 08/02/24(Fri)16:55:41 No.101691148

>>101691019
Tell that to the jan.ai fags who made their software unavailable for people without AVX2 in futile attempt to speed up CPU inference.
Imagine being hard-bottlenecked by CPU for a task meant for GPUs. I'm out of luck.

Anonymous
08/02/24(Fri)16:55:57 No.101691156

Anonymous 08/02/24(Fri)16:55:57 No.101691156

>>101690905
The Grifters United organization, no thanks.

Anonymous
08/02/24(Fri)16:56:18 No.101691168

Anonymous 08/02/24(Fri)16:56:18 No.101691168

The 'tune cabal?

Anonymous
08/02/24(Fri)16:57:13 No.101691182

Anonymous 08/02/24(Fri)16:57:13 No.101691182

Wasn't Pygmalion already a company?

Anonymous
08/02/24(Fri)16:57:19 No.101691187

Anonymous 08/02/24(Fri)16:57:19 No.101691187

>>101691102
>If we didn't have a monopoly,
nigger

Anonymous
08/02/24(Fri)16:58:14 No.101691202

Anonymous 08/02/24(Fri)16:58:14 No.101691202

>>101690985
A visage of stone, a heart of coal,
His bellow echoes, a story untold.
A passion for power, a love so profound,
In the depths of the earth, his treasure is found.

His eyes wide with fervor, his beard like a storm,
He stands for the darkness, a powerful form.
With a voice that could shake the very ground,
He proclaims his allegiance, with a roaring sound.

His heart, a red ember, a symbol so bright,
A testament to his fervor, his burning delight.
For the blackest of treasures, he holds it so dear,
A love for the coal, that will conquer all fear.

Anonymous
08/02/24(Fri)16:59:20 No.101691227

Anonymous 08/02/24(Fri)16:59:20 No.101691227

>>101691187
nigger

Anonymous
08/02/24(Fri)17:00:06 No.101691244

Anonymous 08/02/24(Fri)17:00:06 No.101691244

>>101691227
coal digger

Anonymous
08/02/24(Fri)17:00:19 No.101691250

Anonymous 08/02/24(Fri)17:00:19 No.101691250

>>101691156
Again, Hate us cuz you ain't us cuh

Anonymous
08/02/24(Fri)17:02:56 No.101691292

Anonymous 08/02/24(Fri)17:02:56 No.101691292

>>101691182
This isn't Pyg, only person I can see in it who's also part of Pyg is Alpindale

Anonymous
08/02/24(Fri)17:03:37 No.101691304

Anonymous 08/02/24(Fri)17:03:37 No.101691304

>>101691244
coal digger

Anonymous
08/02/24(Fri)17:03:40 No.101691306

Anonymous 08/02/24(Fri)17:03:40 No.101691306

>>101691250
Go back to attacking Celeste and Drummer, miku. They pay you for that.

Anonymous
08/02/24(Fri)17:04:11 No.101691311

Anonymous 08/02/24(Fri)17:04:11 No.101691311

>>101691292
exsqueeze me xir?! HOW DARE YOU NOT REFER TO HIM AS **THE** ALPINDALE. YOU BETTER APOLOGIZE RIGHT NOW XIR

Anonymous
08/02/24(Fri)17:04:47 No.101691320

Anonymous 08/02/24(Fri)17:04:47 No.101691320

I have a RTX 4080 and my Windows Task Manager in the Performance tab tells me I have 16.0GB of Dedicated GPU Memory, 31.9GB of Shared GPU Memory and 47.9GB of GPU Memory.

Which one is my size limit for running a model? I was considering trying out the new FLUX.1 model.
>https://huggingface.co/black-forest-labs/FLUX.1-schnell

I've used the GGUF VRAM Calculator in the OP, I think I have 16GB of RAM? Apparently I need 32GB of GPU RAM to run FLUX.1.
Which memory size is my size limit for local models?

Anonymous
08/02/24(Fri)17:05:16 No.101691330

Anonymous 08/02/24(Fri)17:05:16 No.101691330

>>101691306
Hi Drummer... go back to slopping

Anonymous
08/02/24(Fri)17:06:14 No.101691344

Anonymous 08/02/24(Fri)17:06:14 No.101691344

>>101691182
alpin needs to pay for her gender transition surgery, please understand

Anonymous
08/02/24(Fri)17:07:56 No.101691375

Anonymous 08/02/24(Fri)17:07:56 No.101691375

>>101691320
*47.9GB of Total memory

Anonymous
08/02/24(Fri)17:09:26 No.101691397

Anonymous 08/02/24(Fri)17:09:26 No.101691397

>>101691344
XIR XIR XIR!!!! PLEASE UNDERSTAND AND REFER TO HIM AS ***THE*** T - H - E ALPINDALE XIR

Anonymous
08/02/24(Fri)17:09:58 No.101691409

Anonymous 08/02/24(Fri)17:09:58 No.101691409

>>101691320
>Which memory size is my size limit for local models?
16.0GB

Anonymous
08/02/24(Fri)17:10:00 No.101691412

Anonymous 08/02/24(Fri)17:10:00 No.101691412

>>101691320
You have no idea what VRAM and RAM is. Please stop using shit in Task manager,

Anonymous
08/02/24(Fri)17:11:07 No.101691430

Anonymous 08/02/24(Fri)17:11:07 No.101691430

So the niggers who made Magnum made an org and it's called coal

Anything else I should know?

Anonymous
08/02/24(Fri)17:11:10 No.101691432

Anonymous 08/02/24(Fri)17:11:10 No.101691432

>>101691320
Dedicated memory is the amount on the GPU; Shared allows some 50% of motherboard memory to be used for Graphics, but is slower and undesirable; GPU Memory is the total from both.
Ideally you only ever want to be using Dedicated Memory for inference, to keep all the data the model needs on VRAM which is much quicker than system RAM.

Anonymous
08/02/24(Fri)17:11:48 No.101691447

Anonymous 08/02/24(Fri)17:11:48 No.101691447

>>101691430
I am currently riding tek's 2 inch indian cock

Anonymous
08/02/24(Fri)17:12:06 No.101691452

Anonymous 08/02/24(Fri)17:12:06 No.101691452

Hi anons, just asking in case someone tried something like that
When you handwrite an example and want to send it to the model to use as reference, should i just copypaste it into my message and tell it to write in a similar manner, or would describing what it has to do and then replacing bots reply with my example and telling it to keep going like this have a better effect?

Anonymous
08/02/24(Fri)17:12:40 No.101691465

Anonymous 08/02/24(Fri)17:12:40 No.101691465

>>101691409
>>101691432
Ty bros

Anonymous
08/02/24(Fri)17:16:26 No.101691554

Anonymous 08/02/24(Fri)17:16:26 No.101691554

>>101691430
i`m a barbie girl, in a barbie worlddd life in plastic is fantastic, you can cut my hair, undress me anywhere - imagination, life is my creation!~

Anonymous
08/02/24(Fri)17:16:45 No.101691563

Anonymous 08/02/24(Fri)17:16:45 No.101691563

Mistral Large settings?

Anonymous
08/02/24(Fri)17:20:31 No.101691621

Anonymous 08/02/24(Fri)17:20:31 No.101691621

>>101691563
Neutralize, then add some minp if you'd like.

Anonymous
08/02/24(Fri)17:23:24 No.101691668

Anonymous 08/02/24(Fri)17:23:24 No.101691668

>>101691621
Also 0.2 smoothing because of the characteristic MistralAI overconfidence for every token.

Anonymous
08/02/24(Fri)17:24:53 No.101691696

Anonymous 08/02/24(Fri)17:24:53 No.101691696

File: file.png (1.88 MB, 1024x1024)

1.88 MB PNG

the 'site marches on in search of new models to gem

Anonymous
08/02/24(Fri)17:25:59 No.101691713

Anonymous 08/02/24(Fri)17:25:59 No.101691713

>>101691668
NTA but how do I do that

Anonymous
08/02/24(Fri)17:26:58 No.101691728

Anonymous 08/02/24(Fri)17:26:58 No.101691728

>>101691713
You set smoothing to 0.2

Anonymous
08/02/24(Fri)17:29:55 No.101691779

Anonymous 08/02/24(Fri)17:29:55 No.101691779

do we have multiple schizo anons here or is it all petra?

Anonymous
08/02/24(Fri)17:29:59 No.101691780

Anonymous 08/02/24(Fri)17:29:59 No.101691780

File: image.png (145 KB, 912x710)

145 KB PNG

>>101691696
>sponge

Anonymous
08/02/24(Fri)17:31:24 No.101691805

Anonymous 08/02/24(Fri)17:31:24 No.101691805

>>101691668
>use meme sampler, it will help, for sure!

Anonymous
08/02/24(Fri)17:32:09 No.101691819

Anonymous 08/02/24(Fri)17:32:09 No.101691819

I have two questions. When looking at the FAQ for gpu requirements, I notice 'precision' in 4-bit, 8-bit, and 16-bit but I don't see an explanation of what that means in the context of LLMs. I understand the idea of 7B and 13B models but the precision has me confused.
Also, I'm looking to maybe upgrade my GPU. Is a 3090 good for text generation? 24gb of VRAM would be a big upgrade over my current setup but the price point is high enough that I'd feel bad if it became obsolete in the next little while.

Anonymous
08/02/24(Fri)17:32:20 No.101691822

Anonymous 08/02/24(Fri)17:32:20 No.101691822

is it just me or are system prompts absolute useless memes

Anonymous
08/02/24(Fri)17:39:11 No.101691933

Anonymous 08/02/24(Fri)17:39:11 No.101691933

>>101691819
>When looking at the FAQ for gpu requirements, I notice 'precision' in 4-bit, 8-bit, and 16-bit but I don't see an explanation of what that means in the context of LLMs
most LLMs are trained with f16 precision but because of that they take a lot of memory. People figured out that they can quantize weights to save a lot of space for small quality reduction. Generally quantization hurts smaller models more than bigger ones, I wouldn't use anything below q8 for 7-8B models, q6 for 12-13B models and q4 for bigger models, although some anons say that q2-3 on bigger models isn't that bad (but I think they lie)
>Is a 3090 good for text generation?
yeah, 3090 is quite good for $ per GB

Anonymous
08/02/24(Fri)17:39:28 No.101691940

Anonymous 08/02/24(Fri)17:39:28 No.101691940

>>101691668
Why do you need to use smoothing? If you want to have another go just increase the temp, that's why you have the little bit of minp in case you want to do that.

Anonymous
08/02/24(Fri)17:41:06 No.101691969

Anonymous 08/02/24(Fri)17:41:06 No.101691969

>>101691940
top-p is better than meeme min-p

Anonymous
08/02/24(Fri)17:41:07 No.101691970

Anonymous 08/02/24(Fri)17:41:07 No.101691970

>>101691940
because anon is shilling his own sampler that only adds bloat to postprocessing

Anonymous
08/02/24(Fri)17:42:04 No.101691989

Anonymous 08/02/24(Fri)17:42:04 No.101691989

>>101686705
>you must pay my jew master! you must bootlick!
slit your wrists.

Anonymous
08/02/24(Fri)17:43:25 No.101692007

Anonymous 08/02/24(Fri)17:43:25 No.101692007

>>101687198
>technology board
>>>anime
hello? retard?

Anonymous
08/02/24(Fri)17:44:12 No.101692018

Anonymous 08/02/24(Fri)17:44:12 No.101692018

>>101691933
>3090
You forgot to add "but you'll need several of them" given the requirements you provide for the quants.

Anonymous
08/02/24(Fri)17:47:03 No.101692070

Anonymous 08/02/24(Fri)17:47:03 No.101692070

>>101692018
I don't think bigger models are worth buying multiple GPUs, LLMs are still pretty bad, regardless of size. 24B is enough to comfortably run smaller language models, diffusion models and so on.

Anonymous
08/02/24(Fri)17:47:23 No.101692074

Anonymous 08/02/24(Fri)17:47:23 No.101692074

>>101691969
Top p can work too, I just happen to use minp. Just something simple to trim some tokens and the temp, and that's all you need was my main point.

Anonymous
08/02/24(Fri)17:49:39 No.101692105

Anonymous 08/02/24(Fri)17:49:39 No.101692105

>>101691728
where is that option

Anonymous
08/02/24(Fri)17:50:12 No.101692117

Anonymous 08/02/24(Fri)17:50:12 No.101692117

File: ugly bastard.jpg (148 KB, 1280x720)

148 KB JPG

>>101686705
very cute uniform, it would be a pity if I had an ugly bastard license

Anonymous
08/02/24(Fri)17:51:29 No.101692144

Anonymous 08/02/24(Fri)17:51:29 No.101692144

Hi all, Drummer here...

>>101689595
>>101689670
>>101689962

I'm sorry but those are not meant for release. If you don't see a description, then it doesn't count and you're not supposed to mind it, especially if it's not under my account (TheDrummer).

Tuners upload their test models publicly all the time, either for accessibility or transparency.

I've even privated the safetensors so that the quanters don't make it worse by creating even more mirrors of it that I have no control over. Would it be better if I place them in a org named BeaverTest to make that clear?

Anonymous
08/02/24(Fri)17:55:03 No.101692209

Anonymous 08/02/24(Fri)17:55:03 No.101692209

>>101691933
Thank you. I still do not understand what 'precision' actually means in this context though. Is it how close to the prompt the response is? And is the quantization a setting that I change on my end or is it a selection I make when downloading the model itself?

Anonymous
08/02/24(Fri)17:56:38 No.101692239

Anonymous 08/02/24(Fri)17:56:38 No.101692239

>>101692105
tavern or ooba or llama

Anonymous
08/02/24(Fri)17:58:10 No.101692263

Anonymous 08/02/24(Fri)17:58:10 No.101692263

>>101692209
A model is a bunch of numbers. Like 1.812347123972397. A less precise version of this number would be 1.8. A model full of numbers like 1.8 isn't as good as the one with numbers like 1.812347123972397, but it's smaller and easier to fit in less memory.

Anonymous
08/02/24(Fri)17:58:46 No.101692272

Anonymous 08/02/24(Fri)17:58:46 No.101692272

>>101692239
>llama
Is it actually in llamacpp? I didn't see a command line option in the list for it. What's the flag?

Anonymous
08/02/24(Fri)17:59:04 No.101692280

Anonymous 08/02/24(Fri)17:59:04 No.101692280

For the fellow users of Fimbulvetr and Typhon Mistral : know that mini-magnum-12b-v1.1.Q6_K is even better. Fast, imaginative, descriptive, follow the prompt well, long context, top notch.

Anonymous
08/02/24(Fri)18:00:06 No.101692301

Anonymous 08/02/24(Fri)18:00:06 No.101692301

File: Untitled.jpg (48 KB, 346x413)

48 KB JPG

>>101692105

Anonymous
08/02/24(Fri)18:01:25 No.101692326

Anonymous 08/02/24(Fri)18:01:25 No.101692326

>>101692289
>>101692289
>>101692289

Anonymous
08/02/24(Fri)18:01:45 No.101692334

Anonymous 08/02/24(Fri)18:01:45 No.101692334

>>101692301
And which of those two should be the one set to 0.2

Anonymous
08/02/24(Fri)18:01:47 No.101692337

Anonymous 08/02/24(Fri)18:01:47 No.101692337

>>101692144
>BeaverTest
Thy not name it something like TestNotForRelease, TestNotReadyForUse, or whatever?

Anonymous
08/02/24(Fri)18:04:30 No.101692393

Anonymous 08/02/24(Fri)18:04:30 No.101692393

>>101692263
I see. And I imagine that precision matters for things like properly identifying concepts? So a less precise model may have fewer identifiers attached to a given word? For example; [DOG] might be precisely identified as (4 legs) (snout) (fur) (tail) (snarling) (golden retriever) (dalmatian) (collar) etc.. while a less precise [DOG] might just include (4 legs) (snout) (fur) (tail)?

Sorry if this is a retarded question. I am still trying to wrap my head around how these models work.

Anonymous
08/02/24(Fri)18:05:41 No.101692413

Anonymous 08/02/24(Fri)18:05:41 No.101692413

>>101692209
do you know how neural networks are made from weights? LLMs have billions of them. You can lower their precision saving space, for example:
1.4329324553312 - weight on high precision
1.4329 - weight on low precision
It obviously changes the calculations a bit but not that much actually on high quants

Anonymous
08/02/24(Fri)18:11:09 No.101692520

Anonymous 08/02/24(Fri)18:11:09 No.101692520

>>101692393
probably
associations between concepts that occur less often in the dataset may have a smaller influence on the weights and thus be more likely to be muddied by a lower precision

Anonymous
08/02/24(Fri)18:18:23 No.101692644

Anonymous 08/02/24(Fri)18:18:23 No.101692644

>>101692393
no, that's not like it works, basically neural network changes words to number, makes a lot calculations inside and on the last layer we choose the neuron (which represents a word or rather a token, but let's not complicate) that was activated the most (it is represented by %). So if you have the sentence:
>The best friend of a human is a
and let's say we only have 3 neurons (cat, dog, cow), the activation in the last layer may look like:
Cat - 15%
Dog - 83 %
Cow - 2%

Now, because we quantized the model lowering it's precision the calculations inside the model will have a bigger error, for example:

1.342 * 0.491 = 0.658922
1.34 * 0.49 = 0.6566
Notice how despite this multiplication being between the same weights (1.342 and 0.491) the result is slightly different due to lower precision. These errors can influence the results in the last layer and we can have something like that instead:
Cat - 19%
Dog - 78 %
Cow - 3%
Which doesn't change the word to pick really but in some cases it can. The bigger quantization the biggest difference in calculations and bigger chance that the model will choose wrong token at the end.

Anonymous
08/02/24(Fri)18:29:24 No.101692831

Anonymous 08/02/24(Fri)18:29:24 No.101692831

>>101692644 (me)
>The bigger quantization
and by that I mean stronger quantization (like q2,3,4), the lower number next to q, the bigger precision loss

Anonymous
08/02/24(Fri)18:36:05 No.101692949

Anonymous 08/02/24(Fri)18:36:05 No.101692949

>>101692644
I see. Thank you very much for the example, it was very helpful!

Anonymous
08/02/24(Fri)18:38:30 No.101693002

Anonymous 08/02/24(Fri)18:38:30 No.101693002

>>101692644
Throw sampler gymnastics into the mix and it'll change significantly. All it takes is one single bad token to poison the context

Anonymous
08/02/24(Fri)18:42:13 No.101693072

Anonymous 08/02/24(Fri)18:42:13 No.101693072

>>101693002
there is a lot you could add to that, I generalized and dumbed it down as much as I could, otherwise I would need several posts to explain every single detail that adds to the equation

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.