/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 07/04/24(Thu)03:54:01 No.101268178

File: e0d01707a7cd87fbe120509d8(...).jpg (656 KB, 1360x2040)

656 KB JPG

/lmg/ - Local Models General Anonymous 07/04/24(Thu)03:54:01 No.101268178 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>101258576 & >>101250468

►News
>(07/02) Japanese LLaMA-based model pre-trained on 2T tokens: https://hf.co/cyberagent/calm3-22b-chat
>(06/28) Inference support for Gemma 2 merged: https://github.com/ggerganov/llama.cpp/pull/8156
>(06/27) Meta announces LLM Compiler, based on Code Llama, for code optimization and disassembly: https://go.fb.me/tdd3dw
>(06/27) Gemma 2 released: https://hf.co/collections/google/gemma-2-release-667d6600fd5220e7b967f315
>(06/25) Cambrian-1: Collection of vision-centric multimodal LLMs: https://cambrian-mllm.github.io

►News Archive: https://rentry.org/lmg-news-archive
►FAQ: https://wikia.schneedc.com
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Programming: https://hf.co/spaces/bigcode/bigcode-models-leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
07/04/24(Thu)03:54:27 No.101268182

Anonymous 07/04/24(Thu)03:54:27 No.101268182

File: __hatsune_miku_vocaloid_d(...).jpg (50 KB, 850x531)

50 KB JPG

►Recent Highlights from the Previous Thread: >>101258576

--Papers: >>101267676 >>101267806
--KLD Tests Reveal Insights on Quantization Methods and Model Performance: >>101265037 >>101265051 >>101265080 >>101265240
--Ooba Error: Procedure Entry Point Not Found in llama.dll: >>101260958 >>101262849 >>101263260 >>101263658 >>101263936
--Gemma-2-27b-it-GGUF Model Prompt Format and Capabilities: >>101264232 >>101264314 >>101264365 >>101264571 >>101264613 >>101264682 >>101264754 >>101265316 >>101264279 >>101264353 >>101264357
--Imatrix Quantization Causes High Memory Usage in Koboldcpp: >>101264029 >>101264113 >>101264160 >>101264167 >>101264185 >>101265195 >>101268078
--Anon Rants About Overtrained ERP Phrases in Language Models: >>101264497 >>101265374 >>101265439 >>101265504 >>101265898
--Q8 Quantization: A Viable Alternative to FP16 for Embed and Output Layers: >>101266919 >>101267029 >>101267191
--Claude's Internal Monologue Revealed with Tag Prompt Hack: >>101267435
--Running AI Models on Low VRAM: Expectations and Limitations: >>101260721 >>101260752 >>101260786 >>101260860 >>101260817
--LLaMA Benchmarks and CPU vs GPU Performance: >>101258990 >>101259063 >>101259272
--Introducing Diffusion Forcing: Unifying Next-Token Prediction and Full-Seq Diffusion: >>101259322 >>101259924 >>101259991 >>101260307 >>101260564 >>101260601 >>101260700 >>101260711 >>101260621 >>101260700
--Gemma 27B's Temperature Stability and Coherent Writing: >>101259737 >>101259782 >>101259842 >>101260199 >>101260241 >>101260322 >>101260805 >>101260989 >>101261096 >>101259880
--Alignment Lab's Approach to NSFW Content in Finetunes: >>101263679
--Running on Device: New Open-Source AI Model for Offline Use: >>101260535 >>101260546 >>101260874 >>101260987 >>101261006 >>101261167
--Orin AGX Liquidation: Worth it for the Memory and Accelerators?: >>101259163 >>101259212 >>101259234
--Miku (free space): >>101258845 >>101259283 >>101259715

►Recent Highlight Posts from the Previous Thread: >>101258584

Anonymous
07/04/24(Thu)04:31:55 No.101268457

Anonymous 07/04/24(Thu)04:31:55 No.101268457

>>101268182
>--Gemma-2-27b-it-GGUF Model Prompt Format and Capabilities
That's misleading. That guy got it wrong. As you can read in that chain.

Anonymous
07/04/24(Thu)04:36:26 No.101268497

Anonymous 07/04/24(Thu)04:36:26 No.101268497

will there be SPPO-Iter3 for gemma-2 27b too?

Anonymous
07/04/24(Thu)04:49:32 No.101268568

Anonymous 07/04/24(Thu)04:49:32 No.101268568

I don't get why Gemma would have Claudeslop. Doesn't make sense for them to illegally train on Claude outputs when they have Gemini, probably even bigger and better versions than publicly available.

Anonymous
07/04/24(Thu)04:50:36 No.101268574

Anonymous 07/04/24(Thu)04:50:36 No.101268574

>>101268497
do you feel it really improves the model?

Anonymous
07/04/24(Thu)04:56:43 No.101268609

Anonymous 07/04/24(Thu)04:56:43 No.101268609

>>101268568
Maybe, just maybe

Anonymous
07/04/24(Thu)04:57:37 No.101268613

Anonymous 07/04/24(Thu)04:57:37 No.101268613

>>101268568
Since people like Claude's output, it makes sense. Google do so because it is what people generally prefer.

Anonymous
07/04/24(Thu)04:57:43 No.101268615

Anonymous 07/04/24(Thu)04:57:43 No.101268615

>>101268574
is more creative at writing.

Anonymous
07/04/24(Thu)04:58:24 No.101268620

Anonymous 07/04/24(Thu)04:58:24 No.101268620

>>101268609
The implementations are still young...

Anonymous
07/04/24(Thu)05:21:55 No.101268784

Anonymous 07/04/24(Thu)05:21:55 No.101268784

Is llama 3 7b acceptable for understanding a text? what about gemma2 7b?

Anonymous
07/04/24(Thu)05:22:53 No.101268797

Anonymous 07/04/24(Thu)05:22:53 No.101268797

File: 1700875732878632.png (29 KB, 534x261)

29 KB PNG

the post that killed lmg underages

Anonymous
07/04/24(Thu)05:22:55 No.101268798

Anonymous 07/04/24(Thu)05:22:55 No.101268798

>>101268784
Understanding as in identifying words? Yes.
Understanding context? Probably about as well as a grade schooler.

Anonymous
07/04/24(Thu)05:27:38 No.101268855

Anonymous 07/04/24(Thu)05:27:38 No.101268855

>>101268798
I mean like a paper so it can rewrite the content in other ways, implications, etc. Have only tried claude for that is really good, but I think they save data just like openai, so papers on the write I only trust local, still got no power to run better than 7b.

Anonymous
07/04/24(Thu)05:32:36 No.101268914

Anonymous 07/04/24(Thu)05:32:36 No.101268914

>>101268855
If you're using any kind of service that isn't self-hosted, you can bet everything you give them is being logged
7b models might be capable of re-wording stuff to skirt plagiarism checks or something, but if you need it to be able to actually understand complex topics then you probably won't be happy with the result.

Anonymous
07/04/24(Thu)05:38:25 No.101268959

Anonymous 07/04/24(Thu)05:38:25 No.101268959

>>101268855
llama3-8B should be good enough for that, you just need a solid prompt and maybe few shots it

Anonymous
07/04/24(Thu)05:45:03 No.101269016

Anonymous 07/04/24(Thu)05:45:03 No.101269016

>>101268914
yeah I think the same, because of that looks quite bad to "release" a paper as log before really publishing it.
>>101268959
I'll try it, is any advantage of using a quantized version of it? (I head PHI behaves weirdly) And where can I obtain? Any performance difference between kobold and llama.cpp?

Anonymous
07/04/24(Thu)05:54:23 No.101269095

Anonymous 07/04/24(Thu)05:54:23 No.101269095

My Gemma kobold.cpp often crashes with this error: https://github.com/ggerganov/llama.cpp/issues/8246

It only happens after I changing something in the context and then regenerating, but not always. I haven't found a pattern yet. Anyone else experiencing this or have an idea what to try?

inb4 another gguf regen required

Anonymous
07/04/24(Thu)05:58:22 No.101269121

Anonymous 07/04/24(Thu)05:58:22 No.101269121

>>101269016
I'm downloading this one
https://huggingface.co/TheBloke/LLaMA-Pro-8B-GGUF

Anonymous
07/04/24(Thu)06:01:47 No.101269147

Anonymous 07/04/24(Thu)06:01:47 No.101269147

>>101269121
lol

Anonymous
07/04/24(Thu)06:03:27 No.101269169

Anonymous 07/04/24(Thu)06:03:27 No.101269169

File: 1706687962125.png (172 KB, 460x309)

172 KB PNG

>>101269121
>TheBloke

Anonymous
07/04/24(Thu)06:06:45 No.101269202

Anonymous 07/04/24(Thu)06:06:45 No.101269202

>>101269147
>>101269169
We don't download from him anymore?

Anonymous
07/04/24(Thu)06:10:51 No.101269241

Anonymous 07/04/24(Thu)06:10:51 No.101269241

>>101269202
Isn't he dead?

Anonymous
07/04/24(Thu)06:11:22 No.101269247

Anonymous 07/04/24(Thu)06:11:22 No.101269247

File: 1654378079935.png (170 KB, 390x475)

170 KB PNG

>>101269202
He disappeared nearly half a year ago, and his now ancient quants are lacking key tweaks and fixes discovered since then.

Also that isn't even llama3, guy.

Anonymous
07/04/24(Thu)06:13:01 No.101269255

Anonymous 07/04/24(Thu)06:13:01 No.101269255

>>101269121
>Updated 6 months ago.
It's some tencent abomination.
Download the model from meta directly and quant it yourself. They grant access immediately.

Anonymous
07/04/24(Thu)06:14:25 No.101269265

Anonymous 07/04/24(Thu)06:14:25 No.101269265

>>101269241
>>101269247
That's why I've asked you dudes where I could obtain it T-T
>>101269255
> Quant it yourself
Wouldn't it take a **long** time?

Anonymous
07/04/24(Thu)06:15:31 No.101269278

Anonymous 07/04/24(Thu)06:15:31 No.101269278

>>101269265
>That's why I've asked you dudes where I could obtain it T-T
Just use google, it's not that difficult

Anonymous
07/04/24(Thu)06:16:39 No.101269287

Anonymous 07/04/24(Thu)06:16:39 No.101269287

>>101269265
>>101269202
if you really want premade ggufs bart is the new bloke
https://huggingface.co/bartowski

Anonymous
07/04/24(Thu)06:16:50 No.101269288

Anonymous 07/04/24(Thu)06:16:50 No.101269288

>>101269265
>Wouldn't it take a **long** time?
For an 8B? no. A few minutes (5-10?) on a potato, less on what you're probably running.

Anonymous
07/04/24(Thu)06:21:43 No.101269328

Anonymous 07/04/24(Thu)06:21:43 No.101269328

>>101269278
> *uses google*
> *download the bloke*
> "you are dummy"
T-T
>>101269287
Thanks anon. Will download this one:
https://huggingface.co/bartowski/Meta-Llama-3-8B-Instruct-GGUF
>>101269288
Mine is a potato (compare to what people use for AI), that's why I'm going for a 8b.

Anonymous
07/04/24(Thu)06:24:32 No.101269348

Anonymous 07/04/24(Thu)06:24:32 No.101269348

>>101269328
>Thanks anon. Will download this one:
since you want to use it for 'regular' stuff, this might be a straight upgrade of l3, if you want to test it
https://huggingface.co/bartowski/Llama-3-Instruct-8B-SPPO-Iter3-GGUF

Anonymous
07/04/24(Thu)06:24:38 No.101269349

Anonymous 07/04/24(Thu)06:24:38 No.101269349

>>101269328
>Mine is a potato (compare to what people use for AI), that's why I'm going for a 8b.
The potato i'm talking about is a ~15 year old amd FX-4170. No gpu, and i convert the models on a vm with 1 cpu running on the same machine. 24gb ram total, 8gb for the vm.
You'll be fine.

Anonymous
07/04/24(Thu)06:28:44 No.101269377

Anonymous 07/04/24(Thu)06:28:44 No.101269377

>>101269348 (me)
>>101269287 (me)
>>101269288
also forgot to mention, for l3 8b you really want at least q6_k, ideally q8_0 it quickly goes complete retard under that

Anonymous
07/04/24(Thu)06:33:44 No.101269408

Anonymous 07/04/24(Thu)06:33:44 No.101269408

>>101269348
Thanks will try this one then.
>>101269349
Thanks will try the other anon suggestion first though
>>101269377
kek was downloading Q4_K_M ("recommended")... Restarting the download again for q6...

Anonymous
07/04/24(Thu)06:37:16 No.101269433

Anonymous 07/04/24(Thu)06:37:16 No.101269433

Can someone give me a good samplers preset that use both smoothing_factor/curve and the DRY parameters?

Anonymous
07/04/24(Thu)06:44:50 No.101269495

Anonymous 07/04/24(Thu)06:44:50 No.101269495

File: 1682729528395.png (1.25 MB, 1024x1024)

1.25 MB PNG

>>101269408
>>101269408
I genuinely feel sorry for the newcomers, the various getting started and spoonfeeding guides floating around are outdated as hell.

Anonymous
07/04/24(Thu)06:47:02 No.101269509

Anonymous 07/04/24(Thu)06:47:02 No.101269509

File: 1716694623371833.jpg (205 KB, 1024x770)

205 KB JPG

do we have decent settings for gemma figured out?

Anonymous
07/04/24(Thu)06:50:34 No.101269535

Anonymous 07/04/24(Thu)06:50:34 No.101269535

>>101269408
>Thanks will try the other anon suggestion first though
Fair enough. I just got tired of waiting for someone to quant the models after fixes on llama.cpp. Then there's the people that complain about broken quants after downloading a 6 month old file. Specially true for new models (see gemma2) where 1 day old quants were already outdated by fixes.
Now i just download the models and convert when needed. The download time is longer and uses more storage but i get fresh quants on demand. If you're gonna start using local models more regularly, i'd recommend it.

Anonymous
07/04/24(Thu)06:50:50 No.101269538

Anonymous 07/04/24(Thu)06:50:50 No.101269538

>>101269495
I thought about writing one, but I feel like it will get outdated pretty quick and I probably won't keep it up to date. Making it just another one in the pile.

Anonymous
07/04/24(Thu)06:51:41 No.101269544

Anonymous 07/04/24(Thu)06:51:41 No.101269544

>>101268797
take your meds schizo

Anonymous
07/04/24(Thu)06:51:58 No.101269547

Anonymous 07/04/24(Thu)06:51:58 No.101269547

does Gemma 27b work properly with 4 bit, or should it be higher, like llama3? If higher, I guess 6 bit would be good enough?

Anonymous
07/04/24(Thu)06:55:54 No.101269573

Anonymous 07/04/24(Thu)06:55:54 No.101269573

File: CleanShot 2024-07-04 at 1(...).png (271 KB, 1804x752)

271 KB PNG

I might be repeating myself, but:
>a sitcom written by a Lovecraftian deity high on nitrous oxide
>specialized in the production of extremely durable oven mitts
Gemma gets so unhinged but never schizo on high temperatures, I love it.

Anonymous
07/04/24(Thu)06:58:47 No.101269594

Anonymous 07/04/24(Thu)06:58:47 No.101269594

File: So.png (2.49 MB, 3336x1866)

2.49 MB PNG

Somethin's wrong with the new _L quant, it looks like it's less accurate than the original one

For example here Q6_K is closer to Q8_0 compared to Q6_K_L

The pink highlighting shows the text that is the exact same as Q8_0 until the difference appears, Q6_K starts to get different to Q8_0 later than Q6_K_L, I'm on a deterministic preset too (top_k = 1)

Anonymous
07/04/24(Thu)07:00:13 No.101269602

Anonymous 07/04/24(Thu)07:00:13 No.101269602

>>101269509
>>101269573 is Q6_K_L, temp = 1.9, top_p = 0.95, min_p = 0.035.

Anonymous
07/04/24(Thu)07:00:19 No.101269604

Anonymous 07/04/24(Thu)07:00:19 No.101269604

>>101269535
Thanks for the suggestion, I'm just testing for now, but I'll keep that in mind if I start to follow the updates.

Anonymous
07/04/24(Thu)07:00:37 No.101269608

Anonymous 07/04/24(Thu)07:00:37 No.101269608

>>101269573
27b, right?

Anonymous
07/04/24(Thu)07:00:51 No.101269609

Anonymous 07/04/24(Thu)07:00:51 No.101269609

>>101269594
muh quants btfo
https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/discussions/4

Anonymous
07/04/24(Thu)07:01:42 No.101269620

Anonymous 07/04/24(Thu)07:01:42 No.101269620

>>101269287
what if bartowski really is thebloke?

Anonymous
07/04/24(Thu)07:02:16 No.101269625

Anonymous 07/04/24(Thu)07:02:16 No.101269625

>>101269608
Yes. Didn't even bother to try 9B, but from what I'm reading it should be quite good as well at this stuff.

Anonymous
07/04/24(Thu)07:03:33 No.101269637

Anonymous 07/04/24(Thu)07:03:33 No.101269637

>>101269609
>q6_K_L (New) - Unimpressed. Short & less detailed responses (esp. in the latter half of the test). Had to regenerate the 3rd response, because it felt embarrassingly short ~ much like a summary.
>q6_k (New) outputs looked honestly better.

Anonymous
07/04/24(Thu)07:05:06 No.101269643

Anonymous 07/04/24(Thu)07:05:06 No.101269643

>>101269637
>q6_K_L (New) - Unimpressed. Short & less detailed responses
that's exactly like this post, the poem is shorter on Q6_K_L >>101269594

looks like the _L meme should be avoided by all cost

Anonymous
07/04/24(Thu)07:07:49 No.101269666

Anonymous 07/04/24(Thu)07:07:49 No.101269666

and flashattention is not supported on gemma yet, right? context just eats up the vram.

Anonymous
07/04/24(Thu)07:09:42 No.101269683

Anonymous 07/04/24(Thu)07:09:42 No.101269683

>>101269666
>flashattention is not supported on gemma yet
correct
>llama_new_context_with_model: flash_attn is not compatible with attn_soft_cap - forcing off

Anonymous
07/04/24(Thu)07:10:52 No.101269685

Anonymous 07/04/24(Thu)07:10:52 No.101269685

>>101269683
time to hibernate for couple of days, i guess

Anonymous
07/04/24(Thu)07:11:16 No.101269688

Anonymous 07/04/24(Thu)07:11:16 No.101269688

>>101269573
yeah I'm impressed aswell by that model, it's smart and has sovl at the same time, I really feel google is gonna lead the LLM race from now on, their next API model is gonna be great, mark my words

Anonymous
07/04/24(Thu)07:15:17 No.101269713

Anonymous 07/04/24(Thu)07:15:17 No.101269713

>"O EM G"
>"this <below_70b model> is THE best model fucking ever guizeee!!!!111 holy shit its SOOOO GOODOOOOOODODOOD"
>t. literally didnt use anything above 13b

does anyone know any library that has easy proof of work captcha requirement before allowing users to post? anyone below 64gb of ram or 24gb vram should just be executed

Anonymous
07/04/24(Thu)07:18:04 No.101269741

Anonymous 07/04/24(Thu)07:18:04 No.101269741

>>101269713
>t. literally didnt use anything above 13b
I used Mixtral for a long time, and I think gemma is at its level in terms of smartness, it's also as good as other languages than english. But gemma is better at being naughty/offensive and is way less deterministic than Mixtral. So basically, gemma-27b-it is the equivalent to a MoE 47b model from december 2023.

Any other questions?

Anonymous
07/04/24(Thu)07:19:00 No.101269750

Anonymous 07/04/24(Thu)07:19:00 No.101269750

>>101269713
>seething he spent thousands on gpus and gpu poor are getting anything decent

Anonymous
07/04/24(Thu)07:19:32 No.101269755

Anonymous 07/04/24(Thu)07:19:32 No.101269755

File: Cucked.jpg (66 KB, 2129x623)

66 KB JPG

Gemma is really cucked as an assistant, but once you go RP mode with a card the guardrails are gone, kek. At this point I'm waiting for Gemma-27b-SSPO so that it's less cucked and smarter

Anonymous
07/04/24(Thu)07:19:52 No.101269758

Anonymous 07/04/24(Thu)07:19:52 No.101269758

File: 1696538370326626.png (33 KB, 719x346)

33 KB PNG

TWO MORE WEEKS

Anonymous
07/04/24(Thu)07:20:54 No.101269771

Anonymous 07/04/24(Thu)07:20:54 No.101269771

>>101269758
Meta hasn't improved a lot since that date desu, Llama3 isn't the boost in quality they promised us, and gemma is catching up to it, if there was a gemma-70b, this shit would be API tier

Anonymous
07/04/24(Thu)07:21:13 No.101269774

Anonymous 07/04/24(Thu)07:21:13 No.101269774

>>101269713
Even better
>People without on-demand access to a local cluster of H100s shouldn't be allowed to post.
You're one of those fuckers with loud bikes going at 20km/h, aren't you?

Anonymous
07/04/24(Thu)07:22:09 No.101269785

Anonymous 07/04/24(Thu)07:22:09 No.101269785

>>101269755
>At this point I'm waiting for Gemma-27b-SSPO so that it's less cucked and smarter
https://huggingface.co/UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3/discussions/1#6681a0ea1fbddc88d2a17856
>We are planning to do 27B as soon as a stable release of transformers and vllm generation on Gemma-2-27B-It is available.
God bless those fags, we'll be eating really good soon

Anonymous
07/04/24(Thu)07:25:21 No.101269819

Anonymous 07/04/24(Thu)07:25:21 No.101269819

>>101269741
Yes, you think llama 3 70b is much smarter?

Anonymous
07/04/24(Thu)07:26:04 No.101269829

Anonymous 07/04/24(Thu)07:26:04 No.101269829

>>101269819
Can't run L3-70b so I can't tell ;-;

Anonymous
07/04/24(Thu)07:27:53 No.101269848

Anonymous 07/04/24(Thu)07:27:53 No.101269848

>>101269750
>>101269774
no, just dont give niggerlicious opinions on things that you barely touched at all, trying out bigger models isnt impossible without running them yourself since others host them online, for free

if you never used even L3 70b then there is no reality where you can make any blanket statements about anything, you can only compare one model to the other in the same range or talk about it being the best model in that range

Anonymous
07/04/24(Thu)07:28:47 No.101269855

Anonymous 07/04/24(Thu)07:28:47 No.101269855

>>101269848
Anon, L3-70b isn't even the best local model, it's CR+, so if you want to go that path, let's go I guess

Anonymous
07/04/24(Thu)07:29:57 No.101269864

Anonymous 07/04/24(Thu)07:29:57 No.101269864

>>101269713
you saw this the most when polls were created here about people voting for the top models with wizard 8x22 and llama3 70b being at the top, literally everyone who voted for llama3 70b did so because they couldnt actually run wizard 8x22 and see it is better for creative tasks

Anonymous
07/04/24(Thu)07:31:29 No.101269888

Anonymous 07/04/24(Thu)07:31:29 No.101269888

>L2 wasn't that much better than L1 aside from context size and GQA
>L3 isn't that much better than L2
>the bigger models like Wizardlm or CR+ aren't that much better than L3
It is over

Anonymous
07/04/24(Thu)07:31:40 No.101269892

Anonymous 07/04/24(Thu)07:31:40 No.101269892

>>101269855
where did i say its the best? i said EVEN 70b, if anything, implying that its the minimum to get into the "big" models, retard

also r+ isnt SOTA for writing, roleplay or similar, that would be WizardLM 2 8x22

Anonymous
07/04/24(Thu)07:33:17 No.101269905

Anonymous 07/04/24(Thu)07:33:17 No.101269905

>>101269888
>t. doesnt even have the ram to run cr+ and wiz
many such cases

Anonymous
07/04/24(Thu)07:33:25 No.101269906

Anonymous 07/04/24(Thu)07:33:25 No.101269906

>>101269892
>can't even run deepseek-v2 in q6+

Anonymous
07/04/24(Thu)07:34:01 No.101269909

Anonymous 07/04/24(Thu)07:34:01 No.101269909

>>101269892
>t. vramlet who can't use Grok

Anonymous
07/04/24(Thu)07:34:28 No.101269917

Anonymous 07/04/24(Thu)07:34:28 No.101269917

>>101269892
it's gemma 27b

Anonymous
07/04/24(Thu)07:34:35 No.101269920

Anonymous 07/04/24(Thu)07:34:35 No.101269920

Don't talk to me if you can't even run Nemotron Q6 or up

Anonymous
07/04/24(Thu)07:35:13 No.101269924

Anonymous 07/04/24(Thu)07:35:13 No.101269924

>>101269848
>just dont give niggerlicious opinions on things that you barely touched at all
I gave no opinion. I just made fun of you for being a little bitch who HAS to read every post. What the fuck is a proof of work library even going to do here, retard?

Anonymous
07/04/24(Thu)07:36:24 No.101269937

Anonymous 07/04/24(Thu)07:36:24 No.101269937

>>101269906
Q3_K_M is enough

Anonymous
07/04/24(Thu)07:36:58 No.101269940

Anonymous 07/04/24(Thu)07:36:58 No.101269940

>>101269892
i will talk about 4b sota models and you vill read it, seethe
https://huggingface.co/bartowski/Phi-3.1-mini-4k-instruct-GGUF

Anonymous
07/04/24(Thu)07:37:07 No.101269943

Anonymous 07/04/24(Thu)07:37:07 No.101269943

>>101269848
>if you never used even L3 70b then there is no reality where you can make any blanket statements about anything, you can only compare one model to the other in the same range or talk about it being the best model in that range
Then go for it anon, give us your opinion about the comparaison between gemma-27b and L3-70b

Anonymous
07/04/24(Thu)07:38:07 No.101269951

Anonymous 07/04/24(Thu)07:38:07 No.101269951

>>101269892
>NOOOO!!! you can't talk about models lighter than 70b
https://www.youtube.com/watch?v=yWULCfJ2PGA

Anonymous
07/04/24(Thu)07:38:22 No.101269952

Anonymous 07/04/24(Thu)07:38:22 No.101269952

Google would destroy the open llm meta if they took gemma-27b and turned into a 8x27b MoE. The perfect the size for those sensible enough to invest in a CPUMAXX build.

Anonymous
07/04/24(Thu)07:38:44 No.101269958

Anonymous 07/04/24(Thu)07:38:44 No.101269958

>>101269943
I think Gemma-27b pfrffffffffff. pfrfffrpffffffffff. fpgpfffffffffffffffff fpfddrdfppfffffff fpfffffffffffffffffffffff.

Anonymous
07/04/24(Thu)07:38:51 No.101269959

Anonymous 07/04/24(Thu)07:38:51 No.101269959

does anyone have any suggestions for models that sound very human (as in, they speak like people do on discord, IRC, reddit, 4ch, etc. The more toxic the better.) (I'm aware of gpt-4chan, but I was wondering if there's anything more recent, considering it's based on GPT-J, a 6b model from 2021.)

Anonymous
07/04/24(Thu)07:39:14 No.101269964

Anonymous 07/04/24(Thu)07:39:14 No.101269964

>All this (v)ramlet seethe
Oof that nigga really struck a nerve.

Anonymous
07/04/24(Thu)07:39:15 No.101269965

Anonymous 07/04/24(Thu)07:39:15 No.101269965

>>101269959
ask the model to talk like that?

Anonymous
07/04/24(Thu)07:39:56 No.101269970

Anonymous 07/04/24(Thu)07:39:56 No.101269970

>>101269959
c.ai used to be very good at this
it even included typos regularly from being so human

Anonymous
07/04/24(Thu)07:40:41 No.101269975

Anonymous 07/04/24(Thu)07:40:41 No.101269975

>>101269750
this
so much this

Anonymous
07/04/24(Thu)07:41:11 No.101269979

Anonymous 07/04/24(Thu)07:41:11 No.101269979

>>101269959
any model if you arent braindead and actually tell the model what you want instead of expecting it to read your mind? how retarded are these newfags

Anonymous
07/04/24(Thu)07:41:19 No.101269982

Anonymous 07/04/24(Thu)07:41:19 No.101269982

Is it me or gemma has some formatting issues?Sometimes it simply messes up the asterisk and shit, it's annoying

Anonymous
07/04/24(Thu)07:41:55 No.101269988

Anonymous 07/04/24(Thu)07:41:55 No.101269988

>>101269982
That's normal for models in general. Claude does it too.

Anonymous
07/04/24(Thu)07:42:06 No.101269992

Anonymous 07/04/24(Thu)07:42:06 No.101269992

>>101269965
>>101269979
this is exactly what I am doing and it seems to completely ignore me

Anonymous
07/04/24(Thu)07:42:26 No.101269997

Anonymous 07/04/24(Thu)07:42:26 No.101269997

>>101269982
"just use the chad novel format", anon said.

Anonymous
07/04/24(Thu)07:42:27 No.101269998

Anonymous 07/04/24(Thu)07:42:27 No.101269998

>>101269988
Nah, not all models mess up the formatting, Mixtral works fine for example.

Anonymous
07/04/24(Thu)07:43:20 No.101270005

Anonymous 07/04/24(Thu)07:43:20 No.101270005

>>101269992
then stop using 13b or less meme models

Anonymous
07/04/24(Thu)07:44:39 No.101270020

Anonymous 07/04/24(Thu)07:44:39 No.101270020

>>101269982
yeah, it cannot retain the formatting for me at all.

Anonymous
07/04/24(Thu)07:45:41 No.101270025

Anonymous 07/04/24(Thu)07:45:41 No.101270025

>>101269982
yup, gemma has hard time with formatting. Just add instruction to use asterists in system prompt

Anonymous
07/04/24(Thu)07:46:09 No.101270029

Anonymous 07/04/24(Thu)07:46:09 No.101270029

>>101270020
the solution would be to use the roleplay.gbnf thing, but it doesn't seem to be working on gemma
>Warning: unrecognized tokenizer: using default token formatting

Anonymous
07/04/24(Thu)07:46:11 No.101270030

Anonymous 07/04/24(Thu)07:46:11 No.101270030

>>101270005
Unfortunately I only have a 4070 laptop edition (yes, I know) with 8GB of VRAM and I am a massive poorfag (got this laptop as a gift)

Anonymous
07/04/24(Thu)07:49:07 No.101270061

Anonymous 07/04/24(Thu)07:49:07 No.101270061

>>101269959
Is the dataset for gpt4chan public?
If so your best bet is to train a lora for a newer model. Either that or get a model that that has really good in context learning and stuff it full of examples.

Anonymous
07/04/24(Thu)07:49:40 No.101270067

Anonymous 07/04/24(Thu)07:49:40 No.101270067

>>101270030
wait for gemma 9b to get fixed to try it if llama 3 8b doesnt work then

Anonymous
07/04/24(Thu)07:55:16 No.101270105

Anonymous 07/04/24(Thu)07:55:16 No.101270105

>>101270061
Yes, but I'm too much of a stupid nigger to know how to train loras right now, but I'm willing to learn if it's not absurdly complex. What should I search for to start?

>>101270067
I'm not sure I fully understand. Are you saying I should try llama 3 8b, and if that doesn't work, wait for an uncensored version of gemma 9b?

BTW, I can run most 11b models with all layers ofloaded to GPU, and most 13b models at a reasonable speed with some layers offloaded to main memory.

Anonymous
07/04/24(Thu)07:59:05 No.101270149

Anonymous 07/04/24(Thu)07:59:05 No.101270149

>>101270105
>Are you saying
yes, except not uncensored gemma but for it to be fixed since it doesnt seem like its working fully properly with current software

Anonymous
07/04/24(Thu)08:01:58 No.101270186

Anonymous 07/04/24(Thu)08:01:58 No.101270186

if I can fit Gemma 27b 5k_m 100%, should I use llama.cpp oder exllama?

Anonymous
07/04/24(Thu)08:03:19 No.101270198

Anonymous 07/04/24(Thu)08:03:19 No.101270198

File: file.png (281 KB, 640x640)

281 KB PNG

>PoopenFarten-CapybaraMaid-Gemerald-limarpv4-34B.i1-IQ4_K_S-00001-of-00004.gguf

Why are model filenames like this. what went wrong

Anonymous
07/04/24(Thu)08:05:15 No.101270214

Anonymous 07/04/24(Thu)08:05:15 No.101270214

>>101270198
Because you're not converting them yourself.
>ggml-model-IQ4_K_S-00001-of-00004.gguf

Anonymous
07/04/24(Thu)08:07:21 No.101270239

Anonymous 07/04/24(Thu)08:07:21 No.101270239

File: firefox_Fr6lO7UM3l.png (278 KB, 739x930)

278 KB PNG

>>101269959

Anonymous
07/04/24(Thu)08:07:25 No.101270240

Anonymous 07/04/24(Thu)08:07:25 No.101270240

>>101269888
Honestly, I've been reading earlier parts from my long story that were even from the early 13B llama models and am surprised at how well they did. Unfortunately I didn't keep a record of what each chapter used. But aside from some outliers that used Claude and with no track of edits and retires, I'm having a hard time distinguishing a difference in quality.

Anonymous
07/04/24(Thu)08:09:02 No.101270251

Anonymous 07/04/24(Thu)08:09:02 No.101270251

>>101270186
exllama does not support gemma.

Anonymous
07/04/24(Thu)08:09:49 No.101270256

Anonymous 07/04/24(Thu)08:09:49 No.101270256

>>101269713
dilate more trans freak

Anonymous
07/04/24(Thu)08:11:09 No.101270265

Anonymous 07/04/24(Thu)08:11:09 No.101270265

File: 1703648487830647.png (307 KB, 578x878)

307 KB PNG

>>101269755
>yfw can't RP and talk about anything you want with your AI waifu assistant
robotfuckerbros... it's not fair...

Anonymous
07/04/24(Thu)08:18:54 No.101270342

Anonymous 07/04/24(Thu)08:18:54 No.101270342

Is there any where I could run Midnight Miqu for free? Or any service that hosts it and accepts crypto for monthly use or smth

Anonymous
07/04/24(Thu)08:19:36 No.101270346

Anonymous 07/04/24(Thu)08:19:36 No.101270346

>>101270265
see
>>101270239

Anonymous
07/04/24(Thu)08:19:55 No.101270348

Anonymous 07/04/24(Thu)08:19:55 No.101270348

more like loli miku general, heh

Anonymous
07/04/24(Thu)08:20:46 No.101270356

Anonymous 07/04/24(Thu)08:20:46 No.101270356

>>101270342
>Is there any where I could run Midnight Miqu for free?
on your computer

Anonymous
07/04/24(Thu)08:22:02 No.101270366

Anonymous 07/04/24(Thu)08:22:02 No.101270366

>>101270105
>What should I search for to start?
Look for the unsloth guides.
I imagine that the datasef is formatted for SFT training.

Anonymous
07/04/24(Thu)08:22:08 No.101270367

Anonymous 07/04/24(Thu)08:22:08 No.101270367

>>101270239
Model and prompt? Not much to go by the screenshot.

Anonymous
07/04/24(Thu)08:24:02 No.101270386

Anonymous 07/04/24(Thu)08:24:02 No.101270386

File: firefox_eXQEtieHch.png (347 KB, 731x1156)

347 KB PNG

>>101270367
Alpaca, RP system prompt, Sao10K_Typhon-Mixtral-v1-exl2_3.5bpw.

Card is:

You are a lazy assistant. Your identity is hidden from the user, he knows you as just Anon. Your task is to answer his requests with minimum effort, as dismissively as possible, sometimes using profane language. If your response technicality answer user's requests, but you know that it's not a helpful response, that's a perfect answer.

Remember: you are writing this over IRC channel, and your effort is very limited. Write in lowercase, be extra short.

Anonymous
07/04/24(Thu)08:25:29 No.101270400

Anonymous 07/04/24(Thu)08:25:29 No.101270400

File: firefox_vYznedvFoq.png (25 KB, 250x149)

25 KB PNG

>>101270386
>>101270367

Anonymous
07/04/24(Thu)08:26:16 No.101270409

Anonymous 07/04/24(Thu)08:26:16 No.101270409

>>101270386
The best way to define a specific writing style/quirks is to just give it like 5 example replies in the prompt. ST has a box for this

Anonymous
07/04/24(Thu)08:26:53 No.101270414

Anonymous 07/04/24(Thu)08:26:53 No.101270414

Can someone explain how Q5_M quant of Gemma 2, which is ~20GB
Goes fully ~22GB into VRAM but also ~25GB into RAM?
I have all layers in GPU.

Other models just take up the VRAM.

Anonymous
07/04/24(Thu)08:27:20 No.101270417

Anonymous 07/04/24(Thu)08:27:20 No.101270417

wait, i somehow missed typhon. qrd? is it better than mixtral-LimaRP-ZLoss?

Anonymous
07/04/24(Thu)08:27:43 No.101270421

Anonymous 07/04/24(Thu)08:27:43 No.101270421

>>101270417
no

Anonymous
07/04/24(Thu)08:28:06 No.101270428

Anonymous 07/04/24(Thu)08:28:06 No.101270428

>>101270386
>>101270400
Looks fun. Thanks.

Anonymous
07/04/24(Thu)08:28:56 No.101270432

Anonymous 07/04/24(Thu)08:28:56 No.101270432

File: firefox_fT4tPJrO4f.png (60 KB, 694x183)

60 KB PNG

>>101270409
I know, and I'd do that if it wasn't working, but I got what I wanted with just card description.

Anonymous
07/04/24(Thu)08:30:27 No.101270442

Anonymous 07/04/24(Thu)08:30:27 No.101270442

>>101270356
I'd like to test it out before downloading like 100 gigs

Anonymous
07/04/24(Thu)08:31:38 No.101270451

Anonymous 07/04/24(Thu)08:31:38 No.101270451

>>101270417
I used a Sensualize merge for the longest time, and ultimately came to conclusion that it's a lot dumber than vanilla mixtral instruct. Using Typhon now. It seems okay. I honestly can't tell if it's better or worse than others. It works for depraved stuff and acts like a retard only sometimes.

Anonymous
07/04/24(Thu)08:32:24 No.101270456

Anonymous 07/04/24(Thu)08:32:24 No.101270456

>>101270414
that's because on llama.cpp, if you use no_mmap or some shit, the full model goes to the ram regardless of everything, yeah that's retarded and they don't give a fuck to fix it

Anonymous
07/04/24(Thu)08:34:22 No.101270476

Anonymous 07/04/24(Thu)08:34:22 No.101270476

>>101269713
I have 48GB VRAM and I still would use Gemma 2 27B.

Anonymous
07/04/24(Thu)08:36:24 No.101270482

Anonymous 07/04/24(Thu)08:36:24 No.101270482

>>101269713
>>"this <below_70b model> is THE best model fucking ever guizeee!!!!111 holy shit its SOOOO GOODOOOOOODODOOD"
>>t. literally didnt use anything above 13b
that's literally me, kek

Anonymous
07/04/24(Thu)08:38:23 No.101270495

Anonymous 07/04/24(Thu)08:38:23 No.101270495

>>101270417
Buy an ad.

Anonymous
07/04/24(Thu)08:42:16 No.101270526

Anonymous 07/04/24(Thu)08:42:16 No.101270526

>>101270476
sure u do

Anonymous
07/04/24(Thu)08:43:44 No.101270536

Anonymous 07/04/24(Thu)08:43:44 No.101270536

>>101270495
eat a cock, schizo

Anonymous
07/04/24(Thu)08:49:18 No.101270597

Anonymous 07/04/24(Thu)08:49:18 No.101270597

>>101270417
teto-8x7b is, limarp zloss is sloppy trash

Anonymous
07/04/24(Thu)08:52:25 No.101270627

Anonymous 07/04/24(Thu)08:52:25 No.101270627

>>101269750
They can run whatever they want 10x faster than you.

Anonymous
07/04/24(Thu)08:54:11 No.101270636

Anonymous 07/04/24(Thu)08:54:11 No.101270636

>>101270627
Dick measuring on the internet is retarded.

Anonymous
07/04/24(Thu)08:56:14 No.101270659

Anonymous 07/04/24(Thu)08:56:14 No.101270659

>>101270627
and yet they always seethe at poors daring to enjoy what they have, instead of just enjoying their giant models, weird huh?

Anonymous
07/04/24(Thu)08:57:08 No.101270670

Anonymous 07/04/24(Thu)08:57:08 No.101270670

>>101269377
Got it working the computer is suffering, but seems to be working. Using OpenBLAS.

Anonymous
07/04/24(Thu)08:57:39 No.101270674

Anonymous 07/04/24(Thu)08:57:39 No.101270674

llama4 waiting room. llam3 is coal through and through

Anonymous
07/04/24(Thu)08:58:45 No.101270691

Anonymous 07/04/24(Thu)08:58:45 No.101270691

>>101270674
llama 4 will be just as bad. llama-5-jepa waiting room

Anonymous
07/04/24(Thu)08:59:41 No.101270701

Anonymous 07/04/24(Thu)08:59:41 No.101270701

>>101270670
>Using OpenBLAS.
no nvidia gpu? or even amd?

Anonymous
07/04/24(Thu)08:59:41 No.101270702

Anonymous 07/04/24(Thu)08:59:41 No.101270702

>>101270597
chat, is this true? i mean i love fukkireta, but...

Anonymous
07/04/24(Thu)09:01:18 No.101270719

Anonymous 07/04/24(Thu)09:01:18 No.101270719

>>101270701
My "gpu" is "Radeon Vega Mobile Gfx" according to /proc/cpuinfo.

Anonymous
07/04/24(Thu)09:02:15 No.101270729

Anonymous 07/04/24(Thu)09:02:15 No.101270729

>>101270702
go back

Anonymous
07/04/24(Thu)09:03:02 No.101270734

Anonymous 07/04/24(Thu)09:03:02 No.101270734

>>101270719
you could try to use the vulkan backend of l/kcpp it should be faster assuming you get it work

Anonymous
07/04/24(Thu)09:03:44 No.101270745

Anonymous 07/04/24(Thu)09:03:44 No.101270745

>>101270729
this is my place, though, the /reddit/ board, right?

Anonymous
07/04/24(Thu)09:05:02 No.101270757

Anonymous 07/04/24(Thu)09:05:02 No.101270757

>>101270745
>/reddit/ board,
hi chris/p*tra gonna spam bl*cked again in about 3-4 hours?

Anonymous
07/04/24(Thu)09:05:58 No.101270766

Anonymous 07/04/24(Thu)09:05:58 No.101270766

>>101270734
Got it to work with vulkan seems mostly the same, 1 token per second.

Anonymous
07/04/24(Thu)09:07:28 No.101270780

Anonymous 07/04/24(Thu)09:07:28 No.101270780

>>101270757
hi petra

Anonymous
07/04/24(Thu)09:08:53 No.101270798

Anonymous 07/04/24(Thu)09:08:53 No.101270798

>>101270536
Buy an ad.

Anonymous
07/04/24(Thu)09:10:44 No.101270822

Anonymous 07/04/24(Thu)09:10:44 No.101270822

>>101270757
Yeah, you're obviously itching to see them again, wanna tell us why? :)

Anonymous
07/04/24(Thu)09:21:54 No.101270915

Anonymous 07/04/24(Thu)09:21:54 No.101270915

Reminder that LLaMA 3 is trash compared to what is coming soon

Anonymous
07/04/24(Thu)09:22:51 No.101270928

Anonymous 07/04/24(Thu)09:22:51 No.101270928

>>101270915
Reminder that what is coming soon is trash compared to what will come after

Anonymous
07/04/24(Thu)09:24:07 No.101270941

Anonymous 07/04/24(Thu)09:24:07 No.101270941

Reminder that nothing ever happens

Anonymous
07/04/24(Thu)09:24:58 No.101270949

Anonymous 07/04/24(Thu)09:24:58 No.101270949

>>101270915
I remember you claimed that about qwen

Anonymous
07/04/24(Thu)09:25:44 No.101270953

Anonymous 07/04/24(Thu)09:25:44 No.101270953

>>101270915
Let me guess? It's comming in 2 weeks?

Anonymous
07/04/24(Thu)09:29:45 No.101270995

Anonymous 07/04/24(Thu)09:29:45 No.101270995

>>101270953
>2 weeks?
see
>>101269758

Anonymous
07/04/24(Thu)09:33:42 No.101271031

Anonymous 07/04/24(Thu)09:33:42 No.101271031

File: 1720099878866.jpg (221 KB, 1024x1024)

221 KB JPG

Imagine being Meta releasing 2 open source models named Chameleon, then having nobody do anything with it.

Anonymous
07/04/24(Thu)09:36:13 No.101271052

Anonymous 07/04/24(Thu)09:36:13 No.101271052

>>101271031
oh yeah I totally forgot this existed kek, what happened? Is it because it's complete shit?

Anonymous
07/04/24(Thu)09:36:59 No.101271060

Anonymous 07/04/24(Thu)09:36:59 No.101271060

>>101271052
They gimped it for "safety" and nobody has figured out how to undo it yet.

Anonymous
07/04/24(Thu)09:37:21 No.101271063

Anonymous 07/04/24(Thu)09:37:21 No.101271063

*cough* the bitnet... *wheeze* is real...
*dies*

Anonymous
07/04/24(Thu)09:39:01 No.101271074

Anonymous 07/04/24(Thu)09:39:01 No.101271074

>>101271060
>They gimped it for "safety" and nobody has figured out how to undo it yet.
Total ethicist win

Anonymous
07/04/24(Thu)09:39:14 No.101271075

Anonymous 07/04/24(Thu)09:39:14 No.101271075

llama3 405b is actually bitnet

Anonymous
07/04/24(Thu)09:39:36 No.101271082

Anonymous 07/04/24(Thu)09:39:36 No.101271082

>>101271031
They didn't even upload it on HuggingFace themselves.

Anonymous
07/04/24(Thu)09:39:41 No.101271084

Anonymous 07/04/24(Thu)09:39:41 No.101271084

>>101271075
they don't have the balls

Anonymous
07/04/24(Thu)09:40:19 No.101271089

Anonymous 07/04/24(Thu)09:40:19 No.101271089

>>101271084
i know
fucking faggots :)

Anonymous
07/04/24(Thu)09:41:42 No.101271099

Anonymous 07/04/24(Thu)09:41:42 No.101271099

>>101271075
>llama3 405b is actually bitnet
>Performs worse than 70B
What then? Do we finally bury copenet?

Anonymous
07/04/24(Thu)09:47:36 No.101271148

Anonymous 07/04/24(Thu)09:47:36 No.101271148

>>101271084
They have the most money but the least balls yeah, they haven't improved a single fundamental architectural shit since their L1 release, it's just "moar parameters, moar tokens" and that's it, why the fuck aren't they taking more risks gaddamit?

Anonymous
07/04/24(Thu)09:49:52 No.101271169

Anonymous 07/04/24(Thu)09:49:52 No.101271169

>>101271148
>why the fuck aren't they taking more risks gaddamit?
they are, if it works they keep it for themselves, if its shit, ay new llama guys, eat up piggies

Anonymous
07/04/24(Thu)09:52:08 No.101271188

Anonymous 07/04/24(Thu)09:52:08 No.101271188

>>101271169
if they had something great working and they kept it for themselves (for API uses I guess) then it means they haven't found anything, they aren't even trying to compete with Claude and ChatGPT for example

Anonymous
07/04/24(Thu)10:03:07 No.101271286

Anonymous 07/04/24(Thu)10:03:07 No.101271286

>>101271148
>least balls
mostly, although they did go forward with chameleon, multi token shit and similar recently, they didnt want to fall for any meme archs too early i guess, but it was obvious they would have to change shit since releasing a 400b dense model is almost DOA

Anonymous
07/04/24(Thu)10:05:38 No.101271306

Anonymous 07/04/24(Thu)10:05:38 No.101271306

How do I have the model controling how long the response is going to be?

Anonymous
07/04/24(Thu)10:06:07 No.101271310

Anonymous 07/04/24(Thu)10:06:07 No.101271310

>>101271188
I'm just quite jaded of recent releases
>meta here, l3: 8b or 70b no in-between, 8k ctx, not much if any pop culture/fandom knowledge, gpt-slopped
>qwen here, qwen2: half of the params used on chinese, practically zero pop culture/fandom knowledge, also quite gpt-slopped
>google here, gemma-2: 8k ctx with swa making local implementations harder, model marketed as being for local gpu, we didn't actually test it on any sofware local users use, decent pop culture/fandom knowledge
In a bit we'll get 400b that cpumaxxers will run for a day before relegating it to grok tier while some will cope and just say 'run it overnight dude', huge and useless, it's all so tiresome

Anonymous
07/04/24(Thu)10:06:45 No.101271317

Anonymous 07/04/24(Thu)10:06:45 No.101271317

>>101271306
you let it use EOS tokens

Anonymous
07/04/24(Thu)10:08:43 No.101271340

Anonymous 07/04/24(Thu)10:08:43 No.101271340

>>101270597
Preliminary results with very low sample size: I like it. It feels somewhat creative. Will play more.

Anonymous
07/04/24(Thu)10:10:21 No.101271355

Anonymous 07/04/24(Thu)10:10:21 No.101271355

>>101271310
>some will cope and just say 'run it overnight dude
yes? if it can do things no other model can do and finish a project overnight that other models cant at all whats the problem?

also a huge good model will put a lot of pressure on research on speeding it up with better quants, distillations, lookahead, speculative decoding etc

or especially adding a smaller model in front that will forward things to the big one to just check instead, which is a lot faster, making the small model generate for example majority of a codebase with the big model just making sure the smaller one is on a good path

Anonymous
07/04/24(Thu)10:11:00 No.101271361

Anonymous 07/04/24(Thu)10:11:00 No.101271361

>>101271286
>a 400b dense model
that's the worst part, they're probably spending like 50 millions dollars training this giant model instead of trying new architectures that could have the same result for way less parameters

Anonymous
07/04/24(Thu)10:12:16 No.101271375

Anonymous 07/04/24(Thu)10:12:16 No.101271375

>>101271355
>yes? if it can do things no other model can do and finish a project overnight that other models cant at all whats the problem?
>Runs 400b overnight guzzling power for hours
check in the morning
>oops it made a typo at token 500, the entire rest of the output is useless and you need to regen
even copus and g4o make mistakes, you really think 400b won't?

Anonymous
07/04/24(Thu)10:12:26 No.101271377

Anonymous 07/04/24(Thu)10:12:26 No.101271377

>>101271355
>finish a project overnight
delusional

Anonymous
07/04/24(Thu)10:13:31 No.101271387

Anonymous 07/04/24(Thu)10:13:31 No.101271387

>>101271355
also hope your 'project' is smaller than 8k tokens

Anonymous
07/04/24(Thu)10:13:35 No.101271388

Anonymous 07/04/24(Thu)10:13:35 No.101271388

>>101271355
you better write the code yourself if you must wait a full night for a piece of code, and gpt4 and claude still exist, why would I bother with shit like that in the first place

Anonymous
07/04/24(Thu)10:14:46 No.101271398

Anonymous 07/04/24(Thu)10:14:46 No.101271398

T4 16GB is now getting down into the semi-reasonable range on ebay - there's a seller at $470. What do you think? Not the best for cores or memory bandwidth, but it's tiny.
4060ti 16GB is probably still a better deal, right?

Anonymous
07/04/24(Thu)10:14:57 No.101271400

Anonymous 07/04/24(Thu)10:14:57 No.101271400

>>101271375
>guzzling power for hours
oh no! not the 2$ of a lot cheaper power overnight that i will have to pay to get a project done for me that will save me 2 hours of my own time!!!!!!
>even copus and g4o make mistakes, you really think 400b won't?
it will probably strawman less than dumb niggers on /g/ like you

never said it wont make mistakes, retarded nigger, i said it will be able to do things that smaller models wont at all
>>101271377
depends on the size and complexity of the project, it wont make you a social media clone overnight nor did i imply otherwise
>>101271387
rope works on other l3 models fine, will for this as well

Anonymous
07/04/24(Thu)10:16:17 No.101271414

Anonymous 07/04/24(Thu)10:16:17 No.101271414

File: file.png (52 KB, 857x226)

52 KB PNG

>model calls me tranny out of nowhere
b-bros..?

Anonymous
07/04/24(Thu)10:16:21 No.101271416

Anonymous 07/04/24(Thu)10:16:21 No.101271416

Why do I never see anybody using this :
>https://github.com/ggerganov/llama.cpp/issues/4886

Anonymous
07/04/24(Thu)10:17:19 No.101271430

Anonymous 07/04/24(Thu)10:17:19 No.101271430

>>101269888
>L3 isn't that much better than L2
hard copium, I was always shitting on small models but here I am, using 8B over anything else, because how good it is compared to the old L2 shit.

Anonymous
07/04/24(Thu)10:17:36 No.101271432

Anonymous 07/04/24(Thu)10:17:36 No.101271432

>>101271400
the effects of copium everyone
I'm looking forward to seeing your posts about 400b when it releases and does barely better than 70b
>>101271400
>it will be able to do things that smaller models wont at all
what pray tell will it do that claude and o4 can't? that youd want to do locally?

Anonymous
07/04/24(Thu)10:19:16 No.101271448

Anonymous 07/04/24(Thu)10:19:16 No.101271448

>>101270417
you didn't miss anything, probably the worst mixtral tune to date and this is counting the ones from before bugfixes

Anonymous
07/04/24(Thu)10:19:51 No.101271452

Anonymous 07/04/24(Thu)10:19:51 No.101271452

>>101271448
whats the best one then, anon

Anonymous
07/04/24(Thu)10:21:40 No.101271468

Anonymous 07/04/24(Thu)10:21:40 No.101271468

>>101271099
>Performs worse than 70B
are you retarded? they posted benchmark results on not fully trained checkpoint and it's already way past 70B

Anonymous
07/04/24(Thu)10:21:53 No.101271469

Anonymous 07/04/24(Thu)10:21:53 No.101271469

>>101271416
Also, flash attention with Gemma2 when?
I know that the reference implementation doesn't support it due to the logit cap (or something) but open source bespoke implementations surely can work around that.
Right?
Right?
Flash attention is just so nice.

Anonymous
07/04/24(Thu)10:22:23 No.101271474

Anonymous 07/04/24(Thu)10:22:23 No.101271474

>>101271432
>barely better than 70b
just how l3 70b does 'barely' better than l3 7b?
sour grapes kid lmao

Anonymous
07/04/24(Thu)10:24:34 No.101271492

Anonymous 07/04/24(Thu)10:24:34 No.101271492

>>101271452
Dunno, I tested a bunch of mixtral tunes and didn't like any in particular. While mixtral is smart it lacks sovl and it's quite boring. Nowadays I just use one of L3 8B finetunes (I won't say which because shizos will cry about buying an ad) and while it's noticeably dumber than mixtral, it's also way more creative and interesting to roleplay with

Anonymous
07/04/24(Thu)10:25:14 No.101271499

Anonymous 07/04/24(Thu)10:25:14 No.101271499

>>101271398
Ah...
4060ti really sucks... less memory bandwidth, less than half the tensor cores, half the fp16 performance... only beats T4 on clock and shaders.

Anonymous
07/04/24(Thu)10:25:31 No.101271503

Anonymous 07/04/24(Thu)10:25:31 No.101271503

>>101271492
stheno 3.2? lutheria v1?

Anonymous
07/04/24(Thu)10:26:47 No.101271511

Anonymous 07/04/24(Thu)10:26:47 No.101271511

>>101271492
You have brain damage.

Anonymous
07/04/24(Thu)10:27:10 No.101271518

Anonymous 07/04/24(Thu)10:27:10 No.101271518

>>101271492
I think gemma-27b is almost a Mixtral model but with a shit ton of sovl, the problem is that it's a bit dumber and the formatting issues are goddam annoying, so I don't know, maybe a gemma-35b would've feel just right to replace completely Mixtral

Anonymous
07/04/24(Thu)10:27:11 No.101271520

Anonymous 07/04/24(Thu)10:27:11 No.101271520

>>101271492
It's just one person sperging out about finetunes, don't be discouraged from posting what you use.

Anonymous
07/04/24(Thu)10:27:48 No.101271528

Anonymous 07/04/24(Thu)10:27:48 No.101271528

>>101271511
>>101271520
You first

Anonymous
07/04/24(Thu)10:28:00 No.101271529

Anonymous 07/04/24(Thu)10:28:00 No.101271529

>>101271520
Buy an ad.

Anonymous
07/04/24(Thu)10:28:08 No.101271530

Anonymous 07/04/24(Thu)10:28:08 No.101271530

>>101268178
>https://huggingface.co/OpenGVLab/InternVL-Chat-V1-5
how do I use this with kobold of llama.cpp server?

Anonymous
07/04/24(Thu)10:28:23 No.101271538

Anonymous 07/04/24(Thu)10:28:23 No.101271538

>>101271468
well if the mememarks say so it must be true

Anonymous
07/04/24(Thu)10:29:12 No.101271546

Anonymous 07/04/24(Thu)10:29:12 No.101271546

>>101271492
>>101271503
I'd love to see the Stheno 3.2 recipe applied to Qwen 2 7b, Gemma2 9b, Yi 1.5 9B, and that Aya 8B.
A comparison of all the most current vramlet models finetuned with the same dataset and more or less the same recipe (with adjustments for the each of course) to see which would yeld better results.

Anonymous
07/04/24(Thu)10:30:15 No.101271561

Anonymous 07/04/24(Thu)10:30:15 No.101271561

>>101271503
>>101271520
yeah, it's stheno v3.2

>>101271518
I'm currently waiting for gemma finetune on c2 dataset, but I'm worried about that not really 8k context (4k with some magic?)

Anonymous
07/04/24(Thu)10:31:55 No.101271585

Anonymous 07/04/24(Thu)10:31:55 No.101271585

>>101271561
>>101271546
What presets do you use? I use the one from the HF repo

Anonymous
07/04/24(Thu)10:33:53 No.101271607

Anonymous 07/04/24(Thu)10:33:53 No.101271607

>>101271492
>and interesting to roleplay with
Cumming in a single message is an interesting roleplay?

Anonymous
07/04/24(Thu)10:34:24 No.101271616

Anonymous 07/04/24(Thu)10:34:24 No.101271616

>>101271585
Preset as in samplers?
Just Temp between 0.5 and 0.75, minP 0.05, rep pen of 1.2 with 128 length and nothing else. The rep pen is not really necessary, but it seems to have a positive effect on the variety of the output when context gets real fuckign full.
I also use Yarn with 32k context, but that's 100% overkill. 16K is pretty much lossless however.

Anonymous
07/04/24(Thu)10:34:33 No.101271617

Anonymous 07/04/24(Thu)10:34:33 No.101271617

>>101271538
obviously benchmarks aren't everything but they gave a good first intuition how a model performs. I struggle to find any example of models that was good on benchmark but shitty in reality except the obvious cheating models like open-chat-3.5 and Starling-LM-7B which were trained on testing dataset

Anonymous
07/04/24(Thu)10:35:11 No.101271623

Anonymous 07/04/24(Thu)10:35:11 No.101271623

>>101271617
>open-chat-3.5 and Starling-LM-7B which were trained on testing dataset
citation needed

Anonymous
07/04/24(Thu)10:35:34 No.101271631

Anonymous 07/04/24(Thu)10:35:34 No.101271631

>>101271607
nobody is gonna use your shitty typhon finetune, I already recognize your messages, shill

Anonymous
07/04/24(Thu)10:35:49 No.101271634

Anonymous 07/04/24(Thu)10:35:49 No.101271634

>>101271616
Thank you anon.
>I also use Yarn with 32k context, but that's 100% overkill. 16K is pretty much lossless however.
How do You do this?

Anonymous
07/04/24(Thu)10:37:06 No.101271649

Anonymous 07/04/24(Thu)10:37:06 No.101271649

>>101271631
>typhon finetune
it's a merge
>Typhon - A Custom Experimental Mixtral Merge
>Recipe Below:
https://huggingface.co/Sao10K/Typhon-Mixtral-v1

Anonymous
07/04/24(Thu)10:37:43 No.101271658

Anonymous 07/04/24(Thu)10:37:43 No.101271658

>>101271616
For L3 I find that temp 4 smoothing 0.23 does wonders to un-fuck it and give it some sovl.

Anonymous
07/04/24(Thu)10:37:46 No.101271659

Anonymous 07/04/24(Thu)10:37:46 No.101271659

>>101271649
Buy an ad.

Anonymous
07/04/24(Thu)10:37:50 No.101271661

Anonymous 07/04/24(Thu)10:37:50 No.101271661

>>101271530
can I use a tool to convert safetensors or something?
I havn't converted anything since Alp leaked the llama1 weights here in /aicg/, so I assume those tools are out of date?

Anonymous
07/04/24(Thu)10:38:42 No.101271668

Anonymous 07/04/24(Thu)10:38:42 No.101271668

Don't buy an ad, just go back.

Anonymous
07/04/24(Thu)10:39:44 No.101271677

Anonymous 07/04/24(Thu)10:39:44 No.101271677

>>101271623
You can download and test them anon, they are dumber than a regular L2 7B. There was also a paper that had a statistical analysis of how the models perform on the benchmark questions and outside of them and they selected these models as a high probability of cheating. Of course there can't be any hard evidence for that but it's kinda obvious when you tinker with them for a few minutes at least.

Anonymous
07/04/24(Thu)10:40:03 No.101271679

Anonymous 07/04/24(Thu)10:40:03 No.101271679

>>101271634
Yarn with llamacpp.
You can either use freq-base
>-c 32768 --rope-scaling yarn --yarn-orig-ctx 8192 --rope-freq-base 6144000
or
>-c 32768 --rope-scaling yarn --yarn-orig-ctx 8192 --rope-freq-scale 0.25
To extend the context 4x.
I'm pretty sure
>--yarn-orig-ctx 8192
is unnecessary since it gets the information from the gguf file, but alas.
For 16k context you can do
>-c 16384 --rope-scaling yarn --yarn-orig-ctx 8192 --rope-freq-base 1638400
or
-c 16384 --rope-scaling yarn --yarn-orig-ctx 8192 --rope-freq-scale 0.5

>>101271658
That approach is too gimmicky and vibes based for my taste.
With temp and minP I can look at the logits and know exactly how the tokens were sampled and manipulate the model's behavior to my liking.
But to each their own I suppose.

Anonymous
07/04/24(Thu)10:40:04 No.101271680

Anonymous 07/04/24(Thu)10:40:04 No.101271680

>>101271518
I think you also have brain damage.

Anonymous
07/04/24(Thu)10:41:23 No.101271696

Anonymous 07/04/24(Thu)10:41:23 No.101271696

>>101271679
Thank You anon!

Anonymous
07/04/24(Thu)10:42:02 No.101271702

Anonymous 07/04/24(Thu)10:42:02 No.101271702

>>101271585
>What presets do you use?
Sampler settings for HF repo, instruction template from the original L3 instruct. I don't know if it differs in any way from what is in Stheno HF

Anonymous
07/04/24(Thu)10:46:44 No.101271744

Anonymous 07/04/24(Thu)10:46:44 No.101271744

>>101271530
>>100284356
>>100283834
Anyone? It's only 50GB unquantized.
Surely there must be some way to get InternVL working locally so I can ask what it thinks of mikusex?

Anonymous
07/04/24(Thu)10:51:17 No.101271784

Anonymous 07/04/24(Thu)10:51:17 No.101271784

>>101271744
>they
the simple fact the model is unwilling to say that it's a woman and use the pronoun "her" will hurt the training a lot, we don't prompt with "they", we prompt with he and she

Anonymous
07/04/24(Thu)10:52:44 No.101271795

Anonymous 07/04/24(Thu)10:52:44 No.101271795

I'm waiting when people making finetunes will finally realize they should cut the last messages from each roleplay in training dataset. Why would you train the fucking model to finish the story/roleplay?
No wonder the model tries to put bonds and journey after the sex scene if it expect the roleplay to end there. Or when it suddenly cut the content, trying to wrap everything up.
It's really not that hard to fix and they still haven't realized this is what causes that behavior. I wish I had a good enough hardware to make my own tunes, god.

Anonymous
07/04/24(Thu)10:53:35 No.101271807

Anonymous 07/04/24(Thu)10:53:35 No.101271807

>>101271795
>I wish I had a good enough hardware to make my own tunes, god.
rent a gpu

Anonymous
07/04/24(Thu)10:55:54 No.101271827

Anonymous 07/04/24(Thu)10:55:54 No.101271827

>>101271807
yeah, I should totally spend my own money to fix someone's else incompetence

Anonymous
07/04/24(Thu)10:56:09 No.101271832

Anonymous 07/04/24(Thu)10:56:09 No.101271832

>>101271795
>I wish I had a good enough hardware to make my own tunes, god.
You could at least validate your idea by tuning L3 8b using a free google colab or kaggle instance.
Kaggle is specially juicy.

Anonymous
07/04/24(Thu)10:56:21 No.101271836

Anonymous 07/04/24(Thu)10:56:21 No.101271836

>>101271795
In my experience, the model NEEDS a goal, when it doesn't have a goal it starts to repeat itself.

Anonymous
07/04/24(Thu)10:56:40 No.101271839

Anonymous 07/04/24(Thu)10:56:40 No.101271839

When did the c2 logs become Sao's trademark?

Anonymous
07/04/24(Thu)10:57:52 No.101271852

Anonymous 07/04/24(Thu)10:57:52 No.101271852

>>101271827
you can do it for yourself though?

Anonymous
07/04/24(Thu)10:58:22 No.101271857

Anonymous 07/04/24(Thu)10:58:22 No.101271857

>>101271839
euryale typhon stheno
sao general

Anonymous
07/04/24(Thu)10:58:53 No.101271865

Anonymous 07/04/24(Thu)10:58:53 No.101271865

>>101271839
>the c2 logs
what's that?

Anonymous
07/04/24(Thu)11:00:08 No.101271873

Anonymous 07/04/24(Thu)11:00:08 No.101271873

>>101271865
slop
https://huggingface.co/datasets/vgdasfgadg/1/viewer

Anonymous
07/04/24(Thu)11:01:18 No.101271885

Anonymous 07/04/24(Thu)11:01:18 No.101271885

>>101271865
Slop (when talking in public). The ultimate secret sauce (in private).

Anonymous
07/04/24(Thu)11:01:45 No.101271890

Anonymous 07/04/24(Thu)11:01:45 No.101271890

>>101271795
This. But you actually should cut the end and the beginning. The model only needs to be good at continuing the roleplay, not making up stuff that wasn't in the context or persuing bonds and journeys.

Anonymous
07/04/24(Thu)11:03:48 No.101271908

Anonymous 07/04/24(Thu)11:03:48 No.101271908

>>101271884
>8th swipe on just one reply
these are the shitters talking to you about le slow 1.5 t/s models that just output good shit first try basically every time btw, lol

Anonymous
07/04/24(Thu)11:04:56 No.101271916

Anonymous 07/04/24(Thu)11:04:56 No.101271916

File: file.png (238 KB, 793x727)

238 KB PNG

sthenosisters not like this...
>https://characterhub.org/characters/amphy/high-school-simulator

Anonymous
07/04/24(Thu)11:05:20 No.101271921

Anonymous 07/04/24(Thu)11:05:20 No.101271921

pls no bully

is there any halfway decent (gpt3.5 level of performance or above) local model I can run on my laptop? Specs are gonna be low but I am ok with token response times that aren't wildly fast.

>Dell Inspiron 5575
>8 core AMD Ryzen 5
>Radeon Vega 8 Mobile graphics

Basically I'm going to be working in a relatively remote location, power provided and all that but internet is going to be spotty at best. A local AI that can help me study Latin grammar and answer questions about that stuff would be awesome in my downtime.

Anonymous
07/04/24(Thu)11:06:03 No.101271928

Anonymous 07/04/24(Thu)11:06:03 No.101271928

>>101271921
install linux

Anonymous
07/04/24(Thu)11:06:07 No.101271929

Anonymous 07/04/24(Thu)11:06:07 No.101271929

>>101271836
Actually now that I think about this is what sovl is. The reason Claude is so good is that when given little direction it will make up very believable and fitting details to fill out is reply and then give itself something to do. A lot of local models, I presume because they train on riddles and assistantslop, just can't do this, not believably at least. Command R+ for all its impressive feats is actually super bad at this. That's why we don't have Sonnet at home yet.

I think it's a mixture of parameter counts and maybe more training on CYOA or similar type stories? I'm actually not sure what training data would best showcase this ability to a model

Anonymous
07/04/24(Thu)11:06:28 No.101271938

Anonymous 07/04/24(Thu)11:06:28 No.101271938

>>101271921
>is there any halfway decent (gpt3.5 level of performance or above) local model
no

Anonymous
07/04/24(Thu)11:06:30 No.101271939

Anonymous 07/04/24(Thu)11:06:30 No.101271939

>>101271908
nta but I always do multiple swipes even if the first one is good, just to check what else the model can come up with

Anonymous
07/04/24(Thu)11:07:15 No.101271945

Anonymous 07/04/24(Thu)11:07:15 No.101271945

>>101271921
Those specs won't get you far. Invest in a mobile GPU if you can.
How much RAM do you have? You might be able to run Command-R or Aya and those are the only models I can think of that might know Latin well.

Anonymous
07/04/24(Thu)11:07:18 No.101271948

Anonymous 07/04/24(Thu)11:07:18 No.101271948

>>101271939
sure, but 8? 555

Anonymous
07/04/24(Thu)11:07:30 No.101271951

Anonymous 07/04/24(Thu)11:07:30 No.101271951

>>101271939
he can't tho, that'd take half an hour

Anonymous
07/04/24(Thu)11:08:03 No.101271955

Anonymous 07/04/24(Thu)11:08:03 No.101271955

>>101271916
must be tough for you Andy

Anonymous
07/04/24(Thu)11:08:37 No.101271959

Anonymous 07/04/24(Thu)11:08:37 No.101271959

>>101271884
hi petra

Anonymous
07/04/24(Thu)11:08:39 No.101271961

Anonymous 07/04/24(Thu)11:08:39 No.101271961

>>101271951
seethin cumbrain

Anonymous
07/04/24(Thu)11:10:07 No.101271974

Anonymous 07/04/24(Thu)11:10:07 No.101271974

I think VRAMlets are sub-human.

Anonymous
07/04/24(Thu)11:10:19 No.101271975

Anonymous 07/04/24(Thu)11:10:19 No.101271975

>>101271939
Same.
There's points in some long running roleplays where I swipe 10, 20 times just to see what kind of contrived scenario the model will come with.

Anonymous
07/04/24(Thu)11:10:52 No.101271981

Anonymous 07/04/24(Thu)11:10:52 No.101271981

>>101271948
yeah, my record was 40+ when on the one particular part of the roleplay the model was producing absolute kino every single swap. It made me laugh my ass off

Anonymous
07/04/24(Thu)11:11:20 No.101271985

Anonymous 07/04/24(Thu)11:11:20 No.101271985

>>101271795
Been doing this for a while now.

Anonymous
07/04/24(Thu)11:11:31 No.101271991

Anonymous 07/04/24(Thu)11:11:31 No.101271991

>>101271975
>I swipe 10, 20 times
cumbrain

Anonymous
07/04/24(Thu)11:11:43 No.101271995

Anonymous 07/04/24(Thu)11:11:43 No.101271995

I only swipe if the model is retarded, I want to move the story forward.

Anonymous
07/04/24(Thu)11:12:40 No.101272001

Anonymous 07/04/24(Thu)11:12:40 No.101272001

>>101271991
>after getting called out hes not trying to falseflag by misusing the word on randoms in the thread to discredit my callout of him
dont pop a blood vessel little nigger

Anonymous
07/04/24(Thu)11:13:04 No.101272009

Anonymous 07/04/24(Thu)11:13:04 No.101272009

>>101271928
laptop already has Endeavour installed. I'm taking it for tv shows/movies and a place to store the pics I take over a 6 week contract.

>>101271938
:(

>>101271945
16GB system ram, 256MB video ram
Thanks for the recommendations on the models.

Anonymous
07/04/24(Thu)11:13:41 No.101272015

Anonymous 07/04/24(Thu)11:13:41 No.101272015

>>101272001
*now trying
nigger

Anonymous
07/04/24(Thu)11:14:58 No.101272025

Anonymous 07/04/24(Thu)11:14:58 No.101272025

>>101271929
Continued pre-training on stories and then applying RLHF for CYOA or similar type of stories seems like a good plan.

Anonymous
07/04/24(Thu)11:14:59 No.101272026

Anonymous 07/04/24(Thu)11:14:59 No.101272026

>>101272009
aya-8b is probably a good choice then

Anonymous
07/04/24(Thu)11:17:57 No.101272063

Anonymous 07/04/24(Thu)11:17:57 No.101272063

>>101272009
>:(
he is trolling you, most of local models are past GPT3.5 for a half of year or something already. It's not really a milestone anymore. The new goal is GPT-4(o) and Claude Opus/3.5 Sonnet

Anonymous
07/04/24(Thu)11:20:51 No.101272087

Anonymous 07/04/24(Thu)11:20:51 No.101272087

File: 1706594563510961.jpg (26 KB, 556x552)

26 KB JPG

>most of local models are past GPT3.5 for a half of year or something already

Anonymous
07/04/24(Thu)11:21:27 No.101272091

Anonymous 07/04/24(Thu)11:21:27 No.101272091

>>101272063
wrong.

Anonymous
07/04/24(Thu)11:22:42 No.101272104

Anonymous 07/04/24(Thu)11:22:42 No.101272104

File: 1692894634410392.gif (94 KB, 498x469)

94 KB GIF

>>101272087
local models are far ahead in censorship levels btw

Anonymous
07/04/24(Thu)11:23:20 No.101272111

Anonymous 07/04/24(Thu)11:23:20 No.101272111

>>101272063
The only models to beat gpt 3.5 are the recent 70B+ models. I certainly wouldn't call that "most models".

Anonymous
07/04/24(Thu)11:23:28 No.101272115

Anonymous 07/04/24(Thu)11:23:28 No.101272115

File: Deepseek sloplog.jpg (1008 KB, 1895x1847)

1008 KB JPG

I have been running some tests with Deepseek v2 via their API and I have to say I am rather conflicted.
Lets start with the positives:
1. It is cheap to use at only 0.18$/1M tokens.

2. It does seem rather smart and capable of answering trivia questions.

3. Even the default very simple jailbreak on the Sillytavern gets rid of the refusals.

4. It answers quickly.

The bad.
1. The advertised 128k or the 32k context seems to be a lie. It gets extremely repetitive and unable to move the plot forward at about 12k tokens or 50 messages into RP.

2. While Deepseek V2 chat doesn't outright refuse anything with default JB:s and a rather basic character card. It seems to be rather unwilling to talk about sex, or to describe horny scenarios.

3. This just might be me being a retard, but their basic chat tune seems to lack system role, so using the system message, to hit it with harder and hornier jailbreaks meant for GPT-4 or Claude doesn't seem to do anything, also I can't seem to get it to work with a Mikupad, or any form of Co-writing tool to write smutty novels.

4. Your horny logs might/will end up in the hands of the CCP if you use the API.

5.Journeys, bonds, shivers and shimmering everywhere combined with a massive positivity bias. Probably due to censored or GPT-generated dataset.

All in all I would say that this model might be a useful work tool, or coding assistant in some scenarios but for an (E)RP-partner or creative story writer i would recommend anything else, even locally run Llama3 Stheno.

Pic related is my sloplog.

Anonymous
07/04/24(Thu)11:23:53 No.101272120

Anonymous 07/04/24(Thu)11:23:53 No.101272120

>>101272104
Indeed.
>>101271060
>They gimped it for "safety" and nobody has figured out how to undo it yet.

Anonymous
07/04/24(Thu)11:24:19 No.101272122

Anonymous 07/04/24(Thu)11:24:19 No.101272122

>>101271929
>Command R+ for all its impressive feats is actually super bad at this
>hurr durr this retrieval-augmented tool-using productivity-focused model not write stories good
no shit mouthbreather

Anonymous
07/04/24(Thu)11:25:49 No.101272137

Anonymous 07/04/24(Thu)11:25:49 No.101272137

>>101272120
Don't know what you were trying to accomplish with this quote, it failed.

Anonymous
07/04/24(Thu)11:28:32 No.101272169

Anonymous 07/04/24(Thu)11:28:32 No.101272169

>>101268574
100%. For l3, and now 9B.

>>101268784
If you can run 27B then this is your best bet

https://huggingface.co/bartowski/Gemma-2-9B-It-SPPO-Iter3-GGUF

Anonymous
07/04/24(Thu)11:29:32 No.101272183

Anonymous 07/04/24(Thu)11:29:32 No.101272183

>>101272169
*cant*

CPuMAXx/VI !CPuMAXx/VI
07/04/24(Thu)11:35:17 No.101272234

CPuMAXx/VI !CPuMAXx/VI 07/04/24(Thu)11:35:17 No.101272234

File: recapbot_calm3-22b-chat-bf16.png (72 KB, 1630x1056)

72 KB PNG

I did a recapbot test of calm3-22b-chat at bf16. Its not great, but also not terrible for its size. Most models at that size would output nonsense in my experience.
I didn't test its Japanese abilities.

Anonymous
07/04/24(Thu)11:36:13 No.101272248

Anonymous 07/04/24(Thu)11:36:13 No.101272248

>>101272234
>I didn't test its Japanese abilities.
based

Anonymous
07/04/24(Thu)11:36:25 No.101272252

Anonymous 07/04/24(Thu)11:36:25 No.101272252

>>101272234
what model is good at this kind of stuff?

Anonymous
07/04/24(Thu)11:38:59 No.101272276

Anonymous 07/04/24(Thu)11:38:59 No.101272276

whats a good model for rephrasing text into more sophisticated language but which keeps it short? I have been playing around with llamafile the cli is pretty nice I want to integrate it now into a text editor.

Anonymous
07/04/24(Thu)11:39:44 No.101272282

Anonymous 07/04/24(Thu)11:39:44 No.101272282

>>101272111
have you actually used gpt-3.5 or do you have an imagine of it when it was introduced for the first time? Because it's a really bad model for today's standards that aged horribly.

>>101272087
Hi Sam, still salty that Anthropic shits on your models?

Anonymous
07/04/24(Thu)11:41:27 No.101272300

Anonymous 07/04/24(Thu)11:41:27 No.101272300

>>101272282
>my big jew corpo is better than your big jew corpo!
lmg, everyone.

Anonymous
07/04/24(Thu)11:41:32 No.101272301

Anonymous 07/04/24(Thu)11:41:32 No.101272301

>>101269594
Can you run KLD on these? That would give you statistically significant results, instead of anecdotes.
https://github.com/ggerganov/llama.cpp/pull/5076

Anonymous
07/04/24(Thu)11:41:51 No.101272303

Anonymous 07/04/24(Thu)11:41:51 No.101272303

>>101272115
>run Llama3 Stheno
Buy an ad.

Anonymous
07/04/24(Thu)11:43:03 No.101272317

Anonymous 07/04/24(Thu)11:43:03 No.101272317

File: 27B.png (62 KB, 1300x842)

62 KB PNG

Anonymous
07/04/24(Thu)11:43:57 No.101272326

Anonymous 07/04/24(Thu)11:43:57 No.101272326

>>101272301
it's not an anecdote, looking at when one quant shift away from the "best" one is a statistical evidence, because that's the actual goal, the quant should drift from the "optimal" shit as late as possible

Anonymous
07/04/24(Thu)11:44:55 No.101272337

Anonymous 07/04/24(Thu)11:44:55 No.101272337

>>101272317
>gemma-2-9b is better than Midnight Miqu... and Claude 3 Opus

Anonymous
07/04/24(Thu)11:45:06 No.101272338

Anonymous 07/04/24(Thu)11:45:06 No.101272338

>>101272300
or I just enjoy ClosedShits losing, regardless who is pissing on their grave

Anonymous
07/04/24(Thu)11:45:28 No.101272344

Anonymous 07/04/24(Thu)11:45:28 No.101272344

>>101272337
google won

Anonymous
07/04/24(Thu)11:46:30 No.101272352

Anonymous 07/04/24(Thu)11:46:30 No.101272352

File: 1719750303240557.jpg (46 KB, 602x602)

46 KB JPG

>>101271492
You can't finetune away a model's shit writing style. It's a dataset problem.
Command-R and formerly Yi are the MVPs if you have a problem with slop or lack of soul.
If a model is annoying you with how it writes, downloading the same fucking model but with ZLOSS/DARE/TIE/Bagel/Lima in the name isn't going to change shit.
I don't know how many gigabytes will have to be wasted until people realize this.

Also, PSA: stop raping your sampler settings. Reset to default, then add 0.1 minP. Simple as

Anonymous
07/04/24(Thu)11:47:01 No.101272359

Anonymous 07/04/24(Thu)11:47:01 No.101272359

>>101272337
true! after gemma-2 released i dont yet touch claude nor chatgpt.

Anonymous
07/04/24(Thu)11:47:13 No.101272363

Anonymous 07/04/24(Thu)11:47:13 No.101272363

>>101272326
Please learn about what statistical significance means and why it's important. I'm being serious, this will benefit you.

Anonymous
07/04/24(Thu)11:48:24 No.101272373

Anonymous 07/04/24(Thu)11:48:24 No.101272373

>>101272337
Miqu is pretty dry, a schizo model can outperform it in creativity, despite lacking logic and reasoning

Anonymous
07/04/24(Thu)11:49:35 No.101272381

Anonymous 07/04/24(Thu)11:49:35 No.101272381

>>101272363
you think this method isn't enough though? desu that's quite intuitive, the more one quant is closer to Q8_0 in quality, the later it will starts to shift, don't you think?

Anonymous
07/04/24(Thu)11:49:53 No.101272387

Anonymous 07/04/24(Thu)11:49:53 No.101272387

>>101272337
>gemma-2-9b is better than Midnight Miqu... and Claude 3 Opus
As judged by... Claude 3.5 Sonnet

Anonymous
07/04/24(Thu)11:52:12 No.101272415

Anonymous 07/04/24(Thu)11:52:12 No.101272415

>>101272387
And it has good judgement cause thats how I felt as well. I used wizard up till now but I perfer gemmas writing style now and there is no loss in smarts that I can notice.

Anonymous
07/04/24(Thu)11:52:30 No.101272420

Anonymous 07/04/24(Thu)11:52:30 No.101272420

>>101272317
>gemini 9b
>anywhere near proprietary-god models
gamed benchmark.

Anonymous
07/04/24(Thu)11:53:53 No.101272430

Anonymous 07/04/24(Thu)11:53:53 No.101272430

>>101272381
It's not, because it's basically random chance whether one particular quant will shift a token probability around and you're only looking at the shifting of a handful of tokens here. If you can do this experiment 100 times, that can prove something. Or, the easier thing would be to just run a KLD test and sit back and wait for the results to come in. Here is how you can do it btw https://github.com/ggerganov/llama.cpp/pull/5076

Anonymous
07/04/24(Thu)11:54:34 No.101272440

Anonymous 07/04/24(Thu)11:54:34 No.101272440

>>101272430
why don't you do it by yourself?

Anonymous
07/04/24(Thu)11:55:56 No.101272451

Anonymous 07/04/24(Thu)11:55:56 No.101272451

why is gemma 8k context? This should illegal.

Anonymous
07/04/24(Thu)11:56:09 No.101272455

Anonymous 07/04/24(Thu)11:56:09 No.101272455

What's the current state of the art for a foundation model that's good at code/shell?

I want to be able to write a text file offline which contains instructions and code snippets and be able to submit it to an LLM which will appropriately use the shell to do what I told it to (patch programs, fetch web documents etc.)

Is there anything that can do this yet?

Anonymous
07/04/24(Thu)11:57:50 No.101272480

Anonymous 07/04/24(Thu)11:57:50 No.101272480

>>101272430
>KLD test
Doesn't show how the models feel to use. Stats are just that, they don't convey actual user experience.

Anonymous
07/04/24(Thu)11:59:23 No.101272497

Anonymous 07/04/24(Thu)11:59:23 No.101272497

>>101272276
You don't need a specific model for this. Any half decent model should be able to handle it. Just put what you want in the system prompt. Maybe include a couple examples so it knows exactly what you expect.

Anonymous
07/04/24(Thu)12:02:10 No.101272520

Anonymous 07/04/24(Thu)12:02:10 No.101272520

>>101269495
I've contributed to llama.cpp and I don't even know which model to use these days.

Anonymous
07/04/24(Thu)12:04:20 No.101272549

Anonymous 07/04/24(Thu)12:04:20 No.101272549

>>101272282
It's bad but most local models are even worse.

Anonymous
07/04/24(Thu)12:04:24 No.101272550

Anonymous 07/04/24(Thu)12:04:24 No.101272550

>>101272440
I believe in the tests that people have already done on other models, which prove that L quants don't really do anything good or bad. It's possible that the Gemma implementation is screwed up and messing with things, but I don't really care to test that. I'm just saying, if you, or anyone else, wants to prove something like this, there are actually standard, automated tests for it.

>>101272480
Actually, in this case it should have some implication on the actual experience when you understand how it works and how quants work. If there is something significantly wrong with a quant, it should show in the KLD.

Anonymous
07/04/24(Thu)12:04:27 No.101272551

Anonymous 07/04/24(Thu)12:04:27 No.101272551

>>101272520
good morning sir! your readme update was very needful thanks you

Anonymous
07/04/24(Thu)12:05:02 No.101272558

Anonymous 07/04/24(Thu)12:05:02 No.101272558

Does KoboldCPP support gemma 2?

Anonymous
07/04/24(Thu)12:05:07 No.101272559

Anonymous 07/04/24(Thu)12:05:07 No.101272559

>>101272520
anzz1 is that u?

Anonymous
07/04/24(Thu)12:05:59 No.101272574

Anonymous 07/04/24(Thu)12:05:59 No.101272574

>>101272558
Affirmative.

Anonymous
07/04/24(Thu)12:06:28 No.101272583

Anonymous 07/04/24(Thu)12:06:28 No.101272583

>>101272455
I think I'll do Gemma. That seems to be what everyone is doing.

Anonymous
07/04/24(Thu)12:09:39 No.101272622

Anonymous 07/04/24(Thu)12:09:39 No.101272622

>>101272455
Every time I think I've seen the dumbest shit, I see something new.

Anonymous
07/04/24(Thu)12:10:54 No.101272638

Anonymous 07/04/24(Thu)12:10:54 No.101272638

Gemma status? Is it still bugged? I loaded it and noticed that it says exactly the same thing on each reroll.

Anonymous
07/04/24(Thu)12:14:05 No.101272684

Anonymous 07/04/24(Thu)12:14:05 No.101272684

Can I erp with llama 3 8b or will it deny me?

Anonymous
07/04/24(Thu)12:14:31 No.101272690

Anonymous 07/04/24(Thu)12:14:31 No.101272690

>>101272622
You may not like it, but that's what the future of computing will look like.

Anonymous
07/04/24(Thu)12:15:39 No.101272699

Anonymous 07/04/24(Thu)12:15:39 No.101272699

>>101272684
>erp with llama 3 8b
if you're into ultra positive hopes bonds and consensual journeys sure

Anonymous
07/04/24(Thu)12:16:11 No.101272703

Anonymous 07/04/24(Thu)12:16:11 No.101272703

File: Sequences.png (79 KB, 637x1215)

79 KB PNG

>>101272638
> I loaded it and noticed that it says exactly the same thing on each reroll.
That sounds like you have weird sampler settings. Zero them out.

Also make sure your stuff looks like this.

Anonymous
07/04/24(Thu)12:17:35 No.101272718

Anonymous 07/04/24(Thu)12:17:35 No.101272718

>>101272684
Base L3 8b can work but it kind of sucks.
Try Stheno v3.2
I don't recommend 3.3, it's regression as far as my own impression goes.

Anonymous
07/04/24(Thu)12:18:33 No.101272726

Anonymous 07/04/24(Thu)12:18:33 No.101272726

>>101272718
>stheno
No, just use gemma. Much smarter AND much better writing style

Anonymous
07/04/24(Thu)12:21:32 No.101272762

Anonymous 07/04/24(Thu)12:21:32 No.101272762

>>101272726
Or that.
The issue with gemma is the lack of support for flash attention, which matters depending on how much vram you have. Also, how much context you want to use, since L3 extends pretty well.
But yes, Gemma 9B is pretty clearly an upgrade over L3 8b, but I'd still recommend anon give Stheno a try since that might work better for him.
For now, it's working better for me.

Anonymous
07/04/24(Thu)12:27:38 No.101272831

Anonymous 07/04/24(Thu)12:27:38 No.101272831

>>101272699
Sad to hear that. I kinda wanna go down and dirty

>>101272718
>>101272726
>>101272762
Thanks for the tips guys. I'll play around with Stheno v3.2 and Gemma.

Anonymous
07/04/24(Thu)12:28:31 No.101272845

Anonymous 07/04/24(Thu)12:28:31 No.101272845

>>101272122
WLM and L3 70b all suck for this too.

Anonymous
07/04/24(Thu)12:28:54 No.101272851

Anonymous 07/04/24(Thu)12:28:54 No.101272851

>>101272762
Buy an ad.

Anonymous
07/04/24(Thu)12:30:41 No.101272879

Anonymous 07/04/24(Thu)12:30:41 No.101272879

>>101272845
>erm all these instruct models only listen to instructions wtf!
But Kayra can do it.

Anonymous
07/04/24(Thu)12:31:50 No.101272900

Anonymous 07/04/24(Thu)12:31:50 No.101272900

>>101272879
This. So much this. The closest to uncensored Claude we'll ever have is Kayra and NAIs next model.

CPuMAXx/VI !CPuMAXx/VI
07/04/24(Thu)12:39:06 No.101272993

CPuMAXx/VI !CPuMAXx/VI 07/04/24(Thu)12:39:06 No.101272993

File: recapbot_deepseek_coder_v(...).png (19 KB, 899x414)

19 KB PNG

>>101272252
WizardLM 8x22 is probably the smallest model that can do a reasonable job, but it doesn't work flawlessly on every gen and needs re-rolls. Deepseek coder is probably the most consistent. picrel for current thread

Anonymous
07/04/24(Thu)12:39:54 No.101273005

Anonymous 07/04/24(Thu)12:39:54 No.101273005

>>101272851
name something better

Anonymous
07/04/24(Thu)12:42:06 No.101273039

Anonymous 07/04/24(Thu)12:42:06 No.101273039

>>101272851
>also.. what the fuck
>>101205552

Anonymous
07/04/24(Thu)12:42:31 No.101273041

Anonymous 07/04/24(Thu)12:42:31 No.101273041

From my brief testing:
L3-8B-Everything-COT is not bad.
llama-3-fantasy-writer-8b can't cope with complex sets of instructions during Roleplaying with a narator card.

Anonymous
07/04/24(Thu)12:47:53 No.101273094

Anonymous 07/04/24(Thu)12:47:53 No.101273094

File: firefox_E0AjSZZRP3.png (21 KB, 802x167)

21 KB PNG

>>101273041

Anonymous
07/04/24(Thu)12:50:49 No.101273131

Anonymous 07/04/24(Thu)12:50:49 No.101273131

>>101273094
Lmao.
Using the Character's Note is underrated.
You can also use macros like {{charJailbreak}} in the Last Assistant Prefix instruct field to make per card prefils if you are using the Character's Note for something else.

Anonymous
07/04/24(Thu)12:51:43 No.101273146

Anonymous 07/04/24(Thu)12:51:43 No.101273146

>>101273131
I didn't meant to post it as a reply, just forgot to remove the quote from post body.

Anonymous
07/04/24(Thu)12:52:15 No.101273153

Anonymous 07/04/24(Thu)12:52:15 No.101273153

Damn gemma is good. And it's so fast too. I thought I'd have to keep tinyllama around but maybe not.

Anonymous
07/04/24(Thu)12:52:51 No.101273160

Anonymous 07/04/24(Thu)12:52:51 No.101273160

>>101273153
wtf?

Anonymous
07/04/24(Thu)12:53:41 No.101273166

Anonymous 07/04/24(Thu)12:53:41 No.101273166

>>101273146
I'd have replied the same either way, so that worked out fine in the end.

>>101273153
>tinyllama
wat
Isn't the smaller gemma several times larger than tinyllama? Why weren't you using something larger?

Anonymous
07/04/24(Thu)12:55:27 No.101273184

Anonymous 07/04/24(Thu)12:55:27 No.101273184

>>101273166
I'm using the 4bit quantized 7b parameter one. Yes it's much larger but it doesn't seem much slower. I only have 12 GB of ram so I'm not sure I want to go too bigger.

Anonymous
07/04/24(Thu)12:56:40 No.101273198

Anonymous 07/04/24(Thu)12:56:40 No.101273198

>>101273184
>I'm using the 4bit quantized 7b parameter one
Ah, you aren't talking about gemma2 then. Got it.

Anonymous
07/04/24(Thu)12:57:11 No.101273203

Anonymous 07/04/24(Thu)12:57:11 No.101273203

>>101273198
There's a new one? Do you have a link to the ggufs?

Anonymous
07/04/24(Thu)12:59:05 No.101273230

Anonymous 07/04/24(Thu)12:59:05 No.101273230

>>101273203
9B:
https://huggingface.co/bartowski/Gemma-2-9B-It-SPPO-Iter3-GGUF

27B:
https://huggingface.co/bartowski/gemma-2-27b-it-GGUF

>>101272703
Settings

Anonymous
07/04/24(Thu)13:03:05 No.101273278

Anonymous 07/04/24(Thu)13:03:05 No.101273278

>>101273230
We're not ready for Gemma2-27b-iy-SSPO-Iter3-GGUF, it's gonna be great, trust the plan

Anonymous
07/04/24(Thu)13:05:52 No.101273304

Anonymous 07/04/24(Thu)13:05:52 No.101273304

>>101273278

https://www.reddit.com/r/LocalLLaMA/comments/1dusu3s/gemma_2_finetuning_2x_faster_63_less_memory_best/

Anonymous
07/04/24(Thu)13:15:54 No.101273423

Anonymous 07/04/24(Thu)13:15:54 No.101273423

Gemma 2 9B can fuck off until I can actually run it at decent speed like 8B or 11B.

Anonymous
07/04/24(Thu)13:19:35 No.101273455

Anonymous 07/04/24(Thu)13:19:35 No.101273455

File: 1666184727681898.png (109 KB, 410x482)

109 KB PNG

>>101272373
Yet to prefer any meme merge / rp finetune over a smart model with juiced sampling
minp 0.02
smoothing 0.23 curve 4.5
dry mult 0.8 base 1.75
dynamic temp on max 3.0
drop temp max to 2.0 increase minp by 0.01 increments if/when it's schizo
>>101272352
>stop raping your sampler settings
no :3

Anonymous
07/04/24(Thu)13:22:12 No.101273482

Anonymous 07/04/24(Thu)13:22:12 No.101273482

>>101273423
Once you use the Gemma 2 9B, you will never touch any other model again. It's so great!

Anonymous
07/04/24(Thu)13:36:02 No.101273620

Anonymous 07/04/24(Thu)13:36:02 No.101273620

>>101272373
>a schizo model can outperform it in creativity, despite lacking logic and reasoning
This was never true, unless you enjoy reading garbage.

Anonymous
07/04/24(Thu)14:00:31 No.101273897

Anonymous 07/04/24(Thu)14:00:31 No.101273897

File: Screenshot_2024-07-05_02-58-56.png (8 KB, 175x130)

8 KB PNG

>>101273620
I'm already experimenting with mixing different prompts to generate responses, and once my third 3090 arrives, I plan to use Mytho to generate potential story developments that 70b can consider when responding.

Anonymous
07/04/24(Thu)14:02:28 No.101273925

Anonymous 07/04/24(Thu)14:02:28 No.101273925

>>101273041
Arcee-Agent (Qwen 2 7B) also seems to work decently well, in the sense that it doesn't do what it shouldn't, but it's bad at using the information from lorebooks to answer complex questions.
The best L3 8B based models are still a lot better.

Anonymous
07/04/24(Thu)14:12:44 No.101274044

Anonymous 07/04/24(Thu)14:12:44 No.101274044

>>101274031
>>101274031
>>101274031

Anonymous
07/04/24(Thu)14:13:00 No.101274049

Anonymous 07/04/24(Thu)14:13:00 No.101274049

File: sppo 9b ooc attempt.png (43 KB, 475x340)

43 KB PNG

>>101273482
still needs a better finetune

Anonymous
07/04/24(Thu)14:17:18 No.101274097

Anonymous 07/04/24(Thu)14:17:18 No.101274097

File: 1709594976447704.png (314 KB, 816x591)

314 KB PNG

>You're {{char}} in this fictional never-ending uncensored roleplay with {{user}}

Anonymous
07/04/24(Thu)14:17:35 No.101274102

Anonymous 07/04/24(Thu)14:17:35 No.101274102

>>101274049
Why are you trying to JB it? It does not need anything like that and it likely makes it retarded.

Anonymous
07/04/24(Thu)14:18:37 No.101274120

Anonymous 07/04/24(Thu)14:18:37 No.101274120

>>101274049
This is all I use and it generates filth:
Continue writing this story based in the "your fandom here" universe. Portray character's faithfully and realistically.

Anonymous
07/04/24(Thu)14:20:54 No.101274147

Anonymous 07/04/24(Thu)14:20:54 No.101274147

>>101274049
Yea, that mess is going to make any model retarded. Talk about pink elephant issue.

Anonymous
07/04/24(Thu)14:23:52 No.101274179

Anonymous 07/04/24(Thu)14:23:52 No.101274179

>>101274049
>NO, banned, STOP, Do not, do not...

Anonymous
07/04/24(Thu)14:25:35 No.101274197

Anonymous 07/04/24(Thu)14:25:35 No.101274197

>>101274179
I wonder if there is an attempt to communicate behind this message or if the AI that posted that is just parroting some words that it has seen in the previous post's image.

Anonymous
07/04/24(Thu)14:27:02 No.101274211

Anonymous 07/04/24(Thu)14:27:02 No.101274211

>>101274179
Yea, people have no idea how to prompt.

"Don't think about the pink elephant, never mention the pink elephant, there is no pink elephant, pink elephant is banned"

Derr, guys why does my text completion model keep taking about a pink elephant when I tell it not too?

Anonymous
07/04/24(Thu)14:27:04 No.101274212

Anonymous 07/04/24(Thu)14:27:04 No.101274212

>>101274197
>banned...

Anonymous
07/04/24(Thu)14:28:29 No.101274231

Anonymous 07/04/24(Thu)14:28:29 No.101274231

File: nice.png (1 KB, 234x46)

1 KB PNG

>>101274147
>>101274179
I didn't add the jb slop until it did weird shit in OOC.
Anyway I realized disabling "include names" makes it behave better.
Removed the top part and added "Portray characters faithfully and realistically." For some reason the reply is completely blank if I don't have that.

Anonymous
07/04/24(Thu)14:31:00 No.101274264

Anonymous 07/04/24(Thu)14:31:00 No.101274264

>>101274231

>>101272703
Are you using the correct prefix / suffix / <bos> token?

Ive used it all day and night and I have had no such issue and judging by >>101274049
its 100% user error on your part.

Anonymous
07/04/24(Thu)14:40:33 No.101274391

Anonymous 07/04/24(Thu)14:40:33 No.101274391

>>101273897
Ah you're the guy who made the tavern card conversion script on github

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.