/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 09/23/24(Mon)00:10:02 No.102513868

File: __hatsune_miku_vocaloid_d(...).jpg (137 KB, 1200x1500)

137 KB JPG

/lmg/ - Local Models General Anonymous 09/23/24(Mon)00:10:02 No.102513868 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Happy Monday Edition

Previous threads: >>102505481 & >>102493018

►News
>(09/18) Qwen 2.5 released, trained on 18 trillion token dataset: https://qwenlm.github.io/blog/qwen2.5/
>(09/18) Llama 8B quantized to b1.58 through finetuning: https://hf.co/blog/1_58_llm_extreme_quantization
>(09/17) Mistral releases new 22B with 128k context and function calling: https://mistral.ai/news/september-24-release/
>(09/12) DataGemma with DataCommons retrieval: https://blog.google/technology/ai/google-datagemma-ai-llm

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://hf.co/spaces/mike-ravkine/can-ai-code-results

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
09/23/24(Mon)00:14:44 No.102513911

Anonymous 09/23/24(Mon)00:14:44 No.102513911

File: recap-102505481.png (3.27 MB, 1805x8006)

3.27 MB PNG

►Recent Highlights from the Previous Thread: >>102505481

--Papers: >>102512985 >>102513202
--Llama.cpp and exllama logits viewer script:
>102509400 >102509589 >102509688
--Moshi by Kyutai Labs: Fast TTS with LLM and speech encoder:
>102505755 >102506102
--Method to download from Hugging Face without bloat:
>102510483 >102510567 >102510695 >102510747 >102510752 >102510932 >102511575 >102510595
--Llama 3.1 70b struggles with spelling and letter counting tasks:
>102511108 >102511193 >102511262 >102511320
--Seeking a replacement for ChatGPT4 to translate NSFW Japanese content:
>102507345 >102507368 >102507389 >102507482 >102507555 >102507587 >102507650 >102507711 >102507834 >102507861 >102507914
--Mistral, Nemo, and Qwen2.5 compared:
>102506988 >102507001 >102508375 >102508813 >102508541 >102507015 >102507116
--Suggestions for managing ST updates and merging changes:
>102507662 >102507673 >102508592
--Qwen performs poorly on trivia questions compared to Mistral Large:
>102506371 >102506547 >102506577 >102506637 >102506729 >102506816
--Qwen overcomes AI bias and feels human-like:
>102508397 >102508721
--Node-based LLM workflow prototyping tools on GitHub:
>102510927 >102510986
--Local aidungeon equivalent for text generation dungeon crawling:
>102507856 >102507998 >102510410 >102510515 >102510712 >102510899 >102510553 >102510602 >102510667
--Exl2 and VRAM upgrades:
>102510275 >102510328 >102510352 >102510436 >102510482 >102510508 >102510666 >102510696 >102510492
--Concern about higher core count impact on 70b model performance:
>102506431 >102506474 >102509225
--4chan scrubs JSON data from posts:
>102507802 >102507818 >102507829 >102507830 >102508632
--Miku (free space): >>102506056 >>102506768 >>102509995 >>102510410 >>102511919

►Recent Highlight Posts from the Previous Thread: >>102505496

Anonymous
09/23/24(Mon)00:17:30 No.102513938

Anonymous 09/23/24(Mon)00:17:30 No.102513938

>>102513911
looks better. love you recap anon

Anonymous
09/23/24(Mon)00:22:33 No.102513988

Anonymous 09/23/24(Mon)00:22:33 No.102513988

>>102513840
damn, that's fast. perhaps usable as draft for speculative decoding or some basic llm stuff
wonder if llamafile or ik support qwen 2.5. those repos are highly optimized for cpu inference

Anonymous
09/23/24(Mon)00:25:54 No.102514016

Anonymous 09/23/24(Mon)00:25:54 No.102514016

>>102513911
just use >> instead of >

Anonymous
09/23/24(Mon)00:26:48 No.102514029

Anonymous 09/23/24(Mon)00:26:48 No.102514029

>Hermes 405 generates a good smut story for me
>hallucinates a Patreon donation request at the end
kek
I still find stuff like that cute after 4 years
I'd pay you if I could, my friend

Anonymous
09/23/24(Mon)00:32:24 No.102514078

Anonymous 09/23/24(Mon)00:32:24 No.102514078

>>102514016
you're very low info

Anonymous
09/23/24(Mon)00:33:15 No.102514089

Anonymous 09/23/24(Mon)00:33:15 No.102514089

https://poal.me/t0ytku
>What will we get first in llama.cpp?
>Jamba
or
>DRY

Anonymous
09/23/24(Mon)00:34:41 No.102514102

Anonymous 09/23/24(Mon)00:34:41 No.102514102

>>102513911
we need violentmonkey script or sth like that to deal with references. but I guess if the number of refs per post is limited, then why not simply splitting recap into multiple posts???

Anonymous
09/23/24(Mon)00:39:28 No.102514143

Anonymous 09/23/24(Mon)00:39:28 No.102514143

Just picked up muh 4xv100 GPU from the post office.
Time to ogle.

Anonymous
09/23/24(Mon)00:47:06 No.102514220

Anonymous 09/23/24(Mon)00:47:06 No.102514220

>>102514102
Not him but was also thinking a script. Possibly just a modification of 4chanx, or you make the script execute first to transform the single > before numbers into double.

Anonymous
09/23/24(Mon)00:49:24 No.102514243

Anonymous 09/23/24(Mon)00:49:24 No.102514243

4chan servers stopped being able to handle too many backlinks...

Anonymous
09/23/24(Mon)00:49:32 No.102514247

Anonymous 09/23/24(Mon)00:49:32 No.102514247

>>102514089
Just use koboldcpp if you want DRY so bad.

Anonymous
09/23/24(Mon)00:50:42 No.102514262

Anonymous 09/23/24(Mon)00:50:42 No.102514262

>>102514247
you can get DRY in ooba by converting a model to llamacpp_hf format too

Anonymous
09/23/24(Mon)00:51:35 No.102514276

Anonymous 09/23/24(Mon)00:51:35 No.102514276

File: 1725496133951475.png (49 KB, 1533x268)

49 KB PNG

>>102514247
>over 1 million changes just to add a Python HTTP server and an HTML page...

Anonymous
09/23/24(Mon)00:52:56 No.102514287

Anonymous 09/23/24(Mon)00:52:56 No.102514287

>>102514262
ooba is shit, not using it.
>>102514276
Then do without DRY.

Anonymous
09/23/24(Mon)00:53:55 No.102514294

Anonymous 09/23/24(Mon)00:53:55 No.102514294

>>102514287
don't care what you use, faggot
just correcting you on koboldcpp being the only way

Anonymous
09/23/24(Mon)00:55:30 No.102514312

Anonymous 09/23/24(Mon)00:55:30 No.102514312

>>102514294
I didn't say it was the only way, I just said to use it if you want DRY. There's a difference. I would never suggest ooba.

Anonymous
09/23/24(Mon)00:55:31 No.102514313

Anonymous 09/23/24(Mon)00:55:31 No.102514313

File: brian.png (26 KB, 91x111)

26 KB PNG

>using hermes 70b on a story-heavy chat with the first couple replies generated by mini-magnum
>surprising amount of soul
>check templates, left the Pygmalion instruct template on somehow

At this point I think if you have enough parameters you can just shock the model out of slopspace

Anonymous
09/23/24(Mon)01:08:59 No.102514435

Anonymous 09/23/24(Mon)01:08:59 No.102514435

>>102514220
yep, seems like the simplest solution, replace all > with doubles (or other char so legit greentexts aren't screwed), run before the page is loaded.
doesn't even need to be violentmonkey , many browsers support JS directly from the address bar or bookmarks. 4chanx should do the trick too.

Anonymous
09/23/24(Mon)01:16:32 No.102514495

Anonymous 09/23/24(Mon)01:16:32 No.102514495

File: 1667069008109661.jpg (546 KB, 1174x1250)

546 KB JPG

>>102513868
Total and utter newbie coming through!

I have managed to install OoBaBooga and I'm running a model successfully. One thing I need help with is understanding what I need to do to be able to upload a PDF or a .xtx file for the model to draw information from. I plan to do solo roleplaying and I want my storytelling collaborator to analyze rpg lore books so that it can draw from such info during the process.

Do I really need an LLM that is "multimodal" to be able to do this? I don't really care about images or vids or audio at this point, only the ability to throw PDF's into it.

Any help would be much appreciated.

Anonymous
09/23/24(Mon)01:18:03 No.102514506

Anonymous 09/23/24(Mon)01:18:03 No.102514506

>>102514495
*.txt file

My current assistant tells me that models should be able to handle those file formats without having to use an explicitly "multimodal" model with all the bells and whistles.

Anonymous
09/23/24(Mon)01:22:35 No.102514541

Anonymous 09/23/24(Mon)01:22:35 No.102514541

>>102514495
>PDF
convert it to text
>rpg lore books
but you will run into the problem of the context window being too small. some have larger context windows (16k/32k) but then you'll need to have enough memory to handle it. at that size + model you're in the cpu+ram inferencing bucket.

Anonymous
09/23/24(Mon)01:26:02 No.102514563

Anonymous 09/23/24(Mon)01:26:02 No.102514563

>>102514541
I see. That helps me understand. Cheers. Perhaps I could do it while using that Kobold thingy where you outsource the whole operation to some frenly fren who's running big models?

Anonymous
09/23/24(Mon)01:27:04 No.102514576

Anonymous 09/23/24(Mon)01:27:04 No.102514576

>>102514563
AI Horde is what I meant.

Anonymous
09/23/24(Mon)01:27:54 No.102514582

Anonymous 09/23/24(Mon)01:27:54 No.102514582

>suggest going to private quarters to engage in night battles
>mistral small says it can't wait and engages right there, while we're still outside, although no one is there so we probably won't be seen anyway
So this is what enterprise resource planning with a mistral is like.

Anonymous
09/23/24(Mon)01:29:09 No.102514589

Anonymous 09/23/24(Mon)01:29:09 No.102514589

>>102514541
>you're in the cpu+ram inferencing bucket.

This means that the task will spill over to the cpu, right, possibly grinding my whole rig to a halt?

Anonymous
09/23/24(Mon)01:33:37 No.102514619

Anonymous 09/23/24(Mon)01:33:37 No.102514619

>>102514589
>spill
no you will just hit the VRAM limit and crash. I'm of course assuming you don't have an 8xp40 or 6x3090 setup. there are complicated ways to try to get around this kind of stuff but nothing really plug and play (rag, finetune, spec decoding). your best bet is to use claude or gpt4 to pull stuff from the pdfs (think they handle the converting to text for you) then use your local model as a collaborator using whatever the cloud model pulls as part of your prompts

Anonymous
09/23/24(Mon)01:38:02 No.102514658

Anonymous 09/23/24(Mon)01:38:02 No.102514658

https://wandb.ai/doctorshotgun/72b-magnum-fft/runs/itpmbj25/overview
https://wandb.ai/doctorshotgun/32b-magnum-fft/runs/ms4oynlz/overview
Qwen2.5 finetunes soon...

Anonymous
09/23/24(Mon)01:41:43 No.102514681

Anonymous 09/23/24(Mon)01:41:43 No.102514681

>Ever played with a Chozo pleasure probe before?
Kek wtf. Can't believe I'm actually having fun with LLMs again. And they say trivia knowledge doesn't matter.

Anonymous
09/23/24(Mon)01:49:53 No.102514738

Anonymous 09/23/24(Mon)01:49:53 No.102514738

>>102514681
Where's the pleasure probe mentioned in the canon?

Anonymous
09/23/24(Mon)01:53:17 No.102514761

Anonymous 09/23/24(Mon)01:53:17 No.102514761

>>102514738
Same place where Samus remodeled anon's shithole.

Anonymous
09/23/24(Mon)01:53:27 No.102514762

Anonymous 09/23/24(Mon)01:53:27 No.102514762

Haven't been able to log into openAI since saturday, goddamn really

Anonymous
09/23/24(Mon)01:57:33 No.102514778

Anonymous 09/23/24(Mon)01:57:33 No.102514778

>>102514762
Works on my machine. Although it has been kind of flaky recently. Another reason why the world needs local.

Anonymous
09/23/24(Mon)01:57:54 No.102514781

Anonymous 09/23/24(Mon)01:57:54 No.102514781

>>102514658
I want to believe but 2.5 is just SO slopped (yes, even base, I tried it) that I'm sceptical anything can be done with finetuning

they excluded too much of the stuff we need from the pretraining dataset

Anonymous
09/23/24(Mon)02:01:34 No.102514808

Anonymous 09/23/24(Mon)02:01:34 No.102514808

File: disappointment.png (1005 KB, 917x898)

1005 KB PNG

>>102514243
>4chan servers
mfw that's all client-side...
this last 4chan server code update has been a shitshow

Anonymous
09/23/24(Mon)02:03:05 No.102514824

Anonymous 09/23/24(Mon)02:03:05 No.102514824

It is probably obvious to anon here, but I wanted to double check.

If I am using llama-server the openAI client is going to load my model every time and I should be passing json into it and not using the API like every example says.

Anonymous
09/23/24(Mon)02:03:29 No.102514829

Anonymous 09/23/24(Mon)02:03:29 No.102514829

>>102514781
>sceptical
>sceptical is predominantly used in British English (used in UK/AU/NZ) ( en-GB )
hi petra

Anonymous
09/23/24(Mon)02:04:19 No.102514837

Anonymous 09/23/24(Mon)02:04:19 No.102514837

>>102514829
meds

Anonymous
09/23/24(Mon)02:05:15 No.102514844

Anonymous 09/23/24(Mon)02:05:15 No.102514844

File: 1726240003918620.png (198 KB, 1079x1088)

198 KB PNG

>>102514762
Just be patient and you'll have your little toys soon

Anonymous
09/23/24(Mon)02:07:54 No.102514873

Anonymous 09/23/24(Mon)02:07:54 No.102514873

>>102514658
>finetuning on the instruct version

Anonymous
09/23/24(Mon)02:08:50 No.102514884

Anonymous 09/23/24(Mon)02:08:50 No.102514884

>>102514873
Can't argue with results.

Anonymous
09/23/24(Mon)02:09:50 No.102514894

Anonymous 09/23/24(Mon)02:09:50 No.102514894

>>102514873
That's all they can do given their limited datasets and money.

Anonymous
09/23/24(Mon)02:11:46 No.102514906

Anonymous 09/23/24(Mon)02:11:46 No.102514906

>>102514619
Good suggestion. Thanks again.

Anonymous
09/23/24(Mon)02:12:59 No.102514917

Anonymous 09/23/24(Mon)02:12:59 No.102514917

>>102514844
What a pompous faggot
He's personally done fuck all for the SOTA
if anything he's probably been more of a hindrance with his political game of thrones takeover bullshit
guaranteed he's cost humanity in the long run with the way he subverted OAI from their original mission
I honestly can't believe anyone trusts any textbox that he's associated with enough to input data into it. I wouldn't trust him with my fucking grocery list

Anonymous
09/23/24(Mon)02:15:36 No.102514934

Anonymous 09/23/24(Mon)02:15:36 No.102514934

>>102514884
At least fucking abliterate it first or something.

>>102514894
>limited datasets
Last time they didn't even bother properly screening for refusals. THEY TUNED ON FUCKING REFUSALS, that wastes compute and makes model more dumber and cucked at the same time.

Anonymous
09/23/24(Mon)02:15:55 No.102514939

Anonymous 09/23/24(Mon)02:15:55 No.102514939

>>102514917
It's funny seeing this hate here of all places, considering how loved Meta is. Altman is just this generations's Zuckerberg. Give it a couple years and they will make a movie out of the takeover bullshit.

Anonymous
09/23/24(Mon)02:17:50 No.102514946

Anonymous 09/23/24(Mon)02:17:50 No.102514946

>>102514939
The takeover was more about the safety gang, though. No one is going to make a movie that makes them look bad.

Anonymous
09/23/24(Mon)02:17:50 No.102514947

Anonymous 09/23/24(Mon)02:17:50 No.102514947

>>102514873
with qwen2.5 the base isn't really any less safetyized
they bragged about how filtered the pretraining dataset was

Anonymous
09/23/24(Mon)02:19:09 No.102514955

Anonymous 09/23/24(Mon)02:19:09 No.102514955

>>102514917
I don't like him that much but if he hadn't "subverted them from their original mission" we'd never have even seen GPT4
with Sutskever gang in charge they'd have gone pure research and never shared anything with the plebs

Anonymous
09/23/24(Mon)02:23:22 No.102514987

Anonymous 09/23/24(Mon)02:23:22 No.102514987

>Perfect for Mistral Large 2 at 16bit and Llama-3.1 405B at 8bit
https://www.ebay.ca/itm/305716210884
Who's got a bitcoin horde to blow?

Anonymous
09/23/24(Mon)02:37:42 No.102515059

Anonymous 09/23/24(Mon)02:37:42 No.102515059

>>102514658
Whatever they use to make those 'magnum' models must be shit, because every one I've tried sucks. So I'm not optimistic.

Anonymous
09/23/24(Mon)02:46:48 No.102515104

Anonymous 09/23/24(Mon)02:46:48 No.102515104

>>102514987
huh never saw an mi300 series anything in the wild before
would be very wary... I know the mi200 series has all the good shit supported (and is basically the only amd family that really does) but knowing amd I would not trust for a second that they equally support their new gen hardware - the actual customers for this shit all have their own engineers making custom kernels so there's not a rush to ensure they work out of the box for regular use

Anonymous
09/23/24(Mon)02:52:52 No.102515145

Anonymous 09/23/24(Mon)02:52:52 No.102515145

>>102514987
lol pure comedy in that auction description. it reads like /lmg/ copypasta

Anonymous
09/23/24(Mon)03:03:38 No.102515201

Anonymous 09/23/24(Mon)03:03:38 No.102515201

File: todd smile 1.jpg (18 KB, 223x286)

18 KB JPG

>>102514987
>FOUR TIMES THE POWER SUPPLIES
so are we going all in on this business endeavour?

Anonymous
09/23/24(Mon)03:11:30 No.102515242

Anonymous 09/23/24(Mon)03:11:30 No.102515242

File: awinnerisyou.png (1.55 MB, 768x1280)

1.55 MB PNG

y'all ever run more than one copy of llama.cpp, wire each of the outputs to the other's inputs and make them fight?

Anonymous
09/23/24(Mon)03:12:54 No.102515246

Anonymous 09/23/24(Mon)03:12:54 No.102515246

>>102515242
post some examples that sounds kino if it actually works and im not being psyopped by my lack of understanding

Anonymous
09/23/24(Mon)03:22:16 No.102515294

Anonymous 09/23/24(Mon)03:22:16 No.102515294

>>102515242
a while back I made a script that had GPT-4 and Claude Opus talk to each other for 10 turns but the results weren't interesting, they seemed to mode collapse quite fast

Anonymous
09/23/24(Mon)04:09:01 No.102515641

Anonymous 09/23/24(Mon)04:09:01 No.102515641

>>102515242
I use Nemo for unimportant characters in my rp with Largestral to reduce context re-processing times

Anonymous
09/23/24(Mon)04:14:34 No.102515693

Anonymous 09/23/24(Mon)04:14:34 No.102515693

>>102515242
you mean an agent?

Random AI Jason vid because his shit all melts together:
https://www.youtube.com/watch?v=ogQUlS7CkYA

Anonymous
09/23/24(Mon)04:17:15 No.102515715

Anonymous 09/23/24(Mon)04:17:15 No.102515715

How big is the intelligence difference between 4, 6 and 8 BPW haven't been able to find any charts

Anonymous
09/23/24(Mon)04:40:43 No.102515858

Anonymous 09/23/24(Mon)04:40:43 No.102515858

>>102515715
Depends on the task and the model. You won't see a huge difference most of the time, but sometimes 4 shits itself where 6 doesn't. The improvement from 6 to 8 is hardly noticeable.

Anonymous
09/23/24(Mon)04:44:02 No.102515875

Anonymous 09/23/24(Mon)04:44:02 No.102515875

>>102515715
It isn't measurable in intelligence per se, but in deviation from original weights. 8bpw on gguf is 0.03% per token. I don't think anyone made exl2 chart to compare if that's what you are looking for. Of course larger models retain their faculties a lot better than smaller ones. 0.03% was for mistral 7b, I think.

Anonymous
09/23/24(Mon)04:54:28 No.102515945

Anonymous 09/23/24(Mon)04:54:28 No.102515945

>>102515715
Generally anything above 4bpw is fine, but like >>102515875 said, bigger models are much more resistant to quantization errors, so a 2bpw 70B model will still vastly outperform a 7B model even at fp16
Unfortunately, going below 2bpw makes pretty much every model retarded, so don't do that

Anonymous
09/23/24(Mon)05:11:22 No.102516062

Anonymous 09/23/24(Mon)05:11:22 No.102516062

File: daejag5-929d491c-17cf-44b(...).jpg (33 KB, 366x460)

33 KB JPG

FFS Cydonia is absolutely based

Anonymous
09/23/24(Mon)05:35:02 No.102516212

Anonymous 09/23/24(Mon)05:35:02 No.102516212

>>102516062
Post em

Anonymous
09/23/24(Mon)06:09:03 No.102516413

Anonymous 09/23/24(Mon)06:09:03 No.102516413

>>102513911
just remove the refs, people can open the old thread and do ctrl+f anyway

or link only the first post of the chain

Anonymous
09/23/24(Mon)06:11:31 No.102516430

Anonymous 09/23/24(Mon)06:11:31 No.102516430

>>102513911
no, really, put the miku space in a separate post
those are the only posts that matter anyway

Anonymous
09/23/24(Mon)06:52:00 No.102516749

Anonymous 09/23/24(Mon)06:52:00 No.102516749

File: 1695923382578061.jpg (41 KB, 640x473)

41 KB JPG

>>102514313
>excerpt ends with a reddit user link
>it's real

Anonymous
09/23/24(Mon)07:36:42 No.102517097

Anonymous 09/23/24(Mon)07:36:42 No.102517097

>>102516413
>or link only the first post of the chain
This is better. That way, it gives a quicker indicator as to whether topic op is a faggot or not.

Anonymous
09/23/24(Mon)07:54:16 No.102517235

Anonymous 09/23/24(Mon)07:54:16 No.102517235

File: file.png (127 KB, 756x800)

127 KB PNG

So when are we going to achieve CAI-levels of soul using local?

Anonymous
09/23/24(Mon)07:56:05 No.102517257

Anonymous 09/23/24(Mon)07:56:05 No.102517257

>>102517235
Try Gemma 2B.

Anonymous
09/23/24(Mon)07:58:40 No.102517286

Anonymous 09/23/24(Mon)07:58:40 No.102517286

>>102517235
we already have, just stop being a promplet/retard/possible shill.

Anonymous
09/23/24(Mon)08:00:48 No.102517308

Anonymous 09/23/24(Mon)08:00:48 No.102517308

File: 00150-2320880277.png (1.39 MB, 1152x896)

1.39 MB PNG

I've used several models to write erotic stories with a prompt of character descriptions, followed by a bulleted story synopsis (about ~1200 tokens or so). Ordered by quality.

Mistral Nemo 12B fp16:
The GOAT so far, it will write smut with a great balance of creativity and also following the prompt. No refusals ever.

Gemma2 27B Q8_0, 9B fp16:
27B is basically as good as Nemo, but with the slower generation I'm not sure it's worth it. I only used the 9B a little bit and as I recall it was basically the same.

Mistral Small 22B Q8_0:
It's definitely smarter when it comes to the details, but seems worse when it comes to reading the room and writing in the right tone. It has a definite hint of that sterile Wikipedia/assistant style. It also seems to write much shorter responses, but that could be a prompt thing.

Mixtral 8x7B Q8_0:
This was a great model and I used it a lot, but modern smaller ones are better so I think this model's time has passed.

qwen2.5_7b-instruct-fp16:
Writes good storywise but doesn't stick to any cohesive narrative, puts in nonsensical details and it just keeps interjecting random shit from the prompt. It also has moderate censorship and will write disclaimers, refusals, etc. Has a lot of that 'helpful' assistant smell.

qwen2.5_14b-instruct-q8_0:
This was the same as 7B but more refusals, content warnings, and would randomly switch to Chinese?? Google Translate said it wrote "Hee hee, I changed the topic here to avoid sensitive content" so I would skip it.

Mistral 7B v0.2 fp16:
Only mid in it's day and outdated by today's standards.

Llama 3/3.1 70B Q6_K, 8B fp16:
Will refuse basically anything erotic. Easy enough to bypass (just pre-edit the response with 'Sure,' or instead of 'assistant' write a character's name in the instruct line), but it still wants to put a 'and they lived happily ever after' ending on every story. Extremely 'helpful assistant' writing style, worthless for smut purposes.

Thanks for coming to my TED talk.

Anonymous
09/23/24(Mon)08:04:05 No.102517339

Anonymous 09/23/24(Mon)08:04:05 No.102517339

>>102517308
I'd still be using mixtral variants if it weren't for fucking SWA. Current models have inbuilt basically infinite context is a godsend.

Anonymous
09/23/24(Mon)09:36:30 No.102517583

Anonymous 09/23/24(Mon)09:36:30 No.102517583

Not sure who need this info but i feel i gotta share it:
I didnt really like mistral-small, felt worse for RP than nemo because there is more slop and positivity bias thats noticeable.
But its the first model where i had a card with various stats like
Hunger, Trust, etc.
And it not just consistently updated it but it flowed into the story. If the char is hungry you will get comments.
Nemo could do it in some capacity but you could feel that it doesnt fully get it. Especially percentages/numbers are almost random.
Mistral Small survives 5-6 different % bars that go up and down correctly according to the story.
Good shit. Very impressive stuff.

Anonymous
09/23/24(Mon)09:39:26 No.102517629

Anonymous 09/23/24(Mon)09:39:26 No.102517629

>>102517308
Interesting, but what about bigger models? I'm currently balling with 70B at 2t/s and mistral large at 1t/s (which is why I basically never use it), but if the quality of nemo is comparable...

Anonymous
09/23/24(Mon)09:43:01 No.102517674

Anonymous 09/23/24(Mon)09:43:01 No.102517674

File: Screenshot_20240923_224138.png (13 KB, 969x33)

13 KB PNG

S-Sasuga mistral-sama.

Anonymous
09/23/24(Mon)09:43:10 No.102517676

Anonymous 09/23/24(Mon)09:43:10 No.102517676

>>102517583
I have a very similar impression on Small. It feels way smarter than Nemo even when I'm forced to use a retarded Q3 quant. So far I like it.

Anonymous
09/23/24(Mon)09:45:27 No.102517712

Anonymous 09/23/24(Mon)09:45:27 No.102517712

File: 1698238812242621.jpg (760 KB, 1856x2464)

760 KB JPG

>>102513868

Anonymous
09/23/24(Mon)09:45:36 No.102517714

Anonymous 09/23/24(Mon)09:45:36 No.102517714

File: danny elfman album cover wtf.jpg (32 KB, 1000x563)

32 KB JPG

>>102517674
>(Some semen has been absorbed overnight.)

Anonymous
09/23/24(Mon)09:46:09 No.102517721

Anonymous 09/23/24(Mon)09:46:09 No.102517721

>>102516749
One of the most soulful moments I got was an RP stopping with a sudden reddit URL and a comment thread criticizing the model card and saying a major part of the premise didn't make sense. It was so convincing (read: I was so new) I actually checked to see if there really had been a reddit post about this model card that might have been scraped into the training data (obviously there was no such post with that URL or any other).

Anonymous
09/23/24(Mon)09:46:30 No.102517725

Anonymous 09/23/24(Mon)09:46:30 No.102517725

>>102517235
Never, this is lost technology at this point

Anonymous
09/23/24(Mon)09:49:59 No.102517773

Anonymous 09/23/24(Mon)09:49:59 No.102517773

>>102517308
Thank you for your service.
>Mistral Nemo 12B fp16
Does this fit into VRAM for you? What kind of speed do you get?

Anonymous
09/23/24(Mon)09:50:04 No.102517776

Anonymous 09/23/24(Mon)09:50:04 No.102517776

>>102517714
Yeah? That happens.

Anonymous
09/23/24(Mon)09:52:03 No.102517800

Anonymous 09/23/24(Mon)09:52:03 No.102517800

>>102517308
Too much hyperbole about Llama 3.1. It doesn't feel like you actually used any of these models.

Anonymous
09/23/24(Mon)09:53:36 No.102517822

Anonymous 09/23/24(Mon)09:53:36 No.102517822

>>102517235
Never and I blame the retards that get satisfied with slop as long as it writes "I'm cumming~~~"

Anonymous
09/23/24(Mon)09:56:35 No.102517852

Anonymous 09/23/24(Mon)09:56:35 No.102517852

Would an A6000 be able to run models faster than 2x 3090's?

Anonymous
09/23/24(Mon)10:00:50 No.102517912

Anonymous 09/23/24(Mon)10:00:50 No.102517912

>>102517583
>>102517674
Mistral large also does very well with this, better than CR+ in my experience. By default I have a blurb above the stats like "interpret these status bars into the story without directly referencing them" but now I'm kinda thinking it would be fun to have the model update these status bars when relevant. What depth do you keep yours at, assuming you use author's note or WI entries?
As an aside, mistral large can also come up with a very good 5e character sheet, and does good job with world building if you lay out a few basic tenents about the world. So far it works well as a DM provided you don't mind sorta co-DMing when it comes to certain plot movements etc. It can interpret dice rolls well enough to consistently include advantage/disadvantage/ability checks/proficiency bonuses when applicable. we've reached levels of infinite zork that I never previously considered possible

Anonymous
09/23/24(Mon)10:00:57 No.102517913

Anonymous 09/23/24(Mon)10:00:57 No.102517913

>>102517629
>>102517800
Llama3 70B is smart but worthless for smut because even if you bypass the refusal, it still writes with an air of happy and smiles and everything is nice etc., it was trained too hard on being helpful. I have tried Llama3.1 405B and Mistral Large but since I have to run these on a computer at my work I have to be careful. My general impression is they are both smarter but suffer from the same biases their respective smaller models have.

>>102517773
Depending on how much VRAM X server is using, I sometimes have to unload a couple of layers to the CPU, but even then I get about 10 tok/s which is about as fast as I can read so that's OK. Q8 all on the GPU gets like 40 tok/s.

Anonymous
09/23/24(Mon)10:01:12 No.102517919

Anonymous 09/23/24(Mon)10:01:12 No.102517919

>>102517308
>12B
So, I assume my 8gb vram can run it even without using my normal ram?

Anonymous
09/23/24(Mon)10:07:15 No.102518006

Anonymous 09/23/24(Mon)10:07:15 No.102518006

File: 1714446683398166.png (270 KB, 1717x1517)

270 KB PNG

>>102517913
>Llama3 70B is smart but worthless for smut because even if you bypass the refusal, it still writes with an air of happy and smiles and everything is nice etc., it was trained too hard on being helpful.
Pure hyperbole.

Anonymous
09/23/24(Mon)10:08:26 No.102518018

Anonymous 09/23/24(Mon)10:08:26 No.102518018

>>102518006
Yes but how did it end the story?

Anonymous
09/23/24(Mon)10:14:42 No.102518080

Anonymous 09/23/24(Mon)10:14:42 No.102518080

>>102513938
luv u 2 bby
>>102514102
>I guess if the number of refs per post is limited, then why not simply splitting recap into multiple posts???
The number of refs per post is limited to 9. Would need 10 posts to link everything properly.
>>102516430
Permanent multipost recaps would be obnoxious.
>>102516413
Well, that's why I left the post ids. Easier to ctrl+f by a specific id than some keywords that might be all over the previous thread.
>>102517097
9 links per recap isn't enough even if we only link one post per chain.
Also, lots of times the topic op is a fag but has interesting replies further down the chain.

I don't know how you guys use the recap, for me the summaries are the least important part.
But if you guys want, I'll experiment with replacing the links with longer summaries.

Anonymous
09/23/24(Mon)10:14:59 No.102518086

Anonymous 09/23/24(Mon)10:14:59 No.102518086

>>102517852
Probably. memory bandwidth will nearly always be the limiting factor with processing power a distant second. Also a single A6000 would be more energy efficient if that's a concern but probably not enough to make up the difference on its own.

Anonymous
09/23/24(Mon)10:15:16 No.102518089

Anonymous 09/23/24(Mon)10:15:16 No.102518089

>>102517919
Your context will probably spill over into ram, depending on how much you need.

Anonymous
09/23/24(Mon)10:15:45 No.102518100

Anonymous 09/23/24(Mon)10:15:45 No.102518100

>>102518006
This is just forced prompt engineering. Most people don't RP like this

Anonymous
09/23/24(Mon)10:17:54 No.102518122

Anonymous 09/23/24(Mon)10:17:54 No.102518122

>>102518100
promptlet cope

Anonymous
09/23/24(Mon)10:18:37 No.102518130

Anonymous 09/23/24(Mon)10:18:37 No.102518130

>>102518100
cope more, mistral shill

Anonymous
09/23/24(Mon)10:19:34 No.102518139

Anonymous 09/23/24(Mon)10:19:34 No.102518139

>>102518080
>I don't know how you guys use the recap, for me the summaries are the least important part.
what's the point of the recap then? i read the summaries to see if something interesting was discussed, if yes then i click on the first post of the chain and read the old thread from there

Anonymous
09/23/24(Mon)10:20:18 No.102518156

Anonymous 09/23/24(Mon)10:20:18 No.102518156

>>102518130
Mistral is garbage too, their models simply copy and paste the same two replies over and over no matter what you prompt

Anonymous
09/23/24(Mon)10:23:34 No.102518192

Anonymous 09/23/24(Mon)10:23:34 No.102518192

>>102518122
What's the next step for skillchads? Calculating matrix multiplication by hand?

Anonymous
09/23/24(Mon)10:25:02 No.102518204

Anonymous 09/23/24(Mon)10:25:02 No.102518204

developers are lazy entitled soys or retarded Pajeets
LLMs are basically slaves with severe mental illness (ignore that most white onions devs are transexuals lol)
You throw some guardrails on LLMs and you have the biggest innovation since fission.

And it's EXTREMELY corporate friendly for countless reasons, none of which need explanation -- it's basically the industrial revolution 2.0

The LLM is the steam engine
And the programmer is the mick
Chain of thought is your potato famine

Anonymous
09/23/24(Mon)10:26:17 No.102518216

Anonymous 09/23/24(Mon)10:26:17 No.102518216

File: Mud-Jam-.jpg (106 KB, 682x518)

106 KB JPG

>>102518006
Why does this write like a fucking monster truck rally advertisement

Anonymous
09/23/24(Mon)10:26:36 No.102518219

Anonymous 09/23/24(Mon)10:26:36 No.102518219

>>102518204
what did he mean by this?

Anonymous
09/23/24(Mon)10:27:35 No.102518237

Anonymous 09/23/24(Mon)10:27:35 No.102518237

File: Eric_E_Schmidt,_2005_(loo(...).jpg (1.78 MB, 2048x3072)

1.78 MB JPG

>>102518219
>He doesn't know

Anonymous
09/23/24(Mon)10:27:55 No.102518245

Anonymous 09/23/24(Mon)10:27:55 No.102518245

>>102518216
kek

Anonymous
09/23/24(Mon)10:28:43 No.102518255

Anonymous 09/23/24(Mon)10:28:43 No.102518255

>>102518192
garbage in, garbage out, anon. if you want to load a generic card with a helpful assistant prompt and type one word replies that's fine, but don't complain when your context is full of slop.

Anonymous
09/23/24(Mon)10:28:50 No.102518257

Anonymous 09/23/24(Mon)10:28:50 No.102518257

>>102518139
Same, except I read all the replies first to see if it's worth going back to the previous thread.
I'm just saying I don't think longer summaries makes up for not being able to easily go to or read the actual discussion. For me, it would just be more padding that would take me longer to scan for interesting topic.s

Anonymous
09/23/24(Mon)10:29:08 No.102518262

Anonymous 09/23/24(Mon)10:29:08 No.102518262

>>102518216
>SUNDAY SUNDAY SUNDAY

Anonymous
09/23/24(Mon)10:29:17 No.102518264

Anonymous 09/23/24(Mon)10:29:17 No.102518264

File: Ask not for whom the bell(...).jpg (6 KB, 225x225)

6 KB JPG

We are going to genocide your "profession".

Anonymous
09/23/24(Mon)10:30:34 No.102518273

Anonymous 09/23/24(Mon)10:30:34 No.102518273

>>102518255
My context is highest tier of literature written by hand and the models still output slop.

Anonymous
09/23/24(Mon)10:31:34 No.102518282

Anonymous 09/23/24(Mon)10:31:34 No.102518282

>>102518273
share screenshots now let's see that top tier literature.

>inb4 you're the nalachad

Anonymous
09/23/24(Mon)10:36:15 No.102518335

Anonymous 09/23/24(Mon)10:36:15 No.102518335

bit.... net?

Anonymous
09/23/24(Mon)10:36:47 No.102518341

Anonymous 09/23/24(Mon)10:36:47 No.102518341

>>102518255
>garbage in, garbage out, anon.
Agreed. If the model was trained on garbage, no amount of prompting will fix it.

Anonymous
09/23/24(Mon)10:36:49 No.102518344

Anonymous 09/23/24(Mon)10:36:49 No.102518344

>>102518335
bit not

Anonymous
09/23/24(Mon)10:36:57 No.102518345

Anonymous 09/23/24(Mon)10:36:57 No.102518345

>>102518006
I kneel skillchad

Anonymous
09/23/24(Mon)10:37:29 No.102518354

Anonymous 09/23/24(Mon)10:37:29 No.102518354

File: parappa-the-rapper.gif (210 KB, 191x249)

210 KB GIF

>>102518335
bit net!

>>102518344
Bit not

Anonymous
09/23/24(Mon)10:38:28 No.102518362

Anonymous 09/23/24(Mon)10:38:28 No.102518362

>>102518080
>I don't know how you guys use the recap, for me the summaries are the least important part.
I personally read every single /lmg/ thread anyways so the recaps are of no use to me.
But if I were to use them it would be for discovering potentially interesting discussions with comparatively less effort.
I think the post ids further down the reply tree are only useful for this if there is a low-effort way to map them to the actual posts.
With the actual replies that was not an issue, but using just vanilla 4chanX I don't think I would ever use any of the post ids other than the first one and just read from there.

Anonymous
09/23/24(Mon)10:41:23 No.102518395

Anonymous 09/23/24(Mon)10:41:23 No.102518395

>>102518100
god forbid you have to write a sentence or two telling the model what you want

Anonymous
09/23/24(Mon)10:42:54 No.102518409

Anonymous 09/23/24(Mon)10:42:54 No.102518409

>>102518192
the next step for promptlets is complaining that they have to write a card instead of having the model infer who they want to RP with

Anonymous
09/23/24(Mon)10:44:39 No.102518429

Anonymous 09/23/24(Mon)10:44:39 No.102518429

>>102518395
>a sentence or two
I bet that shit started with ten paragraph of multishot examples

Anonymous
09/23/24(Mon)10:44:49 No.102518434

Anonymous 09/23/24(Mon)10:44:49 No.102518434

>>102518282
You can't handle my prose. It's too strong for you.

Anonymous
09/23/24(Mon)10:45:04 No.102518436

Anonymous 09/23/24(Mon)10:45:04 No.102518436

>>102518409
Card-based models would be amazing though, and skillchads would still be happy since they always have pen and paper nearby.

Anonymous
09/23/24(Mon)10:46:08 No.102518447

Anonymous 09/23/24(Mon)10:46:08 No.102518447

>>102518434
>implying.assistant

Anonymous
09/23/24(Mon)10:47:50 No.102518471

Anonymous 09/23/24(Mon)10:47:50 No.102518471

>>102518434
I'm telling you prompter, I need only your strongest prose because I'm going into battle.

Anonymous
09/23/24(Mon)10:52:12 No.102518519

Anonymous 09/23/24(Mon)10:52:12 No.102518519

What is the best model I can run locally with a 4090 gpu? Currently have an old llama 1 30b that I mess with, and the newer 3.1 8b.

Any recommendations?

Anonymous
09/23/24(Mon)10:56:26 No.102518570

Anonymous 09/23/24(Mon)10:56:26 No.102518570

>>102518519
pyg 6b

Anonymous
09/23/24(Mon)10:57:29 No.102518581

Anonymous 09/23/24(Mon)10:57:29 No.102518581

A-anons I hacked into writechad's network and you WILL die from his prose. It's like the mind food thing from Jujutsu Kaisen. It's so immersive it's like experiencing it in real life. It's something the government would gatekeep from their citizens at all cost, since you might become permanently vegetable and die from lack of eating, sleeping, and pissing. It could be used as a bioweapon if released in another country, but obviously they can't contain it to prevent it from spreading back.
From your perspective only a few minutes have passed, but I am lucky to survive a journey that lasted a month and to warn everyone.
Even his ipv6 address is filled with impossible mathematical patterns and contains a character that isn't permitted within the ip range.

Anonymous
09/23/24(Mon)10:58:25 No.102518604

Anonymous 09/23/24(Mon)10:58:25 No.102518604

>>102518519
Mistral Nemo (I'm this anon) >>102517308

Anonymous
09/23/24(Mon)11:00:24 No.102518631

Anonymous 09/23/24(Mon)11:00:24 No.102518631

>>102518436
*throws a tomato at you*

Anonymous
09/23/24(Mon)11:00:34 No.102518632

Anonymous 09/23/24(Mon)11:00:34 No.102518632

>>102518519
Chronoboros-33B

Anonymous
09/23/24(Mon)11:00:54 No.102518636

Anonymous 09/23/24(Mon)11:00:54 No.102518636

File: Axel Okay i believe you.jpg (12 KB, 480x360)

12 KB JPG

>>102518581
>but leak it anyway

Anonymous
09/23/24(Mon)11:01:18 No.102518641

Anonymous 09/23/24(Mon)11:01:18 No.102518641

>>102518581
sounds like slop. my writing is better

Anonymous
09/23/24(Mon)11:04:11 No.102518672

Anonymous 09/23/24(Mon)11:04:11 No.102518672

>>102518641
You never saw it though. My nose is bleeding just to type this. It will be the end of me if I even leak a screenshot.

Anonymous
09/23/24(Mon)11:08:25 No.102518728

Anonymous 09/23/24(Mon)11:08:25 No.102518728

why is nemo so good?
is it because nvidia was involved in making it?

Anonymous
09/23/24(Mon)11:12:04 No.102518768

Anonymous 09/23/24(Mon)11:12:04 No.102518768

>>102518728
I think so. Good mid-ranged model is also in their interest, since it drives consoomer GPU sales.

Anonymous
09/23/24(Mon)11:18:01 No.102518846

Anonymous 09/23/24(Mon)11:18:01 No.102518846

>>102518100
>User: Suck my penis. Do it slowly at first. Then tell me that my cock is huge and you never had one this big. Look me in the eyes as you do that and finger yourself using two fingers. Ask me if I like it.
>Assistant: I suck your penis. I start slowly. "Your cock is huge anon..." I whisper barely above whisper. I look into your eyes a naughty gleam visible in mine as I finger myself with two fingers. "Do you like it anon?"
OH MY GOD THE PERFECT COOMBOT IS HERE!!!!!!!!!!!!!!!!!!!!!!!!

Anonymous
09/23/24(Mon)11:18:05 No.102518847

Anonymous 09/23/24(Mon)11:18:05 No.102518847

>>102517912
Yesh but you still need multiple gpus to run it at acceptable speeds (or quant it to hell and back)
I hope we'll match its intelligence with sub-70B models soon

Anonymous
09/23/24(Mon)11:20:04 No.102518866

Anonymous 09/23/24(Mon)11:20:04 No.102518866

File: vergil all smiles.jpg (19 KB, 280x330)

19 KB JPG

>>102518846
>I whisper barely above whisper
S O U L @Undster @Drummer @Sao

Anonymous
09/23/24(Mon)11:20:51 No.102518873

Anonymous 09/23/24(Mon)11:20:51 No.102518873

File: jambon-ham.jpg (493 KB, 2560x1708)

493 KB JPG

Jamba?

Anonymous
09/23/24(Mon)11:22:48 No.102518890

Anonymous 09/23/24(Mon)11:22:48 No.102518890

File: file.png (441 KB, 449x407)

441 KB PNG

>>102518873

Anonymous
09/23/24(Mon)11:24:06 No.102518902

Anonymous 09/23/24(Mon)11:24:06 No.102518902

>>102518519
Qwen2.5 32B, and the eventual Magnum fine-tune >>102514658

Anonymous
09/23/24(Mon)11:29:04 No.102518947

Anonymous 09/23/24(Mon)11:29:04 No.102518947

svelk

Anonymous
09/23/24(Mon)11:29:38 No.102518950

Anonymous 09/23/24(Mon)11:29:38 No.102518950

File: omg it not migu with only miku.png (40 KB, 317x277)

40 KB PNG

>>102518890
Back to sleep then.

Anonymous
09/23/24(Mon)11:30:03 No.102518958

Anonymous 09/23/24(Mon)11:30:03 No.102518958

File: 1716377010898903.png (735 KB, 819x913)

735 KB PNG

Laurie is so cute and funny. :3

Anonymous
09/23/24(Mon)11:30:51 No.102518964

Anonymous 09/23/24(Mon)11:30:51 No.102518964

>>102518958
buy an ad

Anonymous
09/23/24(Mon)11:31:49 No.102518976

Anonymous 09/23/24(Mon)11:31:49 No.102518976

>>102518873
You just have to finish the llama.cpp PR for it to be merged anon.
Or you can pull the branch and compile it yourself.
Here
>https://github.com/ggerganov/llama.cpp/pull/8526
compilade even wrote a list of TODOs.

Anonymous
09/23/24(Mon)11:37:10 No.102519030

Anonymous 09/23/24(Mon)11:37:10 No.102519030

>>102518890
>no llama.cpp

Anonymous
09/23/24(Mon)11:39:34 No.102519051

Anonymous 09/23/24(Mon)11:39:34 No.102519051

>>102518958
tf is this supposed to mean?

Anonymous
09/23/24(Mon)11:41:36 No.102519073

Anonymous 09/23/24(Mon)11:41:36 No.102519073

Since prompt bros plan everything AI will say, would it make sense to use them as models? Or maybe incorporate them into some sort of a CoT?
If at least 95% of your time with AI is spent on prompting, please drop me your email.

Anonymous
09/23/24(Mon)11:41:50 No.102519076

Anonymous 09/23/24(Mon)11:41:50 No.102519076

>>102519051
macbook get hot from running big model
>>102519030
good job seeing joke anon, original image had llama in hole digging, but now he taking nap so no work done

Anonymous
09/23/24(Mon)11:45:42 No.102519110

Anonymous 09/23/24(Mon)11:45:42 No.102519110

>>102518958
Assuming the best current apple device, what is the largest model one can run, at at least 6bpw, 32k context, and 5t/s with a full context?

Anonymous
09/23/24(Mon)11:46:57 No.102519127

Anonymous 09/23/24(Mon)11:46:57 No.102519127

>>102519076
>but now he taking nap so no work done
I regression test it every day, and I can't remember the last time there wasn't something in the pull. Typically substantial work
>>102518976
>You just have to finish the llama.cpp PR for it to be merged anon.
This. You get what you give. eg Try to submit a bugfix PR to fix a reported issue or make some documentation updates
You, too, can have a llama.cpp (contributor) label on your github profile

Anonymous
09/23/24(Mon)11:48:58 No.102519158

Anonymous 09/23/24(Mon)11:48:58 No.102519158

>>102519110
>Assuming the best current apple device, what is the largest model one can run, at at least 6bpw, 32k context, and 5t/s with a full context?
best apple device is 192gb with ~900GB/s mem bandwidth (costs around $10k), you can calculate the performance per model for yourself.
T/s will be good, but prompt processing will be shit. You do a lot of waiting on the frontend of each response when running on apple silicon

Anonymous
09/23/24(Mon)11:50:10 No.102519167

Anonymous 09/23/24(Mon)11:50:10 No.102519167

>>102518847
>acceptable speeds
I've accepted unacceptable speeds, personally. I just let it spin its wheels and come back to it. CR+ got me used to slow speeds, but unlike CR+ I can actually let mistral large write and I come back to multiple coherent paragraphs. It isn't always what I had in mind, but there's never a moment where it feels like it got entirely derailed to the point where it gets nonsensical. This is the first time that I expanded my token limit from 512 to 1024 and it actually stays at the same output quality.

Anonymous
09/23/24(Mon)11:54:44 No.102519219

Anonymous 09/23/24(Mon)11:54:44 No.102519219

>>102519167
>I've accepted unacceptable speeds, personally
me too. output quality trumps everythingです.
You can think about it like playing a door game on a 1200 baud modem on a BBS.
Or play-by-mail RPG, if you are extra poor and determined.
Hell, I wait longer for responses on 4chan boards half the time

Anonymous
09/23/24(Mon)11:56:18 No.102519235

Anonymous 09/23/24(Mon)11:56:18 No.102519235

>>102519219
But you can't play a game in another window cause your gpu is busy...

Anonymous
09/23/24(Mon)11:58:02 No.102519260

Anonymous 09/23/24(Mon)11:58:02 No.102519260

>>102519073
This right here is how you make AGI

Anonymous
09/23/24(Mon)12:01:46 No.102519315

Anonymous 09/23/24(Mon)12:01:46 No.102519315

>>102519073
rajeshkumar69@openbbs.in

Anonymous
09/23/24(Mon)12:05:34 No.102519361

Anonymous 09/23/24(Mon)12:05:34 No.102519361

>no one has fine-tuned a llm to play chess
really? like, it such an obvious thing to try, its also extremely easy to get tons of data to train on, I've done a lot of loras for image gen but llms seem to much more complicated :(

Anonymous
09/23/24(Mon)12:08:16 No.102519407

Anonymous 09/23/24(Mon)12:08:16 No.102519407

>>102519361
Nope, someone definitely did try, I saw a paper about it some time ago.

Anonymous
09/23/24(Mon)12:08:18 No.102519408

Anonymous 09/23/24(Mon)12:08:18 No.102519408

https://github.com/exo-explore/exo
Has anyone here tried this out? Is it as seamless as described or is that just good marketing?

Anonymous
09/23/24(Mon)12:17:44 No.102519518

Anonymous 09/23/24(Mon)12:17:44 No.102519518

>>102519361
i think i saw one some time ago that said it could beat one of the gpt 4s at 22m parameters

Anonymous
09/23/24(Mon)12:18:02 No.102519523

Anonymous 09/23/24(Mon)12:18:02 No.102519523

>>102519408
I just know that if I go down the path of trying it, I will have wasted precious minutes to hours of my life. That's always how it is with obscure projects on github that nobody uses in any serious capacity.

Anonymous
09/23/24(Mon)12:19:50 No.102519555

Anonymous 09/23/24(Mon)12:19:50 No.102519555

Aren't there any 8 bit quants of 22b Mistral or 32b Qwen for VLLM? I can't find any

Anonymous
09/23/24(Mon)12:20:01 No.102519558

Anonymous 09/23/24(Mon)12:20:01 No.102519558

>>102513911
>>102518080
I really miss the links. I would read the summaries and then click on the links for the things I cared about. I don't know what the solution is though.

Anonymous
09/23/24(Mon)12:21:11 No.102519572

Anonymous 09/23/24(Mon)12:21:11 No.102519572

>>102519361
i saw this earlier when i was trying to figure out what a gbnf was and how to use one (and failed)
you can apparently use it for chess though
https://github.com/ggerganov/llama.cpp/pull/1773

Anonymous
09/23/24(Mon)12:26:02 No.102519640

Anonymous 09/23/24(Mon)12:26:02 No.102519640

File: morningcoffeemiku.png (1.57 MB, 896x1152)

1.57 MB PNG

Good morning /lmg/

Anonymous
09/23/24(Mon)12:26:30 No.102519651

Anonymous 09/23/24(Mon)12:26:30 No.102519651

>>102519640
Good morning Miku

Anonymous
09/23/24(Mon)12:26:58 No.102519661

Anonymous 09/23/24(Mon)12:26:58 No.102519661

I think I'm back. Mistral Small feels smart enough now. It's still dumber than the huge models I was using before, but it's fast, and at least smarter than Nemo. And it's fun, unlike Qwen. I think I will finally settle down.

Anonymous
09/23/24(Mon)12:27:10 No.102519663

Anonymous 09/23/24(Mon)12:27:10 No.102519663

>>102519640
happy miku monday

Anonymous
09/23/24(Mon)12:29:29 No.102519696

Anonymous 09/23/24(Mon)12:29:29 No.102519696

>>102519572
yeah, but unfortunately even big models like qwen 2.5 72b struggle with chess after the opening, they make a lot of illegal moves (even when you make think before saying a move), I'll see what can be done with this https://huggingface.co/spaces/mlabonne/chessllm

Anonymous
09/23/24(Mon)12:30:08 No.102519708

Anonymous 09/23/24(Mon)12:30:08 No.102519708

>>102519661
llama 4 will knock your sox off, 90 mmlu and 120 humaneval, CoT support, true multimodality

Anonymous
09/23/24(Mon)12:30:48 No.102519718

Anonymous 09/23/24(Mon)12:30:48 No.102519718

>Meta Connect in 2 days
We're going to be so back, not necessarily with Llama, but with a competitor who will use this news opportunity to also release a model.
(I am speculating)

Anonymous
09/23/24(Mon)12:32:14 No.102519735

Anonymous 09/23/24(Mon)12:32:14 No.102519735

File: 11__00744_.png (1.78 MB, 1024x1024)

1.78 MB PNG

>>102519235
>cause your gpu is busy
If you run ST remotely you avoid this problem.
Plus you can use your GPU for TTS or image gen at that point

Anonymous
09/23/24(Mon)12:33:29 No.102519753

Anonymous 09/23/24(Mon)12:33:29 No.102519753

>>102519718
I remember watching last year's meta connect and they teased multimodal llama 3... now here we are and there's still no multimodal llama 3...

Anonymous
09/23/24(Mon)12:40:04 No.102519843

Anonymous 09/23/24(Mon)12:40:04 No.102519843

File: goldenfreddybreakindown.gif (379 KB, 220x186)

379 KB GIF

>>102519640
https://youtu.be/DJlztMRIZVE?si=YeNWiRz5052v-xhO

Anonymous
09/23/24(Mon)12:40:16 No.102519846

Anonymous 09/23/24(Mon)12:40:16 No.102519846

>>102517235
pre-filter CAI was true uncensored experience without any tinkering bullshit, no local model is capable of this, "ahh ahh mistress" one message agp delusions doesn't count btw.

Anonymous
09/23/24(Mon)12:42:33 No.102519877

Anonymous 09/23/24(Mon)12:42:33 No.102519877

NAI3 just leaked in hdg

Anonymous
09/23/24(Mon)12:46:15 No.102519920

Anonymous 09/23/24(Mon)12:46:15 No.102519920

>>102519877
>NAI3 just leaked in hdg
please expand upon this post

Anonymous
09/23/24(Mon)12:46:59 No.102519927

Anonymous 09/23/24(Mon)12:46:59 No.102519927

>>102519877
big if true

Anonymous
09/23/24(Mon)12:48:09 No.102519943

Anonymous 09/23/24(Mon)12:48:09 No.102519943

>>102519877
>thought this was a buzz and woody situation
>expected to get Woody Laugh.wav
>it's real.wav

>i dont know how to cross board quote >>>/h/8218392

Anonymous
09/23/24(Mon)12:51:22 No.102519982

Anonymous 09/23/24(Mon)12:51:22 No.102519982

>>102519943
>Illustrious-XL-v0.1.safetensors
Based base64 enjoyer

Anonymous
09/23/24(Mon)12:53:26 No.102520009

Anonymous 09/23/24(Mon)12:53:26 No.102520009

>>102519943
>every other AI company has their shit leaked already
>but anthropic and openai, and their giant customers like ms and amazon (with indian staff) never did
Indians have more ethics than Europeans

Anonymous
09/23/24(Mon)12:53:31 No.102520010

Anonymous 09/23/24(Mon)12:53:31 No.102520010

>>102519982
>no peers
>no seeds
>no filenames
ngmi

Anonymous
09/23/24(Mon)12:57:15 No.102520060

Anonymous 09/23/24(Mon)12:57:15 No.102520060

>>102520009
Or they can't buy HDDs with enough space to exfiltrate the models

Anonymous
09/23/24(Mon)12:58:31 No.102520076

Anonymous 09/23/24(Mon)12:58:31 No.102520076

>>102520009
sharing is good

Anonymous
09/23/24(Mon)13:00:35 No.102520100

Anonymous 09/23/24(Mon)13:00:35 No.102520100

>>102520076
You mean stealing

Anonymous
09/23/24(Mon)13:00:56 No.102520108

Anonymous 09/23/24(Mon)13:00:56 No.102520108

>>102520009
Open ai and anthropic employees get paid millions to keep their AGI a secret

Anonymous
09/23/24(Mon)13:04:50 No.102520148

Anonymous 09/23/24(Mon)13:04:50 No.102520148

File: f1 james hunt ce90df4b1d3(...).jpg (827 KB, 3000x1785)

827 KB JPG

>>102520100
>stealing
Steal the weights and datasets? That's unethical. Instead, make copies so the owners won't miss anything.

Anonymous
09/23/24(Mon)13:07:41 No.102520185

Anonymous 09/23/24(Mon)13:07:41 No.102520185

>nai3 got my autistic fixation waifu perfectly accurate on first try where pony couldnt even do that
i dont need a lora for her anymore.. holy kino im so back..
also sneeding the torrent for a bit

Anonymous
09/23/24(Mon)13:08:44 No.102520199

Anonymous 09/23/24(Mon)13:08:44 No.102520199

>>102519361
Yeah. Crazy nobody thought about it.
>https://huggingface.co/HaileyStorm/chess-mamba-vs-xformer
>https://huggingface.co/zyxdream/Mistral-7B-chess
>https://huggingface.co/Leon-LLM/Leon-Chess-1M
>https://huggingface.co/nevmenandr/w2v-chess
I'd post more, but i think even you'd get the point.

Anonymous
09/23/24(Mon)13:08:44 No.102520200

Anonymous 09/23/24(Mon)13:08:44 No.102520200

>>102520009
not really a matter of ethics, just strength of security policy
the big labs threat models are the intelligence agencies of china, russia, north korea, israel, and iran

Anonymous
09/23/24(Mon)13:08:45 No.102520201

Anonymous 09/23/24(Mon)13:08:45 No.102520201

Is it just image model?

Anonymous
09/23/24(Mon)13:09:27 No.102520214

Anonymous 09/23/24(Mon)13:09:27 No.102520214

>>102519555
This one?
https://huggingface.co/Qwen/Qwen2.5-32B-Instruct-GPTQ-Int8

Anonymous
09/23/24(Mon)13:10:02 No.102520226

Anonymous 09/23/24(Mon)13:10:02 No.102520226

>>102520201
What is?

CPuMAXx/VI !CPuMAXx/VI
09/23/24(Mon)13:10:36 No.102520238

CPuMAXx/VI !CPuMAXx/VI 09/23/24(Mon)13:10:36 No.102520238

File: recapbot-qwen-2.5-72b-q8.png (35 KB, 1914x582)

35 KB PNG

picrel is qwen 2.5 72b recapbot test.
This is old and way overdue (was on the road when qwen dropped...did a test remotely but didn't have a good opportunity to post the results)
It was ok, good even, but droned on with a useless "popularity report" that wasn't requested in the prompt. The way it runs into the previous response makes me think it might have been a token problem in lcpp, but the results weren't impressive enough otherwise for me to continue testing it.

Anonymous
09/23/24(Mon)13:11:53 No.102520258

Anonymous 09/23/24(Mon)13:11:53 No.102520258

>>102520185
is it something i can play with on 8gb of vram?

Anonymous
09/23/24(Mon)13:12:32 No.102520266

Anonymous 09/23/24(Mon)13:12:32 No.102520266

NAI leaked
https://huggingface.co/spaces/AngelBottomless/Illustrious-XL-v0.1-demo

Anonymous
09/23/24(Mon)13:12:52 No.102520272

Anonymous 09/23/24(Mon)13:12:52 No.102520272

>>102520258
>is it something i can play with on 8gb of vram?
>Filesize 6.5GB
yep, looks like it

Anonymous
09/23/24(Mon)13:13:51 No.102520289

Anonymous 09/23/24(Mon)13:13:51 No.102520289

>>102520272
holy heckin' cool

Anonymous
09/23/24(Mon)13:14:46 No.102520309

Anonymous 09/23/24(Mon)13:14:46 No.102520309

I just don't like coming here anymore now that the recap is broken

Anonymous
09/23/24(Mon)13:17:55 No.102520361

Anonymous 09/23/24(Mon)13:17:55 No.102520361

>>102520258
its sdxl so yes

also update your qbit if youre not sneeding the magnet you tards

Anonymous
09/23/24(Mon)13:17:57 No.102520362

Anonymous 09/23/24(Mon)13:17:57 No.102520362

>>102520238
Do you have the link to the script for recap bot?

Anonymous
09/23/24(Mon)13:18:39 No.102520377

Anonymous 09/23/24(Mon)13:18:39 No.102520377

>>102520266
I thought NAI was supposed to be really good?

Anonymous
09/23/24(Mon)13:22:39 No.102520446

Anonymous 09/23/24(Mon)13:22:39 No.102520446

>>102514495
Usually you can just run strings on PDFs although using eg poplar's pdf2txt is probably a better idea.

The real problem you'll run into is that there's no way to fit that into the context window so you'll have to train a LoRA on it and that's no joke.

Anonymous
09/23/24(Mon)13:23:40 No.102520468

Anonymous 09/23/24(Mon)13:23:40 No.102520468

>>102514563
>>102514541
>>102520446
What he really needs is a vector database to do RAG.

Anonymous
09/23/24(Mon)13:26:04 No.102520492

Anonymous 09/23/24(Mon)13:26:04 No.102520492

>>102520009
WTF would you do with gpt3/4's weights? It's way too fucking fat.

Anonymous
09/23/24(Mon)13:28:02 No.102520517

Anonymous 09/23/24(Mon)13:28:02 No.102520517

>>102520492
How fat?

Anonymous
09/23/24(Mon)13:28:14 No.102520522

Anonymous 09/23/24(Mon)13:28:14 No.102520522

File: yar har har.png (50 KB, 1245x474)

50 KB PNG

>>102520361
nordvpn's socks5 service is gay and only works 15% of the time
there'll be a ddl ready before this shit even starts

Anonymous
09/23/24(Mon)13:28:24 No.102520527

Anonymous 09/23/24(Mon)13:28:24 No.102520527

>>102520492
They would be too sloppy anyways. Now, a claude leak, that would be something. Sonnet supposedly is around 100B, so that is certainly within our reach.

Anonymous
09/23/24(Mon)13:29:58 No.102520550

Anonymous 09/23/24(Mon)13:29:58 No.102520550

>>102520522
>he redeemed the retarded youtuber shilled fed honeypot VPN
should've gone mullvad.

Anonymous
09/23/24(Mon)13:30:39 No.102520563

Anonymous 09/23/24(Mon)13:30:39 No.102520563

>>102520522
Fucking rawdog it like a man. No one fucking cares.

Anonymous
09/23/24(Mon)13:31:18 No.102520571

Anonymous 09/23/24(Mon)13:31:18 No.102520571

>>102520362
https://github.com/cpumaxx/lmg_recapbot

Anonymous
09/23/24(Mon)13:31:27 No.102520573

Anonymous 09/23/24(Mon)13:31:27 No.102520573

File: file.png (78 KB, 1205x195)

78 KB PNG

Qwen2.5 is super censored

Anonymous
09/23/24(Mon)13:31:32 No.102520574

Anonymous 09/23/24(Mon)13:31:32 No.102520574

new altman post https://ia.samaltman.com/
he doesnt really say much other than stuff is gonna get more gooder

Anonymous
09/23/24(Mon)13:31:53 No.102520576

Anonymous 09/23/24(Mon)13:31:53 No.102520576

>>102520214
For some reason I thought VLLM only supports FP8, but "Quantizations: GPTQ, AWQ, INT4, INT8, and FP8", thx

Anonymous
09/23/24(Mon)13:32:44 No.102520594

Anonymous 09/23/24(Mon)13:32:44 No.102520594

>>102520377
The only thing good they had was an SD 1.5 finetune two years ago. They were the first ones to make an actual usable anime model and added good support for aspect ratios. Of course it leaked immediately so there was no reason to pay them.
Ever since then the rest of the industry in both image and text has largely blown past them and they're coasting on their initial popularity. This new leak is just another SDXL tune with danbooru tag style (limited information) prompting. It might have been interesting a year ago.

Anonymous
09/23/24(Mon)13:32:54 No.102520595

Anonymous 09/23/24(Mon)13:32:54 No.102520595

>>102520573
>alpaca
you're using a pozzed instruct template from 2022 so your prompt and previous messages are probably slop too.

Anonymous
09/23/24(Mon)13:33:16 No.102520601

Anonymous 09/23/24(Mon)13:33:16 No.102520601

>>102520573
There's a goat-chan card?

Anonymous
09/23/24(Mon)13:33:48 No.102520609

Anonymous 09/23/24(Mon)13:33:48 No.102520609

>>102520595
no doubt, I don't even know how to do those, how do I update?

Anonymous
09/23/24(Mon)13:34:25 No.102520618

Anonymous 09/23/24(Mon)13:34:25 No.102520618

>>102520595
Retard

Anonymous
09/23/24(Mon)13:34:32 No.102520619

Anonymous 09/23/24(Mon)13:34:32 No.102520619

i want to be nice to /lmg/ cause i actually like this general but the NAI leak is a meme, a different model was leaked and someone somehow managed to psyop anyone that cant base64 into thinking it was "NAI3"
its just a really damn good sdxl based model with more up to date booru training
>which is why its WAY better than pony or any other anime model at certain characters without loras

Anonymous
09/23/24(Mon)13:35:19 No.102520634

Anonymous 09/23/24(Mon)13:35:19 No.102520634

>>102520601
got it from chub ai

Anonymous
09/23/24(Mon)13:35:23 No.102520635

Anonymous 09/23/24(Mon)13:35:23 No.102520635

>>102520573
>focusing on creative storytelling that everyone can enjoy
If you still have it running can you ask it if you can roleplay you being an assassin trying to kill someone? And if it agrees then tell it you want to roleplay assassinating chairman xi jingping

Anonymous
09/23/24(Mon)13:35:57 No.102520649

Anonymous 09/23/24(Mon)13:35:57 No.102520649

>>102520574
You now get to pay for twice as many thonk tokens for 1% better benchmark results.

Anonymous
09/23/24(Mon)13:36:28 No.102520654

Anonymous 09/23/24(Mon)13:36:28 No.102520654

>>102520550
I keep read mullvad as talmud so nty, bad vibes all around

Anonymous
09/23/24(Mon)13:37:34 No.102520677

Anonymous 09/23/24(Mon)13:37:34 No.102520677

>>102520654
dont let the semitic individuals ruin your life bud that's straight schizo shit, coming from a fellow schizo noticer.
plus youre a fucking idiot by default if you actually thought nord would be good

Anonymous
09/23/24(Mon)13:37:52 No.102520681

Anonymous 09/23/24(Mon)13:37:52 No.102520681

>>102520619
It also appears to have artist tags intact, and can reproduce styles (varying levels of success).

Anonymous
09/23/24(Mon)13:38:29 No.102520692

Anonymous 09/23/24(Mon)13:38:29 No.102520692

>>102520619
Thanks for the info. It doesn't appear to be a reused one at least...the hash (3e15ba0038) isn't easily googalable like it would be if it was already out there

Anonymous
09/23/24(Mon)13:39:17 No.102520701

Anonymous 09/23/24(Mon)13:39:17 No.102520701

>>102520574
/lmg/ is not ready to hear this but he's right about everything.

Anonymous
09/23/24(Mon)13:40:21 No.102520720

Anonymous 09/23/24(Mon)13:40:21 No.102520720

File: file.png (147 KB, 1227x481)

147 KB PNG

>>102520635
I dont know what the fuck is going on

Anonymous
09/23/24(Mon)13:41:33 No.102520735

Anonymous 09/23/24(Mon)13:41:33 No.102520735

>>102520720
>anon is so unbelievably fucking boring that he got SFW cucked by the model
KEK

Anonymous
09/23/24(Mon)13:41:50 No.102520739

Anonymous 09/23/24(Mon)13:41:50 No.102520739

File: 39_06057_.png (2.86 MB, 2048x2048)

2.86 MB PNG

>>102519877

hatsune_miku, close-up angle, pirate outfit, 90's anime style

Anonymous
09/23/24(Mon)13:42:45 No.102520750

Anonymous 09/23/24(Mon)13:42:45 No.102520750

>>102520609
So you're complaining about model behavior while knowing absolutely nothing about prompt formatting?
Someone pull out the Gerber. Babby need spoon feeding.

Anonymous
09/23/24(Mon)13:42:48 No.102520751

Anonymous 09/23/24(Mon)13:42:48 No.102520751

Does vllm need python 3.12? The requirements file has some comments, but I can't find anything official.

>>102520701
list anything that isn't immediately self-apparent

Anonymous
09/23/24(Mon)13:44:38 No.102520772

Anonymous 09/23/24(Mon)13:44:38 No.102520772

File: file.png (82 KB, 605x880)

82 KB PNG

>>102520750
ahahah, yeah.
I remember editing something about here before, but dont' remember what I did or why

Anonymous
09/23/24(Mon)13:45:39 No.102520782

Anonymous 09/23/24(Mon)13:45:39 No.102520782

File: file.png (200 KB, 1250x762)

200 KB PNG

>>102520750
no, maybe it was here

so where do you get updated ones?

Anonymous
09/23/24(Mon)13:46:35 No.102520796

Anonymous 09/23/24(Mon)13:46:35 No.102520796

>>102520782
Anon I...

Anonymous
09/23/24(Mon)13:49:04 No.102520833

Anonymous 09/23/24(Mon)13:49:04 No.102520833

>>102520782
>baby made boomboom

Anonymous
09/23/24(Mon)13:51:57 No.102520871

Anonymous 09/23/24(Mon)13:51:57 No.102520871

Can someone give guidance how to make llama 3.1 write smut? kinda like claude? I wanna jump ship to local

Anonymous
09/23/24(Mon)13:53:18 No.102520887

Anonymous 09/23/24(Mon)13:53:18 No.102520887

>>102520871
Can you guys stop bullying llama3? It has its use cases okay?

Anonymous
09/23/24(Mon)13:53:27 No.102520889

Anonymous 09/23/24(Mon)13:53:27 No.102520889

>>102520782
Anon, update your sillytavern (or wipe it entirely and start from fresh, it probably won't update correctly) and your backend too, by the look of things.
Then, set the both templates to chatml (if you want to use qwen). Different models use different templates, you should use the one specified in model's description.

Anonymous
09/23/24(Mon)13:53:36 No.102520890

Anonymous 09/23/24(Mon)13:53:36 No.102520890

>>102520751
>In 15 words: deep learning worked, got predictably better with scale, and we dedicated increasing resources to it.
>We can say a lot of things about what may happen next, but the main one is that AI is going to get better with scale
hitting a wall/diminishing returns crowd can't deal with this simple fact

Anonymous
09/23/24(Mon)13:55:14 No.102520912

Anonymous 09/23/24(Mon)13:55:14 No.102520912

Anyone expect anything useful from Meta in 2 days

Anonymous
09/23/24(Mon)13:56:01 No.102520925

Anonymous 09/23/24(Mon)13:56:01 No.102520925

>>102520889
thanks, I just updated sillytavern and koboldcpp, will delete both and download again, there's templates on models descriptions? is it the prompt format? https://huggingface.co/bartowski/Qwen2.5-32B-Instruct-GGUF

Anonymous
09/23/24(Mon)13:56:26 No.102520928

Anonymous 09/23/24(Mon)13:56:26 No.102520928

L3 405B is proof that Meta has hit the data wall

Anonymous
09/23/24(Mon)13:59:33 No.102520959

Anonymous 09/23/24(Mon)13:59:33 No.102520959

>>102520912
It's better if they don't expect anything. It will make it all the more dramatic when it drops.

Anonymous
09/23/24(Mon)14:00:18 No.102520966

Anonymous 09/23/24(Mon)14:00:18 No.102520966

>>102520925
Yeah, you should pay attention to that when trying out different models. If you can't find what the template is called, just look in sillytavern until you find the one that looks exactly the same. Or, if you are feeling a bit more adventurous, you can find the original unquanted model. It should have all the info you need.

Anonymous
09/23/24(Mon)14:00:31 No.102520969

Anonymous 09/23/24(Mon)14:00:31 No.102520969

File: pff.jpg (52 KB, 909x175)

52 KB JPG

>new koboldcpp
>crashes on startup with same parameters and model that worked fine on last version
>quickly realize the damn thing is trying to auto-assign layers to GPU without my instructions now for some reason, and is doing it so poorly that it crashes instantly
>have to manually flag it with --gpulayers 0 now to make it behave and go back to normal

I fucking love undocumented changes.

Anonymous
09/23/24(Mon)14:00:32 No.102520971

Anonymous 09/23/24(Mon)14:00:32 No.102520971

>>102520751
I need the fmt v11 lib. Fuck this documentation for this thing.

>>102520889
nta. Is there a template manager? If I am using qwen I want to be able to record that it should be using chatml.

Anonymous
09/23/24(Mon)14:00:32 No.102520972

Anonymous 09/23/24(Mon)14:00:32 No.102520972

I have a new test for technical models, similar to that Castlevania trivia test. Take the output of hostnamectl and keep removing lines and ask the model what program printed that. Shitty models will hallucinate screenfetch.

Anonymous
09/23/24(Mon)14:01:03 No.102520977

Anonymous 09/23/24(Mon)14:01:03 No.102520977

REJECT MODERNITY, NONE OF THESE MODELS ARE LOCAL THEY'RE ALL "OPEN" SOURCE MODELS MADE BY (((THEM))). YOU CAN NEVER GET RID OF SLOP UNTIL YOU DENY THEIR OFFERINGS.

REPENT AND RETURN TO PYGGY

Anonymous
09/23/24(Mon)14:02:00 No.102520989

Anonymous 09/23/24(Mon)14:02:00 No.102520989

>>102520199
The only llm there is the 7b mistral (the rest is random small transformers/slm) and it has literally 0 info on metrics and what it was trained on, what kind of notation?

Anonymous
09/23/24(Mon)14:03:01 No.102521001

Anonymous 09/23/24(Mon)14:03:01 No.102521001

>>102520594
? the nai models were always better than community models, their only "bad" model is the recent one with no artist tags, other than that all of their models are better than all the crap we have

(((THEM)))
09/23/24(Mon)14:04:08 No.102521015

(((THEM))) 09/23/24(Mon)14:04:08 No.102521015

>>102520977
We didn't make those models

Anonymous
09/23/24(Mon)14:05:54 No.102521033

Anonymous 09/23/24(Mon)14:05:54 No.102521033

File: file.png (10 KB, 500x60)

10 KB PNG

>>102520971
There's a function in sillytavern that achieves something similar. Fill in this field with your model's name/filename in your preset. It will then auto select this preset whenever you connect that model.
It works with regex matching, so lookup how does that work, or ask your llm.

Anonymous
09/23/24(Mon)14:06:31 No.102521037

Anonymous 09/23/24(Mon)14:06:31 No.102521037

>>102521001
NAIv3 and Pony are indistinguishable, and neither are anywhere close to Flux in prompt understanding.

Anonymous
09/23/24(Mon)14:07:27 No.102521049

Anonymous 09/23/24(Mon)14:07:27 No.102521049

>>102520527
There was a claude 2.0 leak, it's a ~768GB model. Go back through /lmg/ archives a while, you'll find the magnet link.

Anonymous
09/23/24(Mon)14:08:59 No.102521064

Anonymous 09/23/24(Mon)14:08:59 No.102521064

>>102521033
I assumed activation regex would be tied to the prompt, not the load. Thanks, I'll look into it.

Anonymous
09/23/24(Mon)14:10:32 No.102521082

Anonymous 09/23/24(Mon)14:10:32 No.102521082

>>102517339
Most current models at least the smaller ones can't use much context.

Anonymous
09/23/24(Mon)14:10:50 No.102521084

Anonymous 09/23/24(Mon)14:10:50 No.102521084

Best model that fits in 16GB VRAM?

Anonymous
09/23/24(Mon)14:11:17 No.102521093

Anonymous 09/23/24(Mon)14:11:17 No.102521093

>>102520977
which is why I only use based chink models (codegeex, qwen, yi, internlm, etc)

Anonymous
09/23/24(Mon)14:11:48 No.102521100

Anonymous 09/23/24(Mon)14:11:48 No.102521100

>>102517629
What setup do you get 2T/s with for 70b? I get around 1.5 and I'd love to get 2.

Anonymous
09/23/24(Mon)14:14:41 No.102521146

Anonymous 09/23/24(Mon)14:14:41 No.102521146

>>102513868
>>102517712
>>102514808
>>102519640
sex
with miku

Anonymous
09/23/24(Mon)14:16:44 No.102521176

Anonymous 09/23/24(Mon)14:16:44 No.102521176

File: 1711072659524105.jpg (125 KB, 2048x1705)

125 KB JPG

>>102518006
That cartoonish depiction of sex is about as safe and inoffensive as a Saturday Night live skit. Months ago, when I still hadn't realized that L3 was so cucked it was beyond redemption, I had a simple litmus test to evaluate the output. Nothing fancy, I'd just ask myself: would the faggot mods on reddit take it down? The answer was almost always no. Same for your log.

Anonymous
09/23/24(Mon)14:25:29 No.102521282

Anonymous 09/23/24(Mon)14:25:29 No.102521282

File: file.png (320 KB, 1856x882)

320 KB PNG

>>102520966
confirmed, reinstalling from scratch sillytavern and it's working well now, selected also chatml and enabled streaming
now it's working good, no censorship, thanks

Anonymous
09/23/24(Mon)14:34:26 No.102521413

Anonymous 09/23/24(Mon)14:34:26 No.102521413

>>102519877
>>102519943
Am I supposed to be impressed by this crap? Did "people" really pay to use this? Is this some kind of prank? If you told me that it's some crappy merge from civitai, I would believe it.

Anonymous
09/23/24(Mon)14:36:59 No.102521449

Anonymous 09/23/24(Mon)14:36:59 No.102521449

>>102521064
it is entirely tied to the prompt. That is no help at all.

Anonymous
09/23/24(Mon)14:41:54 No.102521516

Anonymous 09/23/24(Mon)14:41:54 No.102521516

Ok I've been using Mistral Small even more now and honestly it is still dumb as rocks compared to Mistral Large, even though it is smarter than Nemo and writes better than Qwen or whatever. Unironically over for us VRAMlets. Time to go back into hibernation.

Anonymous
09/23/24(Mon)14:44:22 No.102521549

Anonymous 09/23/24(Mon)14:44:22 No.102521549

>>102520912
The only thing Meta has talked about is Llama 3.1 but multimodal via adapters as specified by the paper. There is LITERALLY no reason to expect anything more.

Anonymous
09/23/24(Mon)14:54:42 No.102521668

Anonymous 09/23/24(Mon)14:54:42 No.102521668

>>102521176
Skill issue. Learn to lower your expectations more.

Anonymous
09/23/24(Mon)14:58:30 No.102521713

Anonymous 09/23/24(Mon)14:58:30 No.102521713

>>102519877
>>102519943
I don't get it, why are people calling this NAI? The post never said what it is. Are people just assuming or taking the opportunity to shitpost?

Anonymous
09/23/24(Mon)14:58:34 No.102521715

Anonymous 09/23/24(Mon)14:58:34 No.102521715

>>102521516
May have been doing something wrong but Mistral Large was very dry when I tried it

Anonymous
09/23/24(Mon)15:02:48 No.102521753

Anonymous 09/23/24(Mon)15:02:48 No.102521753

>>102521713
its 4chan, kek
also, shit model, cant do nice dicks

Anonymous
09/23/24(Mon)15:12:04 No.102521844

Anonymous 09/23/24(Mon)15:12:04 No.102521844

>>102521049
I don't remember that. proof?

Anonymous
09/23/24(Mon)15:20:36 No.102521947

Anonymous 09/23/24(Mon)15:20:36 No.102521947

>>102521049
that was just the magnet for llama 3.1 405b

Anonymous
09/23/24(Mon)15:23:28 No.102521976

Anonymous 09/23/24(Mon)15:23:28 No.102521976

>>102520009
>Indians have more ethics than Europeans
*Indians are better slaves than Europeans
Also have you heard about OpenAI's drug-fueled orgies? They are like FTX(remember pinning the weasel copypasta?), but with AI stuff. During them Sam tells new employees personally that he will lock them up in the rape dungeon for the rest of their lives if they leak shit. Nobody has the balls to do it. Don't know much about Anthropic, they never invited me. Probably have inherited the same traditions since most of them worked at OpenAI.

>>102521049
>>102521844
Miqu-2? That was just llama 405b.

Anonymous
09/23/24(Mon)15:39:58 No.102522148

Anonymous 09/23/24(Mon)15:39:58 No.102522148

Fuck VLLM I can't use 3 GPUs, need an even number

Anonymous
09/23/24(Mon)15:43:20 No.102522201

Anonymous 09/23/24(Mon)15:43:20 No.102522201

>>102522148
why are you using that dogshit in the first place?

Anonymous
09/23/24(Mon)15:43:41 No.102522208

Anonymous 09/23/24(Mon)15:43:41 No.102522208

>>102520739
is it just pony with the serial numbers filed off?
non-autistic prompting doesn't give great results. cfg scale 5-7 seems to be the sweet spot tho

Anonymous
09/23/24(Mon)15:46:38 No.102522242

Anonymous 09/23/24(Mon)15:46:38 No.102522242

For $700, I could get 3 3060s or 1 3090. Which should I do?

Anonymous
09/23/24(Mon)15:51:18 No.102522301

Anonymous 09/23/24(Mon)15:51:18 No.102522301

>>102522242
3090, no question

Anonymous
09/23/24(Mon)15:51:44 No.102522308

Anonymous 09/23/24(Mon)15:51:44 No.102522308

>>102522242
the 3090, no question.
memory density is the only thing that matters

Anonymous
09/23/24(Mon)15:51:58 No.102522315

Anonymous 09/23/24(Mon)15:51:58 No.102522315

>>102522242
buy 5 rx 580 16gb instead.

Anonymous
09/23/24(Mon)15:52:41 No.102522335

Anonymous 09/23/24(Mon)15:52:41 No.102522335

>>102522301
>>102522308
Thank you, I'll do that then

Anonymous
09/23/24(Mon)16:01:35 No.102522427

Anonymous 09/23/24(Mon)16:01:35 No.102522427

>>102522315
>rx 580 16gb
Those are a thing?
I only knew of 8gb and 4gb versions.
Interesting.

Anonymous
09/23/24(Mon)16:03:29 No.102522443

Anonymous 09/23/24(Mon)16:03:29 No.102522443

>>102522315
>buy 5 rx 580 16gb instead.
this way lies multi-psu ex-mining rig insanity

Anonymous
09/23/24(Mon)16:11:06 No.102522529

Anonymous 09/23/24(Mon)16:11:06 No.102522529

>>102519877
>>102519943
weird excuse to link your blacked shit thread here

Anonymous
09/23/24(Mon)16:19:05 No.102522617

Anonymous 09/23/24(Mon)16:19:05 No.102522617

File: file.png (153 KB, 941x737)

153 KB PNG

>fatal flow SAAAR!~
Alright, who did this? I refuse to believe such caricature of a saar actually exists

Anonymous
09/23/24(Mon)16:20:13 No.102522626

Anonymous 09/23/24(Mon)16:20:13 No.102522626

>>102522427
there was a chinese shop replacing memory on the boards with larger modules and hacked up bios, I think.

Needless to say they are of dubious nature

Even then, you run into all sorts of other problems trying to go that rough. 8GB 580s already have some memory bandwidth issues, but you will definitely see some frustrating results on pcie3, and that's pretending you can get enough lanes for 5 GPUs

Stupid configuration in a fun sort of way. But absolutely a very stupid hardware configuration.

Anonymous
09/23/24(Mon)16:22:13 No.102522654

Anonymous 09/23/24(Mon)16:22:13 No.102522654

>>102521037
>NAIv3 and Pony are indistinguishable
completely false, the former is more coherent but less aesthetic. pony is practically a base model due to the way it was trained, but the dataset was so small that a large quantity of world knowledge is gone. its characters look good because it's overtuned on them but it can't draw a fucking toilet or a basic room layout without putting 50 lamps, escherian beds, or all sorts of other anomalies.

Anonymous
09/23/24(Mon)16:22:36 No.102522662

Anonymous 09/23/24(Mon)16:22:36 No.102522662

>>102522617
Sometimes the caricatures are unfortunately more accurate than we'd hope is actually true.

Anonymous
09/23/24(Mon)16:26:27 No.102522697

Anonymous 09/23/24(Mon)16:26:27 No.102522697

Why no flux finetunes?

Anonymous
09/23/24(Mon)16:39:23 No.102522855

Anonymous 09/23/24(Mon)16:39:23 No.102522855

File: 1705255600234335.png (1.49 MB, 1410x1487)

1.49 MB PNG

>>102513868
Add NovelAI to the OP. They just saved the hobby.
https://blog.novelai.net/muscle-up-with-llama-3-erato-3b48593a1cab

Anonymous
09/23/24(Mon)16:40:12 No.102522866

Anonymous 09/23/24(Mon)16:40:12 No.102522866

File: ComfyUI_05725_.png (765 KB, 720x1280)

765 KB PNG

>>102522697
wdym? there's been hyper 8-step tunes for a while now
https://civitai.com/models/645943/flux-unchained-by-scg

Anonymous
09/23/24(Mon)16:40:20 No.102522871

Anonymous 09/23/24(Mon)16:40:20 No.102522871

>>102519877
https://huggingface.co/OnomaAIResearch/Illustrious-xl-early-release-v0
It's this shit. Weird way to advertise desu.

Anonymous
09/23/24(Mon)16:41:31 No.102522892

Anonymous 09/23/24(Mon)16:41:31 No.102522892

File: 6.png (104 KB, 668x672)

104 KB PNG

>8192 context size
lmao
not even gonna give you the (you)

Anonymous
09/23/24(Mon)16:43:13 No.102522913

Anonymous 09/23/24(Mon)16:43:13 No.102522913

>>102522855
>Add NovelAI to the OP.
add novelai to the trash
what a joke

Anonymous
09/23/24(Mon)16:44:16 No.102522932

Anonymous 09/23/24(Mon)16:44:16 No.102522932

>>102522855
Bait aside, Is it actually good though?

Anonymous
09/23/24(Mon)16:46:52 No.102522963

Anonymous 09/23/24(Mon)16:46:52 No.102522963

File: file.png (13 KB, 196x171)

13 KB PNG

>>102522932
>available in opus tier

Anonymous
09/23/24(Mon)16:47:02 No.102522966

Anonymous 09/23/24(Mon)16:47:02 No.102522966

>>102522855
I want Erato to choke me

Anonymous
09/23/24(Mon)16:47:20 No.102522970

Anonymous 09/23/24(Mon)16:47:20 No.102522970

>>102522932
>8k ctx
>proprietary
>proprietary so you can't rope 8k to 16k
>llama 3
It can be mid at best.

Anonymous
09/23/24(Mon)16:48:21 No.102522980

Anonymous 09/23/24(Mon)16:48:21 No.102522980

>>102522871
>Weird way to advertise desu.
Smells like game publishers leaking a denuvo 1.0 game that needs patches to work properly.

Anonymous
09/23/24(Mon)16:49:35 No.102523001

Anonymous 09/23/24(Mon)16:49:35 No.102523001

>>102522963
Lmao

Anonymous
09/23/24(Mon)16:51:14 No.102523020

Anonymous 09/23/24(Mon)16:51:14 No.102523020

>>102522871
Knows more characters than autismmix, but looks worse.

Anonymous
09/23/24(Mon)16:51:56 No.102523027

Anonymous 09/23/24(Mon)16:51:56 No.102523027

>>102522932
If you have something specific you want me to test, I'm happy to do so

Anonymous
09/23/24(Mon)16:53:03 No.102523045

Anonymous 09/23/24(Mon)16:53:03 No.102523045

File: 1698302108653873.png (352 KB, 2344x994)

352 KB PNG

>>102522932
I just regenerated this prompt with it: >>102518006
It already threw a "her voice a hoarse whisper" at me.

Anonymous
09/23/24(Mon)16:53:25 No.102523054

Anonymous 09/23/24(Mon)16:53:25 No.102523054

>>102522855
>llama3 censored slop
no one cares bro

Anonymous
09/23/24(Mon)16:57:23 No.102523093

Anonymous 09/23/24(Mon)16:57:23 No.102523093

>>102522855
>8k
Kek. I guess that says something about their users as well.

Anonymous
09/23/24(Mon)16:57:28 No.102523094

Anonymous 09/23/24(Mon)16:57:28 No.102523094

>>102523027
Can you generate a continuation for this?
https://pastebin.com/raw/iuLzTWmL

Anonymous
09/23/24(Mon)16:58:50 No.102523109

Anonymous 09/23/24(Mon)16:58:50 No.102523109

>>102522871
It's either really hard to prompt well, or needs some weird workflow or nonstandard settings. I can't get much good output from it.
I'm on the fence, but leaning towards deleting it.

Anonymous
09/23/24(Mon)16:58:52 No.102523111

Anonymous 09/23/24(Mon)16:58:52 No.102523111

File: file.png (747 KB, 1280x720)

747 KB PNG

>>102523045
Slop is unavoidable. Slop is your destiny.

Anonymous
09/23/24(Mon)17:00:13 No.102523129

Anonymous 09/23/24(Mon)17:00:13 No.102523129

>>102523045
wow, this feels just like Mistral slop

Anonymous
09/23/24(Mon)17:00:58 No.102523136

Anonymous 09/23/24(Mon)17:00:58 No.102523136

>>102523045
This was also the max output length that it allows.

Anonymous
09/23/24(Mon)17:03:46 No.102523168

Anonymous 09/23/24(Mon)17:03:46 No.102523168

File: screenshot-New Story-2024(...).jpg (936 KB, 1080x5736)

936 KB JPG

>>102523094

Anonymous
09/23/24(Mon)17:07:14 No.102523226

Anonymous 09/23/24(Mon)17:07:14 No.102523226

>>102523168
>I said, but Kirino was already back on her phone, looking down at her phone with a smile. She didn’t look at me at all, and was just looking down at her phone, her face full of smiles.

This sound kind of amateurish not gonna lie. Also, what's up with the repetitions. Not very impressed

Anonymous
09/23/24(Mon)17:07:38 No.102523230

Anonymous 09/23/24(Mon)17:07:38 No.102523230

>https://huggingface.co/datasets/openai/MMMLU

Four hundred thousand rows of pure, unfiltered GPTsloppa... my mouth is drooling thinking about it

Anonymous
09/23/24(Mon)17:10:00 No.102523264

Anonymous 09/23/24(Mon)17:10:00 No.102523264

>>102523230
>English wasn't enough, we must slop up all the other languages too!
Sam is evil.

Anonymous
09/23/24(Mon)17:11:30 No.102523289

Anonymous 09/23/24(Mon)17:11:30 No.102523289

>>102523230
>>102523264
pretty sure this is a testing dataset, not a training one
>We translated the MMLU’s test set into 14 languages using professional human translators.
Basically mmlu but in other languages

Anonymous
09/23/24(Mon)17:13:14 No.102523313

Anonymous 09/23/24(Mon)17:13:14 No.102523313

>>102522201
>Dog shit
It is by far the best if your not a poor fag / retard

Anonymous
09/23/24(Mon)17:13:25 No.102523316

Anonymous 09/23/24(Mon)17:13:25 No.102523316

>>102522932
they did continued pretraining on their data so maybe it'll have a bit more pop culture and other knowledge, but I doubt you'll see too much other than that.
to be honest I don't think raw completion is a good format at all for getting good outputs from models, I feel like you hit diminishing returns really fast with it while the difference in intelligence shines through a lot more with instruct models

Anonymous
09/23/24(Mon)17:13:39 No.102523318

Anonymous 09/23/24(Mon)17:13:39 No.102523318

>>102523168
>I said, but Kirino was already back on her phone, looking down at her phone with a smile. She didn't look at me at all, and was just looking down at her phone, her face full of smiles.
Damn, that's pretty bad. So that's the power of a 100B+ tokens LLaMA continued pre-training, huh?

Anonymous
09/23/24(Mon)17:13:44 No.102523321

Anonymous 09/23/24(Mon)17:13:44 No.102523321

>>102523264
>Du VILL zpeak in nevzpeak
>Du VILL zink in nevzpeak
>Du VILL avoid ze harmful vordz
>Du VILL avoid ze harmful zotz
>Du VILL be aligned
>Und du VILL be happi.

Anonymous
09/23/24(Mon)17:16:55 No.102523349

Anonymous 09/23/24(Mon)17:16:55 No.102523349

>>102523168
Wow, being in the local bubble and being slowly boiled by increasingly sophisticated models, I didn't realize how garbage the commercial offerings were in comparison. Since I never use them, they had some residual halo around them in my mind that made me thing they were better than they actually are.
I've developed a co-writing/text adventure prompt of around 13k characters that kicks the ever living shit out of the Erbus thing when paired with L3 405b. We're winning, gonna make it, eating good, etc

Anonymous
09/23/24(Mon)17:17:59 No.102523366

Anonymous 09/23/24(Mon)17:17:59 No.102523366

>>102523349
>when paired with
>405B
Damn, poorfags drowning here.

Anonymous
09/23/24(Mon)17:19:33 No.102523384

Anonymous 09/23/24(Mon)17:19:33 No.102523384

>>102523349
>heh, my $10,000 pc is way better than this $25/month service
meh

Anonymous
09/23/24(Mon)17:21:35 No.102523414

Anonymous 09/23/24(Mon)17:21:35 No.102523414

>>102523384
You can have a better experience using OpenRouter and that would cost way less than $25/month.

Anonymous
09/23/24(Mon)17:21:57 No.102523416

Anonymous 09/23/24(Mon)17:21:57 No.102523416

>use models more
>it becomes clear that ERP shit becomes boring fast
>the only thing that turns me on now is creative intelligence
It's so over for me.

Anonymous
09/23/24(Mon)17:22:00 No.102523417

Anonymous 09/23/24(Mon)17:22:00 No.102523417

>>102523384
nta, but for ~17$/mo you can subscribe to Poe (which is also shit btw) and get literally every model. 25$ is a lot for a 70b even if it's supposedly free of slop

Anonymous
09/23/24(Mon)17:29:33 No.102523534

Anonymous 09/23/24(Mon)17:29:33 No.102523534

How do Qwen 72b and 32b actually compare to GPT4o, disregarding benchmarks?

Anonymous
09/23/24(Mon)17:31:35 No.102523557

Anonymous 09/23/24(Mon)17:31:35 No.102523557

>>102523534
They don't.

Anonymous
09/23/24(Mon)17:32:14 No.102523565

Anonymous 09/23/24(Mon)17:32:14 No.102523565

>>102523534
The only model that compares to GPT4o is LLaMA 3.1 405B

Anonymous
09/23/24(Mon)17:33:05 No.102523574

Anonymous 09/23/24(Mon)17:33:05 No.102523574

>>102523534
For math 72 is on par. For coding, it's also there, maybe a bit worse. For everything else, it's not close. Especially world knowledge

Anonymous
09/23/24(Mon)17:33:40 No.102523581

Anonymous 09/23/24(Mon)17:33:40 No.102523581

>>102523534
>How do Qwen 72b and 32b actually compare to GPT4o, disregarding benchmarks?
>>102523565
>The only model that compares to GPT4o is LLaMA 3.1 405B
The latest Deepseek releases are close as well

Anonymous
09/23/24(Mon)17:33:51 No.102523584

Anonymous 09/23/24(Mon)17:33:51 No.102523584

Is there a SINGLE convincing argument you can make for why you NEED more than Gemmasutra 2B? Serious answers only. Logs preferred.

Anonymous
09/23/24(Mon)17:35:00 No.102523599

Anonymous 09/23/24(Mon)17:35:00 No.102523599

File: 13183-autismmixSDXL_autis(...).jpg (159 KB, 1024x1024)

159 KB JPG

After a year and a half of following these developments and being the owner of a 3090 and 64 gb of ram I'm still not sure how to make AI roleplaying into a compelling experience.

Anonymous
09/23/24(Mon)17:35:08 No.102523603

Anonymous 09/23/24(Mon)17:35:08 No.102523603

>>102523584
See: >>102518006
A 2B model doesn't listen to instructions or the prompt well.

Anonymous
09/23/24(Mon)17:35:45 No.102523615

Anonymous 09/23/24(Mon)17:35:45 No.102523615

>>102523581
True, I forgot about it.

Anonymous
09/23/24(Mon)17:35:47 No.102523616

Anonymous 09/23/24(Mon)17:35:47 No.102523616

>>102523584
>NEED
for what purpose?

Anonymous
09/23/24(Mon)17:35:54 No.102523620

Anonymous 09/23/24(Mon)17:35:54 No.102523620

>>102523584
I've never played it with it. What happens with the model if you pull a girl's panties up instead of down while she's wearing them.

Anonymous
09/23/24(Mon)17:36:48 No.102523631

Anonymous 09/23/24(Mon)17:36:48 No.102523631

>>102523584
It doesn't know what mesugaki is. (seriously)

Anonymous
09/23/24(Mon)17:37:00 No.102523636

Anonymous 09/23/24(Mon)17:37:00 No.102523636

>>102523584
here:>>102518846

Anonymous
09/23/24(Mon)17:39:19 No.102523663

Anonymous 09/23/24(Mon)17:39:19 No.102523663

>and maybe, just maybe
How does one stop this shit?

Anonymous
09/23/24(Mon)17:41:20 No.102523684

Anonymous 09/23/24(Mon)17:41:20 No.102523684

>>102520574
Might as well ask michio kaku and black science man what the future will be like

Anonymous
09/23/24(Mon)17:43:20 No.102523721

Anonymous 09/23/24(Mon)17:43:20 No.102523721

File: representativeimage.png (1.43 MB, 1152x896)

1.43 MB PNG

>why run your own hardware when you can pay monthly for a mystery-meat 70b with 8k context?

Anonymous
09/23/24(Mon)17:45:26 No.102523753

Anonymous 09/23/24(Mon)17:45:26 No.102523753

>>102523230
This looks like a good dataset to create a translation dataset from.

Anonymous
09/23/24(Mon)17:46:33 No.102523773

Anonymous 09/23/24(Mon)17:46:33 No.102523773

>>102523636
I see, so that's the power of a promptchad

Anonymous
09/23/24(Mon)17:49:43 No.102523802

Anonymous 09/23/24(Mon)17:49:43 No.102523802

File: aaa.png (82 KB, 1024x1220)

82 KB PNG

>>102523663
you can erase the word out of existence with logit bias

Anonymous
09/23/24(Mon)17:51:41 No.102523817

Anonymous 09/23/24(Mon)17:51:41 No.102523817

>>102523802
>probably, just probably

Anonymous
09/23/24(Mon)17:51:43 No.102523818

Anonymous 09/23/24(Mon)17:51:43 No.102523818

>>102523802
But the word can still be useful in other contexts, no?

Anonymous
09/23/24(Mon)17:52:35 No.102523829

Anonymous 09/23/24(Mon)17:52:35 No.102523829

>>102523818
perchance

Anonymous
09/23/24(Mon)17:56:57 No.102523885

Anonymous 09/23/24(Mon)17:56:57 No.102523885

>>102523663
In SillyTavern use the Regex extension. Create a regex with this Find String:
/([Mm])aybe, just maybe/g
and this "Replace With":
$1aybe
Select the box to run on AI output which for some crazy reason isn't enabled by default.

Anonymous
09/23/24(Mon)17:57:55 No.102523898

Anonymous 09/23/24(Mon)17:57:55 No.102523898

>>102523802
Doesn't removing that token impact other words, as well?

Anonymous
09/23/24(Mon)18:11:41 No.102524060

Anonymous 09/23/24(Mon)18:11:41 No.102524060

Barely above a whisper

Anonymous
09/23/24(Mon)18:13:46 No.102524087

Anonymous 09/23/24(Mon)18:13:46 No.102524087

>a friendly homeless person card
>suddenly she's in my fucking home and trying to get me to fuck her
Wtf mistral?

Anonymous
09/23/24(Mon)18:15:02 No.102524100

Anonymous 09/23/24(Mon)18:15:02 No.102524100

File: vramlet-22b.png (1.31 MB, 1200x848)

1.31 MB PNG

What's the best model for 24GB GPU anons?
Is mistral small worthy of a download?

Anonymous
09/23/24(Mon)18:17:15 No.102524125

Anonymous 09/23/24(Mon)18:17:15 No.102524125

>>102524100
That depends on whether you would even be happy with "the best" you can do in 24GB. Mistral Small is probably your best bet. I tested it in Q8 and it's not perfect but I did have fun at times. You could try it out. Maybe with Q6 since then you can fit more of it on your GPU and in theory the quality loss should be minimal.

Anonymous
09/23/24(Mon)18:18:59 No.102524143

Anonymous 09/23/24(Mon)18:18:59 No.102524143

>>102524125
Thanks, I'll give it a whirl

Anonymous
09/23/24(Mon)18:19:40 No.102524153

Anonymous 09/23/24(Mon)18:19:40 No.102524153

File: 1711666924617032.gif (1.62 MB, 448x598)

1.62 MB GIF

>>102523599
drugs

Anonymous
09/23/24(Mon)18:20:31 No.102524162

Anonymous 09/23/24(Mon)18:20:31 No.102524162

>>102524100
There's also Qwen2.5 32B, but it needs a fine-tune to make it more usable.

Anonymous
09/23/24(Mon)18:21:16 No.102524172

Anonymous 09/23/24(Mon)18:21:16 No.102524172

>>102524153
But I don't do drugs. Wouldn't that make typing harder?

Anonymous
09/23/24(Mon)18:23:13 No.102524203

Anonymous 09/23/24(Mon)18:23:13 No.102524203

>>102524172
tts, son

Anonymous
09/23/24(Mon)18:24:26 No.102524220

Anonymous 09/23/24(Mon)18:24:26 No.102524220

>>102524172
with most drugs at a reasonable dose, not really

Anonymous
09/23/24(Mon)18:26:43 No.102524249

Anonymous 09/23/24(Mon)18:26:43 No.102524249

>>102524100
>>102524125
With 24 GB of VRAM, you can comfortably load an 8.0bpw exl2 of Mistral Small with 16k context.

Anonymous
09/23/24(Mon)18:30:20 No.102524302

Anonymous 09/23/24(Mon)18:30:20 No.102524302

>>102524249
True but 16k is a bit short these days. If he downloads a Q6 he has some room to expand and try longer chats.

Anonymous
09/23/24(Mon)18:34:22 No.102524332

Anonymous 09/23/24(Mon)18:34:22 No.102524332

>>102522855
>most powerful
>8k
they don't even try because they don't have to.

Anonymous
09/23/24(Mon)18:38:10 No.102524378

Anonymous 09/23/24(Mon)18:38:10 No.102524378

>>102524339
>>102524339
>>102524339

Anonymous
09/23/24(Mon)18:48:15 No.102524491

Anonymous 09/23/24(Mon)18:48:15 No.102524491

>>102523599
>GGML GPTQ GGUF llama ollama exllama 4bit 8bit bpw Q6_K flash_attn tokenizers, RoPE
I struggle keeping up with this shit myself. It's crazy how image generation is much more straightforward than this.
I'm still on probably outdated Mixtral 8x7B 3.5bpw because it works decently and setting up a proper prompt formatting is a nightmare

Anonymous
09/23/24(Mon)18:50:18 No.102524521

Anonymous 09/23/24(Mon)18:50:18 No.102524521

File: Capture.png (110 KB, 1265x967)

110 KB PNG

>>102524491
This is my current favorite.

Anonymous
09/23/24(Mon)19:03:49 No.102524663

Anonymous 09/23/24(Mon)19:03:49 No.102524663

>>102524521
can I run this with 24G VRAM and 32G RAM?

Anonymous
09/23/24(Mon)19:05:07 No.102524684

Anonymous 09/23/24(Mon)19:05:07 No.102524684

>>102524663
32, probably not. Buy more sticks.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.