/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 07/30/24(Tue)10:03:16 No.101636887

File: __kasane_teto_utau_and_1_(...).jpg (270 KB, 1448x2048)

270 KB JPG

/lmg/ - Local Models General Anonymous 07/30/24(Tue)10:03:16 No.101636887 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>101628398 & >>101619436

►News
>(07/27) Llama 3.1 rope scaling merged: https://github.com/ggerganov/llama.cpp/pull/8676
>(07/26) Cyberagent releases Japanese fine-tune model: https://hf.co/cyberagent/Llama-3.1-70B-Japanese-Instruct-2407
>(07/25) BAAI & TeleAI release 1T parameter model: https://hf.co/CofeAI/Tele-FLM-1T
>(07/24) Mistral Large 2 123B released: https://hf.co/mistralai/Mistral-Large-Instruct-2407
>(07/23) Llama 3.1 officially released: https://ai.meta.com/blog/meta-llama-3-1/
>(07/22) llamanon leaks 405B base model: https://files.catbox.moe/d88djr.torrent >>101516633

►News Archive: https://rentry.org/lmg-news-archive
►FAQ: https://wikia.schneedc.com
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Programming: https://hf.co/spaces/bigcode/bigcode-models-leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
07/30/24(Tue)10:03:39 No.101636892

Anonymous 07/30/24(Tue)10:03:39 No.101636892

File: 1692844050068.gif (1.09 MB, 710x500)

1.09 MB GIF

►Recent Highlights from the Previous Thread: >>101628398

--Paper: VALL-E 2 paper criticized for lack of advancements: >>101628420 >>101630671
--Paper: Mixture of Nested Experts and Meta SAM2 for efficient image processing: >>101632425 >>101632458
--Paper: C3A: a new fine-tuning method using circular convolution: >>101632131 >>101632167
--Llamafile has better CPU inference performance on some CPUs due to ikawrakow's optimizations: >>101633772 >>101633958 >>101634038 >>101634367
--Llama.cpp performance discussion, including context size, attention mechanism, and GPU offloading: >>101628597 >>101628675 >>101632864 >>101633101 >>101633596 >>101633704 >>101634712 >>101634751 >>101634842 >>101634908 >>101635029 >>101634347 >>101628673
--Gguf vs exl2 formats and their differences: >>101630375 >>101630457 >>101630483 >>101630485
--Character card formatting suggestions for vramlet use: >>101633734 >>101634595 >>101634610 >>101634647 >>101634700 >>101634670 >>101634704
--CR+ excels at RPing and writing style, but may struggle with complex scenarios: >>101628972 >>101629174 >>101629086 >>101629222 >>101629261 >>101629386
--Anon troubleshoots Mistral Nemo issues with sampler preset and token output limit: >>101630731 >>101630824 >>101631077 >>101631448 >>101631972 >>101631099 >>101631367
--Anon releases Command R/R+ basic presets v1.3 for SillyTavern: >>101634180
--Anons discuss the golden age of open source and its future: >>101633831 >>101634061 >>101634421 >>101634435 >>101634461 >>101634529 >>101634686 >>101635282 >>101635358 >>101634476
--WInfo-before and WInfo-after still have use cases in 2024: >>101630520 >>101633814 >>101633901
--Anons discuss non-roleplay uses of LLMs, including translation, game development, and custom assistants: >>101629172 >>101629273 >>101629428 >>101629458 >>101629508 >>101629830
--Miku (free space): >>101628819 >>101629323 >>101630431 >>101630651 >>101630714 >>101636814

►Recent Highlight Posts from the Previous Thread: >>101628405

Anonymous
07/30/24(Tue)10:07:27 No.101636935

Anonymous 07/30/24(Tue)10:07:27 No.101636935

>virtamate
>https://hub.virtamate.com/resources/categories/looks.7/
God. They look like fucking ghouls. Imagine the people who are honestly making and playing with this shit.

Anonymous
07/30/24(Tue)10:13:12 No.101637007

Anonymous 07/30/24(Tue)10:13:12 No.101637007

cuda dev, oh cuda dev. why does quantized KV cache not work with RPC?

just insta-fails on this assert:
https://github.com/ggerganov/llama.cpp/blob/140074bb8647df41840d6f32f4409fa8959bcf9f/ggml/src/ggml-rpc.cpp#L390
>TODO: this check is due to MATRIX_ROW_PADDING in CUDA and should be generalized

Anonymous
07/30/24(Tue)10:15:03 No.101637034

Anonymous 07/30/24(Tue)10:15:03 No.101637034

teto's tata's...

Anonymous
07/30/24(Tue)10:19:12 No.101637073

Anonymous 07/30/24(Tue)10:19:12 No.101637073

COHERE, RELEASE A BANGER MODEL IN 30 TO 70B RANGE AND MY LIFE IS YOURS

Anonymous
07/30/24(Tue)10:21:39 No.101637111

Anonymous 07/30/24(Tue)10:21:39 No.101637111

>>101637073
This week.

Anonymous
07/30/24(Tue)10:22:10 No.101637120

Anonymous 07/30/24(Tue)10:22:10 No.101637120

File: latest-2852086992.jpg (364 KB, 1920x1200)

364 KB JPG

>>101637073
t.

Anonymous
07/30/24(Tue)10:22:19 No.101637123

Anonymous 07/30/24(Tue)10:22:19 No.101637123

>>101636935
Reminds me of Illusions Honey Select 2 - uncanny detailed 3D models with shitty animations and physics.
They're releasing something called サマバケ!すくらんぶる soon - supposed to be like AA2, which strategy-wise was actually pretty good. I might, for once, actually buy it if it doesn't suck.

llama.cpp CUDA dev !!OM2Fp6Fn93S
07/30/24(Tue)10:25:59 No.101637166

llama.cpp CUDA dev !!OM2Fp6Fn93S 07/30/24(Tue)10:25:59 No.101637166

>>101637007
I'm not familiar with the RPC backend but this check is much stricter than necessary.
Only the last row needs to be padded to a multiple of 512 (to avoid out-of-bounds memory accesses).
For all other rows no padding is needed because the activations are zero-padded to a multiple of 512 so the resulting vector dot products for the padding are equal to zero (unless there are NaNs or infs in the KV cache).

If you don't use CUDA or if you're using --flash-attention I think it would be safe to remove the check.

Anonymous
07/30/24(Tue)10:28:35 No.101637198

Anonymous 07/30/24(Tue)10:28:35 No.101637198

File: 1717508095907246.jpg (136 KB, 1200x548)

136 KB JPG

>>101634712
>>101634751

>the distribution of experts in Mixtral and Qwen2-57B-A14 is very imbalanced; thus, it would be beneficial to store only the most frequently used experts on the GPU

this was discussed basically the moment mixtral 8x7 dropped back in the day

isnt the problem with this and the reason why it wasnt implemented the fact that for each token you have to use all experts anyway since the MoE models use only X experts per layer (or something similar) not per token, meaning that you will be reading the entire model per token anyway just not all at the same time?

Anonymous
07/30/24(Tue)10:33:51 No.101637263

Anonymous 07/30/24(Tue)10:33:51 No.101637263

>>101637073
if they drop something I feel like it's more likely to be too big for local

Anonymous
07/30/24(Tue)10:33:54 No.101637264

Anonymous 07/30/24(Tue)10:33:54 No.101637264

File: GQoY1pNX0AA8nrN.jpg (426 KB, 1200x1200)

426 KB JPG

Tuesday Theme
https://www.youtube.com/watch?v=sqK-jh4TDXo

Anonymous
07/30/24(Tue)10:36:04 No.101637287

Anonymous 07/30/24(Tue)10:36:04 No.101637287

>>101637264
I like this Teto

Anonymous
07/30/24(Tue)10:38:21 No.101637309

Anonymous 07/30/24(Tue)10:38:21 No.101637309

I had a doctor tell me yesterday that she was using an AI tool to record and summarize the conversation we were having. I'm assuming that she probably wasn't running it locally on her phone right? Wouldn't that mean that she's sending patients information to some server somewhere that may or may not be secure?
Do hospitals even run models themselves or are they all using chat gpt shit?

Anonymous
07/30/24(Tue)10:38:54 No.101637315

Anonymous 07/30/24(Tue)10:38:54 No.101637315

CPU maxxers, have any of you tried running that insanely huge MoE google released I think an year ago or so?

Anonymous
07/30/24(Tue)10:39:37 No.101637323

Anonymous 07/30/24(Tue)10:39:37 No.101637323

>>101637263
yeah, it'll be like command r large 150b or some shit

Anonymous
07/30/24(Tue)10:43:54 No.101637383

Anonymous 07/30/24(Tue)10:43:54 No.101637383

Is magnum 32b significantly better than mini-magnum?

Anonymous
07/30/24(Tue)10:46:44 No.101637419

Anonymous 07/30/24(Tue)10:46:44 No.101637419

>>101637309
>Wouldn't that mean that she's sending patients information to some server somewhere that may or may not be secure?
You really think that hospitals were secure before the AI hype? LOL

Anonymous
07/30/24(Tue)10:48:00 No.101637432

Anonymous 07/30/24(Tue)10:48:00 No.101637432

>>101637309
>may or may not be secure
Security is not that binary
You would hope healthcare staff are aware of obligations re patient data and it's going through some official system, likely eventually to enterprise chatgpt ("pinky promise we won't read your data"), not just a doc trying to save a few minutes with the mobile app.

Anonymous
07/30/24(Tue)10:49:36 No.101637453

Anonymous 07/30/24(Tue)10:49:36 No.101637453

File: _22a4e901-397b-4013-b68b-(...).jpg (118 KB, 1024x1024)

118 KB JPG

>>101637309
>Do hospitals even run models themselves or are they all using chat gpt shit?
That's super-creepy. I imagine it's an Azure or AWS offering. They both sell transcription services.

Anonymous
07/30/24(Tue)10:49:57 No.101637457

Anonymous 07/30/24(Tue)10:49:57 No.101637457

Anyone know of a proyect that can monitor and openai compatible api?

similar to what vLLM has with prometheus. To monitor throughput, requests etc.

Anonymous
07/30/24(Tue)10:53:10 No.101637496

Anonymous 07/30/24(Tue)10:53:10 No.101637496

File: Capture.jpg (177 KB, 1005x969)

177 KB JPG

>>101636935
There is no quality control so most of it is weg nightmarefuel. But when you look hard you can find some really good stuff. Cuddlemocap is good for scenes. As for looks the best looks come from people who just rip models out of real games. And a few people who really know what they are doing. Pic related is my waifu that makes me coom buckets.

Anonymous
07/30/24(Tue)10:56:37 No.101637540

Anonymous 07/30/24(Tue)10:56:37 No.101637540

>>101637457
Prometheus isn't vLLM specific, you can configure it for anything.

Anonymous
07/30/24(Tue)10:57:36 No.101637558

Anonymous 07/30/24(Tue)10:57:36 No.101637558

>>101637496
No offense but that still looks uncanny and just not very good, even if it's better than the average model on there.

llama.cpp CUDA dev !!OM2Fp6Fn93S
07/30/24(Tue)11:00:30 No.101637597

llama.cpp CUDA dev !!OM2Fp6Fn93S 07/30/24(Tue)11:00:30 No.101637597

>>101637198
My expectation is that an optimally trained MoE model would utilize all experts evenly so there would be no benefit to shuffling around which experts get offloaded.
And even if there is an imbalance for specific models I'm not sure that imbalance would be consistent for different inputs.

>isnt the problem with this and the reason why it wasnt implemented the fact that for each token you have to use all experts anyway since the MoE models use only X experts per layer (or something similar) not per token, meaning that you will be reading the entire model per token anyway just not all at the same time?
For prompt processing it's basically guaranteed that you will have to evaluate all experts anyways.
For token generation you would potentially be able to evaluate the same number of experts in total but a larger fraction of experts on the GPU, but only if the experts are utilized unevenly.

Anonymous
07/30/24(Tue)11:02:11 No.101637618

Anonymous 07/30/24(Tue)11:02:11 No.101637618

Can someone explain how exactly speculative token decoding works? Wouldn't the big model have to verify that the token is "correct" anyway, thereby doing the same computation that it would've done otherwise?

Anonymous
07/30/24(Tue)11:02:34 No.101637622

Anonymous 07/30/24(Tue)11:02:34 No.101637622

>>101637309
In the EU at least I think sending patients' medical data to OpenAI would be straight up illegal.

Anonymous
07/30/24(Tue)11:02:57 No.101637626

Anonymous 07/30/24(Tue)11:02:57 No.101637626

>>101637540
yeah but vllm has an integrated endpoint natively. Does llama.cpp or tabby offer something similar?

If they don't I guess that the best way was to put something to act as a proxy in the api endpoint to measure the statistics.

Anonymous
07/30/24(Tue)11:03:00 No.101637627

Anonymous 07/30/24(Tue)11:03:00 No.101637627

>>101637622
What about Azure OpenAI endpoints hosted in Europe itself?

Anonymous
07/30/24(Tue)11:05:17 No.101637653

Anonymous 07/30/24(Tue)11:05:17 No.101637653

File: Screenshot from 2024-07-3(...).png (81 KB, 1015x105)

81 KB PNG

has anybody else noticed how unrealistic the rape pov cards are compared to real life? IRL most girls stop moving and just freeze after like 5 minutes, at least in my experience, but all the cards I try here always do something like picrel which makes me laugh so hard i come instantly.
I don't get ERP, why not just have real sex.

Anonymous
07/30/24(Tue)11:05:21 No.101637655

Anonymous 07/30/24(Tue)11:05:21 No.101637655

>>101637597
>For token generation you would potentially be able to evaluate the same number of experts in total but a larger fraction of experts on the GPU, but only if the experts are utilized unevenly.
given that speed increase curve per % of model offloaded to gpu graph, wouldn't that also apply here, requiring basically 90%+ of the tokens to be generated by the very, very few number of experts that are offloaded to the gpu for this speedup to be possible, meaning unless we change the arch of the models by a large amount into something completely new, this is a nothingburger

Anonymous
07/30/24(Tue)11:06:31 No.101637670

Anonymous 07/30/24(Tue)11:06:31 No.101637670

>>101637653
This looks like a model issue

Anonymous
07/30/24(Tue)11:07:27 No.101637689

Anonymous 07/30/24(Tue)11:07:27 No.101637689

>>101637626
Oh in that case I have no idea

Anonymous
07/30/24(Tue)11:07:54 No.101637697

Anonymous 07/30/24(Tue)11:07:54 No.101637697

File: 1718654038023854.gif (1.28 MB, 618x396)

1.28 MB GIF

>>101637653
>at least in my experience

llama.cpp CUDA dev !!OM2Fp6Fn93S
07/30/24(Tue)11:08:09 No.101637699

llama.cpp CUDA dev !!OM2Fp6Fn93S 07/30/24(Tue)11:08:09 No.101637699

>>101637618
You have to do the same number of computations but you can do them with a higher arithmetic intensity.
Meaning the amount of computations that you can do per data loaded is higher.
And because token generation is I/O bound that translates to higher performance as long as your predictions are correct.

Another way to think about it is that the total time needed to evaluate n tokens scales less than linearly.
I think evaluating two tokens takes ~10% longer, evaluating 64 tokens takes ~2x longer (with llama.cpp).

>>101637655
>given that speed increase curve per % of model offloaded to gpu graph, wouldn't that also apply here, requiring basically 90%+ of the tokens to be generated by the very, very few number of experts that are offloaded to the gpu for this speedup to be possible
Yes.

Anonymous
07/30/24(Tue)11:08:49 No.101637711

Anonymous 07/30/24(Tue)11:08:49 No.101637711

File: 1720398180495932.jpg (329 KB, 1170x1949)

329 KB JPG

Good morning jeets, your toy industry is dying.

Anonymous
07/30/24(Tue)11:11:06 No.101637737

Anonymous 07/30/24(Tue)11:11:06 No.101637737

>>101637711
>your toy industry
Who cares? I have my local models, and they'll always be there even if OpenAI and Anthropic die.

Anonymous
07/30/24(Tue)11:11:28 No.101637743

Anonymous 07/30/24(Tue)11:11:28 No.101637743

File: 1707441294271903.png (22 KB, 449x470)

22 KB PNG

>>101637653
>at least in my experience

Anonymous
07/30/24(Tue)11:12:59 No.101637764

Anonymous 07/30/24(Tue)11:12:59 No.101637764

>>101637618
The key to understanding this is that token generation currently is not using our full GPU. That means we can do two (or more) token generation jobs/requests at a time. Therefore, if we have access to likely guess at what the next token is (thanks to a small model or something else) then we can verify that the guessed token is correct at the same time that we generate the token after that. If the verification is matching, then we can keep both tokens. If there isn't a match, then we can keep the true token, while throwing out the "after that" token.

Anonymous
07/30/24(Tue)11:13:06 No.101637766

Anonymous 07/30/24(Tue)11:13:06 No.101637766

>>101637618
The "shortcut" is the bigger model can evaluate several cheaply-generated candidate tokens in one forward pass. https://huggingface.co/blog/assisted-generation

Anonymous
07/30/24(Tue)11:13:21 No.101637770

Anonymous 07/30/24(Tue)11:13:21 No.101637770

File: 1721748788623073.jpg (38 KB, 554x554)

38 KB JPG

>>101637653
>at least in my experience

Anonymous
07/30/24(Tue)11:13:30 No.101637772

Anonymous 07/30/24(Tue)11:13:30 No.101637772

File: winTet.jpg (226 KB, 2331x1244)

226 KB JPG

>>101636887
Upgrading from Windows XP with Teto
https://www.youtube.com/watch?v=TFAe0BYP2Xc

Anonymous
07/30/24(Tue)11:14:07 No.101637785

Anonymous 07/30/24(Tue)11:14:07 No.101637785

>>101637699
speaking of speculative decoding, whats the biggest bottleneck to having it implemented?

doesnt seem nearly as complex as a lot of other features talked about
with one ok implementation most if not all of the code can be used for most models, should be a drop in for any 2 big/small model pair with the exact same vocab
and it would speed up everything by a solid double digit %, specific workloads with a lot of copying of previous tokens like AI explaining basic program functions can be sped up many times over

Anonymous
07/30/24(Tue)11:14:23 No.101637788

Anonymous 07/30/24(Tue)11:14:23 No.101637788

>>101637166
huh thanks. it does work with -fa on if I remove the check, at least with llama3 8B
the rpc-server doesn't have many options so not sure if -fa is actually active or if I'll get random NaNs once I load in mistral large kek

Anonymous
07/30/24(Tue)11:15:07 No.101637795

Anonymous 07/30/24(Tue)11:15:07 No.101637795

>>101637785
>speaking of speculative decoding, whats the biggest bottleneck to having it implemented?
llama.cpp supports it though

Anonymous
07/30/24(Tue)11:16:16 No.101637812

Anonymous 07/30/24(Tue)11:16:16 No.101637812

File: 1702564284276140.png (503 KB, 1005x752)

503 KB PNG

>>101637653
>at least in my experience
level issue, stalk for 10 more years before raping again, newfag

Anonymous
07/30/24(Tue)11:18:09 No.101637831

Anonymous 07/30/24(Tue)11:18:09 No.101637831

>>101637653
Rape is not fun if you only rape doormats.

llama.cpp CUDA dev !!OM2Fp6Fn93S
07/30/24(Tue)11:19:53 No.101637855

llama.cpp CUDA dev !!OM2Fp6Fn93S 07/30/24(Tue)11:19:53 No.101637855

>>101637785
>speaking of speculative decoding, whats the biggest bottleneck to having it implemented?
Actually getting benefit from it.
Getting good predictions for the next token that are sufficiently cheap is not easy.
The predictions need to be good enough to offset the cost of creating them which is not a given.
This includes indirect costs related to the large model eval taking slightly longer for multiple tokens than for a single token.

Anonymous
07/30/24(Tue)11:22:10 No.101637891

Anonymous 07/30/24(Tue)11:22:10 No.101637891

>>101637843
Kinda missed you ngl.

llama.cpp CUDA dev !!OM2Fp6Fn93S
07/30/24(Tue)11:22:54 No.101637909

llama.cpp CUDA dev !!OM2Fp6Fn93S 07/30/24(Tue)11:22:54 No.101637909

>>101637785
>>101637855
I forgot: speculative decoding also becomes harder with a larger vocabulary sizes since there are fewer token sequences with a single, clear continuation.
For example, "supernova" was tokenized as "super", "n", "ova" for LLaMA 2 but it has its own token for LLaMA 3.

Anonymous
07/30/24(Tue)11:25:08 No.101637940

Anonymous 07/30/24(Tue)11:25:08 No.101637940

File: c0028679_4d15dcfa60fc7.jpg (45 KB, 400x800)

45 KB JPG

>>101637843
>>101637653
yiku/motsuba vibing

Anonymous
07/30/24(Tue)11:27:47 No.101637973

Anonymous 07/30/24(Tue)11:27:47 No.101637973

>>101637653
>at least in my experience
BASED! TAKE MY VRAM KING!

Anonymous
07/30/24(Tue)11:28:30 No.101637986

Anonymous 07/30/24(Tue)11:28:30 No.101637986

File: 1660609272926428.jpg (441 KB, 1988x2048)

441 KB JPG

>>101637653
>IRL most girls stop moving and just freeze after like 5 minutes, at least in my experience

Anonymous
07/30/24(Tue)11:29:33 No.101638004

Anonymous 07/30/24(Tue)11:29:33 No.101638004

>>101637855
perhaps there could be a very large gain in trying to utilize the unused fast memory (gpu usually)

by implementing the ability to set both the big and small models to specific locations separately with commands

for example for L3 3.1, the 8b goes into 8GB VRAM basic GPU and 70B goes into RAM, allowing the 8B model to cruch all the time while contributing much more than if you were to just offload a part of the big model to it

you could also, for some added complexity, even keep the promp processing on the gpu enabled even in this case, by unloading and loading the 8B as required, even dynamically (and manually with cli arg) setting the tokens at which gpu prompt processing is used instead of on CPU (increasing it from 32 which is the number now, since now there is an overhead to loading and unloading the 8B model)

Anonymous
07/30/24(Tue)11:31:24 No.101638033

Anonymous 07/30/24(Tue)11:31:24 No.101638033

>>101637909
yes this is true but not a big problem as most big models have smaller counterparts with mostly the same or completely the same vocab, its essentially free for anyone training the big model to release a 8B/13B smaller one, especially with distillation techniques of today and later

Anonymous
07/30/24(Tue)11:33:14 No.101638055

Anonymous 07/30/24(Tue)11:33:14 No.101638055

>>101638014
Can you not?
Here, go do your thing with a cartoon Migu, if you must: https://files.catbox.moe/2iygns.jfif

Anonymous
07/30/24(Tue)11:33:37 No.101638064

Anonymous 07/30/24(Tue)11:33:37 No.101638064

Does llama.cpp's RPC work like vLLM's multi-node pipeline parallelism?

llama.cpp CUDA dev !!OM2Fp6Fn93S
07/30/24(Tue)11:33:51 No.101638070

llama.cpp CUDA dev !!OM2Fp6Fn93S 07/30/24(Tue)11:33:51 No.101638070

>>101638004
This is already implemented in llama.cpp, you can set separate -ngl values for the draft and target model.
I personally think the better strategy is to try and reduce the latency increase from adding a few extra tokens.
I already have an implementation for n-gram lookup that produces drafts very cheaply using only CPU resources, the main problem right now is that CUDA graphs are only supported for a batch size of 1 so you need to get a certain minimum speedup to offset that.

Anonymous
07/30/24(Tue)11:35:10 No.101638082

Anonymous 07/30/24(Tue)11:35:10 No.101638082

>>101638055
I want to throw that Migu into the air like a baseball. She would find it exciting and be laughing gleefully.

Anonymous
07/30/24(Tue)11:35:11 No.101638083

Anonymous 07/30/24(Tue)11:35:11 No.101638083

>>101638055
>jfif
AIEEEEEEEEEEE MUSTARD GAS

Anonymous
07/30/24(Tue)11:36:56 No.101638099

Anonymous 07/30/24(Tue)11:36:56 No.101638099

>>101637653
erp is for people with fetishes that arent irl actionable or waifuniggers who only want to have real sex with fictional characters

Anonymous
07/30/24(Tue)11:38:07 No.101638117

Anonymous 07/30/24(Tue)11:38:07 No.101638117

>>101637653
prompt issue

Anonymous
07/30/24(Tue)11:38:08 No.101638118

Anonymous 07/30/24(Tue)11:38:08 No.101638118

File: 1707927487304071.png (266 KB, 764x828)

266 KB PNG

so it begins...

https://arxiv.org/abs/2407.19594

Anonymous
07/30/24(Tue)11:38:58 No.101638131

Anonymous 07/30/24(Tue)11:38:58 No.101638131

>>101638070
doesn't n-gram decoding only works if the input and output sequences are very similar? Or that's the case for look-ahead decoding and not lookup one?

llama.cpp CUDA dev !!OM2Fp6Fn93S
07/30/24(Tue)11:40:37 No.101638151

llama.cpp CUDA dev !!OM2Fp6Fn93S 07/30/24(Tue)11:40:37 No.101638151

>>101638131
I am specifically trying to get an implementation that works for the generation of natural text without a large context from which to draw token sequences.

Anonymous
07/30/24(Tue)11:41:21 No.101638161

Anonymous 07/30/24(Tue)11:41:21 No.101638161

File: _e7b33ec6-b82f-44a1-972b-(...).jpg (210 KB, 1024x1024)

210 KB JPG

>>101638083
Sowwy! That's what bing/dall-e spits out.
I'd love to gen stuff with my big rig at home, but last month's electric bill was over $400, so... yeah... Imma let bing handle it for a bit.
I'll be on a time-of-service plan soon, so gens and training will be shifted to "cheap hours" where it's $0.07/kWh

Anonymous
07/30/24(Tue)11:41:24 No.101638162

Anonymous 07/30/24(Tue)11:41:24 No.101638162

File: 1715119241666022.jpg (60 KB, 640x480)

60 KB JPG

>>101637772
Going back to Windows XP with Teto
https://www.youtube.com/watch?v=neuCtK96Dww

Anonymous
07/30/24(Tue)11:41:45 No.101638166

Anonymous 07/30/24(Tue)11:41:45 No.101638166

I just want to lie relaxed on the sofa and listen to a certain voice reading out the latest papers.
nothing available that allows me to do that in high quality and real time
fuck everyone, now i have to work into the field myself because no faggot dares to publish something decent

Anonymous
07/30/24(Tue)11:41:46 No.101638167

Anonymous 07/30/24(Tue)11:41:46 No.101638167

>>101638118
meh, we'll never get to agi with llm alone / the transformer architecture.

Anonymous
07/30/24(Tue)11:41:55 No.101638171

Anonymous 07/30/24(Tue)11:41:55 No.101638171

>>101638161
>Sowwy! That's what bing/dall-e spits out.
It actually doesn't, it's because you're using macOS/iOS or some shit so your browser downloads JFIF. dalle/bing spits out png.

Anonymous
07/30/24(Tue)11:42:43 No.101638181

Anonymous 07/30/24(Tue)11:42:43 No.101638181

>>101638070

Is there a way to force llama.cpp to keep X layers on an SSD? How hard would it be to implement and where to start in the codebase?

Anonymous
07/30/24(Tue)11:43:49 No.101638191

Anonymous 07/30/24(Tue)11:43:49 No.101638191

>>101638033
if you wanna achieve good speedup your draft should be like 30x faster than the target model, meaning something in the region of 0.5B-1B. Those models aren't particularly smart afaik unless you do just code or text formating or sth very homogeneous and predictable.

Anonymous
07/30/24(Tue)11:44:15 No.101638197

Anonymous 07/30/24(Tue)11:44:15 No.101638197

>>101638167
>the transformer architecture
any reasonably sized universal function approximator should be capable of getting to superintelligence, i dont see a reason why transformers wouldnt, its just that other arch that allow easier infinite context would probably do a better job since you dont need perfect recall

Anonymous
07/30/24(Tue)11:45:11 No.101638213

Anonymous 07/30/24(Tue)11:45:11 No.101638213

>>101636887
are there any fully uncensored versions of llama 3.1 yet? all the "uncensored" ones so far are still very censored

Anonymous
07/30/24(Tue)11:45:27 No.101638219

Anonymous 07/30/24(Tue)11:45:27 No.101638219

>>101638197
>probably do a better job
*faster job

Anonymous
07/30/24(Tue)11:45:42 No.101638223

Anonymous 07/30/24(Tue)11:45:42 No.101638223

>>101638194
>Compare this to ~4 cents per image for DALLE-3
Only with standard quality which is shit, HD costs twice as much.
>SD3 on API
Large costs 6.5 cents per gen, Ultra (which is Large with some extra pipelines) costs 8 cents per gen, Medium (which is shit) costs 3.5

Anonymous
07/30/24(Tue)11:46:22 No.101638232

Anonymous 07/30/24(Tue)11:46:22 No.101638232

File: true.gif (1.22 MB, 480x360)

1.22 MB GIF

>>101638194
>beauty must be shared

Anonymous
07/30/24(Tue)11:46:33 No.101638234

Anonymous 07/30/24(Tue)11:46:33 No.101638234

>>101638213
literally just say what you want in the sys prompt or use a card that isnt 100 slop tokens

Anonymous
07/30/24(Tue)11:47:18 No.101638247

Anonymous 07/30/24(Tue)11:47:18 No.101638247

What's the best nemo tune for RP/storytelling? Lumimaid?
I can't stand the writing style of the vanilla instruct version.

Anonymous
07/30/24(Tue)11:47:39 No.101638253

Anonymous 07/30/24(Tue)11:47:39 No.101638253

>>101638194
>that also means I have stole over 100 USD worth of compute from the chinks and spent it all on beautiful little girls lets fucking gooo
I've stolen hundreds of thousands of dollars total in AWS + OpenAI credits.

Anonymous
07/30/24(Tue)11:47:42 No.101638254

Anonymous 07/30/24(Tue)11:47:42 No.101638254

File: 1713176383693498.png (66 KB, 619x207)

66 KB PNG

>Meta's Mark Zuckerberg chews AI chud

llama.cpp CUDA dev !!OM2Fp6Fn93S
07/30/24(Tue)11:48:12 No.101638262

llama.cpp CUDA dev !!OM2Fp6Fn93S 07/30/24(Tue)11:48:12 No.101638262

>>101638181
>Is there a way to force llama.cpp to keep X layers on an SSD?
No.

>How hard would it be to implement and where to start in the codebase?
You may be able to do it relatively easily with the opposite approach, by modifying --mlock in such a way that part of the model is forced to be kept in RAM and the rest is swapped in/out.
But it may be difficult to get something like this merged since I would expect the performance to be pretty terrible.

Anonymous
07/30/24(Tue)11:48:18 No.101638264

Anonymous 07/30/24(Tue)11:48:18 No.101638264

Hey /g/ senpai, quick question for the data engineers out there I was pondering.

If you had to insert an image into an AI-generated background from a prompt for an LLM, how would you do it? Specifically, how would you ensure the image fits the perspective of the generated background? Is there any model that do it natively?

Anonymous
07/30/24(Tue)11:49:05 No.101638274

Anonymous 07/30/24(Tue)11:49:05 No.101638274

>>101638254
>cud
cannibalistic non human underground dweller?
C.L.U.D
CANNIBALISTIC LIZARD UNDERGROUND DWELLER

Anonymous
07/30/24(Tue)11:49:10 No.101638276

Anonymous 07/30/24(Tue)11:49:10 No.101638276

>>101638247
I might have been doing something wrong, but Lunamaid was dumb as a sack, might as well go back to llama3 8b.
In my opinion, it's either nemo-instruct or mini-magnum.
The first is more technical, the second is better at text fucking.

Anonymous
07/30/24(Tue)11:49:29 No.101638280

Anonymous 07/30/24(Tue)11:49:29 No.101638280

File: _d28afc3d-14f2-417f-9175-(...).jpg (81 KB, 1024x1024)

81 KB JPG

>>101638166
>I just want to lie relaxed on the sofa and listen to a certain voice reading out the latest papers.
There's tutorials on youtube for doing a RTVC model. I followed it, using voice acting ripped from a Japanese h-game, and it worked perfectly. AFAIK, RTVC is doing a speech-to-text, text-to-speech, it can't be that far off plain TTS, right?
I personally want a sassy, bratty catgirl-type voice for TTS. I'll see what I can do. As for RTVC, I was going to mostly use it for live vocaloid-type stuff, but the delay makes it unworkable, so I went with a Zoom V3 instead. With some practice, the "child" preset gets you something reasonable, though it's on you to "animate" your voice, meaning making it sound "pouty" or "bratty" or whatever.

Anonymous
07/30/24(Tue)11:51:22 No.101638304

Anonymous 07/30/24(Tue)11:51:22 No.101638304

>>101638262
>But it may be difficult to get something like this merged since I would expect the performance to be pretty terrible.
Yeah it's a pretty niche use case, seems better to just wait for some actually huge and actually good model to create pressure for things to be optimized a bit more around huge models.

I would like the functionality to be able to run future huge models overnight for non time critical things off of pcie gen 5 ssds while my ram is used for other things

Anonymous
07/30/24(Tue)11:51:59 No.101638314

Anonymous 07/30/24(Tue)11:51:59 No.101638314

File: noiamonwindowsbro.png (1.6 MB, 2147x1435)

1.6 MB PNG

>>101638171
>It actually doesn't, it's because you're using macOS/iOS
I'm on Windows 10. Is there some kinda setting in my bing account for it?

Anonymous
07/30/24(Tue)11:52:39 No.101638328

Anonymous 07/30/24(Tue)11:52:39 No.101638328

>>101638264
ask chatgpt

Anonymous
07/30/24(Tue)11:52:40 No.101638329

Anonymous 07/30/24(Tue)11:52:40 No.101638329

>>101638308
>if the CCP wasnt bankrolling Kwai there would be no feasible way to monetize this
Anon, that's not how it works, startups often lose money in the first stages. Look at websim for example - they've been providing FREE 3.5 sonnet and opus (!!) generations for literal months. They only started doing ratelimits recently.

Anonymous
07/30/24(Tue)11:54:15 No.101638341

Anonymous 07/30/24(Tue)11:54:15 No.101638341

>>101638194
>beauty must be shared
Animate the Migu

Anonymous
07/30/24(Tue)11:54:21 No.101638342

Anonymous 07/30/24(Tue)11:54:21 No.101638342

>>101638247
Nemo or maybe mini-magnum. Be aware that the other person that responded to you is a shill, though. Finetuners like to talk shit about each other to sell their stuff. Vanilla nemo was already good at smut so what he said about "technical" is just a lie.

Anonymous
07/30/24(Tue)11:54:46 No.101638354

Anonymous 07/30/24(Tue)11:54:46 No.101638354

>>101638314
>my bing account

Anonymous
07/30/24(Tue)11:55:51 No.101638370

Anonymous 07/30/24(Tue)11:55:51 No.101638370

>>101637711
I just had a thought that the whole financial sector and top management of basically every company out there could be replaced by AI. I really wish it would happen during my lifetime. It is not like AI can be more evil than people in those positions.

Anonymous
07/30/24(Tue)11:56:01 No.101638374

Anonymous 07/30/24(Tue)11:56:01 No.101638374

>>101637653
nay, irl they just keep crying. go back poser

Anonymous
07/30/24(Tue)11:56:09 No.101638375

Anonymous 07/30/24(Tue)11:56:09 No.101638375

>>101638354
>>my bing account
Am I supposed to care?

Anonymous
07/30/24(Tue)11:57:00 No.101638391

Anonymous 07/30/24(Tue)11:57:00 No.101638391

>>101638314
Hmm, then I honestly don't know. But what I know is that I have lots of azure dalle endpoints, do you wanna get lots of mikus and whatnot? You could even use jailbreaks to force your own prompts, and change to natural style.

Anonymous
07/30/24(Tue)11:57:59 No.101638406

Anonymous 07/30/24(Tue)11:57:59 No.101638406

>>101638342
>Be aware that the other person that responded to you is a shill
>What's the best nemo tune for RP/storytelling? Lumimaid?
How charitable of you to assume that question wasn't shilling.

Anonymous
07/30/24(Tue)11:58:07 No.101638407

Anonymous 07/30/24(Tue)11:58:07 No.101638407

TWO
MORE
MINUTES

Anonymous
07/30/24(Tue)11:58:49 No.101638420

Anonymous 07/30/24(Tue)11:58:49 No.101638420

File: 1717661089005397.png (242 KB, 706x545)

242 KB PNG

>>101637166
unrelated to ^

https://www.marktechpost.com/2024/07/26/flute-a-cuda-kernel-designed-for-fused-quantized-matrix-multiplications-to-accelerate-llm-inference/
a comment on this reposted on preddit: "Doesn't turboterp use a bunch of these tricks in exllama already?"

could be interesting

Anonymous
07/30/24(Tue)11:58:52 No.101638421

Anonymous 07/30/24(Tue)11:58:52 No.101638421

>>101638328
ChatGPT give mid to bad answers most of the time, it's mostly good to refine your own ideas or structure them.

I don't trust most of its answers right aways, especially when you know a bit of the domain in question..

Anonymous
07/30/24(Tue)11:59:00 No.101638428

Anonymous 07/30/24(Tue)11:59:00 No.101638428

>>101638407
I believe you but if you lied then your mom will die of cancer tomorrow.

Anonymous
07/30/24(Tue)11:59:06 No.101638431

Anonymous 07/30/24(Tue)11:59:06 No.101638431

>>101638014
What are these glasses called?

Anonymous
07/30/24(Tue)11:59:52 No.101638441

Anonymous 07/30/24(Tue)11:59:52 No.101638441

File: 6.png (104 KB, 668x672)

104 KB PNG

fuck me two of the most annoying posters return to lmg on the same day
>>101638055
you're both avatarfags that gen on cloud and cope about it, if anything you should be best friends

Anonymous
07/30/24(Tue)12:00:01 No.101638444

Anonymous 07/30/24(Tue)12:00:01 No.101638444

>>101638428
Hey I never said anything would be happening.

Anonymous
07/30/24(Tue)12:00:02 No.101638445

Anonymous 07/30/24(Tue)12:00:02 No.101638445

>>101637627
still illegal as it's a third party.

Anonymous
07/30/24(Tue)12:00:14 No.101638448

Anonymous 07/30/24(Tue)12:00:14 No.101638448

>>101638445
It's not illegal, retard.

Anonymous
07/30/24(Tue)12:00:26 No.101638453

Anonymous 07/30/24(Tue)12:00:26 No.101638453

>>101638444
Well sucks to be your mom then...

Anonymous
07/30/24(Tue)12:00:38 No.101638456

Anonymous 07/30/24(Tue)12:00:38 No.101638456

>>101638407
Two more minutes until what?! I want a smarter Local Miku and I want it now. I also want free A6000s from nvidia that they give out to enthusiasts who ask nicely.

Anonymous
07/30/24(Tue)12:00:53 No.101638462

Anonymous 07/30/24(Tue)12:00:53 No.101638462

>>101638391
What does jailbreak get you? It doesn't give you explicit nudes, right? I can gen stuff at home, I'm just trying to figure out if it's my AI stuff running up the bill or just the AC, or both.

Anonymous
07/30/24(Tue)12:01:11 No.101638466

Anonymous 07/30/24(Tue)12:01:11 No.101638466

>>101638407
TWO
NVIDIA
DATACENTERS (MORE)

Anonymous
07/30/24(Tue)12:02:16 No.101638485

Anonymous 07/30/24(Tue)12:02:16 No.101638485

>>101638462
>What does jailbreak get you?
API DALLE rewrites prompts by default, with the jailbreak you can force it to use your prompt as is. Also API DALLE lets you use HD quality and natural style, while bing IIRC is always forced to standard + vivid

Anonymous
07/30/24(Tue)12:04:26 No.101638508

Anonymous 07/30/24(Tue)12:04:26 No.101638508

>>101638456
>free A6000s from nvidia that they give out to enthusiasts who ask nicely.
I'm sure they can, but I doubt they do. But yeah, you wanna give me a $7K GPU, I'll be glad to promote it.

Anonymous
07/30/24(Tue)12:05:10 No.101638515

Anonymous 07/30/24(Tue)12:05:10 No.101638515

>>101638485
>API DALLE rewrites prompts by default
+ ((black | asian | brown) woman:1.5)
negative embedding: white male

i reverse engineered the dall-e 3 secret prompt, your welcome

Anonymous
07/30/24(Tue)12:06:03 No.101638525

Anonymous 07/30/24(Tue)12:06:03 No.101638525

>>101638515
no, anon, that's not the secret prompt, it's much more extensive but yes it will diversity characters unless you explicitly say their ethnicity or just use a JB

Anonymous
07/30/24(Tue)12:07:56 No.101638544

Anonymous 07/30/24(Tue)12:07:56 No.101638544

>>101638485
Ah OK. Makes sense. That's cool but I'll pass. I've probably got enough Migus to train an SDXL lora, if not an actual base model. Stylegan2 ADA used to need thousands of carefully-selected images, maybe SDXL isn't as demanding? Last time I tried with Stylegan2 with about 500 properly sized and cropped images, all with more or less the same pose, I got a bunch of abominations and the model never converged during training, it was a big waste of electricity.

Anonymous
07/30/24(Tue)12:08:58 No.101638553

Anonymous 07/30/24(Tue)12:08:58 No.101638553

>>101638521
pedo

Anonymous
07/30/24(Tue)12:09:05 No.101638557

Anonymous 07/30/24(Tue)12:09:05 No.101638557

>>101638544
HMM, actually thanks for the great idea, I should try fine-tuning SDXL on some DALL-E 3 gens, I can also caption them with 3.5 Sonnet/GPT-4o (i have plenty of the latter) and then actually fine-tune the model myself or via replicate (I have a few scrapped keys with billing)

Anonymous
07/30/24(Tue)12:09:26 No.101638562

Anonymous 07/30/24(Tue)12:09:26 No.101638562

>>101638374
it really depends on how you treat them.

llama.cpp CUDA dev !!OM2Fp6Fn93S
07/30/24(Tue)12:09:46 No.101638565

llama.cpp CUDA dev !!OM2Fp6Fn93S 07/30/24(Tue)12:09:46 No.101638565

>>101638420
>FP16 operations
>batch size <= 32
Don't care, those kernels are the easy ones with a low ceiling for optimization.
I'm much more interested in kernels using int8 for batch sizes >= 512 since those could in theory become twice as fast as FP16 cuBLAS and the potential benefit would be faster and more memory efficient training.

Anonymous
07/30/24(Tue)12:12:56 No.101638600

Anonymous 07/30/24(Tue)12:12:56 No.101638600

>>101638595
show some cr+ cunny logs

Anonymous
07/30/24(Tue)12:13:45 No.101638608

Anonymous 07/30/24(Tue)12:13:45 No.101638608

>>101638557
Is there a way to take the filename and feed it to the Azure API to get the prompt back? It sure looks like the name is a unique hash. That would be super-useful. Otherwise I have to run it through a booru model, and I'd rather not use booru, I'd prefer more natural prompting.

Anonymous
07/30/24(Tue)12:14:52 No.101638625

Anonymous 07/30/24(Tue)12:14:52 No.101638625

>>101638608
Of course not, that's all private. You can't get the original prompt from the hash.

Anonymous
07/30/24(Tue)12:17:54 No.101638672

Anonymous 07/30/24(Tue)12:17:54 No.101638672

>>101638565
approve my PR

Anonymous
07/30/24(Tue)12:23:09 No.101638739

Anonymous 07/30/24(Tue)12:23:09 No.101638739

>>101638718
>we're not on /aicg/
are you retarded? cr+ is a local model, and we share logs here. So post logs or you're larping, and don't actually have any cr+ logs. I'll accept catbox links too.

Anonymous
07/30/24(Tue)12:25:05 No.101638766

Anonymous 07/30/24(Tue)12:25:05 No.101638766

File: 403.jpg (21 KB, 320x324)

21 KB JPG

https://civitai.com/models/323639/ipivs-sdxl-lightning-text2img2vid-sd15-animatediff-lcm
kl*ng but actually relevant since it's fucking local and free.
Previews every step of the way and upscaling to 1080p + interpolation.
Fuck paying for any of this shit.
Fuck using cloud.

Anonymous
07/30/24(Tue)12:28:12 No.101638812

Anonymous 07/30/24(Tue)12:28:12 No.101638812

File: chat.petals.dev.png (96 KB, 762x933)

96 KB PNG

Why don't they update the models? Or at least pull the plug and stop the money drain.

Anonymous
07/30/24(Tue)12:28:25 No.101638816

Anonymous 07/30/24(Tue)12:28:25 No.101638816

>>101638739
>we share logs here
no, we don't?
if anything this shows YOU are a newfag larper

Anonymous
07/30/24(Tue)12:28:44 No.101638823

Anonymous 07/30/24(Tue)12:28:44 No.101638823

>>101638766
any solid controlnet to for example only img2video the background of the character?

Anonymous
07/30/24(Tue)12:29:11 No.101638829

Anonymous 07/30/24(Tue)12:29:11 No.101638829

>>101638816
>no, we don't?
we do

Anonymous
07/30/24(Tue)12:29:48 No.101638839

Anonymous 07/30/24(Tue)12:29:48 No.101638839

>>101638829
>we do
we don't

Anonymous
07/30/24(Tue)12:30:23 No.101638845

Anonymous 07/30/24(Tue)12:30:23 No.101638845

>>101638829
Nta but I only share Nala logs

Anonymous
07/30/24(Tue)12:31:29 No.101638866

Anonymous 07/30/24(Tue)12:31:29 No.101638866

>>101638829
nta but I only share watermelon logs

Anonymous
07/30/24(Tue)12:31:48 No.101638875

Anonymous 07/30/24(Tue)12:31:48 No.101638875

>>101638766
Pretty cool for static backgrounds. I could imagine someone using this to make VN-type games more cool.

Anonymous
07/30/24(Tue)12:31:49 No.101638876

Anonymous 07/30/24(Tue)12:31:49 No.101638876

>>101638858
can we just get this nigger banned? this isn't aicg

Anonymous
07/30/24(Tue)12:32:53 No.101638890

Anonymous 07/30/24(Tue)12:32:53 No.101638890

>>101638876
we are a aicg offspring though

Anonymous
07/30/24(Tue)12:33:04 No.101638894

Anonymous 07/30/24(Tue)12:33:04 No.101638894

>>101638876
im literally talking about local models in that very message. your post was more off topic than mine
>SMHH

Anonymous
07/30/24(Tue)12:33:34 No.101638897

Anonymous 07/30/24(Tue)12:33:34 No.101638897

>>101638894
>im literally talking about local models in that very message
no you're not, you're pretending that you are, but in the end all you do is post pedo videos generated by a proprietary model

Anonymous
07/30/24(Tue)12:34:04 No.101638901

Anonymous 07/30/24(Tue)12:34:04 No.101638901

>>101638894
and why are you deleting your own posts if it's ontopic, as you say? doesn't compute

Anonymous
07/30/24(Tue)12:34:26 No.101638911

Anonymous 07/30/24(Tue)12:34:26 No.101638911

>>101638766
Fuck yeah anon.
Thank you for the link, will play with it later.

>>101638845
My hero.

Anonymous
07/30/24(Tue)12:34:44 No.101638916

Anonymous 07/30/24(Tue)12:34:44 No.101638916

Constant AI in front of you, indistinguishable from a dream

Anonymous
07/30/24(Tue)12:36:22 No.101638934

Anonymous 07/30/24(Tue)12:36:22 No.101638934

>>101638901
He's not deleting his own posts, that's a janny responding to reports.
Then he continues spamming until a mod actually gets around to banning him.
And then he evades the ban.

Anonymous
07/30/24(Tue)12:36:56 No.101638939

Anonymous 07/30/24(Tue)12:36:56 No.101638939

>>101638934
>He's not deleting his own posts, that's a janny responding to reports.
What rule should I report his future post sunder, if I may ask?

Anonymous
07/30/24(Tue)12:37:55 No.101638953

Anonymous 07/30/24(Tue)12:37:55 No.101638953

Any way to use gemma 2 at 8k context? I've heard the sliding window attention or whatever is called isn't implemented in llama.cpp

Anonymous
07/30/24(Tue)12:38:04 No.101638955

Anonymous 07/30/24(Tue)12:38:04 No.101638955

>>101638939
This post breaks the United States laws

Anonymous
07/30/24(Tue)12:39:14 No.101638968

Anonymous 07/30/24(Tue)12:39:14 No.101638968

>>101638897
the prompts for those pedo videos are generated with a local model and I discuss which models are better for that use case. picrel is command r plus on hf chat for this purpose

this is the part where you go "durr but youre not running it locally!!!1!" to cope with the fact that my local model usage and discussion is on topic for these threads

>>101638901
they're not and have never been removed for being off-topic

>>101638955
>This post breaks the United States laws
but it doesn't

Anonymous
07/30/24(Tue)12:39:30 No.101638975

Anonymous 07/30/24(Tue)12:39:30 No.101638975

>>101638953
It's been implemented for a while now.
It might be a hack instead of proper SWA, I can't remember, but regardless, 8k context should work.

Anonymous
07/30/24(Tue)12:40:17 No.101638982

Anonymous 07/30/24(Tue)12:40:17 No.101638982

>>101638939
I report it as "loli outside of /b/" or whatever it's called.
Honestly I think the mods would ban him even on /b/ though.
I don't know the intricacies of US law when it comes to synthetic CP but the mods would probably rather be safe and just ban him.

Anonymous
07/30/24(Tue)12:40:26 No.101638987

Anonymous 07/30/24(Tue)12:40:26 No.101638987

>>101638968
your usage is not on topic in the slightest, you know that, I know that, everyone knows that. You're just a sad lonely fat virgin sitting in your basement with your unhealthy pedo fantasies, and you don't have anyone to talk about them so you gen those videos and share them here to try to get other anons to react.

Anonymous
07/30/24(Tue)12:40:34 No.101638992

Anonymous 07/30/24(Tue)12:40:34 No.101638992

>>101638968
pedophilia is a crime

Anonymous
07/30/24(Tue)12:41:55 No.101639004

Anonymous 07/30/24(Tue)12:41:55 No.101639004

>>101638953
Someone posted RULER benchmarks for it and it did pretty well at 8k even without true SWA. Unfortunately during actual use, its ability to recall early context in natural conversation when you get to 5-8k degrades. So RULER may not be a perfect benchmark for this.

Anonymous
07/30/24(Tue)12:42:52 No.101639017

Anonymous 07/30/24(Tue)12:42:52 No.101639017

>>101638992
is it? I think the crime is putting your thoughts into action, or storing csam. just being a disgusting pedo isn't a crime.

Anonymous
07/30/24(Tue)12:43:32 No.101639027

Anonymous 07/30/24(Tue)12:43:32 No.101639027

>>101638992
>pedophilia is a crime
this is thoughtcrime anon. if we could detect murderous thoughts, should we put everyone with murderous thoughts in jail on the off chance they actually follow through and commit murder?

>>101638987
an ad hominem does not refute the fact that i am using locally available models for my productivity and workflow

Anonymous
07/30/24(Tue)12:44:11 No.101639033

Anonymous 07/30/24(Tue)12:44:11 No.101639033

>>101639027
>an ad hominem
It's not ad hominem, you're not genning those videos with local models, and those videos break /g/ rules anyhow.

Anonymous
07/30/24(Tue)12:46:02 No.101639050

Anonymous 07/30/24(Tue)12:46:02 No.101639050

File: rejected.jpg (68 KB, 447x447)

68 KB JPG

I'm convinced that everyone saying how good LLMs are, have never hold a conversation with an actual human being. It's all robotic cringe from the smallest <7B models to Claude Opus. And don't even try to skill issue me you son of a bitches, I've read your logs. All cringe.

Anonymous
07/30/24(Tue)12:46:14 No.101639054

Anonymous 07/30/24(Tue)12:46:14 No.101639054

File: crp.png (35 KB, 774x742)

35 KB PNG

>>101639033
me being a fat lazy virgin has nothing to do with the on-topicness of my content in the threads, so it is an ad hominem

>you're not genning those videos with local models
i am genning the prompts with local models and discussing which ones are the best for my unique usecase, which is a valuable addition to the thread. id agree with you more if i wasn't sharing the prompts

>those videos break /g/ rules anyhow
irrelevant to whether they are on topic or not, and an appeal to authority even if it was

Anonymous
07/30/24(Tue)12:46:42 No.101639061

Anonymous 07/30/24(Tue)12:46:42 No.101639061

>>101639027
Why yes! I do think people that constantly post gore and say things like "I constantly have dreams where I'm murdering people" should be in jail or in a mental institution.

Anonymous
07/30/24(Tue)12:47:20 No.101639073

Anonymous 07/30/24(Tue)12:47:20 No.101639073

>>101639054
>and an appeal to authority even if it was
Then why are you on 4chan?

Anonymous
07/30/24(Tue)12:47:58 No.101639080

Anonymous 07/30/24(Tue)12:47:58 No.101639080

>>101639050
you should try c.ai, it's the only model to hold a conversation with any resemblance of humanness

Anonymous
07/30/24(Tue)12:55:00 No.101639195

Anonymous 07/30/24(Tue)12:55:00 No.101639195

>>101639061
>I constantly have dreams where I'm murdering people
i didn't say this, but it doesn't matter even if i did. read The Minority Report by Philip K Dick (its 10 pages long) if you'd like to understand why this attitude towards precrime and thoughtcrime results in an abusive and authoritarian society

>>101639073
>Then why are you on 4chan?
you lost me with this one sorry anon

Anonymous
07/30/24(Tue)12:55:33 No.101639201

Anonymous 07/30/24(Tue)12:55:33 No.101639201

>>101639195
pedos like you should get the rope

Anonymous
07/30/24(Tue)12:57:21 No.101639223

Anonymous 07/30/24(Tue)12:57:21 No.101639223

>>101639201
do you really want to live in a society where you kill someone because they AI generated a little girl eating a popsicle

Anonymous
07/30/24(Tue)12:57:51 No.101639226

Anonymous 07/30/24(Tue)12:57:51 No.101639226

>>101639223
Yes.

Anonymous
07/30/24(Tue)12:59:08 No.101639239

Anonymous 07/30/24(Tue)12:59:08 No.101639239

>>101639226
Based

Anonymous
07/30/24(Tue)12:59:53 No.101639245

Anonymous 07/30/24(Tue)12:59:53 No.101639245

>>101639223
I want to live in a society where the mods finally get fed up with the petra/pedo spammer and drop a range ban.

Anonymous
07/30/24(Tue)13:00:17 No.101639251

Anonymous 07/30/24(Tue)13:00:17 No.101639251

>>101639226
based
but the authoritarianism necessary for that would also end up in a society with a Stasi secret police that put you in jail for no reason to meet arrest quotas way before we get to that level, so let's be serious and not edgy anon

Anonymous
07/30/24(Tue)13:00:54 No.101639258

Anonymous 07/30/24(Tue)13:00:54 No.101639258

>>101639245
sadly won't help, the schizos are taking over 4chan as normal anons are leaving it. residential proxies don't cost that much

Anonymous
07/30/24(Tue)13:03:25 No.101639287

Anonymous 07/30/24(Tue)13:03:25 No.101639287

>>101636935
The difference between good and bad models is absolutely insane, especially in VR. Some of them are uber god tier coom extractors, but most are utter garbage. It actually feels more binary than a spectrum. M4RIO's models are fucking amazing.

Anonymous
07/30/24(Tue)13:04:05 No.101639294

Anonymous 07/30/24(Tue)13:04:05 No.101639294

File: file.png (59 KB, 931x493)

59 KB PNG

>the absolute state of literature

Anonymous
07/30/24(Tue)13:04:56 No.101639308

Anonymous 07/30/24(Tue)13:04:56 No.101639308

File: 1697537583642844.png (770 KB, 768x768)

770 KB PNG

>>101637653

Anonymous
07/30/24(Tue)13:05:45 No.101639316

Anonymous 07/30/24(Tue)13:05:45 No.101639316

>>101639294
Is that from Re:Zero?

Anonymous
07/30/24(Tue)13:06:06 No.101639320

Anonymous 07/30/24(Tue)13:06:06 No.101639320

>>101637653
miku highlights guy, if you don't include this post and chain into the highlights, I WILL find you.

Anonymous
07/30/24(Tue)13:08:34 No.101639361

Anonymous 07/30/24(Tue)13:08:34 No.101639361

>>101639258
how much do you get for hosting a residential proxy? I wonder if it's worth the risk

Anonymous
07/30/24(Tue)13:09:07 No.101639370

Anonymous 07/30/24(Tue)13:09:07 No.101639370

>>101639361
hosting? nothing, the ones who sell it mostly get them from botnets and hacked routers/phones

Anonymous
07/30/24(Tue)13:10:01 No.101639383

Anonymous 07/30/24(Tue)13:10:01 No.101639383

>>101637653
Thanks for reminding me where I was.

Anonymous
07/30/24(Tue)13:10:39 No.101639391

Anonymous 07/30/24(Tue)13:10:39 No.101639391

>>101639316
No, it's Durarara.

Anonymous
07/30/24(Tue)13:12:07 No.101639410

Anonymous 07/30/24(Tue)13:12:07 No.101639410

>>101639370
Oh. Makes sense kek

Anonymous
07/30/24(Tue)13:15:41 No.101639461

Anonymous 07/30/24(Tue)13:15:41 No.101639461

>>101638718
Large 2 is great for cunny. Some of the shit is says really melts your heart, then gets your Johnson going. It's probably the most realistic experience so far.

Anonymous
07/30/24(Tue)13:17:05 No.101639479

Anonymous 07/30/24(Tue)13:17:05 No.101639479

>your lips curl into a smile...
>a smile twists across your face...
>a sly grin spreads across your face...
>her words dripping with malice...
>each word tinged with a hint of malice...
FUCK YOU LARGESTRAL I'M SICK OF HEARING ABOUT SMILES AND MALICE REEEEEEEE

Anonymous
07/30/24(Tue)13:19:48 No.101639512

Anonymous 07/30/24(Tue)13:19:48 No.101639512

>>101639479
A year ago, people were reeee'ing from the other side of the spectrum.
These are great times.

Anonymous
07/30/24(Tue)13:21:29 No.101639536

Anonymous 07/30/24(Tue)13:21:29 No.101639536

>>101639479
It's also doing shivers, we're never escaping the fucking shivers. Someone has to get to the fucking bottom of this shit, there is simply no way that sentence is so ubiquitous.

Anonymous
07/30/24(Tue)13:22:25 No.101639548

Anonymous 07/30/24(Tue)13:22:25 No.101639548

You are talking to a machine. It has no awareness. It has no personality. You are alone in your room running your GPU at full speed trying to simulate friendship. You are degrading your social skills and living in a fantasy world. Go outside.

Anonymous
07/30/24(Tue)13:23:34 No.101639563

Anonymous 07/30/24(Tue)13:23:34 No.101639563

>>101639548
>You are talking to a machine. It has no awareness. It has no personality. You are alone in your room running your GPU at full speed trying to simulate friendship. You are degrading your social skills and living in a fantasy world.
Yes, and it's great in here

Anonymous
07/30/24(Tue)13:23:48 No.101639568

Anonymous 07/30/24(Tue)13:23:48 No.101639568

>>101639548
>simulate friendship
>implying
what if i like to simulate scenarios that could never happen and text adventure games?

Anonymous
07/30/24(Tue)13:24:37 No.101639578

Anonymous 07/30/24(Tue)13:24:37 No.101639578

>>101639479
>({{char}} is kind-hearted and friendly.)
>She smiles wickedly, her dark grin, her devious scowl
Stheno...

Anonymous
07/30/24(Tue)13:25:21 No.101639586

Anonymous 07/30/24(Tue)13:25:21 No.101639586

>>101639548
>simulate friendship
i'm simulating sex, actually
>You are degrading your social skills
there were never any
>and living in a fantasy world
many of them actually
>Go outside
there's nothing for me out there

Anonymous
07/30/24(Tue)13:26:59 No.101639604

Anonymous 07/30/24(Tue)13:26:59 No.101639604

>>101639548
But I don't want to go outside! Kids throw rocks at me when I do that...

Anonymous
07/30/24(Tue)13:28:15 No.101639624

Anonymous 07/30/24(Tue)13:28:15 No.101639624

>>101639536
Try adding an author note at depth zero, "Emulate the writing style of XY", XY being some famous fiction writer. That can tone down some of the slop and change the prose enough that the model feels completely different. Might not work for non-book-style RP, though.

Anonymous
07/30/24(Tue)13:31:49 No.101639682

Anonymous 07/30/24(Tue)13:31:49 No.101639682

>>101639624
Mistral large seems to trend towards book style formatting on it's own, so it might work pretty well. Good tip, I'll test it out.

Anonymous
07/30/24(Tue)13:36:49 No.101639750

Anonymous 07/30/24(Tue)13:36:49 No.101639750

>>101639479
based malicious girl enjoyer

Anonymous
07/30/24(Tue)13:51:23 No.101639927

Anonymous 07/30/24(Tue)13:51:23 No.101639927

File: kokomi2.png (147 KB, 1054x1535)

147 KB PNG

>>101639080
Clearly, you are high, or wearing rose-tinted glasses. c.ai was a fucked-up mess most of the time. Oldest screenshot I can find (not this one) was 11/2022 and even then you had to use shit like the POV trick to kick the bot into replying when it got filtered.

Anonymous
07/30/24(Tue)13:54:04 No.101639957

Anonymous 07/30/24(Tue)13:54:04 No.101639957

>>101639927
>Clearly, you are high, or wearing rose-tinted glasses.
Never look down on the lack of mental capacity of some anons. They can and will do both at some time, and then add something stupid on top, just to move the status quo of their own idiocy.

Anonymous
07/30/24(Tue)13:55:17 No.101639971

Anonymous 07/30/24(Tue)13:55:17 No.101639971

File: 344479587-b1eac8bf-f390-4(...).png (542 KB, 2464x1282)

542 KB PNG

>>101637855
WTF???

Anonymous
07/30/24(Tue)13:59:59 No.101640034

Anonymous 07/30/24(Tue)13:59:59 No.101640034

>>101639927
c.ai was only a fucked up mess if you were trying to fight the filter, or if the conversation was too long.
Yes, it enters on repetition loops. Yes, it has a lot of -isms. But it was and still is the best natural model to talk with.

llama.cpp CUDA dev !!OM2Fp6Fn93S
07/30/24(Tue)14:09:09 No.101640156

llama.cpp CUDA dev !!OM2Fp6Fn93S 07/30/24(Tue)14:09:09 No.101640156

>>101639971
What am I looking at here and did you perhaps mean to quote another post?

Anonymous
07/30/24(Tue)14:18:24 No.101640281

Anonymous 07/30/24(Tue)14:18:24 No.101640281

I'm really impressed with character.ai. Is there anything remotely close to it that can run locally, or am I out of luck if I want an experience like that?

Anonymous
07/30/24(Tue)14:20:26 No.101640309

Anonymous 07/30/24(Tue)14:20:26 No.101640309

File: writing-styles_seed0.png (1.08 MB, 1856x959)

1.08 MB PNG

>>101639624
Shivers still abound but it's interesting how changing just one name affects the prose. In picrel I clicked "regenerate" for the greeting message, keeping the seed fixed to 0 after changing the name in "Emulate the writing style of [author]" in an author note at zero depth. The model was Gemma-2-27B.

Anonymous
07/30/24(Tue)14:27:48 No.101640417

Anonymous 07/30/24(Tue)14:27:48 No.101640417

>>101640281
NeMo/Largestral

Anonymous
07/30/24(Tue)14:29:01 No.101640429

Anonymous 07/30/24(Tue)14:29:01 No.101640429

>>101640281
>I'm really impressed with character.ai
Did you go to sleep in 2022?

Anonymous
07/30/24(Tue)14:31:11 No.101640465

Anonymous 07/30/24(Tue)14:31:11 No.101640465

>>101640417
There was a benchmark in the last thread that showed large IQ1_M significantly higher than nemo, is that even possible?

Anonymous
07/30/24(Tue)14:31:17 No.101640468

Anonymous 07/30/24(Tue)14:31:17 No.101640468

>>101640429
I just started using it like a month ago. Maybe I just don't have standards. I downloaded something called backyard.ai and I'm using some model called "Chaifighter v2 20B" but I'm not too impressed. I don't feel like it's acting like my character, it gives me short responses and overall just feels off.

Anonymous
07/30/24(Tue)14:31:48 No.101640474

Anonymous 07/30/24(Tue)14:31:48 No.101640474

Good open source music generation when?

Anonymous
07/30/24(Tue)14:32:08 No.101640481

Anonymous 07/30/24(Tue)14:32:08 No.101640481

File: _9cf86dd7-0275-4025-97d6-(...).jpg (184 KB, 1024x1024)

184 KB JPG

>>101640034
>But it was and still is
I'm pretty sure they are no longer running their original, rather large LaMDA model. They went through a phase when they were getting slammed where they seemed to be using dynamic model sizing, and during periods of high load, it was like you were talking to a 3-6B model. Now it just sounds like a lame LLaMA 13B.

Anonymous
07/30/24(Tue)14:32:50 No.101640487

Anonymous 07/30/24(Tue)14:32:50 No.101640487

>>101640309
jesus christ, what a terrible prose

Anonymous
07/30/24(Tue)14:32:55 No.101640488

Anonymous 07/30/24(Tue)14:32:55 No.101640488

>>101640465
>There was
There was?

Anonymous
07/30/24(Tue)14:34:36 No.101640502

Anonymous 07/30/24(Tue)14:34:36 No.101640502

>>101640488
Yeah, this one: https://oobabooga.github.io/benchmark.html
Seems strange though sometimes higher quants score lower than smaller ones. So that's why I'm asking if it's even possible some IQ1_M of large could be better than nemo.

Anonymous
07/30/24(Tue)14:35:16 No.101640514

Anonymous 07/30/24(Tue)14:35:16 No.101640514

>>101638968
Please stop insulting our intelligence, your posts are obviously intended to sexualize the kids and according to US law (and most other countries), it's only legal as long as it isn't realistic enough
Are these posts obscene enough to actually constitute a crime? I don't know, but most people here think it's disgusting and it adds nothing of value to the thread
Stop being disingenuous, thanks

Anonymous
07/30/24(Tue)14:36:54 No.101640537

Anonymous 07/30/24(Tue)14:36:54 No.101640537

>>101640502
Oh, that. Yeah I wouldn't trust it overall. But I actually would say it's possible that Largestral at IQ1_M beats Nemo. Compare the filesizes. And look at >>101627651, larger models appear to do better at these lower quants than smaller models.

Anonymous
07/30/24(Tue)14:39:31 No.101640567

Anonymous 07/30/24(Tue)14:39:31 No.101640567

File: Wire_Cat.png (366 KB, 680x459)

366 KB PNG

>large language model
>look inside
>numbers

Anonymous
07/30/24(Tue)14:42:08 No.101640592

Anonymous 07/30/24(Tue)14:42:08 No.101640592

>>101640502
>https://oobabooga.github.io/benchmark.html
oobabooger if you are reading this, please test undi's largestral tune https://huggingface.co/NeverSleep/Lumimaid-v0.2-123B it feels way dumber than the original

Anonymous
07/30/24(Tue)14:42:09 No.101640593

Anonymous 07/30/24(Tue)14:42:09 No.101640593

>>101640156
I'm quoting this, and it's related to int8 training
https://github.com/bitsandbytes-foundation/bitsandbytes/issues/1262

Anonymous
07/30/24(Tue)14:46:49 No.101640640

Anonymous 07/30/24(Tue)14:46:49 No.101640640

Is GGML_CUDA_F16 something that I should enable for a 3090?

Anonymous
07/30/24(Tue)14:48:33 No.101640662

Anonymous 07/30/24(Tue)14:48:33 No.101640662

>>101640592
Buy an ad.

Anonymous
07/30/24(Tue)14:49:20 No.101640672

Anonymous 07/30/24(Tue)14:49:20 No.101640672

>>101639245
>>101639258
>I want to live in a society where the mods finally get fed up with the petra/pedo spammer and drop a range ban.
>the schizos are taking over 4chan as normal anons are leaving it
You (faggot A) are whining that someone is trolling you and it gets under your skin and you want jannies to protect you. And you (faggot B) say that "normal anons" are leaving. Normal anons already left long ago. Now it is just you, absolute zoomer scum that needs a safespace.

Anonymous
07/30/24(Tue)14:49:47 No.101640677

Anonymous 07/30/24(Tue)14:49:47 No.101640677

>>101640672
ok but why do you like children tho?

Anonymous
07/30/24(Tue)14:51:07 No.101640693

Anonymous 07/30/24(Tue)14:51:07 No.101640693

>>101637737
your local models are performance-inferior to any cloud AI though and censored more than said cloud AIs, too.
>>101638118
model will censor itself even more with this method, good luck with that.

Anonymous
07/30/24(Tue)14:51:21 No.101640698

Anonymous 07/30/24(Tue)14:51:21 No.101640698

>>101640662
Undi bought discord shills, are you happy now?

llama.cpp CUDA dev !!OM2Fp6Fn93S
07/30/24(Tue)14:52:15 No.101640704

llama.cpp CUDA dev !!OM2Fp6Fn93S 07/30/24(Tue)14:52:15 No.101640704

>>101640593
I don't think this is relevant to my goals.
A fused operation with int8 has to be written in a very different way than a non-fused operation (which I think this is).

>>101640640
With the current code it shouldn't really matter.
There are still some parts where it makes a small difference but in the medium term I want to remove that option and just make the choice based on the hardware.

Anonymous
07/30/24(Tue)14:53:01 No.101640716

Anonymous 07/30/24(Tue)14:53:01 No.101640716

>>101639927
I wonder if secret sauce for that one was that cleaning the datasets from all the toxic things wasn't properly done yet.

Anonymous
07/30/24(Tue)14:54:02 No.101640730

Anonymous 07/30/24(Tue)14:54:02 No.101640730

>>101640677
I don't. They are ugly and annoying. But I am not a butthurt retard that cries to jannies.

Anonymous
07/30/24(Tue)14:56:24 No.101640755

Anonymous 07/30/24(Tue)14:56:24 No.101640755

File: Screenshot 2024-07-16 111127.png (396 KB, 635x542)

396 KB PNG

Is it normal for mistral-large to repeat large chunks of paragraph as early as like, the 2nd or 3rd message?

Anonymous
07/30/24(Tue)14:56:43 No.101640761

Anonymous 07/30/24(Tue)14:56:43 No.101640761

File: nemo_writing-styles_seed0.png (1.23 MB, 1851x966)

1.23 MB PNG

>>101640487
Here's with Mistral Nemo 12B

Anonymous
07/30/24(Tue)14:57:37 No.101640769

Anonymous 07/30/24(Tue)14:57:37 No.101640769

>>101640704
Can you also make it guess GGML_CUDA_DMMV_X and GGML_CUDA_MMV_Y, I sometime forgot to set those manually.

llama.cpp CUDA dev !!OM2Fp6Fn93S
07/30/24(Tue)14:58:41 No.101640786

llama.cpp CUDA dev !!OM2Fp6Fn93S 07/30/24(Tue)14:58:41 No.101640786

>>101640769
Also planned.

Anonymous
07/30/24(Tue)14:58:43 No.101640787

Anonymous 07/30/24(Tue)14:58:43 No.101640787

>>101640755
No.

Anonymous
07/30/24(Tue)14:58:44 No.101640789

Anonymous 07/30/24(Tue)14:58:44 No.101640789

>>101640761
>>101640309
Directly requesting styles has never worked well. I've never been able to get a model to replicate say, Dave Barry or Carl Hiaasen's style, they just default to their shitty default "funny" style they go to when you tell them to be funny.

Anonymous
07/30/24(Tue)15:00:00 No.101640804

Anonymous 07/30/24(Tue)15:00:00 No.101640804

>>101640761
I'm throwing up

Anonymous
07/30/24(Tue)15:00:11 No.101640806

Anonymous 07/30/24(Tue)15:00:11 No.101640806

>>101640761
I like nemo but this looks like placebo. I mean it feels like it is still writing with the same style but when you reference lovecraft she is about to grow a tentacle in next 2 posts and if you mention tolkien it associates archaic english and puts it in. It just finds some concepts associated with name and runs with those concepts.

Anonymous
07/30/24(Tue)15:00:11 No.101640807

Anonymous 07/30/24(Tue)15:00:11 No.101640807

>>101640787
Okay, I gotcha. Maybe it's an openrouter issue. I'll dig around for other shit it could be, too. But I'm guessing I'll have to bite the bullet and CPUmaxx if I want a good experience with it.

Anonymous
07/30/24(Tue)15:00:57 No.101640817

Anonymous 07/30/24(Tue)15:00:57 No.101640817

I never messed with LoRA before, but would it be possible to extract a -instruct LoRA out of Nemo (diffing Nemo base and Nemo-instruct) then apply that LoRA to base with a different strength (that's a thing right?)?

Anonymous
07/30/24(Tue)15:01:50 No.101640832

Anonymous 07/30/24(Tue)15:01:50 No.101640832

>>101639548
Which card are these defs from? Sounds like a good setup.

Anonymous
07/30/24(Tue)15:02:11 No.101640839

Anonymous 07/30/24(Tue)15:02:11 No.101640839

>>101640806
That would track with what this anon said >>101640789

Namely, that asking for an author will just associate the concept with the story. In that anon's case going generic "funny" mode for humor authors, and for lovecraft, general eldritch horror with no considerations for his literary stylings.

Anonymous
07/30/24(Tue)15:02:31 No.101640842

Anonymous 07/30/24(Tue)15:02:31 No.101640842

>>101640807
The prompting could be weird because of how the official API sends the system prompt to the last user message. Try instruct mode too with OpenRouter.

Anonymous
07/30/24(Tue)15:03:12 No.101640848

Anonymous 07/30/24(Tue)15:03:12 No.101640848

>>101640755
enable DRY, retard

Anonymous
07/30/24(Tue)15:04:12 No.101640865

Anonymous 07/30/24(Tue)15:04:12 No.101640865

>>101640778
if everything is in **, then nothing should be

Anonymous
07/30/24(Tue)15:04:32 No.101640871

Anonymous 07/30/24(Tue)15:04:32 No.101640871

File: file.png (824 KB, 768x768)

824 KB PNG

Anonymous
07/30/24(Tue)15:05:06 No.101640883

Anonymous 07/30/24(Tue)15:05:06 No.101640883

File: graph.png (7 KB, 502x397)

7 KB PNG

>>101640755

Anonymous
07/30/24(Tue)15:06:14 No.101640898

Anonymous 07/30/24(Tue)15:06:14 No.101640898

>>101640848
women are DRY when they see you

Anonymous
07/30/24(Tue)15:07:13 No.101640913

Anonymous 07/30/24(Tue)15:07:13 No.101640913

>>101640839
I wonder if this is part of the effort to avoid copyrighted content? It seems like they've made a special effort to make models really, really fucking bad at knowing the exact text of books, I'd imagine that'd also translate to it being unable to replicate the style except in the broadest terms.

Anonymous
07/30/24(Tue)15:09:02 No.101640944

Anonymous 07/30/24(Tue)15:09:02 No.101640944

>>101640898
yeah your mothers menopause was rough

Anonymous
07/30/24(Tue)15:09:34 No.101640957

Anonymous 07/30/24(Tue)15:09:34 No.101640957

>>101640677
If I hate them you thoughtcrime pursuers will be after me from murder anyway.

Anonymous
07/30/24(Tue)15:10:52 No.101640979

Anonymous 07/30/24(Tue)15:10:52 No.101640979

>>101640951
Did you ever do the chubby tummy? I may have missed it, oops...

Anonymous
07/30/24(Tue)15:12:09 No.101640999

Anonymous 07/30/24(Tue)15:12:09 No.101640999

File: ecker groomer.png (155 KB, 1257x984)

155 KB PNG

>>101639258
This. The most deranged are getting their easy-to-use proxy management shitpost side for free.

Anonymous
07/30/24(Tue)15:12:41 No.101641006

Anonymous 07/30/24(Tue)15:12:41 No.101641006

This general is a cornucopia of mental illness.

Anonymous
07/30/24(Tue)15:12:55 No.101641008

Anonymous 07/30/24(Tue)15:12:55 No.101641008

>>101640704
I've conducted some experiments with MNIST. kept weights , tried both ternary and binary (just the linear layers , didn't touch convos) and it worked pretty well, for binary the loss curve didn't converge very smoothly, but eventually hit the satisfying levels.
The question is ,can we keep the gradient and the optimizer in int8 all the time during backprop in transformers. We could randomly drop some updates like DropBP (there's a paper) and that sorta jazz, but gradient is ,technically speaking, float by definition. So is there a way we could fully/partially calculate gradients in int8 or somehow convert (not necessarily quantize) to integer and yet preserve the quality of cross-entropy when updated. That's most likely impossible in diffusers since unet is fed by the noise then sometimes even upscaled, and noise is very sensitive to the precision , but in llm like transformers perhaps it's somehow doable. dunno but worth a try. int8 is definitely the fastest option when it comes to the compute.

Anonymous
07/30/24(Tue)15:14:34 No.101641028

Anonymous 07/30/24(Tue)15:14:34 No.101641028

What are some of the 'best' models you can run nowadays with 16gb vram? I don't mind having a small context window (say 4k), just don't want to use any ram because it slows down to a crawl for me

llama.cpp CUDA dev !!OM2Fp6Fn93S
07/30/24(Tue)15:17:09 No.101641053

llama.cpp CUDA dev !!OM2Fp6Fn93S 07/30/24(Tue)15:17:09 No.101641053

>>101641008
>The question is ,can we keep the gradient and the optimizer in int8 all the time during backprop in transformers.
It should be possible to store the gradients as int8 as long as their absolute values are relatively similar.
But that is definitely not a given.
I have no idea whether the ideas I have will actually work out; I'll just need to try them and see.

Anonymous
07/30/24(Tue)15:19:59 No.101641077

Anonymous 07/30/24(Tue)15:19:59 No.101641077

>>101641028
Nemo 12B.
It has 128k context, although you probably don't want to use more than 32k.

Anonymous
07/30/24(Tue)15:20:01 No.101641078

Anonymous 07/30/24(Tue)15:20:01 No.101641078

>>101641028
mistral nemo
gemma 9b
llama 3.1 8b

Anonymous
07/30/24(Tue)15:22:06 No.101641103

Anonymous 07/30/24(Tue)15:22:06 No.101641103

>>101641013
Aww, man. Well, still cute.

Anonymous
07/30/24(Tue)15:22:08 No.101641105

Anonymous 07/30/24(Tue)15:22:08 No.101641105

>>101641028
People say nemo but you have to wrangle with this tard model just to get 70% retarded responses and 30% brilliant ones. Not worth in my opinion.

Anonymous
07/30/24(Tue)15:22:37 No.101641109

Anonymous 07/30/24(Tue)15:22:37 No.101641109

>>101640817
I think so...

Anonymous
07/30/24(Tue)15:24:34 No.101641138

Anonymous 07/30/24(Tue)15:24:34 No.101641138

>>101637653
>at least in my experience
good morning sir

Anonymous
07/30/24(Tue)15:24:35 No.101641139

Anonymous 07/30/24(Tue)15:24:35 No.101641139

>>101641109
Sick. Gonna try that out then.
MergeKit can do that, extract a LoRA from the difference between two models, right?
I wonder if that can be used to "fix" overcooked fine tunes.

Anonymous
07/30/24(Tue)15:24:47 No.101641141

Anonymous 07/30/24(Tue)15:24:47 No.101641141

File: wranglin'.jpg (67 KB, 721x900)

67 KB JPG

How do I keep base models under control? Even when I go NAI style and write a good hunk of story as an intro for it to do text completion on, it tends to go off the rails.

Anonymous
07/30/24(Tue)15:25:38 No.101641151

Anonymous 07/30/24(Tue)15:25:38 No.101641151

>>101641141
>he fell for the base model meme

Anonymous
07/30/24(Tue)15:26:35 No.101641163

Anonymous 07/30/24(Tue)15:26:35 No.101641163

>>101641053
definitely worth a try. it works in MNIST so who knows.

Anonymous
07/30/24(Tue)15:26:40 No.101641166

Anonymous 07/30/24(Tue)15:26:40 No.101641166

>>101641141
Guide it with your own input.

Anonymous
07/30/24(Tue)15:27:36 No.101641175

Anonymous 07/30/24(Tue)15:27:36 No.101641175

What's the new meta now?
Is it still nemo?

Anonymous
07/30/24(Tue)15:28:30 No.101641188

Anonymous 07/30/24(Tue)15:28:30 No.101641188

>>101641077
What happens above 32k?

Anonymous
07/30/24(Tue)15:29:01 No.101641195

Anonymous 07/30/24(Tue)15:29:01 No.101641195

>>101641175
New meta is lumimaid and mini-magnum

Anonymous
07/30/24(Tue)15:29:01 No.101641196

Anonymous 07/30/24(Tue)15:29:01 No.101641196

>>101641166
I am... but if it goes on for more than a short paragraph or so, it'll do its own thing. They need to finetune this shit to go in 10-150 token bursts like NAI.

Anonymous
07/30/24(Tue)15:30:03 No.101641208

Anonymous 07/30/24(Tue)15:30:03 No.101641208

>>101641141
>How do I keep base models under control?
that's the neat part - you don't

Anonymous
07/30/24(Tue)15:30:56 No.101641225

Anonymous 07/30/24(Tue)15:30:56 No.101641225

>>101641175
It has been like 3 fucking days jesus fuck.. yes it is still nemo..

Anonymous
07/30/24(Tue)15:31:10 No.101641230

Anonymous 07/30/24(Tue)15:31:10 No.101641230

>>101641141
Tell it what you want and don't want to happen. The base model understands instructions, it's just very rebellious.

Anonymous
07/30/24(Tue)15:31:19 No.101641233

Anonymous 07/30/24(Tue)15:31:19 No.101641233

>>101641196
Just limit the response length then.

Anonymous
07/30/24(Tue)15:31:25 No.101641235

Anonymous 07/30/24(Tue)15:31:25 No.101641235

>>101641188
As with most (every?) model with large contexts, after a certain point it's accuracy starts going down and it's ability to use information from the context gets spotty.
Do try and see how much context works well for you. For me, 32k has been the sweet spot so far.

Anonymous
07/30/24(Tue)15:32:16 No.101641247

Anonymous 07/30/24(Tue)15:32:16 No.101641247

>>101641225
I mean... >>101641195

Anonymous
07/30/24(Tue)15:32:19 No.101641249

Anonymous 07/30/24(Tue)15:32:19 No.101641249

>>101641188
It gets retarded.. it honestly gets uneseable for RP after like 12-14k tokens.

Anonymous
07/30/24(Tue)15:33:19 No.101641268

Anonymous 07/30/24(Tue)15:33:19 No.101641268

>>101641247
Luminiad is more retarded version of nemo. Magnum i did not tried. so that i will not comment.

Anonymous
07/30/24(Tue)15:35:12 No.101641293

Anonymous 07/30/24(Tue)15:35:12 No.101641293

>>101641247
Ignore the Sao shill, he will keep spamming "Undi = dumb" regardless of the model.

Anonymous
07/30/24(Tue)15:35:44 No.101641300

Anonymous 07/30/24(Tue)15:35:44 No.101641300

>>101641230
Would it understand them better if I introduced it in a completion style? Like as a dust jacket summary of the story/premise beforehand, then the opening prose for it to continue?

Anonymous
07/30/24(Tue)15:36:55 No.101641318

Anonymous 07/30/24(Tue)15:36:55 No.101641318

>>101641235
>>101641249
What's the technical reason as for why this happens? I'm guessing that they don't have many examples with a length above 32k tokens in their traning set right?

Anonymous
07/30/24(Tue)15:41:10 No.101641354

Anonymous 07/30/24(Tue)15:41:10 No.101641354

>>101641195
lumimaid l3 was so terrible that if the nemo version resembles it in any way I wouldn't even bother to try it

Anonymous
07/30/24(Tue)15:41:35 No.101641362

Anonymous 07/30/24(Tue)15:41:35 No.101641362

>>101641300
Yeah, I think so. What I usually do is write the story in Markdown format. I start with a glossary section, a summary/synopsis section, and then chapter 1.

Anonymous
07/30/24(Tue)15:43:18 No.101641382

Anonymous 07/30/24(Tue)15:43:18 No.101641382

>>101640871
Pushing the Pochi down the stairs.

Anonymous
07/30/24(Tue)15:45:51 No.101641411

Anonymous 07/30/24(Tue)15:45:51 No.101641411

>>101641318
No, it's probably just a limitation of the parameter count.

Anonymous
07/30/24(Tue)15:47:22 No.101641433

Anonymous 07/30/24(Tue)15:47:22 No.101641433

>>101641318
Examples are not the problem Attention is.

https://github.com/hsiehjackson/RULER

Anonymous
07/30/24(Tue)15:53:56 No.101641502

Anonymous 07/30/24(Tue)15:53:56 No.101641502

File: based dep.jpg (54 KB, 521x937)

54 KB JPG

>>101637653
>at least in my experience

Anonymous
07/30/24(Tue)15:54:17 No.101641507

Anonymous 07/30/24(Tue)15:54:17 No.101641507

>>101641318
>What's the technical reason as for why this happens?
the reason is that transformer architecture is shitty

Anonymous
07/30/24(Tue)15:54:55 No.101641514

Anonymous 07/30/24(Tue)15:54:55 No.101641514

>there is STILL no open source tts that isn't shit
when the fuck will get a local audiobook generator? this should be way easier than the quadrillion parameter bullshit everyone's doing now shouldn't it

Anonymous
07/30/24(Tue)15:55:58 No.101641527

Anonymous 07/30/24(Tue)15:55:58 No.101641527

File: file.png (512 KB, 768x768)

512 KB PNG

Anonymous
07/30/24(Tue)15:56:59 No.101641541

Anonymous 07/30/24(Tue)15:56:59 No.101641541

>>101641527
OMG it's Pochi!!! The best avatarfag!!!

Anonymous
07/30/24(Tue)15:57:12 No.101641544

Anonymous 07/30/24(Tue)15:57:12 No.101641544

>>101641527
Anon, is everything alright with you? You weren't like this before. If you want someone to talk to I'm here.

Anonymous
07/30/24(Tue)15:57:40 No.101641549

Anonymous 07/30/24(Tue)15:57:40 No.101641549

>>101640693
Yeah and my car is inferior to a lambo, so what? Mine has SOVL

Anonymous
07/30/24(Tue)15:59:15 No.101641566

Anonymous 07/30/24(Tue)15:59:15 No.101641566

>>101641362
>Try it
>Just spits it back out verbatim with <im_end> at the end
Huh. Should I be including the model's message format somewhere? What should it be around?

Anonymous
07/30/24(Tue)16:00:15 No.101641580

Anonymous 07/30/24(Tue)16:00:15 No.101641580

File: ohh.jpg (56 KB, 851x925)

56 KB JPG

>>101637653
>at least in my experience

Anonymous
07/30/24(Tue)16:01:06 No.101641587

Anonymous 07/30/24(Tue)16:01:06 No.101641587

>>101638370
It'll mimic their behaviors and it'll be even more cutthroat, because unlike people, you can't brown nose and get on it's good side

Anonymous
07/30/24(Tue)16:01:24 No.101641590

Anonymous 07/30/24(Tue)16:01:24 No.101641590

File: GPheKC8W4AAQI7k.jpg (61 KB, 800x533)

61 KB JPG

>Lurking because I want to be horny
>too poor to get gpus in this 3rd world country or pay for the various services in like openrouter, so I rely on the kobold horde
>Trying to find the good coom cards/chatbots that feel in character alongside setups/instruction/models to goon
I have collected over 300 bots, Now time to find out if I have it properly set up and is good coom material or not.

Anonymous
07/30/24(Tue)16:02:24 No.101641594

Anonymous 07/30/24(Tue)16:02:24 No.101641594

>>101641590
I have tons of Horde kudos from my earlier days btw, by a ton I mean >10 million. Could give you if you need them

Anonymous
07/30/24(Tue)16:03:16 No.101641605

Anonymous 07/30/24(Tue)16:03:16 No.101641605

>>101641590
You can F2P your coom with Google colab, they give you 16GB VRAM for free

Anonymous
07/30/24(Tue)16:04:01 No.101641615

Anonymous 07/30/24(Tue)16:04:01 No.101641615

>>101641566
>Move from openrouter to ST since it does the formatting for you
>No way to get rid of the shitty clusterfuck of characters and instructslop
Arrgh. I just want NAI style...

Anonymous
07/30/24(Tue)16:04:26 No.101641618

Anonymous 07/30/24(Tue)16:04:26 No.101641618

>>101641615
mikup[ad

Anonymous
07/30/24(Tue)16:04:40 No.101641626

Anonymous 07/30/24(Tue)16:04:40 No.101641626

>>101641590
https://github.com/LostRuins/koboldcpp/blob/concedo/colab.ipynb

Anonymous
07/30/24(Tue)16:05:22 No.101641635

Anonymous 07/30/24(Tue)16:05:22 No.101641635

>>101641618
Bloody...

Anonymous
07/30/24(Tue)16:05:24 No.101641637

Anonymous 07/30/24(Tue)16:05:24 No.101641637

>>101641590
Literally just go to /aicg/ and wait until someone gives out a free proxy, you'll also get a better model than whatever you can run here

Anonymous
07/30/24(Tue)16:06:10 No.101641650

Anonymous 07/30/24(Tue)16:06:10 No.101641650

>>101641637
There's one right now but it doesn't have 3.5 sonnet iirc, only claude 3 haiku/sonent and below (claude 2.1 etc) https://rentry.org/unreliableproxy

Anonymous
07/30/24(Tue)16:07:57 No.101641664

Anonymous 07/30/24(Tue)16:07:57 No.101641664

>>101641566
I think you should be concerned with finding out where "im_end" is coming from. You shouldn't be using prompt formats with base models.

Anonymous
07/30/24(Tue)16:08:27 No.101641668

Anonymous 07/30/24(Tue)16:08:27 No.101641668

>>101641650
>it doesn't have 3.5 sonnet iirc
Yeah I don't think there's any public one that has it atm

Anonymous
07/30/24(Tue)16:08:56 No.101641674

Anonymous 07/30/24(Tue)16:08:56 No.101641674

>>101637711
good, everything should be open

Anonymous
07/30/24(Tue)16:09:05 No.101641677

Anonymous 07/30/24(Tue)16:09:05 No.101641677

>>101641615
mikupad

Anonymous
07/30/24(Tue)16:12:42 No.101641721

Anonymous 07/30/24(Tue)16:12:42 No.101641721

>>101641664
Well, in that case, it's almost certainly openrouter. I'm testing shit on it because doing huge 70b models locally takes forever when you're dipping your toes into what works and what doesn't.

That being said, you'd think it'd complete something, right..? Does it need a shove? It's weird that it just spits back the entire thing at me, right? Should I begin my entire block blurb with "complete the following" or something? Maybe a "I need you to work on this." Something more human/casual/that a text completion thing might expect?

Anonymous
07/30/24(Tue)16:14:00 No.101641733

Anonymous 07/30/24(Tue)16:14:00 No.101641733

>>101641650
did it get taken down?

Anonymous
07/30/24(Tue)16:14:35 No.101641741

Anonymous 07/30/24(Tue)16:14:35 No.101641741

>>101641733
no, it's up, the 4 words in the first line is the trycloudflare link
https://something-industries-billing-bedroom.trycloudflare.com/

Anonymous
07/30/24(Tue)16:15:50 No.101641758

Anonymous 07/30/24(Tue)16:15:50 No.101641758

File: 1722370541251.jpg (239 KB, 1024x1024)

239 KB JPG

isreal

Anonymous
07/30/24(Tue)16:16:30 No.101641769

Anonymous 07/30/24(Tue)16:16:30 No.101641769

>>101641590
>>101641637
or show ecker your wiener for a proxy key...

Anonymous
07/30/24(Tue)16:17:12 No.101641779

Anonymous 07/30/24(Tue)16:17:12 No.101641779

File: Screenshot 2024-07-30 161507.png (80 KB, 1320x669)

80 KB PNG

>>101641721
Okay, something is definitely up with openrouter, it writes for a bit, then starts spewing out the parameters they're using(?)

Also
>'model': 'julka/julka-neox',
Those fuckers.

Anonymous
07/30/24(Tue)16:17:21 No.101641780

Anonymous 07/30/24(Tue)16:17:21 No.101641780

If LLM's are so smart how come they don't work when you if they settings wrong?

Anonymous
07/30/24(Tue)16:17:47 No.101641784

Anonymous 07/30/24(Tue)16:17:47 No.101641784

>>101641594
thanks but for now I dont even know if I have properly set up, doesnt help how there are like 20 fucking models at any given time and how with the way models are made you get wildly different results because you didnt noticed one is Mistral and the other is Mixtral, for now Im just testing from what I could gather
>1K to 2k being the upper limit in terms of permanent tokens, if it goes further its too bloated.
>Stheno/fimbuli-dont-remember-the-last-part being the more readily accesible ones that still give decent output
>bunch of presets from huggingface
>>101641605
I need to check it out later, When I first tried it went over my head some of the instructions.
>>101641626
I will check the link later, thanks
>>101641637
Do they even leave free proxies around? I thought it was only paid stuff or just for the people that are "In the group" since I asked a few times and either nobody replied or said "Lurk more" to me or other people that made similar questions.

Anonymous
07/30/24(Tue)16:18:34 No.101641791

Anonymous 07/30/24(Tue)16:18:34 No.101641791

>>101641780
smart people are fragile

Anonymous
07/30/24(Tue)16:19:40 No.101641800

Anonymous 07/30/24(Tue)16:19:40 No.101641800

>>101641758
warning! horny miku!

https://files.catbox.moe/caos5b.jpg
https://files.catbox.moe/3sr25m.jpg
https://files.catbox.moe/ug1u9g.jpg
https://files.catbox.moe/y0ykee.jpg
https://files.catbox.moe/fv2efb.jpg
https://files.catbox.moe/s0ii8r.jpg

(yes, this is dalle3, from an mostly unfilterd dalle3 azure endpoint)

Anonymous
07/30/24(Tue)16:19:59 No.101641804

Anonymous 07/30/24(Tue)16:19:59 No.101641804

>>101641779
They include the parameters in the prompt as an optimization, so the sampling code path is exactly the same for every request. The model should figure out how to respond anyway.

Anonymous
07/30/24(Tue)16:20:05 No.101641805

Anonymous 07/30/24(Tue)16:20:05 No.101641805

>>101641784
>since I asked a few times and either nobody replied or said "Lurk more" to me or other people that made similar questions.
They are a bunch of niggers, but sometimes people do leave free proxies - my tip: check the archives, limit all searches to /g/ and search for "password"/"pass", you'll probably find a public proxy that way

Anonymous
07/30/24(Tue)16:20:21 No.101641811

Anonymous 07/30/24(Tue)16:20:21 No.101641811

File: Screenshot 2024-07-30 161905.png (113 KB, 1285x599)

113 KB PNG

>>101641779
Kek, it just sort of hijacked it and decided to start telling its entirely separate own story about this girl who, as far as I can gather, definitely has a vagina.

Anonymous
07/30/24(Tue)16:20:23 No.101641812

Anonymous 07/30/24(Tue)16:20:23 No.101641812

>>101637711
>>101641674
i hope this doesn't mean shit like huggingface will go down though. this is why we need a decentralized way to distribute models. people probably would seed models too compared to pirated stuff.

Anonymous
07/30/24(Tue)16:21:31 No.101641821

Anonymous 07/30/24(Tue)16:21:31 No.101641821

>>101641733
>>101641741
Go to aicg, wrong thread

Anonymous
07/30/24(Tue)16:22:11 No.101641829

Anonymous 07/30/24(Tue)16:22:11 No.101641829

>>101641780
If humans are so smart how come they don't work when you increase their temperature by a mere 2%?

Anonymous
07/30/24(Tue)16:22:22 No.101641832

Anonymous 07/30/24(Tue)16:22:22 No.101641832

What will the GPT-4 of video be like?

Anonymous
07/30/24(Tue)16:22:26 No.101641833

Anonymous 07/30/24(Tue)16:22:26 No.101641833

>>101637626
HAProxy has a metrics module that can output to Prometheus https://www.haproxy.com/documentation/haproxy-configuration-tutorials/alerts-and-monitoring/prometheus/

Anonymous
07/30/24(Tue)16:22:45 No.101641837

Anonymous 07/30/24(Tue)16:22:45 No.101641837

>>101641832
Sora but it's in eternal gatekeep

Anonymous
07/30/24(Tue)16:23:56 No.101641850

Anonymous 07/30/24(Tue)16:23:56 No.101641850

>>101641784
Well, my offer still stands, because sometimes Horde gets really overwhelmed, and I can easily give you a few hundred thousand kudos

Anonymous
07/30/24(Tue)16:24:34 No.101641859

Anonymous 07/30/24(Tue)16:24:34 No.101641859

>>101641811
if you told me this was gpt2 talktotransformer shit from four years ago I'd have believed you

Anonymous
07/30/24(Tue)16:24:51 No.101641865

Anonymous 07/30/24(Tue)16:24:51 No.101641865

File: Screenshot 2024-07-30 162402.png (16 KB, 718x188)

16 KB PNG

Man, okay. I'm done with the base model/text completion meme. Maybe if NAI makes a 30b or something.

Anonymous
07/30/24(Tue)16:24:54 No.101641866

Anonymous 07/30/24(Tue)16:24:54 No.101641866

Am I the only one that feels that --split-mode row is broken?

Anonymous
07/30/24(Tue)16:25:22 No.101641874

Anonymous 07/30/24(Tue)16:25:22 No.101641874

Wait, does OpenRouter not allow text completions for base models?

Anonymous
07/30/24(Tue)16:26:08 No.101641885

Anonymous 07/30/24(Tue)16:26:08 No.101641885

>>101641865
>using the base model through OR's chat interface
you deserve it

Anonymous
07/30/24(Tue)16:26:30 No.101641890

Anonymous 07/30/24(Tue)16:26:30 No.101641890

>>101641821
api models are nice for local too, you can use them to augment datasets and things like that.
I didn't ask it for ERP, because unlike you, I don't care only for gooning, loser.

Anonymous
07/30/24(Tue)16:26:58 No.101641896

Anonymous 07/30/24(Tue)16:26:58 No.101641896

>>101641874
At least from what I'm trying, it seems to be extremely instruct-formatted in a way that fucks with base model completion really hard.

Anonymous
07/30/24(Tue)16:27:40 No.101641906

Anonymous 07/30/24(Tue)16:27:40 No.101641906

File: random_example_don't_judge.png (114 KB, 946x875)

114 KB PNG

>>101641865
You will never get something good from the chat interface, since they are injecting intruct junk to your prompt.

Anonymous
07/30/24(Tue)16:28:59 No.101641920

Anonymous 07/30/24(Tue)16:28:59 No.101641920

>>101641906
Huh, I had a feeling. At least I know I'm on the right track, my pre-formatted thing looks extremely close to what you've got, chapter and all. I guess I'll use mikupad and trudge out doing a 70b on my 3060 + RAM.

Anonymous
07/30/24(Tue)16:29:04 No.101641922

Anonymous 07/30/24(Tue)16:29:04 No.101641922

What speed was 405B at when it was available on Groq for everyone to test?

Anonymous
07/30/24(Tue)16:29:05 No.101641923

Anonymous 07/30/24(Tue)16:29:05 No.101641923

>>101641874
They do, even for instruct models.

Anonymous
07/30/24(Tue)16:29:57 No.101641937

Anonymous 07/30/24(Tue)16:29:57 No.101641937

File: file.png (795 KB, 768x768)

795 KB PNG

>>101641544
It is not me. I am me.

Anonymous
07/30/24(Tue)16:30:03 No.101641941

Anonymous 07/30/24(Tue)16:30:03 No.101641941

>>101641923
>Text completion on instruct models
What's that like? Sounds either monstrously sloppy or based.

Anonymous
07/30/24(Tue)16:30:21 No.101641945

Anonymous 07/30/24(Tue)16:30:21 No.101641945

>>101641890
Man, you really showed him

Anonymous
07/30/24(Tue)16:30:53 No.101641951

Anonymous 07/30/24(Tue)16:30:53 No.101641951

>>101641920
You can also use openrouter models on mikupad, you know.

Anonymous
07/30/24(Tue)16:31:21 No.101641960

Anonymous 07/30/24(Tue)16:31:21 No.101641960

>>101641587
>you can't brown nose and get on it's good side
And that is when things may finally change cause bootlickers keep this scum up top.

Anonymous
07/30/24(Tue)16:34:09 No.101641984

Anonymous 07/30/24(Tue)16:34:09 No.101641984

>>101641805
>>101641850
thanks for the tips and the offer, I still need to do my searches for good coom in case Proxies are not available.
>Proceeds to lurk again

Anonymous
07/30/24(Tue)16:35:07 No.101641991

Anonymous 07/30/24(Tue)16:35:07 No.101641991

>>101641951
Are they not injected with the same slop? I tried in text completion mode on ST with the OR API and it still gave me the instructionslop shit. I figured it was just baked into their implementation of the model. But I'll try Mikupad.

Anonymous
07/30/24(Tue)16:37:23 No.101642024

Anonymous 07/30/24(Tue)16:37:23 No.101642024

https://github.com/acrognale/llmtree
Neat

Anonymous
07/30/24(Tue)16:38:46 No.101642041

Anonymous 07/30/24(Tue)16:38:46 No.101642041

>>101642024
ST also allows this btw, and there's even a timelines extension

Anonymous
07/30/24(Tue)16:40:13 No.101642058

Anonymous 07/30/24(Tue)16:40:13 No.101642058

File: file.png (89 KB, 1262x548)

89 KB PNG

>>101641991
Doesn't seem to be the case, although maybe the LLaMA 3 70B base you got in OR could be fucked or something.

Anonymous
07/30/24(Tue)16:46:02 No.101642120

Anonymous 07/30/24(Tue)16:46:02 No.101642120

File: 1722372326332.jpg (232 KB, 1024x1024)

232 KB JPG

>>101641800
did the proxy shut down?

Anonymous
07/30/24(Tue)16:46:22 No.101642123

Anonymous 07/30/24(Tue)16:46:22 No.101642123

>>101642120
what proxy

Anonymous
07/30/24(Tue)16:47:01 No.101642131

Anonymous 07/30/24(Tue)16:47:01 No.101642131

>>101641960
Don't hate the players, hate the game. Most people don't get promoted by working hard, most business deals don't get closed because they're good ideas. Guess what? People up top put each other there and I doubt they'll suddenly decide to go jobless together with the gang. Management will be the last people to get automated

Anonymous
07/30/24(Tue)16:47:27 No.101642140

Anonymous 07/30/24(Tue)16:47:27 No.101642140

>>101641800
wait, this is dalle 3? damn.

Anonymous
07/30/24(Tue)16:47:59 No.101642148

Anonymous 07/30/24(Tue)16:47:59 No.101642148

>>101642140
Yes it is, the style is obviously dalle

Anonymous
07/30/24(Tue)16:47:59 No.101642149

Anonymous 07/30/24(Tue)16:47:59 No.101642149

File: split-mode.png (10 KB, 407x135)

10 KB PNG

>>101641866
broken how? Seems okay to me, performance delta probably varies a lot depending on hardware. Actually seems improved since I last tested

Anonymous
07/30/24(Tue)16:48:23 No.101642153

Anonymous 07/30/24(Tue)16:48:23 No.101642153

>>101642123
I thought some guy from aicg was hosting a dalle3 proxy or something.

Anonymous
07/30/24(Tue)16:50:12 No.101642178

Anonymous 07/30/24(Tue)16:50:12 No.101642178

>>101642153
No one will host uncensored dalle3 because those endpoints are really rare

Anonymous
07/30/24(Tue)16:51:18 No.101642195

Anonymous 07/30/24(Tue)16:51:18 No.101642195

>>101642149
The output quality seems low compared to the default mode with high context.

Anonymous
07/30/24(Tue)16:51:21 No.101642197

Anonymous 07/30/24(Tue)16:51:21 No.101642197

Is there some prompt I can put into memory to make my AI stop sundowning adventures?

Anonymous
07/30/24(Tue)16:51:43 No.101642203

Anonymous 07/30/24(Tue)16:51:43 No.101642203

>>101642178
>uncensored dalle3
wut? I thought MS was pretty strict about censoring no-no prompts

Anonymous
07/30/24(Tue)16:55:08 No.101642243

Anonymous 07/30/24(Tue)16:55:08 No.101642243

File: 1709122811982895.png (162 KB, 1240x1003)

162 KB PNG

>>101642203
You can disable them on Azure if you are a company and have some use-case so they'll approve you
https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/content-filter?tabs=warning%2Cuser-prompt%2Cpython-new#configurability-preview

The endpoint I have still has filters enabled on the prompt but doesn't check the generated image, so you can still gen NSFW because the prompt filter is dumb.

Anonymous
07/30/24(Tue)16:55:36 No.101642249

Anonymous 07/30/24(Tue)16:55:36 No.101642249

>>101642058
Works on Mikupad! Sucks that you can only have one generation in undo-redo, but ah well. Glad to have it going!

Anonymous
07/30/24(Tue)16:56:37 No.101642255

Anonymous 07/30/24(Tue)16:56:37 No.101642255

File: Screenshot 2024-07-30 165515.png (50 KB, 808x503)

50 KB PNG

>>101642249
Uh-oh. This isn't what I signed up for.

Anonymous
07/30/24(Tue)16:57:32 No.101642273

Anonymous 07/30/24(Tue)16:57:32 No.101642273

>>101642243
Oh that's pretty interesting, unrelated but has dalle gotten worse over time? I remember it being pretty good the week it launched but nowadays everything it generates has that typical 'ai' look that gives it right away that it was made with dalle

Anonymous
07/30/24(Tue)16:58:35 No.101642285

Anonymous 07/30/24(Tue)16:58:35 No.101642285

File: 20240730_200625.jpg (165 KB, 1069x885)

165 KB JPG

Does anybody have cards with moderately complex scenarios, so I can test various models on them?
Btw, new Chatbot arena update dropped, 405b is in third place

Anonymous
07/30/24(Tue)16:59:33 No.101642296

Anonymous 07/30/24(Tue)16:59:33 No.101642296

>>101642149
>>101642195
Oh, I think it was that I just didn't update it for the rope scale fix... Oops.

Anonymous
07/30/24(Tue)17:00:20 No.101642306

Anonymous 07/30/24(Tue)17:00:20 No.101642306

File: 1711773139728600.jpg (390 KB, 1792x1024)

390 KB JPG

>>101642273
I don't think so, no, it's just that most people got used to the default DALL-E style. You see, dalle has two styles on the API - "natural" which basically doesn't do any extra "enhancement", and "vivid" which makes it much easier to create higher-quality images, but makes them all look similar-ish. Bing creator and ChatGPT Plus only use Vivid style, but on the API you can use Natural. Here's an example picrel of what you can get with natural dalle if you try hard enough (yes I posted this pic a lot of times, it's over half a year old at this point).

Anonymous
07/30/24(Tue)17:00:58 No.101642311

Anonymous 07/30/24(Tue)17:00:58 No.101642311

>>101642203
Nta, but it's just skill issue, D3 can draw any degeneracy you want.
https://litter.catbox.moe/9v059d.jpg

Anonymous
07/30/24(Tue)17:01:28 No.101642318

Anonymous 07/30/24(Tue)17:01:28 No.101642318

>>101642311
You have to do tons of tries on Bing to get those styles of pictures though, to bypass both the prompt filter and the image filter.

Anonymous
07/30/24(Tue)17:01:59 No.101642326

Anonymous 07/30/24(Tue)17:01:59 No.101642326

File: 1697137813149709.jpg (206 KB, 1024x1024)

206 KB JPG

>>101642306
This is also natural style dalle3, unedited

Anonymous
07/30/24(Tue)17:05:12 No.101642365

Anonymous 07/30/24(Tue)17:05:12 No.101642365

>>101642318
>You have to do tons of tries on Bing to get those styles of pictures though, to bypass both the prompt filter and the image filter.
not really, I can generate more on the first try
https://litter.catbox.moe/nrfvz9.jpg
https://litter.catbox.moe/v96ma7.jpg

Anonymous
07/30/24(Tue)17:05:46 No.101642376

Anonymous 07/30/24(Tue)17:05:46 No.101642376

>>101642365
anon, this is a very specific fetish and you're just lucky that you found this degeneracy theme. Can you try to get normal tits from bing though, like my miku gens?

Anonymous
07/30/24(Tue)17:13:16 No.101642443

Anonymous 07/30/24(Tue)17:13:16 No.101642443

>>101642306
>picrel
That looks pretty fucking good compared to the typical dalle-images, have fun with the api anon

Anonymous
07/30/24(Tue)17:14:20 No.101642455

Anonymous 07/30/24(Tue)17:14:20 No.101642455

>>101641800
Here's the best I coaxed from bing today:
https://files.catbox.moe/xy2z08.jfif
https://files.catbox.moe/xc84ul.jfif
https://files.catbox.moe/ml1kg2.jfif
https://files.catbox.moe/0al4rw.jfif

I'm retiring my OLD LLM machine (Dell R720). It was a k8s testing platform, and I discovered it could take a pair of P100s, and from there I started playing with stuff. It's a watt-waster, though.

Anonymous
07/30/24(Tue)17:14:43 No.101642462

Anonymous 07/30/24(Tue)17:14:43 No.101642462

>>101642326
wtf

Anonymous
07/30/24(Tue)17:15:16 No.101642469

Anonymous 07/30/24(Tue)17:15:16 No.101642469

File: 1698239585092081.jpg (196 KB, 1024x1024)

196 KB JPG

>>101642462
yeah

Anonymous
07/30/24(Tue)17:16:20 No.101642482

Anonymous 07/30/24(Tue)17:16:20 No.101642482

File: 1698700675967315.png (53 KB, 210x164)

53 KB PNG

>>101642326
You can see the squares break

Anonymous
07/30/24(Tue)17:17:03 No.101642488

Anonymous 07/30/24(Tue)17:17:03 No.101642488

File: 1720765088932249.jpg (85 KB, 1024x1024)

85 KB JPG

GOTTA BECOME RETARDED!

Anonymous
07/30/24(Tue)17:17:35 No.101642492

Anonymous 07/30/24(Tue)17:17:35 No.101642492

>>101642488
omg it sanik

Anonymous
07/30/24(Tue)17:18:01 No.101642496

Anonymous 07/30/24(Tue)17:18:01 No.101642496

File: 1721334394229132.jpg (238 KB, 1024x1024)

238 KB JPG

>>101642492

Anonymous
07/30/24(Tue)17:20:13 No.101642521

Anonymous 07/30/24(Tue)17:20:13 No.101642521

>>101642496
Can you draw judy hopps?

Anonymous
07/30/24(Tue)17:21:13 No.101642532

Anonymous 07/30/24(Tue)17:21:13 No.101642532

>>101642376
I don't do anime. Have some yakuza and ebony titties tho.
https://litter.catbox.moe/rz5oks.jpg
https://litter.catbox.moe/ffs982.jpg
https://litter.catbox.moe/5dc7s9.jpg

I rarely go for nudity nowadays, I prefer to generate more erotic images like
https://litter.catbox.moe/cz15th.jpg

Anonymous
07/30/24(Tue)17:24:17 No.101642571

Anonymous 07/30/24(Tue)17:24:17 No.101642571

>>101642306
This could almost be an anime shot, or it could be

Anonymous
07/30/24(Tue)17:24:48 No.101642580

Anonymous 07/30/24(Tue)17:24:48 No.101642580

File: ComfyUI_02523_.png (1.72 MB, 960x1270)

1.72 MB PNG

local models?

Anonymous
07/30/24(Tue)17:25:21 No.101642586

Anonymous 07/30/24(Tue)17:25:21 No.101642586

>>101642580
Are you lost?

Anonymous
07/30/24(Tue)17:26:00 No.101642591

Anonymous 07/30/24(Tue)17:26:00 No.101642591

File: 1698868454484510.jpg (198 KB, 1024x1024)

198 KB JPG

>>101642571
asuka if she was trans

Anonymous
07/30/24(Tue)17:26:43 No.101642601

Anonymous 07/30/24(Tue)17:26:43 No.101642601

>>101642326
>>101642469
What do ms paint gens look like? like this >>101636906

Anonymous
07/30/24(Tue)17:27:32 No.101642615

Anonymous 07/30/24(Tue)17:27:32 No.101642615

File: 1716842066282114.jpg (196 KB, 1792x1024)

196 KB JPG

>>101642571
This is from an actual anime btw (totally not an unedited dalle3 gen, including text)

Anonymous
07/30/24(Tue)17:28:50 No.101642630

Anonymous 07/30/24(Tue)17:28:50 No.101642630

File: 1444332745884.png (60 KB, 456x570)

60 KB PNG

>>101642591
>>101642601

Anonymous
07/30/24(Tue)17:31:05 No.101642659

Anonymous 07/30/24(Tue)17:31:05 No.101642659

File: 1701360303739698.png (374 KB, 1024x1024)

374 KB PNG

>>101642601
this is dalle with jb with prompt "ms paint drawing of an anime girl, extremely simple, microsoft paint"

Anonymous
07/30/24(Tue)17:33:14 No.101642685

Anonymous 07/30/24(Tue)17:33:14 No.101642685

>>101642615
Post moar pls, looks really good

Anonymous
07/30/24(Tue)17:33:26 No.101642688

Anonymous 07/30/24(Tue)17:33:26 No.101642688

>>101642659
that necktie is so cute, it made me smile

Anonymous
07/30/24(Tue)17:34:33 No.101642695

Anonymous 07/30/24(Tue)17:34:33 No.101642695

>>101642685
Just wanted to share the exact prompt (without JB, you need it):
The image is a single frame from an anime show (anime screencap) showing an anime girl adorned with a clover symbol. She is pointing and laughing in a teasing manner towards the viewer with the accompanying subtitle text 'Holy, scrapelet!'. The background is neutral gradient colors.

Natural style, 1792x1024, HD quality.

Anonymous
07/30/24(Tue)17:34:37 No.101642697

Anonymous 07/30/24(Tue)17:34:37 No.101642697

>>101642586
azureshit and cloudshit is on topic now so I must be.
the amount of cope in this thread is astounding
writing a platitude about llms in your post to seem relevant is not the secret workaround you think it is retards, get fucked

Anonymous
07/30/24(Tue)17:35:35 No.101642712

Anonymous 07/30/24(Tue)17:35:35 No.101642712

File: 1714039375650621.png (952 KB, 1792x1024)

952 KB PNG

>>101642685
it doesn't always get the text sadly

Anonymous
07/30/24(Tue)17:38:11 No.101642736

Anonymous 07/30/24(Tue)17:38:11 No.101642736

File: 1715900684000120.png (1.47 MB, 1792x1024)

1.47 MB PNG

>>101636887
>>101642685

Anonymous
07/30/24(Tue)17:38:42 No.101642741

Anonymous 07/30/24(Tue)17:38:42 No.101642741

>>101642638
have your throat slit pedo

Anonymous
07/30/24(Tue)17:38:43 No.101642742

Anonymous 07/30/24(Tue)17:38:43 No.101642742

>>101642736
the fingers........ AIIIIEEEEEEEEEEEEEEEEE

Anonymous
07/30/24(Tue)17:39:13 No.101642751

Anonymous 07/30/24(Tue)17:39:13 No.101642751

>>101642615
erm sorry chud the proportions of her face are slightly bad
ai has hit a wall

Anonymous
07/30/24(Tue)17:39:24 No.101642752

Anonymous 07/30/24(Tue)17:39:24 No.101642752

File: _112d70a7-c414-41cf-aa81-(...).jpg (156 KB, 1024x1024)

156 KB JPG

>>101642697
No fun allowed!

Anonymous
07/30/24(Tue)17:39:29 No.101642754

Anonymous 07/30/24(Tue)17:39:29 No.101642754

File: 1698114860793295.png (1.76 MB, 1792x1024)

1.76 MB PNG

Anonymous
07/30/24(Tue)17:40:01 No.101642762

Anonymous 07/30/24(Tue)17:40:01 No.101642762

>>101642638
That looks like a dwarf

Anonymous
07/30/24(Tue)17:40:56 No.101642777

Anonymous 07/30/24(Tue)17:40:56 No.101642777

>>101642754
A website should let you pay to finetune on a bunch of miku songs to gen a miku song

Anonymous
07/30/24(Tue)17:41:09 No.101642779

Anonymous 07/30/24(Tue)17:41:09 No.101642779

>>101642754
Post her armpits

Anonymous
07/30/24(Tue)17:42:00 No.101642791

Anonymous 07/30/24(Tue)17:42:00 No.101642791

>>101642754
what model? sdxl?

Anonymous
07/30/24(Tue)17:42:43 No.101642799

Anonymous 07/30/24(Tue)17:42:43 No.101642799

llama.cpp's RPC mode sucks... I don't want to send 40GB of weights over the network every time I load the model... With vLLM you just put the model in the same path in both machines...

Anonymous
07/30/24(Tue)17:43:58 No.101642817

Anonymous 07/30/24(Tue)17:43:58 No.101642817

>>101642615
how much does an api call like this one cost?

Anonymous
07/30/24(Tue)17:44:50 No.101642827

Anonymous 07/30/24(Tue)17:44:50 No.101642827

>>101642817
If you were to pay yourself, 12 cents. Because it's HD quality (normal is 4 cents -> 8 cents is HD) + higher res 1792x1024 which is another +4 cents, and I mass-gen since not all tries are good, and nitpick the better results. I don't think it's viable to paypig dalle, but you can easily scrape azure endpoints off github.

Anonymous
07/30/24(Tue)17:45:22 No.101642831

Anonymous 07/30/24(Tue)17:45:22 No.101642831

File: 1714126339002338.png (2 MB, 1792x1024)

2 MB PNG

Anonymous
07/30/24(Tue)17:45:31 No.101642833

Anonymous 07/30/24(Tue)17:45:31 No.101642833

>>101642827
wait so how good is free midjourney compared to this then?

Anonymous
07/30/24(Tue)17:45:54 No.101642839

Anonymous 07/30/24(Tue)17:45:54 No.101642839

>>101642833
idk, i never use midjourney because I dislike discord

Anonymous
07/30/24(Tue)17:46:35 No.101642847

Anonymous 07/30/24(Tue)17:46:35 No.101642847

>>101642831
miku hiii it's me lmg

Anonymous
07/30/24(Tue)17:46:40 No.101642849

Anonymous 07/30/24(Tue)17:46:40 No.101642849

Samplers for mini-magnum, and reviews for it vs nemo instruct?

Anonymous
07/30/24(Tue)17:48:00 No.101642867

Anonymous 07/30/24(Tue)17:48:00 No.101642867

>>101642827
>but you can easily scrape azure endpoints off github.
you overestimate me

Anonymous
07/30/24(Tue)17:49:32 No.101642886

Anonymous 07/30/24(Tue)17:49:32 No.101642886

>>101642827
>but you can easily scrape azure endpoints off github.
Doesn't github immediately take those down and send a notification to the owner of the repository? Kind of like what they do when you accidentally leak your credentials

Anonymous
07/30/24(Tue)17:50:21 No.101642894

Anonymous 07/30/24(Tue)17:50:21 No.101642894

>>101642886
No, they do this for OpenAI (which are just single keys), but not for Azure OpenAI endpoints (which are two parts - the endpoint name and the key, you also need the deployment name but it could be obtained from the API itself if you have the first two)

Anonymous
07/30/24(Tue)17:52:00 No.101642918

Anonymous 07/30/24(Tue)17:52:00 No.101642918

File: 00041-404906826_1.png (1.79 MB, 1456x1024)

1.79 MB PNG

>>101639548
>Biggest loser itt
Looks like someone's feeling left out

Anonymous
07/30/24(Tue)17:52:44 No.101642926

Anonymous 07/30/24(Tue)17:52:44 No.101642926

>>101642285
3rd? more like 5th, lmao, local lost

Anonymous
07/30/24(Tue)17:54:34 No.101642952

Anonymous 07/30/24(Tue)17:54:34 No.101642952

>>101642455
you need help

Anonymous
07/30/24(Tue)17:55:19 No.101642964

Anonymous 07/30/24(Tue)17:55:19 No.101642964

>>101639548
>mfw I realize I can prompt people outside and wait for their response
>as low stakes as talking to a model
thanks anon I'm married now

Anonymous
07/30/24(Tue)17:55:27 No.101642967

Anonymous 07/30/24(Tue)17:55:27 No.101642967

>>101642918
>MIKUK
the memes write themselves

Anonymous
07/30/24(Tue)17:55:57 No.101642979

Anonymous 07/30/24(Tue)17:55:57 No.101642979

File: 1707720034696868.png (30 KB, 168x130)

30 KB PNG

>>101642967
It reads more like NKUK for me

Anonymous
07/30/24(Tue)17:57:58 No.101643007

Anonymous 07/30/24(Tue)17:57:58 No.101643007

>>101642736
kek this is really bad and really good at the same time

Anonymous
07/30/24(Tue)17:58:11 No.101643012

Anonymous 07/30/24(Tue)17:58:11 No.101643012

File: 1715889883975479.png (1.55 MB, 1792x1024)

1.55 MB PNG

it doesn't want to make miku when she's hugging anon aaaaaaaaaa

Anonymous
07/30/24(Tue)17:58:41 No.101643015

Anonymous 07/30/24(Tue)17:58:41 No.101643015

I just got an rtx 3090. How smart of an investment was this?

Anonymous
07/30/24(Tue)17:58:59 No.101643022

Anonymous 07/30/24(Tue)17:58:59 No.101643022

>>101642791
lol
lmao even

Anonymous
07/30/24(Tue)17:59:42 No.101643032

Anonymous 07/30/24(Tue)17:59:42 No.101643032

File: 1710075201271825.png (1.23 MB, 1792x1024)

1.23 MB PNG

Finally a single real miku, but she's a loli for some reason

Anonymous
07/30/24(Tue)18:00:00 No.101643034

Anonymous 07/30/24(Tue)18:00:00 No.101643034

>>101642831
kino, looks like the psp game

Anonymous
07/30/24(Tue)18:01:00 No.101643045

Anonymous 07/30/24(Tue)18:01:00 No.101643045

>>101643034
Looks exactly like MMD, anon.

Anonymous
07/30/24(Tue)18:01:01 No.101643046

Anonymous 07/30/24(Tue)18:01:01 No.101643046

>>101642979
it's clearly INKUK

Anonymous
07/30/24(Tue)18:02:02 No.101643054

Anonymous 07/30/24(Tue)18:02:02 No.101643054

>>101643032
cute and funny, the AI got its priorities straight.

Anonymous
07/30/24(Tue)18:02:21 No.101643060

Anonymous 07/30/24(Tue)18:02:21 No.101643060

>>101641514
From what i've read the main problem problem is data, not enough labelling of audio transcription

Anonymous
07/30/24(Tue)18:02:32 No.101643062

Anonymous 07/30/24(Tue)18:02:32 No.101643062

>>101642831
The 2D stickers look out of place.

Anonymous
07/30/24(Tue)18:02:40 No.101643065

Anonymous 07/30/24(Tue)18:02:40 No.101643065

File: 1707201855037480.png (2.05 MB, 1792x1024)

2.05 MB PNG

>>101643045
yeah, dalle3 knows mmd

Anonymous
07/30/24(Tue)18:02:43 No.101643068

Anonymous 07/30/24(Tue)18:02:43 No.101643068

>>101643015
>investment
financially? you're not gonna make the money back
enjoyment? it depends how much you enjoy it. (we understand. you're in good company here)

Anonymous
07/30/24(Tue)18:03:10 No.101643075

Anonymous 07/30/24(Tue)18:03:10 No.101643075

>>101642849
>Samplers
I like a bit higher than recommended temperature, .6-1, add some minp if you're getting crazy tokens.

Anonymous
07/30/24(Tue)18:03:58 No.101643083

Anonymous 07/30/24(Tue)18:03:58 No.101643083

>>101643075
what is minp

Anonymous
07/30/24(Tue)18:04:04 No.101643084

Anonymous 07/30/24(Tue)18:04:04 No.101643084

File: 1717082216698188.png (2 MB, 1792x1024)

2 MB PNG

Something's wrong with this Miku...

Anonymous
07/30/24(Tue)18:05:02 No.101643103

Anonymous 07/30/24(Tue)18:05:02 No.101643103

>>101643084
/lmg/ ove...r

Anonymous
07/30/24(Tue)18:05:18 No.101643106

Anonymous 07/30/24(Tue)18:05:18 No.101643106

>>101643084
LMG is oveR after miku became real and got all anons laid

Anonymous
07/30/24(Tue)18:05:33 No.101643111

Anonymous 07/30/24(Tue)18:05:33 No.101643111

>>101643089
>>101643089
>>101643089

Anonymous
07/30/24(Tue)18:05:34 No.101643112

Anonymous 07/30/24(Tue)18:05:34 No.101643112

File: 1694294408509626.png (1.32 MB, 1792x1024)

1.32 MB PNG

NVGIDIA RXC RTX

Anonymous
07/30/24(Tue)18:06:06 No.101643120

Anonymous 07/30/24(Tue)18:06:06 No.101643120

>>101643112
genius fan design

Anonymous
07/30/24(Tue)18:06:35 No.101643127

Anonymous 07/30/24(Tue)18:06:35 No.101643127

File: 0fa.jpg (1.05 MB, 3264x2448)

1.05 MB JPG

What current machine AI handles a solid conversational flow for RP?

Every RP i've tried so far has the same sterile robotic feel to it that I dislike (waterboard you with questions that while, seems kinda normal, the frequency just instantly reminds you you're talking to a bot)

Really pissing me off, been trying Gemma 27B trying to fine tune it and it's super bad for this. Command R is a little better, Mistral Nemo is also pretty bad.

For reference, I have a 4090, so not running any x3 3090 setups for the actual nutty models

Anonymous
07/30/24(Tue)18:15:19 No.101643239

Anonymous 07/30/24(Tue)18:15:19 No.101643239

>>101643127
>What current machine AI handles a solid conversational flow for RP?
none
I'm being serious, wait a year or two I guess

Anonymous
07/30/24(Tue)18:25:57 No.101643374

Anonymous 07/30/24(Tue)18:25:57 No.101643374

>>101643127
>>101643239
I seriously think this could be fixed with a card-specific prompt or an author's note or something.

Anonymous
07/30/24(Tue)18:26:22 No.101643379

Anonymous 07/30/24(Tue)18:26:22 No.101643379

>>101643374
And examples.

Anonymous
07/30/24(Tue)18:27:08 No.101643389

Anonymous 07/30/24(Tue)18:27:08 No.101643389

>>101638197
so sure, a transformer big enough could.

but it's just not the right tool for the job, too inneficient, you can aproximate mandelbrot, heck have a perfect representation of mandelbrot with enough layers (infinite), but that's not a practical solution.

the transformer architecture is just not fit for AGI, you could get there if you had 100 order of magnitudes more compute and data.

but it's a pointless endeavor due to architectural limitation, i do think we'll reach agi at some point, and i do think we even already have the compute necessary for it, but i think the transformer architecture alone is just too inneficient for that purpose.

if you consider the human mind as a function, sure an universal function aproximator can get close to the mapping of all possible i/o.

but it'd be a huge waste of time.

just like we don't use ufa to practically approximate mandelbrot, it's not a practical tool for full blown agi, at least not alone.

i think your first focus should be making an artificial hippocampus.

Anonymous
07/30/24(Tue)18:29:18 No.101643422

Anonymous 07/30/24(Tue)18:29:18 No.101643422

>>101643389
also fuck the reddit spacing, i don't do double new line, i do shift newline but i didn't knew it'd show up like that.

test(doing double newline)

test(doing shift newline)
test(doing just newline)
test

Anonymous
07/30/24(Tue)18:36:59 No.101643530

Anonymous 07/30/24(Tue)18:36:59 No.101643530

>>101643127
>single GPU rig
unironically NGMI
jj tho.
I do have an idea for some datasets that might give models a more natural conversational flow. But it'll have to wait until fall when I can just open my window and not turn my house into an oven while training right now it's heat wave season.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.