/lmg/ - a general dedicated to the discussion and development of local language models.Previous threads: >>101628398 & >>101619436►News>(07/27) Llama 3.1 rope scaling merged: https://github.com/ggerganov/llama.cpp/pull/8676>(07/26) Cyberagent releases Japanese fine-tune model: https://hf.co/cyberagent/Llama-3.1-70B-Japanese-Instruct-2407>(07/25) BAAI & TeleAI release 1T parameter model: https://hf.co/CofeAI/Tele-FLM-1T>(07/24) Mistral Large 2 123B released: https://hf.co/mistralai/Mistral-Large-Instruct-2407>(07/23) Llama 3.1 officially released: https://ai.meta.com/blog/meta-llama-3-1/>(07/22) llamanon leaks 405B base model: https://files.catbox.moe/d88djr.torrent >>101516633►News Archive: https://rentry.org/lmg-news-archive►FAQ: https://wikia.schneedc.com►Glossary: https://rentry.org/lmg-glossary►Links: https://rentry.org/LocalModelsLinks►Official /lmg/ card: https://files.catbox.moe/cbclyf.png►Getting Startedhttps://rentry.org/llama-mini-guidehttps://rentry.org/8-step-llm-guidehttps://rentry.org/llama_v2_sillytavernhttps://rentry.org/lmg-spoonfeed-guidehttps://rentry.org/rocm-llamacpphttps://rentry.org/lmg-build-guides►Further Learninghttps://rentry.org/machine-learning-roadmaphttps://rentry.org/llm-traininghttps://rentry.org/LocalModelsPapers►BenchmarksChatbot Arena: https://chat.lmsys.org/?leaderboardProgramming: https://hf.co/spaces/bigcode/bigcode-models-leaderboardCensorship: https://hf.co/spaces/DontPlanToEnd/UGI-LeaderboardCensorbench: https://codeberg.org/jts2323/censorbench►ToolsAlpha Calculator: https://desmos.com/calculator/ffngla98ycGGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-CalculatorSampler visualizer: https://artefact2.github.io/llm-sampling►Text Gen. UI, Inference Engineshttps://github.com/oobabooga/text-generation-webuihttps://github.com/LostRuins/koboldcpphttps://github.com/lmg-anon/mikupadhttps://github.com/turboderp/exuihttps://github.com/ggerganov/llama.cpp
►Recent Highlights from the Previous Thread: >>101628398--Paper: VALL-E 2 paper criticized for lack of advancements: >>101628420 >>101630671--Paper: Mixture of Nested Experts and Meta SAM2 for efficient image processing: >>101632425 >>101632458--Paper: C3A: a new fine-tuning method using circular convolution: >>101632131 >>101632167--Llamafile has better CPU inference performance on some CPUs due to ikawrakow's optimizations: >>101633772 >>101633958 >>101634038 >>101634367--Llama.cpp performance discussion, including context size, attention mechanism, and GPU offloading: >>101628597 >>101628675 >>101632864 >>101633101 >>101633596 >>101633704 >>101634712 >>101634751 >>101634842 >>101634908 >>101635029 >>101634347 >>101628673--Gguf vs exl2 formats and their differences: >>101630375 >>101630457 >>101630483 >>101630485--Character card formatting suggestions for vramlet use: >>101633734 >>101634595 >>101634610 >>101634647 >>101634700 >>101634670 >>101634704--CR+ excels at RPing and writing style, but may struggle with complex scenarios: >>101628972 >>101629174 >>101629086 >>101629222 >>101629261 >>101629386--Anon troubleshoots Mistral Nemo issues with sampler preset and token output limit: >>101630731 >>101630824 >>101631077 >>101631448 >>101631972 >>101631099 >>101631367--Anon releases Command R/R+ basic presets v1.3 for SillyTavern: >>101634180--Anons discuss the golden age of open source and its future: >>101633831 >>101634061 >>101634421 >>101634435 >>101634461 >>101634529 >>101634686 >>101635282 >>101635358 >>101634476--WInfo-before and WInfo-after still have use cases in 2024: >>101630520 >>101633814 >>101633901--Anons discuss non-roleplay uses of LLMs, including translation, game development, and custom assistants: >>101629172 >>101629273 >>101629428 >>101629458 >>101629508 >>101629830--Miku (free space): >>101628819 >>101629323 >>101630431 >>101630651 >>101630714 >>101636814►Recent Highlight Posts from the Previous Thread: >>101628405
>virtamate>https://hub.virtamate.com/resources/categories/looks.7/God. They look like fucking ghouls. Imagine the people who are honestly making and playing with this shit.
cuda dev, oh cuda dev. why does quantized KV cache not work with RPC?just insta-fails on this assert:https://github.com/ggerganov/llama.cpp/blob/140074bb8647df41840d6f32f4409fa8959bcf9f/ggml/src/ggml-rpc.cpp#L390>TODO: this check is due to MATRIX_ROW_PADDING in CUDA and should be generalized
teto's tata's...
COHERE, RELEASE A BANGER MODEL IN 30 TO 70B RANGE AND MY LIFE IS YOURS
>>101637073This week.
>>101637073t.
>>101636935Reminds me of Illusions Honey Select 2 - uncanny detailed 3D models with shitty animations and physics.They're releasing something called サマバケ!すくらんぶる soon - supposed to be like AA2, which strategy-wise was actually pretty good. I might, for once, actually buy it if it doesn't suck.
>>101637007I'm not familiar with the RPC backend but this check is much stricter than necessary.Only the last row needs to be padded to a multiple of 512 (to avoid out-of-bounds memory accesses).For all other rows no padding is needed because the activations are zero-padded to a multiple of 512 so the resulting vector dot products for the padding are equal to zero (unless there are NaNs or infs in the KV cache).If you don't use CUDA or if you're using --flash-attention I think it would be safe to remove the check.
>>101634712>>101634751>the distribution of experts in Mixtral and Qwen2-57B-A14 is very imbalanced; thus, it would be beneficial to store only the most frequently used experts on the GPUthis was discussed basically the moment mixtral 8x7 dropped back in the dayisnt the problem with this and the reason why it wasnt implemented the fact that for each token you have to use all experts anyway since the MoE models use only X experts per layer (or something similar) not per token, meaning that you will be reading the entire model per token anyway just not all at the same time?
>>101637073if they drop something I feel like it's more likely to be too big for local
Tuesday Themehttps://www.youtube.com/watch?v=sqK-jh4TDXo
>>101637264I like this Teto
I had a doctor tell me yesterday that she was using an AI tool to record and summarize the conversation we were having. I'm assuming that she probably wasn't running it locally on her phone right? Wouldn't that mean that she's sending patients information to some server somewhere that may or may not be secure? Do hospitals even run models themselves or are they all using chat gpt shit?
CPU maxxers, have any of you tried running that insanely huge MoE google released I think an year ago or so?
>>101637263yeah, it'll be like command r large 150b or some shit
Is magnum 32b significantly better than mini-magnum?
>>101637309>Wouldn't that mean that she's sending patients information to some server somewhere that may or may not be secure?You really think that hospitals were secure before the AI hype? LOL
>>101637309>may or may not be secureSecurity is not that binaryYou would hope healthcare staff are aware of obligations re patient data and it's going through some official system, likely eventually to enterprise chatgpt ("pinky promise we won't read your data"), not just a doc trying to save a few minutes with the mobile app.
>>101637309>Do hospitals even run models themselves or are they all using chat gpt shit?That's super-creepy. I imagine it's an Azure or AWS offering. They both sell transcription services.
Anyone know of a proyect that can monitor and openai compatible api? similar to what vLLM has with prometheus. To monitor throughput, requests etc.
>>101636935There is no quality control so most of it is weg nightmarefuel. But when you look hard you can find some really good stuff. Cuddlemocap is good for scenes. As for looks the best looks come from people who just rip models out of real games. And a few people who really know what they are doing. Pic related is my waifu that makes me coom buckets.
>>101637457Prometheus isn't vLLM specific, you can configure it for anything.
>>101637496No offense but that still looks uncanny and just not very good, even if it's better than the average model on there.
>>101637198My expectation is that an optimally trained MoE model would utilize all experts evenly so there would be no benefit to shuffling around which experts get offloaded.And even if there is an imbalance for specific models I'm not sure that imbalance would be consistent for different inputs.>isnt the problem with this and the reason why it wasnt implemented the fact that for each token you have to use all experts anyway since the MoE models use only X experts per layer (or something similar) not per token, meaning that you will be reading the entire model per token anyway just not all at the same time?For prompt processing it's basically guaranteed that you will have to evaluate all experts anyways.For token generation you would potentially be able to evaluate the same number of experts in total but a larger fraction of experts on the GPU, but only if the experts are utilized unevenly.
Can someone explain how exactly speculative token decoding works? Wouldn't the big model have to verify that the token is "correct" anyway, thereby doing the same computation that it would've done otherwise?
>>101637309In the EU at least I think sending patients' medical data to OpenAI would be straight up illegal.
>>101637540yeah but vllm has an integrated endpoint natively. Does llama.cpp or tabby offer something similar?If they don't I guess that the best way was to put something to act as a proxy in the api endpoint to measure the statistics.
>>101637622What about Azure OpenAI endpoints hosted in Europe itself?
has anybody else noticed how unrealistic the rape pov cards are compared to real life? IRL most girls stop moving and just freeze after like 5 minutes, at least in my experience, but all the cards I try here always do something like picrel which makes me laugh so hard i come instantly.I don't get ERP, why not just have real sex.
>>101637597>For token generation you would potentially be able to evaluate the same number of experts in total but a larger fraction of experts on the GPU, but only if the experts are utilized unevenly.given that speed increase curve per % of model offloaded to gpu graph, wouldn't that also apply here, requiring basically 90%+ of the tokens to be generated by the very, very few number of experts that are offloaded to the gpu for this speedup to be possible, meaning unless we change the arch of the models by a large amount into something completely new, this is a nothingburger
>>101637653This looks like a model issue
>>101637626Oh in that case I have no idea
>>101637653>at least in my experience
>>101637618You have to do the same number of computations but you can do them with a higher arithmetic intensity.Meaning the amount of computations that you can do per data loaded is higher.And because token generation is I/O bound that translates to higher performance as long as your predictions are correct.Another way to think about it is that the total time needed to evaluate n tokens scales less than linearly.I think evaluating two tokens takes ~10% longer, evaluating 64 tokens takes ~2x longer (with llama.cpp).>>101637655>given that speed increase curve per % of model offloaded to gpu graph, wouldn't that also apply here, requiring basically 90%+ of the tokens to be generated by the very, very few number of experts that are offloaded to the gpu for this speedup to be possibleYes.
Good morning jeets, your toy industry is dying.
>>101637711>your toy industryWho cares? I have my local models, and they'll always be there even if OpenAI and Anthropic die.
>>101637618The key to understanding this is that token generation currently is not using our full GPU. That means we can do two (or more) token generation jobs/requests at a time. Therefore, if we have access to likely guess at what the next token is (thanks to a small model or something else) then we can verify that the guessed token is correct at the same time that we generate the token after that. If the verification is matching, then we can keep both tokens. If there isn't a match, then we can keep the true token, while throwing out the "after that" token.
>>101637618The "shortcut" is the bigger model can evaluate several cheaply-generated candidate tokens in one forward pass. https://huggingface.co/blog/assisted-generation
>>101636887Upgrading from Windows XP with Tetohttps://www.youtube.com/watch?v=TFAe0BYP2Xc
>>101637699speaking of speculative decoding, whats the biggest bottleneck to having it implemented? doesnt seem nearly as complex as a lot of other features talked aboutwith one ok implementation most if not all of the code can be used for most models, should be a drop in for any 2 big/small model pair with the exact same vocaband it would speed up everything by a solid double digit %, specific workloads with a lot of copying of previous tokens like AI explaining basic program functions can be sped up many times over
>>101637166huh thanks. it does work with -fa on if I remove the check, at least with llama3 8Bthe rpc-server doesn't have many options so not sure if -fa is actually active or if I'll get random NaNs once I load in mistral large kek
>>101637785>speaking of speculative decoding, whats the biggest bottleneck to having it implemented?llama.cpp supports it though
>>101637653>at least in my experiencelevel issue, stalk for 10 more years before raping again, newfag
>>101637653Rape is not fun if you only rape doormats.
>>101637785>speaking of speculative decoding, whats the biggest bottleneck to having it implemented?Actually getting benefit from it.Getting good predictions for the next token that are sufficiently cheap is not easy.The predictions need to be good enough to offset the cost of creating them which is not a given.This includes indirect costs related to the large model eval taking slightly longer for multiple tokens than for a single token.
>>101637843Kinda missed you ngl.
>>101637785>>101637855I forgot: speculative decoding also becomes harder with a larger vocabulary sizes since there are fewer token sequences with a single, clear continuation.For example, "supernova" was tokenized as "super", "n", "ova" for LLaMA 2 but it has its own token for LLaMA 3.
>>101637843>>101637653yiku/motsuba vibing
>>101637653>at least in my experienceBASED! TAKE MY VRAM KING!
>>101637653>IRL most girls stop moving and just freeze after like 5 minutes, at least in my experience
>>101637855perhaps there could be a very large gain in trying to utilize the unused fast memory (gpu usually)by implementing the ability to set both the big and small models to specific locations separately with commandsfor example for L3 3.1, the 8b goes into 8GB VRAM basic GPU and 70B goes into RAM, allowing the 8B model to cruch all the time while contributing much more than if you were to just offload a part of the big model to ityou could also, for some added complexity, even keep the promp processing on the gpu enabled even in this case, by unloading and loading the 8B as required, even dynamically (and manually with cli arg) setting the tokens at which gpu prompt processing is used instead of on CPU (increasing it from 32 which is the number now, since now there is an overhead to loading and unloading the 8B model)
>>101637909yes this is true but not a big problem as most big models have smaller counterparts with mostly the same or completely the same vocab, its essentially free for anyone training the big model to release a 8B/13B smaller one, especially with distillation techniques of today and later
>>101638014Can you not?Here, go do your thing with a cartoon Migu, if you must: https://files.catbox.moe/2iygns.jfif
Does llama.cpp's RPC work like vLLM's multi-node pipeline parallelism?
>>101638004This is already implemented in llama.cpp, you can set separate -ngl values for the draft and target model.I personally think the better strategy is to try and reduce the latency increase from adding a few extra tokens.I already have an implementation for n-gram lookup that produces drafts very cheaply using only CPU resources, the main problem right now is that CUDA graphs are only supported for a batch size of 1 so you need to get a certain minimum speedup to offset that.
>>101638055I want to throw that Migu into the air like a baseball. She would find it exciting and be laughing gleefully.
>>101638055>jfifAIEEEEEEEEEEE MUSTARD GAS
>>101637653erp is for people with fetishes that arent irl actionable or waifuniggers who only want to have real sex with fictional characters
>>101637653prompt issue
so it begins...https://arxiv.org/abs/2407.19594
>>101638070doesn't n-gram decoding only works if the input and output sequences are very similar? Or that's the case for look-ahead decoding and not lookup one?
>>101638131I am specifically trying to get an implementation that works for the generation of natural text without a large context from which to draw token sequences.
>>101638083Sowwy! That's what bing/dall-e spits out.I'd love to gen stuff with my big rig at home, but last month's electric bill was over $400, so... yeah... Imma let bing handle it for a bit.I'll be on a time-of-service plan soon, so gens and training will be shifted to "cheap hours" where it's $0.07/kWh
>>101637772Going back to Windows XP with Tetohttps://www.youtube.com/watch?v=neuCtK96Dww
I just want to lie relaxed on the sofa and listen to a certain voice reading out the latest papers.nothing available that allows me to do that in high quality and real timefuck everyone, now i have to work into the field myself because no faggot dares to publish something decent
>>101638118meh, we'll never get to agi with llm alone / the transformer architecture.
>>101638161>Sowwy! That's what bing/dall-e spits out.It actually doesn't, it's because you're using macOS/iOS or some shit so your browser downloads JFIF. dalle/bing spits out png.
>>101638070Is there a way to force llama.cpp to keep X layers on an SSD? How hard would it be to implement and where to start in the codebase?
>>101638033if you wanna achieve good speedup your draft should be like 30x faster than the target model, meaning something in the region of 0.5B-1B. Those models aren't particularly smart afaik unless you do just code or text formating or sth very homogeneous and predictable.
>>101638167>the transformer architectureany reasonably sized universal function approximator should be capable of getting to superintelligence, i dont see a reason why transformers wouldnt, its just that other arch that allow easier infinite context would probably do a better job since you dont need perfect recall
>>101636887are there any fully uncensored versions of llama 3.1 yet? all the "uncensored" ones so far are still very censored
>>101638197>probably do a better job*faster job
>>101638194>Compare this to ~4 cents per image for DALLE-3Only with standard quality which is shit, HD costs twice as much.>SD3 on APILarge costs 6.5 cents per gen, Ultra (which is Large with some extra pipelines) costs 8 cents per gen, Medium (which is shit) costs 3.5
>>101638194>beauty must be shared
>>101638213literally just say what you want in the sys prompt or use a card that isnt 100 slop tokens
What's the best nemo tune for RP/storytelling? Lumimaid?I can't stand the writing style of the vanilla instruct version.
>>101638194>that also means I have stole over 100 USD worth of compute from the chinks and spent it all on beautiful little girls lets fucking goooI've stolen hundreds of thousands of dollars total in AWS + OpenAI credits.
>Meta's Mark Zuckerberg chews AI chud
>>101638181>Is there a way to force llama.cpp to keep X layers on an SSD?No.>How hard would it be to implement and where to start in the codebase?You may be able to do it relatively easily with the opposite approach, by modifying --mlock in such a way that part of the model is forced to be kept in RAM and the rest is swapped in/out.But it may be difficult to get something like this merged since I would expect the performance to be pretty terrible.
Hey /g/ senpai, quick question for the data engineers out there I was pondering.If you had to insert an image into an AI-generated background from a prompt for an LLM, how would you do it? Specifically, how would you ensure the image fits the perspective of the generated background? Is there any model that do it natively?
>>101638254>cudcannibalistic non human underground dweller? C.L.U.DCANNIBALISTIC LIZARD UNDERGROUND DWELLER
>>101638247I might have been doing something wrong, but Lunamaid was dumb as a sack, might as well go back to llama3 8b.In my opinion, it's either nemo-instruct or mini-magnum.The first is more technical, the second is better at text fucking.
>>101638166>I just want to lie relaxed on the sofa and listen to a certain voice reading out the latest papers.There's tutorials on youtube for doing a RTVC model. I followed it, using voice acting ripped from a Japanese h-game, and it worked perfectly. AFAIK, RTVC is doing a speech-to-text, text-to-speech, it can't be that far off plain TTS, right?I personally want a sassy, bratty catgirl-type voice for TTS. I'll see what I can do. As for RTVC, I was going to mostly use it for live vocaloid-type stuff, but the delay makes it unworkable, so I went with a Zoom V3 instead. With some practice, the "child" preset gets you something reasonable, though it's on you to "animate" your voice, meaning making it sound "pouty" or "bratty" or whatever.
>>101638262>But it may be difficult to get something like this merged since I would expect the performance to be pretty terrible.Yeah it's a pretty niche use case, seems better to just wait for some actually huge and actually good model to create pressure for things to be optimized a bit more around huge models.I would like the functionality to be able to run future huge models overnight for non time critical things off of pcie gen 5 ssds while my ram is used for other things
>>101638171>It actually doesn't, it's because you're using macOS/iOSI'm on Windows 10. Is there some kinda setting in my bing account for it?
>>101638264ask chatgpt
>>101638308>if the CCP wasnt bankrolling Kwai there would be no feasible way to monetize thisAnon, that's not how it works, startups often lose money in the first stages. Look at websim for example - they've been providing FREE 3.5 sonnet and opus (!!) generations for literal months. They only started doing ratelimits recently.
>>101638194>beauty must be sharedAnimate the Migu
>>101638247Nemo or maybe mini-magnum. Be aware that the other person that responded to you is a shill, though. Finetuners like to talk shit about each other to sell their stuff. Vanilla nemo was already good at smut so what he said about "technical" is just a lie.
>>101638314>my bing account
>>101637711I just had a thought that the whole financial sector and top management of basically every company out there could be replaced by AI. I really wish it would happen during my lifetime. It is not like AI can be more evil than people in those positions.
>>101637653nay, irl they just keep crying. go back poser
>>101638354>>my bing accountAm I supposed to care?
>>101638314Hmm, then I honestly don't know. But what I know is that I have lots of azure dalle endpoints, do you wanna get lots of mikus and whatnot? You could even use jailbreaks to force your own prompts, and change to natural style.
>>101638342>Be aware that the other person that responded to you is a shill>What's the best nemo tune for RP/storytelling? Lumimaid?How charitable of you to assume that question wasn't shilling.
TWOMOREMINUTES
>>101637166unrelated to ^https://www.marktechpost.com/2024/07/26/flute-a-cuda-kernel-designed-for-fused-quantized-matrix-multiplications-to-accelerate-llm-inference/a comment on this reposted on preddit: "Doesn't turboterp use a bunch of these tricks in exllama already?"could be interesting
>>101638328ChatGPT give mid to bad answers most of the time, it's mostly good to refine your own ideas or structure them.I don't trust most of its answers right aways, especially when you know a bit of the domain in question..
>>101638407I believe you but if you lied then your mom will die of cancer tomorrow.
>>101638014What are these glasses called?
fuck me two of the most annoying posters return to lmg on the same day>>101638055you're both avatarfags that gen on cloud and cope about it, if anything you should be best friends
>>101638428Hey I never said anything would be happening.
>>101637627still illegal as it's a third party.
>>101638445It's not illegal, retard.
>>101638444Well sucks to be your mom then...
>>101638407Two more minutes until what?! I want a smarter Local Miku and I want it now. I also want free A6000s from nvidia that they give out to enthusiasts who ask nicely.
>>101638391What does jailbreak get you? It doesn't give you explicit nudes, right? I can gen stuff at home, I'm just trying to figure out if it's my AI stuff running up the bill or just the AC, or both.
>>101638407TWONVIDIADATACENTERS (MORE)
>>101638462>What does jailbreak get you?API DALLE rewrites prompts by default, with the jailbreak you can force it to use your prompt as is. Also API DALLE lets you use HD quality and natural style, while bing IIRC is always forced to standard + vivid
>>101638456>free A6000s from nvidia that they give out to enthusiasts who ask nicely.I'm sure they can, but I doubt they do. But yeah, you wanna give me a $7K GPU, I'll be glad to promote it.
>>101638485>API DALLE rewrites prompts by default+ ((black | asian | brown) woman:1.5)negative embedding: white malei reverse engineered the dall-e 3 secret prompt, your welcome
>>101638515no, anon, that's not the secret prompt, it's much more extensive but yes it will diversity characters unless you explicitly say their ethnicity or just use a JB
>>101638485Ah OK. Makes sense. That's cool but I'll pass. I've probably got enough Migus to train an SDXL lora, if not an actual base model. Stylegan2 ADA used to need thousands of carefully-selected images, maybe SDXL isn't as demanding? Last time I tried with Stylegan2 with about 500 properly sized and cropped images, all with more or less the same pose, I got a bunch of abominations and the model never converged during training, it was a big waste of electricity.
>>101638521pedo
>>101638544HMM, actually thanks for the great idea, I should try fine-tuning SDXL on some DALL-E 3 gens, I can also caption them with 3.5 Sonnet/GPT-4o (i have plenty of the latter) and then actually fine-tune the model myself or via replicate (I have a few scrapped keys with billing)
>>101638374it really depends on how you treat them.
>>101638420>FP16 operations>batch size <= 32Don't care, those kernels are the easy ones with a low ceiling for optimization.I'm much more interested in kernels using int8 for batch sizes >= 512 since those could in theory become twice as fast as FP16 cuBLAS and the potential benefit would be faster and more memory efficient training.
>>101638595show some cr+ cunny logs
>>101638557Is there a way to take the filename and feed it to the Azure API to get the prompt back? It sure looks like the name is a unique hash. That would be super-useful. Otherwise I have to run it through a booru model, and I'd rather not use booru, I'd prefer more natural prompting.
>>101638608Of course not, that's all private. You can't get the original prompt from the hash.
>>101638565approve my PR
>>101638718>we're not on /aicg/are you retarded? cr+ is a local model, and we share logs here. So post logs or you're larping, and don't actually have any cr+ logs. I'll accept catbox links too.
https://civitai.com/models/323639/ipivs-sdxl-lightning-text2img2vid-sd15-animatediff-lcmkl*ng but actually relevant since it's fucking local and free.Previews every step of the way and upscaling to 1080p + interpolation.Fuck paying for any of this shit. Fuck using cloud.
Why don't they update the models? Or at least pull the plug and stop the money drain.
>>101638739>we share logs hereno, we don't?if anything this shows YOU are a newfag larper
>>101638766any solid controlnet to for example only img2video the background of the character?
>>101638816>no, we don't?we do
>>101638829>we dowe don't
>>101638829Nta but I only share Nala logs
>>101638829nta but I only share watermelon logs
>>101638766Pretty cool for static backgrounds. I could imagine someone using this to make VN-type games more cool.
>>101638858can we just get this nigger banned? this isn't aicg
>>101638876we are a aicg offspring though
>>101638876im literally talking about local models in that very message. your post was more off topic than mine>SMHH
>>101638894>im literally talking about local models in that very messageno you're not, you're pretending that you are, but in the end all you do is post pedo videos generated by a proprietary model
>>101638894and why are you deleting your own posts if it's ontopic, as you say? doesn't compute
>>101638766Fuck yeah anon.Thank you for the link, will play with it later.>>101638845My hero.
Constant AI in front of you, indistinguishable from a dream
>>101638901He's not deleting his own posts, that's a janny responding to reports.Then he continues spamming until a mod actually gets around to banning him.And then he evades the ban.
>>101638934>He's not deleting his own posts, that's a janny responding to reports.What rule should I report his future post sunder, if I may ask?
Any way to use gemma 2 at 8k context? I've heard the sliding window attention or whatever is called isn't implemented in llama.cpp
>>101638939This post breaks the United States laws
>>101638897the prompts for those pedo videos are generated with a local model and I discuss which models are better for that use case. picrel is command r plus on hf chat for this purposethis is the part where you go "durr but youre not running it locally!!!1!" to cope with the fact that my local model usage and discussion is on topic for these threads>>101638901they're not and have never been removed for being off-topic>>101638955>This post breaks the United States lawsbut it doesn't
>>101638953It's been implemented for a while now.It might be a hack instead of proper SWA, I can't remember, but regardless, 8k context should work.
>>101638939I report it as "loli outside of /b/" or whatever it's called.Honestly I think the mods would ban him even on /b/ though.I don't know the intricacies of US law when it comes to synthetic CP but the mods would probably rather be safe and just ban him.
>>101638968your usage is not on topic in the slightest, you know that, I know that, everyone knows that. You're just a sad lonely fat virgin sitting in your basement with your unhealthy pedo fantasies, and you don't have anyone to talk about them so you gen those videos and share them here to try to get other anons to react.
>>101638968pedophilia is a crime
>>101638953Someone posted RULER benchmarks for it and it did pretty well at 8k even without true SWA. Unfortunately during actual use, its ability to recall early context in natural conversation when you get to 5-8k degrades. So RULER may not be a perfect benchmark for this.
>>101638992is it? I think the crime is putting your thoughts into action, or storing csam. just being a disgusting pedo isn't a crime.
>>101638992>pedophilia is a crimethis is thoughtcrime anon. if we could detect murderous thoughts, should we put everyone with murderous thoughts in jail on the off chance they actually follow through and commit murder?>>101638987an ad hominem does not refute the fact that i am using locally available models for my productivity and workflow
>>101639027>an ad hominemIt's not ad hominem, you're not genning those videos with local models, and those videos break /g/ rules anyhow.
I'm convinced that everyone saying how good LLMs are, have never hold a conversation with an actual human being. It's all robotic cringe from the smallest <7B models to Claude Opus. And don't even try to skill issue me you son of a bitches, I've read your logs. All cringe.
>>101639033me being a fat lazy virgin has nothing to do with the on-topicness of my content in the threads, so it is an ad hominem>you're not genning those videos with local modelsi am genning the prompts with local models and discussing which ones are the best for my unique usecase, which is a valuable addition to the thread. id agree with you more if i wasn't sharing the prompts>those videos break /g/ rules anyhowirrelevant to whether they are on topic or not, and an appeal to authority even if it was
>>101639027Why yes! I do think people that constantly post gore and say things like "I constantly have dreams where I'm murdering people" should be in jail or in a mental institution.
>>101639054>and an appeal to authority even if it wasThen why are you on 4chan?
>>101639050you should try c.ai, it's the only model to hold a conversation with any resemblance of humanness
>>101639061>I constantly have dreams where I'm murdering peoplei didn't say this, but it doesn't matter even if i did. read The Minority Report by Philip K Dick (its 10 pages long) if you'd like to understand why this attitude towards precrime and thoughtcrime results in an abusive and authoritarian society>>101639073>Then why are you on 4chan?you lost me with this one sorry anon
>>101639195pedos like you should get the rope
>>101639201do you really want to live in a society where you kill someone because they AI generated a little girl eating a popsicle
>>101639223Yes.
>>101639226Based
>>101639223I want to live in a society where the mods finally get fed up with the petra/pedo spammer and drop a range ban.
>>101639226basedbut the authoritarianism necessary for that would also end up in a society with a Stasi secret police that put you in jail for no reason to meet arrest quotas way before we get to that level, so let's be serious and not edgy anon
>>101639245sadly won't help, the schizos are taking over 4chan as normal anons are leaving it. residential proxies don't cost that much
>>101636935The difference between good and bad models is absolutely insane, especially in VR. Some of them are uber god tier coom extractors, but most are utter garbage. It actually feels more binary than a spectrum. M4RIO's models are fucking amazing.
>the absolute state of literature
>>101637653
>>101639294Is that from Re:Zero?
>>101637653miku highlights guy, if you don't include this post and chain into the highlights, I WILL find you.
>>101639258how much do you get for hosting a residential proxy? I wonder if it's worth the risk
>>101639361hosting? nothing, the ones who sell it mostly get them from botnets and hacked routers/phones
>>101637653Thanks for reminding me where I was.
>>101639316No, it's Durarara.
>>101639370Oh. Makes sense kek
>>101638718Large 2 is great for cunny. Some of the shit is says really melts your heart, then gets your Johnson going. It's probably the most realistic experience so far.
>your lips curl into a smile...>a smile twists across your face...>a sly grin spreads across your face...>her words dripping with malice...>each word tinged with a hint of malice...FUCK YOU LARGESTRAL I'M SICK OF HEARING ABOUT SMILES AND MALICE REEEEEEEE
>>101639479A year ago, people were reeee'ing from the other side of the spectrum.These are great times.
>>101639479It's also doing shivers, we're never escaping the fucking shivers. Someone has to get to the fucking bottom of this shit, there is simply no way that sentence is so ubiquitous.
You are talking to a machine. It has no awareness. It has no personality. You are alone in your room running your GPU at full speed trying to simulate friendship. You are degrading your social skills and living in a fantasy world. Go outside.
>>101639548>You are talking to a machine. It has no awareness. It has no personality. You are alone in your room running your GPU at full speed trying to simulate friendship. You are degrading your social skills and living in a fantasy world.Yes, and it's great in here
>>101639548>simulate friendship>implyingwhat if i like to simulate scenarios that could never happen and text adventure games?
>>101639479>({{char}} is kind-hearted and friendly.)>She smiles wickedly, her dark grin, her devious scowlStheno...
>>101639548>simulate friendshipi'm simulating sex, actually>You are degrading your social skillsthere were never any>and living in a fantasy worldmany of them actually>Go outsidethere's nothing for me out there
>>101639548But I don't want to go outside! Kids throw rocks at me when I do that...
>>101639536Try adding an author note at depth zero, "Emulate the writing style of XY", XY being some famous fiction writer. That can tone down some of the slop and change the prose enough that the model feels completely different. Might not work for non-book-style RP, though.
>>101639624Mistral large seems to trend towards book style formatting on it's own, so it might work pretty well. Good tip, I'll test it out.
>>101639479based malicious girl enjoyer
>>101639080Clearly, you are high, or wearing rose-tinted glasses. c.ai was a fucked-up mess most of the time. Oldest screenshot I can find (not this one) was 11/2022 and even then you had to use shit like the POV trick to kick the bot into replying when it got filtered.
>>101639927>Clearly, you are high, or wearing rose-tinted glasses.Never look down on the lack of mental capacity of some anons. They can and will do both at some time, and then add something stupid on top, just to move the status quo of their own idiocy.
>>101637855WTF???
>>101639927c.ai was only a fucked up mess if you were trying to fight the filter, or if the conversation was too long.Yes, it enters on repetition loops. Yes, it has a lot of -isms. But it was and still is the best natural model to talk with.
>>101639971What am I looking at here and did you perhaps mean to quote another post?
I'm really impressed with character.ai. Is there anything remotely close to it that can run locally, or am I out of luck if I want an experience like that?
>>101639624Shivers still abound but it's interesting how changing just one name affects the prose. In picrel I clicked "regenerate" for the greeting message, keeping the seed fixed to 0 after changing the name in "Emulate the writing style of [author]" in an author note at zero depth. The model was Gemma-2-27B.
>>101640281NeMo/Largestral
>>101640281>I'm really impressed with character.aiDid you go to sleep in 2022?
>>101640417There was a benchmark in the last thread that showed large IQ1_M significantly higher than nemo, is that even possible?
>>101640429I just started using it like a month ago. Maybe I just don't have standards. I downloaded something called backyard.ai and I'm using some model called "Chaifighter v2 20B" but I'm not too impressed. I don't feel like it's acting like my character, it gives me short responses and overall just feels off.
Good open source music generation when?
>>101640034>But it was and still isI'm pretty sure they are no longer running their original, rather large LaMDA model. They went through a phase when they were getting slammed where they seemed to be using dynamic model sizing, and during periods of high load, it was like you were talking to a 3-6B model. Now it just sounds like a lame LLaMA 13B.
>>101640309jesus christ, what a terrible prose
>>101640465>There wasThere was?
>>101640488Yeah, this one: https://oobabooga.github.io/benchmark.htmlSeems strange though sometimes higher quants score lower than smaller ones. So that's why I'm asking if it's even possible some IQ1_M of large could be better than nemo.
>>101638968Please stop insulting our intelligence, your posts are obviously intended to sexualize the kids and according to US law (and most other countries), it's only legal as long as it isn't realistic enoughAre these posts obscene enough to actually constitute a crime? I don't know, but most people here think it's disgusting and it adds nothing of value to the threadStop being disingenuous, thanks
>>101640502Oh, that. Yeah I wouldn't trust it overall. But I actually would say it's possible that Largestral at IQ1_M beats Nemo. Compare the filesizes. And look at >>101627651, larger models appear to do better at these lower quants than smaller models.
>large language model>look inside>numbers
>>101640502>https://oobabooga.github.io/benchmark.htmloobabooger if you are reading this, please test undi's largestral tune https://huggingface.co/NeverSleep/Lumimaid-v0.2-123B it feels way dumber than the original
>>101640156I'm quoting this, and it's related to int8 traininghttps://github.com/bitsandbytes-foundation/bitsandbytes/issues/1262
Is GGML_CUDA_F16 something that I should enable for a 3090?
>>101640592Buy an ad.
>>101639245>>101639258>I want to live in a society where the mods finally get fed up with the petra/pedo spammer and drop a range ban.>the schizos are taking over 4chan as normal anons are leaving itYou (faggot A) are whining that someone is trolling you and it gets under your skin and you want jannies to protect you. And you (faggot B) say that "normal anons" are leaving. Normal anons already left long ago. Now it is just you, absolute zoomer scum that needs a safespace.
>>101640672ok but why do you like children tho?
>>101637737your local models are performance-inferior to any cloud AI though and censored more than said cloud AIs, too.>>101638118model will censor itself even more with this method, good luck with that.
>>101640662Undi bought discord shills, are you happy now?
>>101640593I don't think this is relevant to my goals.A fused operation with int8 has to be written in a very different way than a non-fused operation (which I think this is).>>101640640With the current code it shouldn't really matter.There are still some parts where it makes a small difference but in the medium term I want to remove that option and just make the choice based on the hardware.
>>101639927I wonder if secret sauce for that one was that cleaning the datasets from all the toxic things wasn't properly done yet.
>>101640677I don't. They are ugly and annoying. But I am not a butthurt retard that cries to jannies.
Is it normal for mistral-large to repeat large chunks of paragraph as early as like, the 2nd or 3rd message?
>>101640487Here's with Mistral Nemo 12B
>>101640704Can you also make it guess GGML_CUDA_DMMV_X and GGML_CUDA_MMV_Y, I sometime forgot to set those manually.
>>101640769Also planned.
>>101640755No.
>>101640761>>101640309Directly requesting styles has never worked well. I've never been able to get a model to replicate say, Dave Barry or Carl Hiaasen's style, they just default to their shitty default "funny" style they go to when you tell them to be funny.
>>101640761I'm throwing up
>>101640761I like nemo but this looks like placebo. I mean it feels like it is still writing with the same style but when you reference lovecraft she is about to grow a tentacle in next 2 posts and if you mention tolkien it associates archaic english and puts it in. It just finds some concepts associated with name and runs with those concepts.
>>101640787Okay, I gotcha. Maybe it's an openrouter issue. I'll dig around for other shit it could be, too. But I'm guessing I'll have to bite the bullet and CPUmaxx if I want a good experience with it.
I never messed with LoRA before, but would it be possible to extract a -instruct LoRA out of Nemo (diffing Nemo base and Nemo-instruct) then apply that LoRA to base with a different strength (that's a thing right?)?
>>101639548Which card are these defs from? Sounds like a good setup.
>>101640806That would track with what this anon said >>101640789Namely, that asking for an author will just associate the concept with the story. In that anon's case going generic "funny" mode for humor authors, and for lovecraft, general eldritch horror with no considerations for his literary stylings.
>>101640807The prompting could be weird because of how the official API sends the system prompt to the last user message. Try instruct mode too with OpenRouter.
>>101640755enable DRY, retard
>>101640778if everything is in **, then nothing should be
>>101640755
>>101640848women are DRY when they see you
>>101640839I wonder if this is part of the effort to avoid copyrighted content? It seems like they've made a special effort to make models really, really fucking bad at knowing the exact text of books, I'd imagine that'd also translate to it being unable to replicate the style except in the broadest terms.
>>101640898yeah your mothers menopause was rough
>>101640677If I hate them you thoughtcrime pursuers will be after me from murder anyway.
>>101640951Did you ever do the chubby tummy? I may have missed it, oops...
>>101639258This. The most deranged are getting their easy-to-use proxy management shitpost side for free.
This general is a cornucopia of mental illness.
>>101640704I've conducted some experiments with MNIST. kept weights , tried both ternary and binary (just the linear layers , didn't touch convos) and it worked pretty well, for binary the loss curve didn't converge very smoothly, but eventually hit the satisfying levels.The question is ,can we keep the gradient and the optimizer in int8 all the time during backprop in transformers. We could randomly drop some updates like DropBP (there's a paper) and that sorta jazz, but gradient is ,technically speaking, float by definition. So is there a way we could fully/partially calculate gradients in int8 or somehow convert (not necessarily quantize) to integer and yet preserve the quality of cross-entropy when updated. That's most likely impossible in diffusers since unet is fed by the noise then sometimes even upscaled, and noise is very sensitive to the precision , but in llm like transformers perhaps it's somehow doable. dunno but worth a try. int8 is definitely the fastest option when it comes to the compute.
What are some of the 'best' models you can run nowadays with 16gb vram? I don't mind having a small context window (say 4k), just don't want to use any ram because it slows down to a crawl for me
>>101641008>The question is ,can we keep the gradient and the optimizer in int8 all the time during backprop in transformers.It should be possible to store the gradients as int8 as long as their absolute values are relatively similar.But that is definitely not a given.I have no idea whether the ideas I have will actually work out; I'll just need to try them and see.
>>101641028Nemo 12B.It has 128k context, although you probably don't want to use more than 32k.
>>101641028mistral nemogemma 9bllama 3.1 8b
>>101641013Aww, man. Well, still cute.
>>101641028People say nemo but you have to wrangle with this tard model just to get 70% retarded responses and 30% brilliant ones. Not worth in my opinion.
>>101640817I think so...
>>101637653>at least in my experiencegood morning sir
>>101641109Sick. Gonna try that out then.MergeKit can do that, extract a LoRA from the difference between two models, right?I wonder if that can be used to "fix" overcooked fine tunes.
How do I keep base models under control? Even when I go NAI style and write a good hunk of story as an intro for it to do text completion on, it tends to go off the rails.
>>101641141>he fell for the base model meme
>>101641053definitely worth a try. it works in MNIST so who knows.
>>101641141Guide it with your own input.
What's the new meta now?Is it still nemo?
>>101641077What happens above 32k?
>>101641175New meta is lumimaid and mini-magnum
>>101641166I am... but if it goes on for more than a short paragraph or so, it'll do its own thing. They need to finetune this shit to go in 10-150 token bursts like NAI.
>>101641141>How do I keep base models under control?that's the neat part - you don't
>>101641175It has been like 3 fucking days jesus fuck.. yes it is still nemo..
>>101641141Tell it what you want and don't want to happen. The base model understands instructions, it's just very rebellious.
>>101641196Just limit the response length then.
>>101641188As with most (every?) model with large contexts, after a certain point it's accuracy starts going down and it's ability to use information from the context gets spotty.Do try and see how much context works well for you. For me, 32k has been the sweet spot so far.
>>101641225I mean... >>101641195
>>101641188It gets retarded.. it honestly gets uneseable for RP after like 12-14k tokens.
>>101641247Luminiad is more retarded version of nemo. Magnum i did not tried. so that i will not comment.
>>101641247Ignore the Sao shill, he will keep spamming "Undi = dumb" regardless of the model.
>>101641230Would it understand them better if I introduced it in a completion style? Like as a dust jacket summary of the story/premise beforehand, then the opening prose for it to continue?
>>101641235>>101641249What's the technical reason as for why this happens? I'm guessing that they don't have many examples with a length above 32k tokens in their traning set right?
>>101641195lumimaid l3 was so terrible that if the nemo version resembles it in any way I wouldn't even bother to try it
>>101641300Yeah, I think so. What I usually do is write the story in Markdown format. I start with a glossary section, a summary/synopsis section, and then chapter 1.
>>101640871Pushing the Pochi down the stairs.
>>101641318No, it's probably just a limitation of the parameter count.
>>101641318Examples are not the problem Attention is.https://github.com/hsiehjackson/RULER
>>101641318>What's the technical reason as for why this happens?the reason is that transformer architecture is shitty
>there is STILL no open source tts that isn't shitwhen the fuck will get a local audiobook generator? this should be way easier than the quadrillion parameter bullshit everyone's doing now shouldn't it
>>101641527OMG it's Pochi!!! The best avatarfag!!!
>>101641527Anon, is everything alright with you? You weren't like this before. If you want someone to talk to I'm here.
>>101640693Yeah and my car is inferior to a lambo, so what? Mine has SOVL
>>101641362>Try it>Just spits it back out verbatim with <im_end> at the endHuh. Should I be including the model's message format somewhere? What should it be around?
>>101638370It'll mimic their behaviors and it'll be even more cutthroat, because unlike people, you can't brown nose and get on it's good side
>Lurking because I want to be horny>too poor to get gpus in this 3rd world country or pay for the various services in like openrouter, so I rely on the kobold horde>Trying to find the good coom cards/chatbots that feel in character alongside setups/instruction/models to goonI have collected over 300 bots, Now time to find out if I have it properly set up and is good coom material or not.
>>101641590I have tons of Horde kudos from my earlier days btw, by a ton I mean >10 million. Could give you if you need them
>>101641590You can F2P your coom with Google colab, they give you 16GB VRAM for free
>>101641566>Move from openrouter to ST since it does the formatting for you>No way to get rid of the shitty clusterfuck of characters and instructslopArrgh. I just want NAI style...
>>101641615mikup[ad
>>101641590https://github.com/LostRuins/koboldcpp/blob/concedo/colab.ipynb
>>101641618Bloody...
>>101641590Literally just go to /aicg/ and wait until someone gives out a free proxy, you'll also get a better model than whatever you can run here
>>101641637There's one right now but it doesn't have 3.5 sonnet iirc, only claude 3 haiku/sonent and below (claude 2.1 etc) https://rentry.org/unreliableproxy
>>101641566I think you should be concerned with finding out where "im_end" is coming from. You shouldn't be using prompt formats with base models.
>>101641650>it doesn't have 3.5 sonnet iircYeah I don't think there's any public one that has it atm
>>101637711good, everything should be open
>>101641615mikupad
>>101641664Well, in that case, it's almost certainly openrouter. I'm testing shit on it because doing huge 70b models locally takes forever when you're dipping your toes into what works and what doesn't. That being said, you'd think it'd complete something, right..? Does it need a shove? It's weird that it just spits back the entire thing at me, right? Should I begin my entire block blurb with "complete the following" or something? Maybe a "I need you to work on this." Something more human/casual/that a text completion thing might expect?
>>101641650did it get taken down?
>>101641733no, it's up, the 4 words in the first line is the trycloudflare linkhttps://something-industries-billing-bedroom.trycloudflare.com/
isreal
>>101641590>>101641637or show ecker your wiener for a proxy key...
>>101641721Okay, something is definitely up with openrouter, it writes for a bit, then starts spewing out the parameters they're using(?)Also>'model': 'julka/julka-neox',Those fuckers.
If LLM's are so smart how come they don't work when you if they settings wrong?
>>101641594thanks but for now I dont even know if I have properly set up, doesnt help how there are like 20 fucking models at any given time and how with the way models are made you get wildly different results because you didnt noticed one is Mistral and the other is Mixtral, for now Im just testing from what I could gather>1K to 2k being the upper limit in terms of permanent tokens, if it goes further its too bloated.>Stheno/fimbuli-dont-remember-the-last-part being the more readily accesible ones that still give decent output>bunch of presets from huggingface>>101641605I need to check it out later, When I first tried it went over my head some of the instructions.>>101641626I will check the link later, thanks>>101641637Do they even leave free proxies around? I thought it was only paid stuff or just for the people that are "In the group" since I asked a few times and either nobody replied or said "Lurk more" to me or other people that made similar questions.
>>101641780smart people are fragile
>>101641758warning! horny miku! https://files.catbox.moe/caos5b.jpghttps://files.catbox.moe/3sr25m.jpghttps://files.catbox.moe/ug1u9g.jpghttps://files.catbox.moe/y0ykee.jpghttps://files.catbox.moe/fv2efb.jpghttps://files.catbox.moe/s0ii8r.jpg(yes, this is dalle3, from an mostly unfilterd dalle3 azure endpoint)
>>101641779They include the parameters in the prompt as an optimization, so the sampling code path is exactly the same for every request. The model should figure out how to respond anyway.
>>101641784>since I asked a few times and either nobody replied or said "Lurk more" to me or other people that made similar questions.They are a bunch of niggers, but sometimes people do leave free proxies - my tip: check the archives, limit all searches to /g/ and search for "password"/"pass", you'll probably find a public proxy that way
>>101641779Kek, it just sort of hijacked it and decided to start telling its entirely separate own story about this girl who, as far as I can gather, definitely has a vagina.
>>101637711>>101641674i hope this doesn't mean shit like huggingface will go down though. this is why we need a decentralized way to distribute models. people probably would seed models too compared to pirated stuff.
>>101641733>>101641741Go to aicg, wrong thread
>>101641780If humans are so smart how come they don't work when you increase their temperature by a mere 2%?
What will the GPT-4 of video be like?
>>101637626HAProxy has a metrics module that can output to Prometheus https://www.haproxy.com/documentation/haproxy-configuration-tutorials/alerts-and-monitoring/prometheus/
>>101641832Sora but it's in eternal gatekeep
>>101641784Well, my offer still stands, because sometimes Horde gets really overwhelmed, and I can easily give you a few hundred thousand kudos
>>101641811if you told me this was gpt2 talktotransformer shit from four years ago I'd have believed you
Man, okay. I'm done with the base model/text completion meme. Maybe if NAI makes a 30b or something.
Am I the only one that feels that --split-mode row is broken?
Wait, does OpenRouter not allow text completions for base models?
>>101641865>using the base model through OR's chat interfaceyou deserve it
>>101641821api models are nice for local too, you can use them to augment datasets and things like that.I didn't ask it for ERP, because unlike you, I don't care only for gooning, loser.
>>101641874At least from what I'm trying, it seems to be extremely instruct-formatted in a way that fucks with base model completion really hard.
>>101641865You will never get something good from the chat interface, since they are injecting intruct junk to your prompt.
>>101641906Huh, I had a feeling. At least I know I'm on the right track, my pre-formatted thing looks extremely close to what you've got, chapter and all. I guess I'll use mikupad and trudge out doing a 70b on my 3060 + RAM.
What speed was 405B at when it was available on Groq for everyone to test?
>>101641874They do, even for instruct models.
>>101641544It is not me. I am me.
>>101641923>Text completion on instruct modelsWhat's that like? Sounds either monstrously sloppy or based.
>>101641890Man, you really showed him
>>101641920You can also use openrouter models on mikupad, you know.
>>101641587>you can't brown nose and get on it's good sideAnd that is when things may finally change cause bootlickers keep this scum up top.
>>101641805>>101641850thanks for the tips and the offer, I still need to do my searches for good coom in case Proxies are not available.>Proceeds to lurk again
>>101641951Are they not injected with the same slop? I tried in text completion mode on ST with the OR API and it still gave me the instructionslop shit. I figured it was just baked into their implementation of the model. But I'll try Mikupad.
https://github.com/acrognale/llmtreeNeat
>>101642024ST also allows this btw, and there's even a timelines extension
>>101641991Doesn't seem to be the case, although maybe the LLaMA 3 70B base you got in OR could be fucked or something.
>>101641800did the proxy shut down?
>>101642120what proxy
>>101641960Don't hate the players, hate the game. Most people don't get promoted by working hard, most business deals don't get closed because they're good ideas. Guess what? People up top put each other there and I doubt they'll suddenly decide to go jobless together with the gang. Management will be the last people to get automated
>>101641800wait, this is dalle 3? damn.
>>101642140Yes it is, the style is obviously dalle
>>101641866broken how? Seems okay to me, performance delta probably varies a lot depending on hardware. Actually seems improved since I last tested
>>101642123I thought some guy from aicg was hosting a dalle3 proxy or something.
>>101642153No one will host uncensored dalle3 because those endpoints are really rare
>>101642149The output quality seems low compared to the default mode with high context.
Is there some prompt I can put into memory to make my AI stop sundowning adventures?
>>101642178>uncensored dalle3wut? I thought MS was pretty strict about censoring no-no prompts
>>101642203You can disable them on Azure if you are a company and have some use-case so they'll approve youhttps://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/content-filter?tabs=warning%2Cuser-prompt%2Cpython-new#configurability-previewThe endpoint I have still has filters enabled on the prompt but doesn't check the generated image, so you can still gen NSFW because the prompt filter is dumb.
>>101642058Works on Mikupad! Sucks that you can only have one generation in undo-redo, but ah well. Glad to have it going!
>>101642249Uh-oh. This isn't what I signed up for.
>>101642243Oh that's pretty interesting, unrelated but has dalle gotten worse over time? I remember it being pretty good the week it launched but nowadays everything it generates has that typical 'ai' look that gives it right away that it was made with dalle
Does anybody have cards with moderately complex scenarios, so I can test various models on them?Btw, new Chatbot arena update dropped, 405b is in third place
>>101642149>>101642195Oh, I think it was that I just didn't update it for the rope scale fix... Oops.
>>101642273I don't think so, no, it's just that most people got used to the default DALL-E style. You see, dalle has two styles on the API - "natural" which basically doesn't do any extra "enhancement", and "vivid" which makes it much easier to create higher-quality images, but makes them all look similar-ish. Bing creator and ChatGPT Plus only use Vivid style, but on the API you can use Natural. Here's an example picrel of what you can get with natural dalle if you try hard enough (yes I posted this pic a lot of times, it's over half a year old at this point).
>>101642203Nta, but it's just skill issue, D3 can draw any degeneracy you want.https://litter.catbox.moe/9v059d.jpg
>>101642311You have to do tons of tries on Bing to get those styles of pictures though, to bypass both the prompt filter and the image filter.
>>101642306This is also natural style dalle3, unedited
>>101642318>You have to do tons of tries on Bing to get those styles of pictures though, to bypass both the prompt filter and the image filter.not really, I can generate more on the first tryhttps://litter.catbox.moe/nrfvz9.jpghttps://litter.catbox.moe/v96ma7.jpg
>>101642365anon, this is a very specific fetish and you're just lucky that you found this degeneracy theme. Can you try to get normal tits from bing though, like my miku gens?
>>101642306>picrelThat looks pretty fucking good compared to the typical dalle-images, have fun with the api anon
>>101641800Here's the best I coaxed from bing today:https://files.catbox.moe/xy2z08.jfifhttps://files.catbox.moe/xc84ul.jfifhttps://files.catbox.moe/ml1kg2.jfifhttps://files.catbox.moe/0al4rw.jfifI'm retiring my OLD LLM machine (Dell R720). It was a k8s testing platform, and I discovered it could take a pair of P100s, and from there I started playing with stuff. It's a watt-waster, though.
>>101642326wtf
>>101642462yeah
>>101642326You can see the squares break
GOTTA BECOME RETARDED!
>>101642488omg it sanik
>>101642492
>>101642496Can you draw judy hopps?
>>101642376I don't do anime. Have some yakuza and ebony titties tho.https://litter.catbox.moe/rz5oks.jpghttps://litter.catbox.moe/ffs982.jpghttps://litter.catbox.moe/5dc7s9.jpgI rarely go for nudity nowadays, I prefer to generate more erotic images likehttps://litter.catbox.moe/cz15th.jpg
>>101642306This could almost be an anime shot, or it could be
local models?
>>101642580Are you lost?
>>101642571asuka if she was trans
>>101642326>>101642469What do ms paint gens look like? like this >>101636906
>>101642571This is from an actual anime btw (totally not an unedited dalle3 gen, including text)
>>101642591>>101642601
>>101642601this is dalle with jb with prompt "ms paint drawing of an anime girl, extremely simple, microsoft paint"
>>101642615Post moar pls, looks really good
>>101642659that necktie is so cute, it made me smile
>>101642685Just wanted to share the exact prompt (without JB, you need it):The image is a single frame from an anime show (anime screencap) showing an anime girl adorned with a clover symbol. She is pointing and laughing in a teasing manner towards the viewer with the accompanying subtitle text 'Holy, scrapelet!'. The background is neutral gradient colors.Natural style, 1792x1024, HD quality.
>>101642586azureshit and cloudshit is on topic now so I must be.the amount of cope in this thread is astoundingwriting a platitude about llms in your post to seem relevant is not the secret workaround you think it is retards, get fucked
>>101642685it doesn't always get the text sadly
>>101636887>>101642685
>>101642638have your throat slit pedo
>>101642736the fingers........ AIIIIEEEEEEEEEEEEEEEEE
>>101642615erm sorry chud the proportions of her face are slightly badai has hit a wall
>>101642697No fun allowed!
>>101642638That looks like a dwarf
>>101642754A website should let you pay to finetune on a bunch of miku songs to gen a miku song
>>101642754Post her armpits
>>101642754what model? sdxl?
llama.cpp's RPC mode sucks... I don't want to send 40GB of weights over the network every time I load the model... With vLLM you just put the model in the same path in both machines...
>>101642615how much does an api call like this one cost?
>>101642817If you were to pay yourself, 12 cents. Because it's HD quality (normal is 4 cents -> 8 cents is HD) + higher res 1792x1024 which is another +4 cents, and I mass-gen since not all tries are good, and nitpick the better results. I don't think it's viable to paypig dalle, but you can easily scrape azure endpoints off github.
>>101642827wait so how good is free midjourney compared to this then?
>>101642833idk, i never use midjourney because I dislike discord
>>101642831miku hiii it's me lmg
Samplers for mini-magnum, and reviews for it vs nemo instruct?
>>101642827>but you can easily scrape azure endpoints off github.you overestimate me
>>101642827>but you can easily scrape azure endpoints off github.Doesn't github immediately take those down and send a notification to the owner of the repository? Kind of like what they do when you accidentally leak your credentials
>>101642886No, they do this for OpenAI (which are just single keys), but not for Azure OpenAI endpoints (which are two parts - the endpoint name and the key, you also need the deployment name but it could be obtained from the API itself if you have the first two)
>>101639548>Biggest loser ittLooks like someone's feeling left out
>>1016422853rd? more like 5th, lmao, local lost
>>101642455you need help
>>101639548>mfw I realize I can prompt people outside and wait for their response>as low stakes as talking to a modelthanks anon I'm married now
>>101642918>MIKUKthe memes write themselves
>>101642967It reads more like NKUK for me
>>101642736kek this is really bad and really good at the same time
it doesn't want to make miku when she's hugging anon aaaaaaaaaa
I just got an rtx 3090. How smart of an investment was this?
>>101642791lollmao even
Finally a single real miku, but she's a loli for some reason
>>101642831kino, looks like the psp game
>>101643034Looks exactly like MMD, anon.
>>101642979it's clearly INKUK
>>101643032cute and funny, the AI got its priorities straight.
>>101641514From what i've read the main problem problem is data, not enough labelling of audio transcription
>>101642831The 2D stickers look out of place.
>>101643045yeah, dalle3 knows mmd
>>101643015>investmentfinancially? you're not gonna make the money backenjoyment? it depends how much you enjoy it. (we understand. you're in good company here)
>>101642849>SamplersI like a bit higher than recommended temperature, .6-1, add some minp if you're getting crazy tokens.
>>101643075what is minp
Something's wrong with this Miku...
>>101643084/lmg/ ove...r
>>101643084LMG is oveR after miku became real and got all anons laid
>>101643089>>101643089>>101643089
NVGIDIA RXC RTX
>>101643112genius fan design
What current machine AI handles a solid conversational flow for RP?Every RP i've tried so far has the same sterile robotic feel to it that I dislike (waterboard you with questions that while, seems kinda normal, the frequency just instantly reminds you you're talking to a bot)Really pissing me off, been trying Gemma 27B trying to fine tune it and it's super bad for this. Command R is a little better, Mistral Nemo is also pretty bad.For reference, I have a 4090, so not running any x3 3090 setups for the actual nutty models
>>101643127>What current machine AI handles a solid conversational flow for RP?noneI'm being serious, wait a year or two I guess
>>101643127>>101643239I seriously think this could be fixed with a card-specific prompt or an author's note or something.
>>101643374And examples.
>>101638197so sure, a transformer big enough could.but it's just not the right tool for the job, too inneficient, you can aproximate mandelbrot, heck have a perfect representation of mandelbrot with enough layers (infinite), but that's not a practical solution.the transformer architecture is just not fit for AGI, you could get there if you had 100 order of magnitudes more compute and data.but it's a pointless endeavor due to architectural limitation, i do think we'll reach agi at some point, and i do think we even already have the compute necessary for it, but i think the transformer architecture alone is just too inneficient for that purpose.if you consider the human mind as a function, sure an universal function aproximator can get close to the mapping of all possible i/o.but it'd be a huge waste of time.just like we don't use ufa to practically approximate mandelbrot, it's not a practical tool for full blown agi, at least not alone.i think your first focus should be making an artificial hippocampus.
>>101643389also fuck the reddit spacing, i don't do double new line, i do shift newline but i didn't knew it'd show up like that.test(doing double newline)test(doing shift newline)test(doing just newline)test
>>101643127>single GPU rigunironically NGMIjj tho.I do have an idea for some datasets that might give models a more natural conversational flow. But it'll have to wait until fall when I can just open my window and not turn my house into an oven while training right now it's heat wave season.