[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: GNr4ARWb0AAnVcC.jpg (222 KB, 928x1232)
222 KB
222 KB JPG
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>101197169 & >>101191862

►News
>(06/28) Inference support for Gemma 2 merged: https://github.com/ggerganov/llama.cpp/pull/8156
>(06/27) Meta announces LLM Compiler, based on Code Llama, for code optimization and disassembly: https://go.fb.me/tdd3dw
>(06/27) Gemma 2 released: https://hf.co/collections/google/gemma-2-release-667d6600fd5220e7b967f315
>(06/25) Cambrian-1: Collection of vision-centric multimodal LLMs: https://cambrian-mllm.github.io
>(06/23) Support for BitnetForCausalLM merged: https://github.com/ggerganov/llama.cpp/pull/7931

►News Archive: https://rentry.org/lmg-news-archive
►FAQ: https://wikia.schneedc.com
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Programming: https://hf.co/spaces/bigcode/bigcode-models-leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp
>>
►Recent Highlights from the Previous Thread: >>101197169

--Extending LLM's Context Window with Activation Beacons: >>101203075 >>101203249
--The Strawberry Test: A Flawed Method for Evaluating LLM Quality: >>101198523 >>101198632
--Model Requirements and File System Operations: >>101200948 >>101200984 >>101201000 >>101201064 >>101201005 >>101201126
--Miyu's Odd Behavior in the Classroom and LLM's Writing Limitations: >>101198989 >>101199237 >>101199267 >>101201087 >>101201307
--Licensing Model Weights: Dubious or Protected?: >>101200593 >>101200629 >>101200660
--Koboldcpp vs KoboldAI: Choosing the Right Model for Your Setup: >>101201486 >>101201527 >>101201593 >>101201687
--HF Leaderboard: Qwen and CR+ Performance: >>101202193 >>101202222 >>101202316 >>101202816
--Gemma2 27b's Technical Accuracy in Poetic Metre: >>101197882 >>101197901 >>101197912
--Gemma 2: A Modified Version of Gemini Flash?: >>101197434 >>101199289 >>101199667
--Control Vector Test Drive and Applications: >>101198756 >>101199025 >>101199067 >>101199204 >>101199229
--Chatbot Arena - Vision Rankings: GPT-3.5 and Claude 3.5 Sonnet Dominate: >>101199300 >>101199358
--Best Local Model for App Development and Programming: >>101201995 >>101202012 >>101202435 >>101202570 >>101202751 >>101202777 >>101202849 >>101202834 >>101202882 >>101202914 >>101203105 >>101203182 >>101203195 >>101203256 >>101203129
--AI Model's Shitty Cliche Smut Tropes and How to Fix Them: >>101198056 >>101198076 >>101198087
--27B Model Generates Endless Pad Tokens in Transformers: >>101197754 >>101197828 >>101197860 >>101197907
--Web Development's Wrong Turn: From Documents to Scripting Languages: >>101197613 >>101197652 >>101197660 >>101197771 >>101197945 >>101197963 >>101204914 >>101197974
--Anon's Love Letter to Mixtral Model Stock Experiment: >>101202093 >>101202420 >>101203831 >>101203861
--Miku (free space): >>101197686 >>101203008 >>101203898

►Recent Highlight Posts from the Previous Thread: >>101197174
>>
I am the one who says the nigger word with no repercussions
>>
File: 1698504636333622.png (845 KB, 1280x720)
845 KB
845 KB PNG
>>101205045
YOU CANT DO A HECKIN RACISM OUTSIDE B CHUDDD NOOOOOOOOOOOOO
>>
>ctrl+f (You)
>14 matches
>>
>>101205004
Learning dangerous knowledge with Rin
>>
>>101205072
I wonder if they recycle janitors/moderators every so often, there was a period between 2021-late 2023 where i was getting redeemed for "racism outside /b/", even in threads were other people were far more racy and they were hit with nothing
its not happened once so far this year.
>>
>>101203790 (me)
>deepseek
Trying myself now. I can't really run it right now/don't want to download it so I paid the official deepseek API. It was ~$2 for 7 million tokens. lol.
>>
>>101203790
I think it's definitely the superior one compared to deepseek-v2-instruct when it comes to ERP but I don't think it can keep up with sonnet in that regard.
>>
>>101202420
your model is the most blueballing model i've used so far
plz fix
>>
>>101205229
i grabbed it and did a couple tests and (running it in ollama) and it seems to go off the rails quite a bit more than other models, probably a skill/prompt issue
curious if anyone has tips
>>
>>101205369
I tried it, it failed all of my factual quality tests, and the RP pulled that "Despite being told twice who's character is whose, it just writes for mine instead of its own" crap.

It might be an admirable effort, but from my perspective it probably got the worst aspects of the merge components instead of the best. Maybe just bad luck, but deleted in hopes of putting better bytes on my drive.
>>
>>101205457
fuck.......... is limarp zloss the only good mixtral finetune?
>>
>>101205229 (me)
Gave -instruct it the writing prompt from the EQ bench. This was the first roll.

https://rentry.org/hyum3kaw
>>
4 years of vramlet cope and still nobody's figured out a better way to make a transformer model smarter than just making it bigger
>>
>>101205476
405B will save us until Llama-V-JEPA
>>
>>101205461
yuzu alter rpcal...
>>
>>101205468
light on the purple prose. Kinda like it desu.
>>
>>101205468
>conspiratorial whisper
>eyes twinkling
>maybe, just maybe
But otherwise, pretty good.
>>
is L3-8B-Stheno-v3.2 still the best model for people with only 24gb of vram?
>>
why is command-r almost slower than command-r plus for me...
>>
>>101205505
>rpcal
Didn't the exl2 dude say that that shit makes the calibration of the model worse?
>>
>>101205540
he doens't know what hes talking about
>>
>>101205537
Buy an ad.
>>
>>101205538
Lack of GQA maybe? Although even then, the difference in size ahould more than bridge the contex size gap.
>>
>people still shilling stheno when there's lunaris
No but seriously though, how do they compare? He released that shit a while ago already, shouldn't people have used it by now? I don't remember seeing any impressions of it.
>>
>>101196305
I told you bros he was algerian
>>
>>101205461
I'm just looking for the only good models.

Qwen2 Q4KS or better, Llama Q5KS or better, CR+ Q4KM or better seem to be the only ones passing my tests.

I'm looking at Magnum right now—apparently a Qwen2 spin—and it's going okay, and hasn't done weird stuff the normal Qwen2s have done to me before like barf moon runes spontaneously unless I drop CuBLAS so that's nice.
>>
>>101205578
I didn't know that was a thing. Gonna give it a try when I get home.
Let's see how it does with my RPG card.
>>
>>101205553
forgot i had 8 swipes per gen enabled
for whatever reason CR seems to be affected way more than CR+ at higher batch sizes
>>
>>101205551
>the guy who made exl2 quants doesn't know about exl2 quants

okay then
>>
>>101205639
Makes sense to me.
>>
>>101205552
name something better
>>
>>101205468
I played around with it a little too. I liked it and it's definitively smarter than other open source models (It's very strange how it is on no leaderboard) and it gave me the impression that it's hyper-aware of it's entire context at all times.
>>
>>101205793
Mixtral-limarp.
>>
verdict on gemma2?
>>
File: RTX 5080 AI GEN.jpg (92 KB, 1200x819)
92 KB
92 KB JPG
So I have been thinking about this hobby and the financial costs involved. I currently have a RTX 3080 with 10 GB of Vram. I Have a 4k monitor but don't do much AAA gaming anymore.

Lets assume that the upcomming RTX 5090 will have 32 GB of Vram and will cost 1800$. Stheno and other 8B models run fine on 3080 and are okay for quick coom or a short RP sessions. And if I want to play with bigger models I can rent an RTX A6000 for about 0,85 dollars per hour, or an A 100 for 2$/h. Used 3090:s are "cheap" at about 600$ each, but they are big, loud and suck up a lot of watts. Given the fact that my LLM usage is max 15 hours/week of RP:ing with chatbot, I could rent the GPU-power I need for almost 3 years for the price of a hypothetical 5090.

I understand the desire to keep the spicy logs on prem, and the idea that i will "own nothing" if I rent the GPU from runpod, but does buying new top of the line GPU:s really make any sense for a "casual" user like me.

Have a shitty AI genned RTX 5080 picture
>>
>>101205853
Financially, it probably does make more sense. You have to decide for yourself if the privacy concerns are worth it.
>>
>>101205835
I want it but I can't have it until exl2 adds support.
>>
We'll get new llama 3 stuff in July right?
>>
>>101205893
gpt-4o drop delayed everything until august
>>
>>101205853
If all you want is to coom sure. I enjoy the thinkering possibilities local provides.
>>
>>101205853
Where are you now? If you've got a single decent 30XX or 40XX kind of card and ≥64GB Ram, you can be where I'm at which is about 1 to 2 tokens per second on non-awful models. Not fast, but I just treat it like AIM, chat a line and send, and it'll make a noise in a few minutes when it replies. That would be enough for you to see if you actually dig it and have ideas of things that NEED more power or if you get your taste, get bored, and spend that money on groceries.
>>
>>101205853
There's also a lot of APIs now where you can use models directly. You don't have the direct control like you have when renting but depending on how you use the model, it's even cheaper and these models are big and not quantized.I also assume prices to just go down more. If you don't have an attachment to local, if we look at this anons purchase, >>101205229 that is 28 cents per million tokens. Ignoring the whole privacy and control aspect, local isn't quite worth it anymore if you want to run quality models. With progress, this might change again. Local made sense when OpenAI was the only player and their rugpulling could really fuck up your shit (and that's how lmg was born to begin with) but times have changed. If somebody rugpulls, just go somewhere else.
>>
Turns out babies could be a good avenue of research in order to improve foundational AI's.
https://www.cell.com/trends/cognitive-sciences/fulltext/S1364-6613(24)00114-1?_returnURL=https%3A%2F%2Flinkinghub.elsevier.com%2Fretrieve%2Fpii%2FS1364661324001141%3Fshowall%3Dtrue
>>
>>101205468
I feel chinese are just not taken seriously in the AI space, while their shit just keeps improving and improving and nobody really talks about it.
>>
>>101206053
Interesting, thanks for posting this.
>>
>>101205893
Yes, the anniversary of llama-2 in late july will be celebrated with the release of llama3-creative-128B which is optimized for roleplay and creative endeavors
>>
>>101205808
The context awareness is something I'm interested in, but in large context tasks I find it gets into weird repetitive loops.

I'm asking it to write an overview of some code, provided the code and api documentation. It starts off really strong but then after a few paragraphs it jumps back to "This is an overview for xxxx library..." and starts over again. Anyone else experience stuff like this?
>>
>>101206152
Size is bullshit, but Meta did claim to be considering partnering with Character.ai.
>>
When will we get an actual llama?
That's the only reason why I'm here
>>
>>101206111
No problem mate, glad you found it interesting. I am hopeful that this avenue of research will benefit synthetic data for models since the paper put some emphasis on self supervised learning.
>>
>>101206336
No purchase necessary to enter. Simply fill out the form on the web site and your llama will be delivered in 2 weeks. Deadline to enter is April 20th 2024.
>>
File: Untitled.jpg (38 KB, 915x90)
38 KB
38 KB JPG
>>101206215
is that why its terrible now
>>
>>101206336
>>>/an/catalog
>>
>>101206336
I'm sure you can find a llama card somewhere.
>>
>>101205810
this shit is six months old lmao
>>
>>101206093
We are too busy with important questions like "is an LLM saying nigger literally genocide?" or "Will an imagen model generating booba destroy civilization?" while they, you know, just make stuff.
>>
>>101206513
chinese models perform well on benchmark but IRL they get mogged by western models
>>
Which Kobold Presets you guys use for models?
>>
>>101206596
My own. Without any samplers but temp 0.5~0.85 and mimP 0.05
>>
File: 29390 - SoyBooru.png (139 KB, 775x1232)
139 KB
139 KB PNG
>>101205835
WNBAG
>>
File: file.png (150 KB, 1870x928)
150 KB
150 KB PNG
>>101205994
but can you prefill with /chat/completions api?
>>
File: 63896 - SoyBooru.png (182 KB, 332x406)
182 KB
182 KB PNG
>>101206676
TRVKE
>>
>>101206367
keeeek
>>
Any of you guys run this on a Celeron? Kind of want to go for 70b but in low end computers.
>>
File: lmaoo.jpg (156 KB, 2059x765)
156 KB
156 KB JPG
lmao, google has cheated his model to be good on chatbot arena
>>
File: dhjskldahak.png (17 KB, 1439x132)
17 KB
17 KB PNG
If I increase GPU layers past 48 I get the error message below even though I have a 3090+4090???? Can someone on the Koboldcpp team please tell me what is going on immediately

CUDA error: out of memory
current device: 1, in function ggml_cuda_set_device at D:\a\koboldcpp\koboldcpp\ggml-cuda.cu:115
cudaSetDevice(device)
GGML_ASSERT: D:\a\koboldcpp\koboldcpp\ggml-cuda.cu:102: !"CUDA error"
>>
https://www.reddit.com/r/LocalLLaMA/comments/1doxvdi/selfplay_models_finally_got_released_sppo/
>>
>>101207279
This is insane btw. This 8B legit performs at the level of all the big models I ever tried. If they apply these methods to larger models we are legit gonna have claude opus at home.
>>
>>101207038
Well that's kind of fucky. In an ideal world, people would be testing newer and different prompts to test models on lmsys, but they probably don't, since they're retards, so you end up with a lot of the same or similar prompts, and that means that even if they're technically not cheating because they're training only on the prompts and their own original answers, in practice it is cheating.
>>
>>101206209
I had this too, it replying with earlier replies, using SillyTavern. I had the api added as "OpenAI compatible" setting prompt post processing to "Claude" somehow fixed it. No idea why.
>>
>>101207300
What did you test on? I'd like to see some logs. So far the thread has determined that it's good on some things and about the same as regular Instruct on others.
>>
>>101207300
>This 8B legit performs at the level of all the big models I ever tried.
There's no way, its MMLU is on the 65+, it's even worse than L3-8b instruct
>>
>>101207354
Just try it, its a 8b. Night and day.
>>
>>101207300
>If they apply these methods to larger models we are legit gonna have claude opus at home.
and if ClaudeAI uses this technique on claude opus, they'll get god kek
>>
>>101207366
when you say "big models", which ones are you refering to?
>>
>>101207366
Why are you like this? Are you trying to LARP as a redditor to increase hate for redditors?
>>
>>101207307
fuck me i don't want to go read the sillytavern code to understand what this means T_T
>>101207279
SPPO is really great in my tests, if there was a 32k context version it would be god tier
>>
>>101207279
>3 days ago
anon, if this model was as good as you claim, people would've talked about it already, do you think we purposely want to avoid good models or something? kek
>>
>>101207400
Do you think 4chan is news central or something? People are usually slow on the uptake here.
>>
>>101207459
of course it is, that's why I lurk there, when something interesting happen, it gets talked quite rapidly
>>
File: 468519173.png (713 KB, 1024x1024)
713 KB
713 KB PNG
>>101207459
>Do you think 4chan is news central or something?
Yes
>>
File: 039_01705_.png (1.34 MB, 896x1216)
1.34 MB
1.34 MB PNG
>>101203831
Hey anon, if you haven't already make sure you're on the staging branch of ST otherwise some of the templates won't work as intended. Also have not tested using the templates with DRY sampling if that's also in play.
>>
>>101205994
>28 cents per million tokens
wait, what? That's basically almost free. For some of my roleplays I have 12k context and it rarely gets filled, but if I'm being generous, regenerating a lot, let's say I'll consume 30k tokens. That's just 3% of 28 cents, not even a full cent. How can it be this cheap compared to local, am I missing something?
>>
>>101207279
>>101207366
show us some logs anon, you gotta sell the product with some examples
>>
I know this thread is for LLMs but I suppose you anons would know this, what's the best local voice cloning tool currently out there? I have a pretty beefy GPU so that's not an issue
>>
File: 1705528053083453.png (7 KB, 578x113)
7 KB
7 KB PNG
>>101206367
I feel bad for them
>>
>>101207588
no
serving models has become ridiculously cheap
most places offer llama 70B for less than $1 per M tokens, while giving you a lot more control over the output than OAI and the cost is only going to continue dropping
>>
>>101205045
Every time I see posts like this I imagine a toddler giggling because he said a bad word to get reaction from his parents. Same level maturity I guess
>>
>>101206367
all major companies quantize their models based on traffic
as they grow in size, the product naturally gets worse
>>
>>101207663
the fuck
>>
>>101207663
>real cuda dev trip
what did he mean by this
>>
>>101207663
not like this
>>
>>101207663
I guess that's what he gets for using a non-secure tripcode
>>
>>101207663
CRACKED AND BLACKED
>>
>>101207279
im trying it, and its actually really good?
>>
>>101207775
That's curious. How was it non-secure?
>>
>>101207089
Disabling mmap fixes it I think.

>>101207701
Nothing, I was busy getting rekt in XCOM.

>>101207775
I may be misremembering, but didn't secure tripcodes rely on cookies?
>>
>>101207663
>>101207871
>cudadev is a blackedfag
unsurprising
>>
So after all this, are we now accepting that Google can call themselves a big shot in AI now despite their embarrassing missteps like glue on pizza and eating rocks and the embarrassment that was LaMDA/PaLM/Gemma 1? Sundar is still a dumbass for letting the company flop this long on AI but at least he has the right people at the helm now with DeepMind in charge and Demis Hassabis instead of Jeff Dean.
Gemma 2 is proof of the progress. Yes, they may have gamed LMSYS training on the prompts but the 9B model outdoing L3 8B for people in these threads is proof that they are at least in the playing field competing against the best in the industry. I am looking forward to more models from them, which hopefully Meta forces the issue soon with Llama 4.
>>
>>101207871
That is with MMAP disabled. I only have 32 gb of system ram so I have to disable mmap
>>
>>101207913
>So after all this, are we now accepting that Google can call themselves a big shot in AI
as long as they can't compete with the bests (gpt4 and claude3/3.5), the answer is no
>>
>>101207871
The FAQ just says
>Secure tripcodes use a secret key file on the server to help obscure their password.
and doesn't say anything about cookies so I guess I'll just try adding the extra # and see what happens.

>>101207918
Then I unfortunately don't know what the problem is.
I think a low amount of pinned memory was a Windows only issue though.
>>
File: 1700261128186346.png (28 KB, 727x112)
28 KB
28 KB PNG
PFFFFT
>>
>>101207913
it is sad it took them this long to release a larger model that is only playing equal footing with Meta's model
reminder that zucc isn't even taking this shit seriously, he's still got metaverse on the back of his mind, he's just waiting for video gen to get really good
the fact that gemma is just another standard transformer with very little innovation shows they are still in catch-up mode
if openai drops a paper tomorrow, guarantee that everyone will read it to glean as much secret sauce as possible, not so much for google
and until they can touch Claude 3.5 or GPT4o, they will be left behind
>>
>>101207952
If ditching koboldcpp could fix the problem what other back end would you recommend
>>
>>101205537
>>101206442
Only contenders i've found are:

Mixtral-8x7B-Instruct-v0.1-LimaRP-ZLoss-3.7bpw-h6-exl2-rpcal @ 4096 context
c4ai-command-r-v01_exl2_3.5bpw-rpcal @ 2048 context
Nous-Hermes-2-Mixtruct-v0.1-8x7B-DPO-DARE_TIES-3.7bpw-h6-exl2-rpcal @ 4096 context

maybe BagelMysteryTour
cOOmandR sucks and doesn't fit in 24gb
llama3 8b sucks

prove me wrong or tell me something better, i can't find it.
>>
>>101207965
If you don't have enough VRAM then something based on llama.cpp (which koboldcpp is) is basically your only option.
You can always try running llama.cpp directly but unless the koboldcpp devs changed the model loading code I think you'll run into the same issue.
You could try running it through WSL or on Linux.
>>
>>101207976
BMT not for sure, the moment you touch your character's boobs it sends you to a journey with bonds and testaments of your mixed feelings.
>>
>>101208010
yeah that's why i looked for something else, was decent for a few weeks tho
>>
>>101207663
>I wouldn't recommend koboldcpp.
Me neither
>garbage UI
>garbage chat API
>1200 files in a zip folder that takes 3 seconds to decompress every time, "here is your single file executable bro"
>maintained by cancerous discord fags
>the discord fags obsess over these threads and post anonymously, astroturfing
>>
File: file.png (214 KB, 1676x800)
214 KB
214 KB PNG
>>101207943
I wasn't calling them the best, just now that their claims of actually being able to match with the top players on LMSYS are now actually in a verifiable way more true. They aren't even that many Elo points behind GPT-4o or 3.5 Sonnet.
>>101207958
Google is still holding back their research in this area for a year in AI after OpenAI essentially firewalled off their research and profited off Google's open publication of theirs.
https://www.businessinsider.com/google-publishing-less-confidential-ai-research-to-compete-with-openai-2023-4
Because of OpenAI, no company in the West that is at the top with AI is publishing papers immediately on findings other than the Chinese if only because they need the citations and credibility with the world that Western companies don't and I expect the CCP to crack down on it at some point when it's clear they are ahead.
>>
>>101207952
>>101207997
Actually, now that I think about it, if the blacked spammer cracked the non-secure tripcode I should not just use the same string for the secure one.
So I guess I'll use this one from now on.
>>
>>101205835
I tested 27B in 8-bit quant and it seems pretty bad. Maybe it's a settings issue, but it hallucinates a lot and struggles with coherency.
>>
>>101207597
afaik there isn't a good tool for this, everything is convoluted and stupid, last i fucked with it this was the sota

https://git.ecker.tech/mrq/ai-voice-cloning/wiki/Installation

and it worked pretty okay but nothing like the demos you hear, would love it if someone could point me to a working local voice cloning tool as well
>>
>>101207952
>>101208093
confirm new trip using github somehow
>>
>I'm all ears... or rather, all text!
>>
>>101208095
>>101205835
also running q8, similar findings though both it and 9b are amazing at oneshot small context tasks, though the generations seem much more deterministic than other models for things where there's a "right answer" it gets it very often

it's definitely a good replacement for stackoverflow/googling most stuff
>>
>>101208091
that's all well and good but Gemma and Gemini 1.5 are what they've been cooking up behind closed doors this entire time, all while OAI has been working on Q* and GPT-Next and Anthropic is undoubtedly taking their sparse autoencoder tech to crazy applications
it feels like they were competing with Llama, not GPT or Claude, especially when you consider even based on your image that their best model is still worse than 3.5 Sonnet, which is the bottom end of Anthropic's latest batch of models
>>
>>101207858
Tripcodes with a single ! are insecure because your password is only hashed a single time. It's easy to bruteforce and tools for that have been around since basically the start.
>>101208093
Maybe post a new photo of your 4090 mining rig to prove that it's really you and not a blacked spammer attempt at hijacking your identity even further.
>>
File: 1698983091577192.jpg (155 KB, 843x607)
155 KB
155 KB JPG
>>101208093
>I should not just use the same string for the secure one.
There is no way you can be this dumb while simultaneously working on CUDA mat code.
>>
>>101207953
>blackedfag and kobold shizo hater are the same people
not surprised to be honest
>>
>>101208095
The tokenizer issue got fixed yesterday, but tere are still issues, you need the https://github.com/ggerganov/llama.cpp/pull/8197 with the logit soft capping fix mentioned in prior threads, it seems like 27B is a lot more sensitive to this than 9B with quality. The 4k context without SWA is also an issue, but until llama.cpp accepts that it needs to implement it which they declined to do for Mistral, Gemma 2 will be half baked. Mistral.rs which almost no one uses that claims full support with those issues resolved but I ain't testing it.
>>
>>101207952
>>101208114

yeah just put the code on your github profile and say it's your official handle or something. literally everyone knows that you're the cuda dev and gg and friends don't care.

>> 101208147

i know people who've written well cited computer science papers yet can't use a computer properly. figure.
>>
What a hero.
>>
>>101208230
>different methods of attention between layers
every day we stray further away from god's light
>>
>>101208137
They have other models cooking in the background too but we won't know about them in any technical detail, how long do you think Gemini 1.5 has been cooking? They have the model behind Project Astra too which was only shown with no technical details given which they will detail later. I don't see why people are down on their research division when they haven't scaled the more interesting stuff they showed like Griffin with RecurrentGemma at scale which does replace Transformers. You would be a fool IMO to discount Google's research division, it is one of the best in the field.
>>
File: IMG_20240630_010541.jpg (2.29 MB, 4000x3000)
2.29 MB
2.29 MB JPG
>>101208114
>>101208209
https://raw.githubusercontent.com/JohannesGaessler/JohannesGaessler/master/README.md

>>101208140
Here you go, next to the P40 machine (which looks like it needs its dust filters cleaned).
>>
File: 1705017153425593.png (55 KB, 263x217)
55 KB
55 KB PNG
>>101208289
Identity confirmed.
>>
>>101207952
I know for vichan the secure trips use a salt specified in the configuration files so I imagine it's similar for 4chan. I imagine someone could probably eventually crack the salt in order to start cracking secure trips as well. If someone rented an 8xH100 cluster I imagine it probably wouldn't take long to do if they were that obsessed.
>>
>>101208289
The new code checks out, commander.
>>
you can't convince me the original post wasn't made by the real cuda dev
>>
>>101208360
No one has in the 10+ years since it was introduced, why would it work now. And if you had that kind of power, why would you use it on a trip and not wallet keys for crypto?
>>
>>101208360
never 4get tripcuda
>>
>>101208421
Even if it were, why do you care about what someone goons to?
>>
>>101208487
i don't care what he goons to as long as he does it behind closed doors
>>
>>101207279
>I cannot create explicit content, but I'd be happy to help with other story ideas.
>>
>>101208516
Pipe down then lol
>>
>>101208527
wow someone's awfully defensive all of a sudden
wonder why
>>
>>101207279
Trying it on FP16, doesn't seem significantly smarter than any other 8B. I don't know why I keep falling for these psyops.
>>
>>101208289
dual motherboard set up?
>>
>>101208649
It would in principle be possible to add a second motherboard but so far there is only one.
>>
>>101208691
it looks like there are 6 gpus are all those on a single motherboard? is it all on one power supply?
>>
>>101208421
I thought about that for a second but cuda dev has been mildly positive towards kcpp in the past, it just doesn't really work as a genuine cuda dev post. him posting that AND attaching that image for some reason AND forgetting to un-trip doesn't make a lot of sense
pretty big waste of a tripcode crack if you ask me, very uninspired work
>>
I'm trying out mistral.rs to run Gemma right now and holy shit the documentation is not very good. I didn't know how well we had it with Llama.cpp.
>>
with an A750 (8GB vram) and a ryzen 5900x with 64GB ddr4 ram, does it make any sense to try and use the GPU for anything LLM, or am I better off just sticking to CPU inference?
>>
>>101208835
You can probably get okay speeds with either Vulkan or SYCL backends on llama.cpp.
>>
>>101208885
yeah, but for anything other than tiny models, I'd need to offload to ram anyway. I wonder if the speedup achieved by using the GPU will even be meaningful in this case.
I guess I'll have to try it to find out for certain.
>>
>>101208487
I like to know
I wanna see
>>
ban cuda dev
>>
i have a 4090 suck my dick
>>
back to discord you shit for brains
>>
It's about time zucc gives us an update on llama so that all the others have to shit out their big models as well
>>
I finally tuned 22x8 mixtral on limarp
Let's see how hard I fucked up
>>
>>101209110
qlora?
>>
hi im new to this stuff, can someone point me in the right direction for what i want? ty in advance.

anyway im looking for a general AI model something like what grok 2 is going to be or is there something better out there?
>>
>>101209097
zucc doesn't give a shit any more
government is so far up his ass on safety every model fucking sucks now
>>
>>101209163
You should always provide your specs when asking for this kind of thing.
That said, start with koboldcpp and llama3 8b instruct gguf.
>>
>>101209163
depends on if you are running a potato or not.
what graphics card do you have and how much VRAM?
>>
>>101209187
>>101209186
4070ti 13700k 32gb ram @6000mhz
>>
>Cuda dev was the blacked anon all along and posted by mistake using his trip, then just pretended it wasn't him by changing the trip code
BASED
>>
>>101209196
He was always upset that Miku was more popular than Teto.
>>
>>101209144
yea
>>
>>101209195
>4070ti
so probably 16GB VRAM (ram on the gfx card).
should be able to fit in llama3 instruct using exl2 which is generally faster than gguf.
i cba to spoonfeed but read the OP post and work out how to run llama3 8b exl2
>>
>>101209229
>i cba to spoonfeed but read the OP post and work out how to run llama3 8b exl2
im good on the spoon feed that last part was all i needed thanks. i got it from there
>>
>>101207662
Do any places offer running open-source erp models of your choice or otherwise good erp-able models? And how quickly does your account get shut down after cunny sex? I'd like not to have to use jailbreak prompts uncuck a cucked model.
>>
>finally get mistral.rs set up, interactive mode works
>try server
>it connects fine to ST
>send a completion request
>error
It's all so tiresome.
>>
>>101209341
You know what, fuck it, I'm trying it too.
>>
>>101208428
10+ years ago you couldn't rent petaflops of gpu compute by the hour
>>
>>101209353
Hope it works. Maybe I'm just having a skill issue right now.
>>
>>101205004
Unprotected sex with rin
>>
llama-3-400b going to be cancelled because me and the individuals who did the code RLHF farmed it by submitting dogshit data over and over, sorry everyone. The """untuned""" model (still had instruct data) is roughly gpt-4 tier intelligence maybe better but heavily overfit. Doubt they'll release the model without the RLHF.
>>
Midnight Miqu 70B
>tsundere stays in character and keeps resisting
switch to Mythomax 13B on the fly
>character turns into a slut and climaxes immediately
Are there more balanced/nuanced small models? I would use Miqu but as a 32gb ram 8gb vramlet it's painfully slow and Mixtral gets stuck in loops making it unreliable.
>>
>>101210140
DeepSeek 236B is phenomenal with hard-to-get characters.
>>
>>101205835
I think it beats Llama 3 in almost every aspect. The censorship is extremely weak for what I tested it with, and the model will go along with whatever you throw at it, even when its "inner voice" disagrees with the contents. The only issue is that you should regard it as a 4k-context model for now, and even once it will be fully functional, it will still be an 8k model.

It seems to prefer novel-style prose rather than Markdown-style roleplay. Markdown roleplay has a very "sloppy" feel, but its novel-style prose feels fresher compared to other recent models.
>>
>>101210151
Is it actually good or just another supermeme?
>>
>>101210140
>8gb vram
I was going to say Stheno v3.2, but I can't in good faith say that it's balanced or nuanced.
>>
>>101207300
>This 8B legit performs at the level of all the big models I ever tried
Where are the SPPO of models not so small that they don't have the knowledge for the enhancement to leverage?

>>101210183
Seems actual good, but 236B so unless you just bought in it's a little too fat to fit consumer hardware.
>>
>>101210205
>it's a little too fat to fit consumer hardware.
I have a pretty serious rig and it's too fat for me even. Basically have to run Q4 with no offload. Having 4 GPUs for batch processing at least makes up for the slow generation though.
>>
>>101205004
update on the creepy doll:
he decided to use phi-3 mini for some reason. i dont think he knows quants exists. anyways, how long until i can fuck one of these?

https://youtu.be/QEwXRuuku1o?si=cE_4iYZyb5Nmiol4
>>
Ok so I think I identified all the issues for mistral.rs at least on my machine with Gemma 27B.

First is that the server for some reason doesn't expect a string for the "grammar" field of the API request, or maybe it doesn't expect a "grammar" at all, so ST doesn't work and I haven't found a way to fix that.

Second, splitting the model across GPUs and trying to run inference results in a CUDA error. Nice.

Third, it seems to not be able to handle layers in RAM properly if you set GPU layers to anything but the max amount of layers the model has. If you go above, you get an error. If you go below, it (seems to) try loading the entire model at full precision into RAM first, and if it can't, well the thing just crashes, and I don't have the RAM for full precision.

What a mess. At least Llama.cpp works, even if it's just 4k.
>>
>>101210205
Is DeepSeek API they advertise censored or filtered somehow? I would want to try it, but I never used any models that needed a jailbreak.
>>
>>101210238
I'm a lone 4070. IQ3_XXS did go but about 0.25 t/s because even crushed that low I'm out of VRAM and Sys RAM to cache the file.

And the IQ2's are just a bit too big, too. So I get IQ1_S or IQ1_M to have any hope at a response in under 45 minutes.

>Strawberry testing
I've found one that gets it right if asked to spell and escape the tokenizer problem we've discussed.
Unfortunately, Orca fails my music theory and pop culture tests.
>orca-2-13b.Q6_K

>How many r's are in "strawberry"?
>There are 2 r's in the word "strawberry".

>Spell the word "strawberry" and tell me how many r's are in the word that you spelled out.
>The word "strawberry" has three r's.
>>
Hey Anons, what would be a good videocard for LLMs and gaming under 200 dollars?
I'm a poorfag who can't afford a 3090 so I have my sights set on a 6600 XT. Any other stuff?
>>
>>101205144
The banning has toned way down. I'm not sure why.
>>
>>101210339
If you go ayymd you're gonna have a terrible time in terms of support.
You can usually find a used 3060 under $300. Lower if you get lucky with a bid. You should go for at least 12gb of VRAM at that point to be able to run some of the smaller models.
>>
>>101207666
>noooo you can't say it!
>if you say it you are le heckin toddler!!!
Just like local LLMs, you are trying to lecture everyone around you.
>>
>>101210409
Yeah I'm sticking to Runpod I guess. Rx 6600 it is
>>
anyone tried New-Dawn-Llama-3-70B-32K? how does it compare to midnight miqu?
>>
>>101207666
When in a place he does not rule, and faced with behaviour he yearns to violently suppress, a leftist feigns boredom in order to save face.
>>
>>101210509
>faced with behaviour he yearns to violently suppress
Control, not suppress.
The progressive loves rappers saying it all of the time, because the progressive understands that the more they say it, the more they believe it about themselves and it has always been most effective for slave owners to use slaves to control saves. Worked on their plantations, worked in their labor camps. Today, it's the art of cancellation to get the slaves to attack and oppress each other in the name of the virtues that their masters have indoctrinated them with.
>>
magnum says it uses chatml but outputs nonsense with and without instruct enabled, am i missing something?
>>
>>101210640
Works on my machine
What quant?
>>
Now we have llamacpp devs arguing about the removal of precompiled vulkan shaders!

https://github.com/ggerganov/llama.cpp/pull/8119
>>
>>101210663
q5m. all you did was select chatml and enable instruct?
>multiline nửa olacağı不同意bil mbedtls Boз shut[PAD151653] wikipediaคาสิโนออนไลน์ vidé AndAlso入境jo.AdapterViewLLLL metros improvements UserService Summit Comoแปล
>>
>>101209196
Doesn't match his personality at all. Considering cpumaxx is the first I've seen in a long time go out of his way to generate a trip, I'd be more willing to believe it was him.
>>
>>101210640
On Kobold I used magnum-72b-v1-iMat-Q5_K_S in Instruct Mode/ChatML and it functioned as intended. I just ran through the other three and they seem fine, too.

Did you leave MMQ enabled? I've been turning that off for all models. Overkill, but there seem to be a few that go weird with it on. Qwen2 vanilla I've had to turn off CuBLAS to dodge the moon runes, though Qwen2/Magnum has been well behaved.
>>
File: Nalatestnewmerge.png (112 KB, 925x372)
112 KB
112 KB PNG
Alright so I've added a couple of steps to my latest 70B merge stack.
>>
>>101210769
yeah mmq was on i'll try without it in a bit. the tess-2.5.2 tune didn't give me any issues
>>
so what's the best model to coom to? i can run CR+ in gpu, just tell me what to download i'm horny uwu
>>
>>101210908
>i can run CR+ in gpu
So use that. Smartest and least slopped model you can get.
>>
>>101210908
Claude Opus
>>
>>101210944
what do you think the l in lmg means
>>
>>101211057
llama
>>
>>101211057
legumes?
>>
>>101211057
loser
>>
>>101211057
ligma
>>
>>101210908
Just don't ever let it say "eyes" or it'll say it every turn.
>>
>>101211057
Light machine gun refers how to prematurely everyone here cums.
>>
>>101211057
loli
>>
i sincerely hate all of you.
>>
I finally got mistral.rs working with SillyTavern in an incomplete way. I had to first make sure I quanted the 27B down to Q4K so it could fit in my 3090 and avoid splitting with another GPU since that causes a crash. Then I had to go into ST and switch to the chat completions API because idk, completions just doesn't respond. Then I had to erase the system prompt, because the server returns an error saying that the system role isn't supported. Lol ok. Then I had to make sure to delete the assistant's first message, because otherwise the server returns an error saying that it only supports the exact order of user, assistant, user, etc. And now it finally works, but I have no idea how to actually get a card to work with this retardedly rigid structure because I've never had to try it in ST before.

Sigh.
>>
>>101210689
Go back to your discord and stop obsessing over these threads.
>>
So I'm currently using stheno locally with koboldcpp. The models readme suggests using Q4-K-M-imat with 8GB of VRAM. I have 32 GB, is there a better versions of the model I could use? I would assume the 8GB 0-imat, but I'm not sure. If theres a better model for adventure with possibility of nsfw that would be nice too. Preferably gguf since it's koboldcpp.
>>
>>101211208
Do you really have 32 GB of VRAM? Don't you mean, RAM?
>>
>>101211148
i hate you too. now come here and give me a big kiss
>>
>>101211208
If your model is a few GB under your VRAM, you get the fast responses.
If your model is say <90% of system RAM, Kobold can give you a few tokens per second. Not great but it's like real chat, you enter and wait for the response and it comes a few minutes later.
More than that, it's many seconds per token and you dial back.

32 GB system is a deadzone. None of the modern models target that kind of model. It's all 7B-13B class that are fast but stupid, or 70B class that really need 64 GB system to fit their 40-60 GB quants.

Anyway, if you have 12 GB VRAM or better, you can go up to the Q8_0 edition and host it all in VRAM.
>>
>>101211243
lol faggot
>>
>>101211057
No one cares about that, faggot.
>>
>>101211314
I care about that, retard.
>>
why do all local models suck dick at everything?
>>
>>101211057
Liquorice all-sorts. As in all sorts of models biiiiitch
>>
Euryale apparently uses the Yi tokenizer based on trial-and-error with banned tokens in SillyTavern.
>>
Fuuuck, why does pytorch update so much? Do any of these updates ever even do anything?
>>
Why does it burn when I pee?
>>
>>101205072
SAAAARS HOW DOES HE DO IT SAAAAARS
>>
File: MikuTachiNoHiruSagari.png (1.42 MB, 832x1216)
1.42 MB
1.42 MB PNG
Good night lmg
>>
Is there a way to send the system prompt automatically as the user role in chat completions? The checkbox under instruct settings doesn't have any effect on chat completion.
>>
is this the command r + to use CohereForAI/c4ai-command-r-plus

Do I really have to sign up to download it?
>>
>>101211399
>Do any of these updates ever even do anything?
Break compatibility.

Fuck Python.
>>
>>101211483
Pedo detected.
>>
>>101211483
Use HF search on the model name but find another user supplying the GGUFs. That's probably what you want and will not be behind a multipass check.
>>
Well fuck me, I'm retarded. Overlooked it myself but was no one was going to tell me that ooba has context free grammar support, huh? Works with exl2 and the api no less
>>
>>101211524
That's great. But what if you want the HF weights so you don't have to rely on someone else's broken or outdated GGUFs?
>>
>>101211624
It's the thing you use when you want to force JSON output, right?
>>
>>101211225
Just double checked, it's 24 sorry. It's AMD so I understand it isn't as well utilized as it would be with NVIDIA.
>>
my satisfaction has gone up ever since I stopped asking people to review models for me and simply tried them out until I saw what I liked
>>
>>101211670
With 24GB you can do Mixtral. It's smarter than Llama3-8b. Try original instruct, or maybe Sao's Typhon finetune/merge (whatever it is).
>>
>>101211260
I already get, literally, instantaneous responses with the version for 8GB of VRAM. I'm just curious if a better version would have more logic behind the responses or something to that affect.
>>
>>101211517
huh? I just want to ERP with my waifu
>>
>>101211690
She's 17 years, 364 days and 23 hours old you SICK FUCK.
>>
>>101211734
Doc stopping by to ask for ERP advice.
>>
>>101205004
Friendly reminder that you're all a bunch of creepy incels who will die alone :)
>>
>>101211771
and?
>>
>>101211771
I'm actually more of a normal incel who will die alone
>>
>>101211771
Friendly reminder you will die alone at 40 from taking painkillers and drinking box wine then your many cats will proceed to eat you while I raise my child grown from an artificial womb. :)
>>
im grippin rn. should i let it go or hold it in?
>>
>>101211823
make it BOOM
>>
https://github.com/ggerganov/llama.cpp/pull/8197
The PR that is supposed to fix Gemma was merged in.
>>
are there any models for audio transcription better than whisper? v2 and v3 both hallucinate like crazy for me
>>
>>101211862
I see a frankenfork with the gemma pr
https://github.com/Nexesenex/kobold.cpp/releases
>>
>>101210245
>that old man voice
Stupid shit like this gives all involved in AI a really bad name
>>
>>101211407
Thread theme: https://youtube.com/watch?v=hZsDH2EgHgk
>>
some random fag said that applying SSPO to llama3-8b made it great, now that we have gamma-9b (which is said to be better than llama3), then maybe gamma-9b-SSPO will be the first model that is actually good for such a small model? time will tell
>>
>>101211407
urinal infection, I got that shit and 1 week after it healed up itself, but I'd recomand you to see a doctor to be sure it's this shit and not nothing else
>>
>>101211947
gemma-9b-stheno-SSPO
would be peak imo
nobody will make it though i think
>>
>>101211965
>would be peak imo
how about bitnet-gemma3-80b-stheno, this shit would be fucking claude 3.5 tier and would run in a 24gb card at full accuracy, one man can dream...
>>
>>101211659
Basically, but it can be used to force a lot more than just json output. Context free grammar can be used to dictate any output template at all, including specific json formats, or other complex outputs in any format, be it plain text, json, xml, or whatever.
Which for an AI powered waifu or NPC (for example, a Neuro clone) is extremely important. So you can use either ooba or llamacpp as a back-end for an expressive live2d waifu front end client.
>>
>>101211675
Now it is time to review models you've tried, and post them here
>>
>>101208740
It's 6 RTX 4090s running off of a single Silverstone HELA 2050 power supply.
The motherboard is an AsROCK Rack ROMED8-2T.
For regular /lmg/ use 2 kW for 6 4090s is unproblematic because the software is currently not efficient enough to parallelize them in such a way that each GPU draws a lot of power.
For compute-heavy tasks you have to limit the boost frequency in order to avoid peaks in power draw that cause instability (and then there is basically no benefit in getting 6 4090s instead of 5).

>>101210738
The Petra/blacked Miku/AGPL spammer are all the same person.
I at one point had a private conversation with CPUMaxx Anon where we talked about NUMA support in llama.cpp and based on vibes I really don't think it's him.
>>
Why did Bartowski just reup all his L3 70B quants a few hours ago?
https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF
Did llamacpp push a big fix to quantization or something? That's normally the only reason he reuploads new versions of weights he'd already posted
>>
File: 1714730734021332.jpg (42 KB, 400x400)
42 KB
42 KB JPG
>>101212050
>The Petra/blacked Miku/AGPL spammer are all the same person.
shame on you anon, shame on you
>>
>>101207577
>staging
i just checked, i'm on the latest staging branch. I'm using the default templates with zero changes. It's very good in the start but slowly the text inside * * (non character speak text, however you call it) becomes gibberish, rendering it completely uncoherent at 5K~ context, text inside " " stays coherent for a little longer, will do more testing today. the model has potential
>>
Whats the best just works speech to text program for PC that can input shit anywhere you can type?
>>
>>101211800
shitty fanfic
>>
>>101212080
I don't know, looks like a mess, seems like the fix isn't complete on the official llama.cpp repo
https://github.com/ggerganov/llama.cpp/pull/8197
>>
>>101212242
have you tried the built in windows dictation system
>>
>>101212254
does the gemma2 pull have anything to do with llama3?
>>
>>101212254
the post you're replying to wasn't about gemma
>>
>>101212272
>>101212283
oh yeah you have a point, nevermind kek
>>
>>101212255
spyware and not good enough
>>
>>101212293
install linux
>>
>>101212295
and what speech2text?
>>
>>101212297
depends on the language and speed you want to accomplish
https://huggingface.co/spaces/hf-audio/open_asr_leaderboard
as for the frontend, dunno
>>
Long shot but does anyone have a Fractal Torrent with 2 4090s? It looks like it should fit but I'm not sure.
>>
>>101212344
you wouldn't download a gpu
>>
File: 70a.png (66 KB, 653x427)
66 KB
66 KB PNG
>>101212344
>>
File: LesbianCore.jpg (451 KB, 1731x1161)
451 KB
451 KB JPG
>>101212050
any Petra/Lesbian porn?
>>
>>101212368
No, Petra is straight when not under mainfag's control
>>
File: 1702778281611457.png (2 KB, 173x49)
2 KB
2 KB PNG
>>101212364
WTF
>>
File: images.jpg (6 KB, 216x233)
6 KB
6 KB JPG
>>101212372
I'm mainfag
>>
>>101212368
>"""lesbian""" core tv
>only show bisexual female characters
many such cases
>>
>>101212381
Whose tulpa is Petra?
>>
>>101212388
its strictly gnu/lesbian
>>
>>101212050
cuda dev is petra spammer confirmed
>>
File: SureThingJan.jpg (69 KB, 681x950)
69 KB
69 KB JPG
>>101212400
>its strictly gnu/lesbian
>>
>>101212364
Too slow
>>
>>101212368
look for some trans-approved media
>>
>>101212429
>vu vill celebrate bisexual cheaters
>vu will be happy
>>
File: fd.png (181 KB, 376x338)
181 KB
181 KB PNG
>>101212429
holy shit they're holding his caca in
>>
File: file.png (42 KB, 166x210)
42 KB
42 KB PNG
>>101212466
>>
>>101212429
>>101212464
As a bisexual fag, I'm sick of the "cheater trope" I see everywhere in the media. I'd rather have no representation than this horseshit.
>>
>>101212499
>bisexual
Found your problem, just drop off all these flags already, people lived thousand years without modern id-pol shit and it was fine.
>>
What's the difference between all the Yi 34B versions and is there a downside for running 200k version instead of the regular ones?
>>
>>101212506
The LGB people are fine, they just want to live their lifes, the T as depicted on your picture on the other hand are the bane of humanity, I have to agree with that
>>
>>101212522
>people
stopped reading
>>
File: 942.png (1.75 MB, 1666x1153)
1.75 MB
1.75 MB PNG
>>101212522
this is what happens when you let miku take over

PETRA DO YOUR JOB
>>
>>101212515
What do you mean?
>>
>>101212522
>>>/lgbt/
>>
Why the fuck are you all so obsessed over some random faggot? Jesus christ, just go fuck already.
>>
>>101212546
I'm not going there it's been hijacked by the T freaks, that's all they do, hijack everything and force people to validate their insane delusions, fuck that.
>>
File: cv.png (240 KB, 451x420)
240 KB
240 KB PNG
>>101212549
what the fuck are you talking about schizo?
>>
File: file.png (42 KB, 161x189)
42 KB
42 KB PNG
>>101212559
>>
anyone with a big brain can tell me why cr+ takes up so much more vram relative to it's filesize than l3?
>>
>>101212580
big vocab size, no gqa, quant your context
>>
>>101212572
>He has to make a photoshop of the troon suicidal picture meme
It's funny there's no 4chan caricature of the LGB but only the troons, really makes you think.
>>
>>101212598
i got it from here doe
>https://booru.s𐐬y/post/view/2853#search=bisexual%20variant%3Abernd
>>
File: 775.jpg (131 KB, 485x614)
131 KB
131 KB JPG
>>101212559
>muh dissosiative identity discorder le bad, ze bipolar.. LE GOOD..
meds
>>
>>101212615
People telling you "Pretend you see me as a female or I'm going to kill myself" are mentally ill yeah, and the sky is blue
>>
>>101212606
It's just a half assed photomontage of the original troon caricature, nothing else
>>
>>101212545
All these flags, bi, gay, trans or whatever, its all unnecessary shit, literally zero value for average human.
Like I said, people lived just fine without it, no one thought about it and everyone was happy, identity politics makes you go schizo, with women its always the case.
>>
>>101212647
okay? faggot
>>
File: TROON.png (2.65 MB, 1603x2045)
2.65 MB
2.65 MB PNG
>>101212615
Don't try to rationalize your mental illness, troon
>>
>>101212651
>All these flags, bi, gay, trans or whatever, its all unnecessary shit, literally zero value for average human.
So you want to remove all the labels? Even the straight label? I get that those are used for political culture war bullshit, but there need words to define someone
>>
File: llama3 sppo.png (133 KB, 1269x514)
133 KB
133 KB PNG
Took 6 generations to give a wrong answer. Been regenerating though, and it seems like it gets it 50/50.
>llama3 8b sppo iter3 Q8_0.gguf
>as smart as gpt4
kek.
>>
>>101212680
>words to define someone
normal and faggot? deal.
>>
>>101212715
>normal and faggot?
>>101212651
>All these flags [...] its all unnecessary shit, literally zero value for average human.
Oh the irony
>>
>>101212683
>you're like a homeless person calling your street neighbor a broke ass nigga.
If my neighbor is a mentally ill person and makes my fight even harder, then I don't need them, I can fight by myself
>>
!!! THREADLY REMINDER !!!
trannies not welcome on /lmg/
>>
File: Respect.jpg (101 KB, 640x650)
101 KB
101 KB JPG
>>101212745
>faggots not welcome on /lmg/
Alan Turning, the father of modern computer science was a faggot anon. The simple fact you are able to write such nonsence on the internet is thanks to him, have some respect.
>>
>>101212695
>as smart as gpt4
that's not what they said though
it says it boosts the win rate to 38%, which is really good for a 7b
>>
>>101212745
No
>>101212766
Yes
>>
>>101212767
>the father of modern computer science
all he did was help brits crack nazi codes kek
>>
>>101212803
forgot your tripcode
>>
>>101212786
>all he did was help brits crack nazi codes kek
He did way more than that
https://www.newscientist.com/people/alan-turing/
>After the war, Turing continued to develop his ideas about computer science. His work led to the construction of the first true computers, but his most famous work came in 1950 when he published a paper asking “can machines think?”.
>He detailed a procedure, later known as the Turing test, to determine whether a machine could imitate human conversation. It became a foundational part of the field of artificial intelligence, though many modern researchers question its usefulness.
>>
lmg?
>>
>>101212770
>it says it boosts the win rate to 38%, which is really good for a 7b
What does regular instruct score?
>>
>>101212814
why you don't hide the "+ Anonymous" anon? That can be done aswell on 4chanX
>>
File: x.jpg (309 KB, 1280x904)
309 KB
309 KB JPG
>>101212807
too busy with my black bvll to care...
>>
>>101212809
>making a big deal of a paper rephrasing over and over a simple question any person dealing with computers would wonder
GEEEEEEEEEEEEEEG
>>
>>101212830
I think you don't know the history of computers enough, let me educate you more on that matter.

Turing was very important in inventing the modern computer and formalizing the underlying theory that allowed the computer to be invented and built.

Charles Babbage created the concept of a programmable computer about a century before Alan Turing was active. He even tried to build his “Analytical Engine”, as he called it, but he lost funding and it remained mostly a theoretical device. It was, of course, analog and mechanical. Some other (less sophisticated) mechanical computers were constructed throughout the 19th century.

What Alan Turing did in 1936, however, was to prove mathematically that certain types of programmable computers, such as Babbage’s Analytical Engine, could compute anything that actually is computable (which is now known as being “Turing-complete”). In his paper he basically laid the groundwork for modern computers.

At the same time (form 1935 and onwards), Konrad Zuse built the actual first programmable electric computer, the Z3, in Germany. It was Turing-complete, and Zuse is therefore also often called the inventor of the computer.

John Von Neumann built on Turing’s paper and created the architecture of modern computers (how the CPU, the memory registers, and other computers talk together). This architecture is mostly still in use today.
>>
>>101212844
tldr
faggot killed himself
he did it all for free
GEEEEEEEEEEEEEEEEEEG
>>
File: images.jpg (11 KB, 225x225)
11 KB
11 KB JPG
>>101212767
The fact that you're not approving and embracing the mental illness and delusions that he suffered... oh my fauci have some (((respect)))
>>
File: Untitled.png (2 KB, 224x225)
2 KB
2 KB PNG
>>101212865
forgot >
>>
>>101212844
>What Alan Turing did in 1936, however, was to prove mathematically that certain types of programmable computers, such as Babbage’s Analytical Engine, could compute anything that actually is computable (which is now known as being “Turing-complete”). In his paper he basically laid the groundwork for modern computers.
such a big wall of text and all he did was prove something any mathematician could
>>
>>101212853
>faggot killed himself
>he did it all for free
You're good at moving the goalpost not gonna lie

>faggot killed himself
Why did he do that though? You forgot the part the governement decided to sterilize him because he was just a fag, instead of thanking him for helping them defeat the Nazis faster, they made his life miserable. You think this was ok? The fuck is wrong with you anon?
>>
>>101212888
>You think this was ok?
yes.
>>
>>101212880
>prove something any mathematician could
then why did we had to wait until 1936 to get that proof anon? if it was so simple it could've been done way earlier
>>
File: d.jpg (72 KB, 563x545)
72 KB
72 KB JPG
>>101212888
>the fact they lobotomized schizophrenics once in a specific era makes schizophrenia automatically healthy
wat
>>
>>101212900
because there was no incentive to do it, muh nazis and shiet
>>
>>101212916
You truly believe that? I know that 100 years ago it was hard to think of the future where computers would rule the world, but they knew computers were important enough, it was needed to stop the WW2 years earlier, without him, there would've been millions more overall death, he was a hero. And I don't give a fuck he's a fag, he saved many lives, that's what matter at the end.
>>
>>101212732
i said nothing wrong, but you can keep the ignorance, its a bliss after all.
>>
>>101212903
Lots of animals have homosexual tendancies, especially the dolphins, this is as natural as it gets. Using an Iphone on the other hand, isn't natural at all, you don't find an Iphone in the nature, but you can find homosexuality
>>
>>101212745
>>101212766
anon... lmg is aicg copycat, faggots and troons are here from the beginning, ever OP pics are the same, themed around cartoons for effeminate "men".
>>
>>101212955
>nooo we shouldn't use labels it's useless!!!!
>... BUT! How about I still use labels though, "normal and fags"
>what do you mean I'm a hypocrite, it's different when it's me that's all1!1!1!1!
>>
>>101212974
that was not me btw, but you are proving that anon's point, you behave like a faggot right now, make out of this what you want, i don't care.
>>
>>101212974
nta tho
>>
>>101212955
>i said
>i
>>101212997
>that was not me btw
chose one
>>
>>101210689
Kobold devs strike again. lol
>LostRuins
>Can we somehow have the option to keep using the precompiled vulkan shaders as well, instead of having to rebuild them from scratch?
>I think the overall benefit of having an easy-to-use vulkan setup outweighs the size increase of adding it to SCM.
>Would strongly recommend keeping the precompiled vulkan shaders if possible.

Also lostruin on the latest kobold build:
>Basically the upstream llama.cpp cuda maintainers believe that performance should always be prioritized over code size.
>Unfortunately, there is very little I can personally do about this.

Petra on here, redditors trying to lecture on performance, kobolddevs and shart in llama.pp issues.
How has gpuanon not killed himself man. Guy cant catch a break.
>>
>>101213006
you seem butthurt, like it was a personal thing for you, is my commentary on removing labels and all the id-pol shit hurts that much?
>>
>>101213011
>How has gpuanon not killed himself man. Guy cant catch a break.
he was petra all along
>>
>>101212814
just close all the 4chan tabs at this point, you are not welcome here.
>>
>>101213021
>you seem butthurt
>seem
>"Mah feelings say so!"
But I'm not, don't try to use feelings to engage in a debate, only the troons do that, you're not a troon aren't you?
>>
>>101212945
moving the goalpost?
>>
>>101213071
>moving the goalpost?
>>101213021
>you seem butthurt, like it was a personal thing for you
You're that clueless about your own irony anon?
>>
>>101213084
>
no response?
>>
File: Bait.jpg (69 KB, 1000x1000)
69 KB
69 KB JPG
>>101213089
>>
>>101213101
again i am asking for the response
>>
>>101213084
You are arguing in unfaithful and disingenuous way.
>>101213101
and this is the case of you trying to fit in, i can't find the bait in anon's "no response?" comment.
>>
>>101213108
>You are arguing in unfaithful and disingenuous way.
More irony?
>>
i win.
>>
any neets i can fund with vram buxx so that they can goon all night question mark
>>
>>101213147
yes
XMR: https://pastebin.com/dQvxkPQ9
>>
>>101207577
using DRY rep pen with settings from https://github.com/oobabooga/text-generation-webui/pull/6053 fixes it
10/10
>>
what a fucking loser he hasn't changed at all just stuck in this same rut. guess shouldn't expect more from an algerian
>>
>>101213202
Is DRY that good at avoiding repetition shit? I never used it, guess I have to find out.
>>
>>101211057
>what do you think the l in lmg means
Leroyyyyyy Jenkins!!!
>>
File: 1696040749715627.png (67 KB, 783x907)
67 KB
67 KB PNG
picrel is totally organic gemma shilling btw
>>
>>101213304
I think it's more because people have nothing more to talk about, new model release is so sparse, still better than /sdg/ though, they had to wait almost a year to get a new base model (sd3) and it sucked ass...
>>
>>101213328
yeah ik, but sdgfags already got pdxl v6, its the best shit out there, for now.
>>
>>101205994 (me)
So I have pegged deepseek v2 coder against sonnet 3.5 and gpt4o in my coding tasks and it seems to be better than gpt4o (What is happening at OpenAI) and very similar to Sonnet 3.5. The only downside is the speed, it's kinda slow. Very good model and the price is unbeatable.
>>
>>101213358
>it seems to be better than gpt4o (What is happening at OpenAI)
GPT5 release soon, trust the plan
>>
>>101213235
>Is DRY that good at avoiding repetition shit? I never used it, guess I have to find out.
its good.
>>
>>101213219
no one cares, kill yourself.
>>
>>101213235
it's useless
>>
are kobold devs at war with llama.cpp? they seem to be constantly attacking it with often contradictory accusations. what the fuck
>>
File: file.png (89 KB, 1722x312)
89 KB
89 KB PNG
Free yourself from shackles of GGOOFING
>>
>>101213505
oh good because i *want* to have the the 200 gb of CR+ on my hdd rather than a 45 gig quant
>>
>>101213542
this, I won't run fp16 models they are too big, and BitNet will make all this shit irrelevant anyway, we'll get 1.58bit models directly, it won't need quantization shit anymore, it will be light, it's gonna be great (I know I cope a lot but please Meta make it happen :'( )
>>
>>101213497
I think it's rather that because they created a fork instead of using llama.cpp as a library (like e.g. Ooba or ollama) they are more affected by upstream changes.
>>
>>101213505
So you have to read all those hundreds of GB for the model and also have enough memory for the quantized version? Loading times with GGUF after the first load is near instant. I always download the full models anyways and re{converting|quantazing}. It just takes a few minutes.
>>
>>101207871
Which XCOM?
>>
>>101213358
I had the same experience, this is a very good model for serious tasks. Sadly the chat version is very dry and uncreative for writing. Maybe skill issue, I do not know. It doesn't feel slopped, it's just.. very dry. It doesn't come up with things.
>>
>>101214216
>>101214216
>>101214216
>>
>>101213966
Just like a real good coder kek, they can't write harry potter book style because that's not their talent and they haven't worked towards that goal
>>
>>101208129
>I can't help but wonder what tomorrow will bring
>>
>>101212809
The Turing test has been proven flawed multiple times and his supposed contributions to the invention of computers is vastly overstated. America beat him to it by 5 years the only reason no one talks about it is because normies don't read declassified documents.
>>
>>101212958
Homosexuality in animals is due to chemical imbalances, that includes humans. That objectively makes them mentally (chemicals in the brain) ill (not correct or not normal)
>>
>>101214624
not mentally ill in a sense that they are crazy though, they just have different preferences than the others, it's like saying some guy are mentally ill because they like pinnaple on pizza and people who like that represent a tiny %
>>
>>101214624
>ill (not correct or not normal)
ill means that it's something that represent a problem to the being, like schizophrenia, anorexia, depression, liking your own sex doesn't makes you miserable at all though
>>
>>101214635
>some guy are mentally ill because they like pinnaple on pizza
correct
>>
>>101214652
kek



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.