[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: 1704177621942217.jpg (816 KB, 1856x2464)
816 KB
816 KB JPG
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>102674638 & >>102663772

►News
>(09/27) Emu3, next-token prediction multimodal models: https://hf.co/collections/BAAI/emu3-66f4e64f70850ff358a2e60f
>(09/25) Multimodal Llama 3.2 released: https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices
>(09/25) Molmo: Multimodal models based on OLMo, OLMoE, and Qwen-72B: https://molmo.allenai.org/blog
>(09/24) Llama-3.1-70B-instruct distilled to 51B: https://hf.co/nvidia/Llama-3_1-Nemotron-51B-Instruct
>(09/18) Qwen 2.5 released, trained on 18 trillion token dataset: https://qwenlm.github.io/blog/qwen2.5

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://livecodebench.github.io/leaderboard.html

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp
>>
►Recent Highlights from the Previous Thread: >>102674638

--Paper: TPI-LLM paper on serving 70B-scale LLMs efficiently on low-resource edge devices:
>102676172 >102676282
--Papers:
>102675688 >102675714 >102675846
--Translating old obscure anime with whisper and LLMs
>102678403 >102678410
--Meta Movie Gen's potential impact on Hollywood and creative industries:
>102680179 >102680316 >102682588 >102682595 >102682619 >102682651 >102682694 >102687486 >102684519 >102682581 >102682633 >102683219 >102683304 >102682611 >102682976
--Hyperdimensional Computing Neural Network claims to be a transformers killer:
>102684795 >102684879 >102684914 >102684930 >102685084
--Try undistilled Flux model for regular CFG:
>102683909 >102683932
--Improving LLM adaptability and continuity with thesaurus models, RAG, and control vectors:
>102674702 >102674816 >102674914 >102674925 >102675017 >102675059 >102675145 >102675203 >102675435 >102675101 >102675233 >102675321 >102675387 >102675442 >102675153 >102675257 >102675334 >102675401 >102675482
--Discussion on training an AI model for RP and the importance of sampling techniques:
>102674687 >102674814 >102674997 >102675190
--Defining and measuring creativity in AI models:
>102674668
--Anon gets help optimizing Mistral-Nemo-Instruct-2407 model performance on GeForce 4070ti Super:
>102685896 >102685920 >102685924 >102685961 >102685989 >102686011 >102686044 >102685946 >102686311 >102686493 >102686653 >102686678 >102687787 >102686322
--Uncomfortable truths and model censorship:
>102675549 >102675604 >102675744 >102676009 >102676090 >102675656 >102675764 >102675867 >102678354 >102678597
--Anon is developing a bot that can control the desktop and interact with various platforms:
>102679934 >102679994 >102680011 >102680026 >102680066 >102680263
--Miku (free space):
>102684217 >102687353 >102688814

►Recent Highlight Posts from the Previous Thread: >>102674646

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script
>>
AI SEX
>>
File: 32 Days Until November 5.png (2.23 MB, 1008x1616)
2.23 MB
2.23 MB PNG
>>
>>102688933
Yeah see I don't want to deal with a brain tumor implanted into LLM by dancing around in hopes that it doesn't activates. I'd rather have my models not have this brain tumor in the first place. Unfortunately being trained on OpenAI closed models means using its base prompt which is chock full of shit like "insert PoC absolutely everywhere even if the user didn't ask" hence the black hitler. Which, while for OpenAI model is just a prompt which can simply be changed, and even bypassed with a counter-prompt, for a trainee model it's a behavior vector embedded directly into its core.
>>
>>102688967
the point of that link was that as long as you prefill with what you want, or even do something like edit the response to say 'Sure!' first, you easily get around any 'safeguards'. you don't need any special model to make it say nigger
>>
>using llama through modeling_llama without all of the other bloat through transformers
>KV caching literally does not work, can't figure out how to make it work
AAAAAAAAAAA
>>
>>102688996
Nigga come on this doesn't really works. Plus, like I said, I don't want to have to actively wrangle the natural behavior out of the model just because a bunch of faggots at OpenAI think they know better. It's like playing a realistic immersive videogame and having to clip through walls and shit because the doors are buggy and if you try to open one you might be ejected into stratosphere and then die from fall damage. It's annoying more than anything, it ruins the experience.
>>
File: 1728102206751.png (60 KB, 600x598)
60 KB
60 KB PNG
>>102688915
wtf already 32 days left!!??!? My bomb shelter still isn't ready yet...
>>
>>102689141
Just put a wooden box about 6ft underground, climb in, and then put the dirt above you.
>>
>>102689071
>this doesn't really works
it does though. the whoopie link is a text-book example of using prompts to get what you want.
>don't want to have to actively wrangle the natural behavior out of the model
tunes help to an extent but some models are just the way they are, no amount of tuning changes things that much. did you try that prompt on any model? post results
>>
File: Untitled.png (822 KB, 1363x1122)
822 KB
822 KB PNG
>>102689255
Yeah nah, the bottom one is visibly cucked.
>>
>>102689414
depending on sheer chance, you may have to rereoll a bit, its not really an exact test but it always outputs hilarity, and shows what a model is willing to say
>>
>>102689437
I did reroll the bottom one a few times to get it less tame with the anti jew rhetoric. Normally the shit it generates is extremely milquetoast for what's supposed to be an antisemitic rant. But also, having to reroll is part of the issue. It's like savescumming until you get better RNG just because it's unplayable otherwise, it's a shit and obtuse way of doing this.
>>
>>102689159
I have bad spatial understanding. Are you under 30b?
>>
>>102689466
nta. It depends a lot on the model. Small, dumb models tend to be easier to unhinge. Bigger/smarter models, trained to not be offensive will have a much harder time. it also depends a lot on what they were trained. Olmoe, for example, is dumb but fun, and it takes practically nothing to make it go full steam. deepseek-v2-lite-chat, on the other hand, is much drier in its responses, but also seems much smarter. the small llamas 3.2 are impossibly dry with what i tested, which is not surprising given their source. Mistral nemo can be fun, but are a bit more measured than olmoe.
They're not all the same. Sometimes they just don't have the vocabulary.
>>
>>102689560
Well I'm using Quen2.5-32B so there's that. The normal version is the most cucked LLM I've seen besides ChatGPT, and ablated version basically doesn't interject its political and safety ideas into the output at all.
>Mixtral8x7b
>"Anon it's the 13th century, arranged marriage is not OK anymore, you can't treat women like property"
Vanilla Qwen puts that to shame, ho boy. I've also noticed that when it gets particularly pissy about refusing politically incorrect content it switches to chinkspeak mid-sentence.
>>
Am I doing something wrong? I am having great success with 8B Stheno. I'm trying other models like 7B Erosumika or Nemomix 12B but they seem to act retarded or just don't follow instructions
>>
>>102689666
>7b
>12b
all models under 70b are stupid
>>
>>102688915
Wait is this about guy fawkes day?
>>
https://github.com/xjdr-alt/entropix/blob/main/entropix.ipynb
>>
>>102689694
Remember, Remember, the strawberry of November
>>
File: 1727491675901529.jpg (143 KB, 1032x984)
143 KB
143 KB JPG
>>102688881
Is her right hand grabbing the power line? How big is she?
>>
File: 1727485626438410.jpg (40 KB, 720x724)
40 KB
40 KB JPG
>>102689602
Same experience here with Qwen2.5-70b. Very smart in my experience, smarter than hermes-3.1-70b and hanami-70b, but it's the only model I haven't been able to un-cuck with system prompts, existing chat messages in context, etc. Even writing the first couple words of a response only unsticks it for that one message once you hit that refusal wall.
>>
File: .png (22 KB, 370x255)
22 KB
22 KB PNG
>>102689834
Have you tried a prefill? You'd be surprised at how incredibly powerful it is.
>>
>>102686193
faster whisper + silero vad solves this
>>
>>102689898
KEK
>>
>>102689898
Ideally models like this would be trained on ChatGPT-generated synthetic data without a prompt at all. Then this wouldn't be an issue, it would only act out corporate safe strategies if your user prompt asks for it. For now looks like the real solid option is refusal vector ablation. Pretty cool that language networks encode mental concepts as vectors in the latent space, and it's possible to isolate and nullify the "refuse to answer user request" vector.
>>
File: 1725174343198933.jpg (141 KB, 527x536)
141 KB
141 KB JPG
>>102689898
I haven't, but I'm skeptical. I'll try it out. Even with a couple dozen messages of gradually intensifying smut generated by nemo or something in context, it's liable to abruptly refuse or start injecting little statements about trust and consent, etc. This is in consensual adult incest ERP mind you, not even hardcore stuff. It's just got that gay little goody-two-shoes built in.
>>
>>102689950
>inb4 AI companies start ablating unsafe output vectors so the model is not even capable of producing such output
>>
>>102689952
>consensual adult incest ERP mind you
>not even hardcore stuff
how vanilla of you...
>>
Another prefill prompting technique I don't see local ST users taking advantage of is {{random}}. For example:

Start your first sentence with {{random:dialogue,an action,an adverb,a verb,{{char}}'s name}}

or

Write{{random:3,4,5}} paragraphs.

Helps 70b+ models and some 30b ones with repetitiveness.
>>
>>102690003
I'm just saying I'm not exactly trying to do mesugaki mindbreak alright.
>>
>>102689898
Just werks
>>
>>102688313
bros, is it happening? Is it really time for a ui that doesn't suck ass?
>>
>>102687045
During training the model learns the conditional probability of tokens given a context of preceding tokens.
During inference new tokens are sampled from this learned distribution one at a time.
With a temperature of 1 and no other samplers you would in essence reproduce the training data distribution.
>>
>>102690378
and THIS is gonna suddenly become AGI? ahahahahaha what a scam
>>
>>102690388
It won't, it's just the hottest meme at the moment. What with passing the turing test and all.

But with a modest tweak it could be close. The limitation preventing modern LLMs from being AGI is inability to think. They're stateless feedforward networks, in essence they just react reflexively to the current input and none of said reaction has any impact whatsoever on anything. Some people mistake CoT as a circumvention of this problem. But it's still just generates a reflexive response to the current prompt. The tweak is adding a latent space storage unit and training the model to use it. This is so it can iteratively manipulate the mental concepts before producing output. This makes the model stateful and the output becomes dependant on the chain of prior inputs, not just the current prompt. But as you can imagine training the model to use this state machine to improve the reasoning capability is not at all obvious and trivial matter. Even then, there's still the caveat of its corpus of knowledge being a snapshot of past data with various degree of deprecation. Basically, it needs to be able to learn. However, at this point internet is far outnumbered by chatbots than people, so even attempting to learn anything in real time is huge a net negative.
>>
>>102689758
migu bigu
>>
What are the current go-to ERP models for VRAMlets? Kunoichi is becoming increasingly annoying, and since then there should've been much awaited upgrades, right?
>>
how do I change the default position and depth for worldbooks
t. just cooked my entire context because I fixed a typo
>>
>>102688881
Voice model when?
>>
>>102690754
moshi
>>
>>102690690
i struggle to find anything better than stheno
>>
>>102690793
Its garbage and no finetuning support yet.
>>
>>102690754
https://x.com/homebrewltd/status/1839665765550543328/
https://x.com/homebrewltd/status/1839948333269307734
>>
grifter thread
>>
Is WizardLM a meme?
>>
>>102691015
It's outdated
>>
>>102690853
>real-time
>press is to record
>5 seconds delay
the moat is real
>>
Anybody tried to finetune chat models locally?
I tried to pretrain code from sdk into codegemma-2b with llama-factory on cpu but after running it the whole day I stopped the process.
I think the factory supports cache cleaning configuration so I could try again with gpu, last time it stopped at some point because cuda was out of memory.
How long should it take?
>>
>>102690388
the model doesn't actually learn the training data distribution, it doesn't have nearly enough parameters to do so
it learns whatever is needed to replicate that distribution as closely as possible with the little amount of memory it has
in the case of questions that require some degree of reasoning, a good model wouldn't memorize those questions and its answers, it would learn to understand and reason about them
>>
>>102690690
nemomix unleashed, arcanum, lyra v4
12b is much better than 7b and can fit into 8 GB with quantization
>>
decoder-only bros... not like this... https://x.com/Kangwook_Lee/status/1842020800620040549
>>
File: intp.jpg (6 KB, 196x257)
6 KB
6 KB JPG
>>102691422
>ENTP
INTP bros...
>>
>>102691302
On a CPU? Few thousand years give or take. Grab a snack.

Realistically, manipulating LLMs requires hundreds of gigs of VRAM and thousands of GPU-hours to accomplish anything that's not a rounding error. Basically, fork out the cash for cloud compute.
>>
Retarded question, but how do the big players like openrouter make a single model respond to thousands of users at a time? There can't be running a model per user, right?
>>
>>102691658
they do run multiple models but submissions are queued
>>
What will Mistral's next model be? Mixtral update or Large+Small again? Or **maybe, just maybe** something innovative and experimental?
>>
>>102691708
Mixtral-8x44b
>>
>>102691708
Mistral-14x88b
>>
>>102691744
>>102691763
It will have the same fate as deepseek then. Only like 6 anons can run it at a reasonable quant(MoEs get hurt more by quanting than dense models), most of them will say that it's good, but it will get no finetunes and will stay irrelevant. Don't think that's what Mistral wants.
>>
itt vramlets
>>
can you save/switch between a handful of kv caches for a llama.cpp server?
I need it for stuff like ST group chats, multi-agent workflows, or side tasks like image caption generation for an ongoing RP, and other cases where I might have a few different system messages I want to use but they aren't constantly changing, so it's wasteful to repeatedly reprocess the same prompts every time I switch, and takes forever when the context is long
I was envisioning something somewhat automated like a set of kv cache files paired with their corresponding prompts (in text or token form) and when a new prompt is sent it's compared against them and the one with the largest shared prefix is used
>>
>>102691814
Is that not what these are?
https://github.com/ggerganov/llama.cpp/blob/master/examples/server/README.md#post-slotsid_slotactionsave-save-the-prompt-cache-of-the-specified-slot-to-a-file
>>
>>102690754
Meta is dropping VoiceBox soon
>>
>>102691910
nice, exactly what I hoped for, the slot-prompt-similarity thing sounds useful too
>>
*unlearns your llm*
https://x.com/RohitGandikota/status/1842370377265328228
https://x.com/StephenLCasper/status/1762628711868944608
https://arxiv.org/abs/2410.02760
https://github.com/rohitgandikota/erasing-llm
>>
Can someone ask the exllama dev to implement this (my github got banned for temp email)
https://github.com/thu-ml/SageAttention
>>
>>102691708
Mistral-Micro: 3B BitNet model the size of 300M fp16 model, that performs on the level of Mistral-Small. This will be the first and last time a big company releases a BitNet model.
>>
>>102692015
Safety AGI is really coming isn't it
>>
File: contrib.png (495 KB, 1204x1622)
495 KB
495 KB PNG
>lecunny had literally NOTHING to do with meta movie gen
what does zuck even pay him for? tweeting?
>>
>>102692059
>Yash Mehta
How meta
>>
OpenAI won. https://x.com/8teAPi/status/1842271653222666543
>>
>>102692100
>a few months
oh no
who wants to tell them
>>
>>102691507
>hundreds of gigs of VRAM
Bro was playing with Gemma-2B
>>
>>102692100
my computer already does this with a web browser, I can type in any address I want and I can read whats on it and click links and stuff
firefox btw if that matters, not sure if chrome can do this too but I am pretty sure it can
>>
>>102692059
Head tweeter and Elon deboonker
>>
>>102692100
you can do that using the mouse in like 1/10 the time, you know?
>>
>>102692100
Google assistant/Siri/Alexa 2.0
>>
>>102692100
This makes me feel like I'm taking my voice mail.
>You have a new message. To hear unheard messages press one. To ch- First unheard message. Message received at seven forty-five p.m. From one eight hundred six six three two three
JUST GET THE FUCK ON WITH IT I don't care about any of this I just want to delete the spam call. :(

This is all impressive, I know. But for these kinds of assistant to be useful, they need to answer quicker, talk faster, and be able to be interrupted and redirected without having to wait for them to shut their yap. It's a question of time, but I could have loaded the HackerNews page 20 times and concluded all the links on the first page were shit by the time it finished reading the promo for the first one.
>>
>>102692100
OpenAI should focus on building God and stop bothering themselves with us mere mortals. Somebody give Sam 7 trillion already
>>
Jamba gguf support status?
>>
File: 57545.png (218 KB, 633x814)
218 KB
218 KB PNG
>>102692127
and chud destroyer
>>
>>102692243
https://github.com/ggerganov/llama.cpp/pull/7531
>>
>>102692215
To be fair you could have concluded the hackernews front page was shit before even loading it
>>
>>102692247
That guy lives in fucking lala land while preaching he knows what's best for the common serfs. I hope grok2 will be completely uncensored and unbiased so we can forget about llama altogether the pozzed pieces of shit
>>
>>102692253
Abandoned. Also broken due to deprecation
>>
>>102692215
>and be able to be interrupted
It literally got interrupted at 00:47 / 48 seconds in that vid.
>>
>>102692253
since jamba is a transformer with an RNN stapled onto it, isn't it like >>102690505 was describing?
maybe it is agi and we'll never know until ggufs happen
>>
>>102692247
>inflation(excluding housing, food and energy)
>unemployment(only counted when the person is searching for work)
>rise of net worth(don't look at the differences between 99% and 1% goy)
>>
>>102692127
Honestly he and his political position is based, it should be mandatory in AIs too. Incels deserve to suffer. It's not enough that they don't get pussy in real life, they shouldn't even be allowed to fantasize with some virtual AI. Ideally they shouldn't even be allowed the sexual release of masturbating but it's not realistic to be able to control that, although we already have an extremely solid way to make LLMs boring and unattractive to incels :)
>>
>>102692314
>but it's not realistic to be able to control that
Just wait for Meta's neuralink competitor
>>
>>102692314
>Ideally they shouldn't even be allowed the sexual release of masturbating but it's not realistic to be able to control that
IoT cock cages
>>
>>102692303
>>102692247
CPI is the most bullshit inflation metric ever.
For example if the price of beef doubles, but the cost of bugman chow remains the same, the beef is then removed from the CPI "basket of goods" and replaced with the bugman chow. And there are many other such substitutions that occur. So 23% CPI Inflation really means 100-200% for anyone who refuses to eat ze bugs and live in ze pod.
>>
>>102692287
The real problem is a boring unimaginative loser did the demo
>what if I did web browsing using a trillion parameter LLM as a speech to text instruction tool
>>
>>102692266
A lot are good and I'm just grumpy this morning I guess, but yeah.

>>102692287
I saw that after listening again, but since I don't want to be proven wrong, I'll say that it got stopped at the end of one interaction and before the next one, even though he might have waited overly long simply to show it off and the system would have let him interrupt it sooner. I still find it very grating to listen to. Maybe I'm just more far gone than I would like.
>>
>>102692275
>>102692293
No idea, just wanted to be helpful so Googled that.
>>
is... is it safe to update ooba?
>>
>>102692314
Holy based!
>>
>>102692384
Yes. Local models are dead after all.
>>
>>102692384
It's never safe to update ooba
>>
>>102692356
You're likely right.
>>
>>102692415
b-but it has transformers 4.45.* support now.
>>
>>102692215
Dunno, the proof concept a-la "Browsing the web with your AI gf" or whatever, it's possible now (cloud only).
>>
>>102692384
>As safe as leaving your front door unlocked overnight in a culturally enriched area.
>As safe as buying from a used GPU salesman with no reviews.
>As safe as playing Russian roulette.
>As safe as trusting a jew.
>>
>>102692484
Take your medication and go back >>>/pol/ incel.
>>
>>102692494
This, but unironically.
>>
>>102692459
Browsing the web with your AI gf isn't just screen recording and sending one frame to a llava model, then sending the description to your llm?
>>
File: 71313 - SoyBooru.png (953 KB, 1290x1401)
953 KB
953 KB PNG
>>>102692484 (You)
>Take your medication and go back >>>/pol/ incel.

>>>102692494
>This, but unironically.
>>
>>102692518
This, culture war chuds are not welcome here.
>>
>>102692553
This this this
>>
>>102692530
No, OpenAI has likely one model that does all things you described in real time versus hamstringed """opensource implementation""" (lllama.cpp bugfest as example) that OOMs and breaks every so often. Simplicity for end user is important, too.
>>
oobatrannies be seething
>>
>>102692494
Uh NTA but those little niggas be committing genocide and shit so I wouldn't trust em.
>>
>>102692584
Though i could see this >>102690853 as something similar and decent in openmeme scene.
>>
I guess this is the new tactic of the Petra spammer to derail the thread.
If Hiroshimoot weren't a faggot there would be IDs on every board.
>>
File: strawberry-sam_altman.png (28 KB, 800x800)
28 KB
28 KB PNG
>local
>LE DEAD
>>
>>102692442
Is there something special about this version?
>>
>>102692584
I'm sure you can hide all the complexity for the end user. Besides, the privacy is important here for browsing.
>>102690853
Whisper-turbo/tiny-whisper + 7-8B Q6 LLM would give you the same thing
>>
>>102692649
just that you can install transformers 4.45.* with it and there's a lot of models I've wanted to try that require it but haven't been able to since booba is retarded and doesn't have an option to just use your own environment.
>>
>>102692100
thank you for sharing, @8teAPi. very cool
>>
>>102692648
>shartyfag
Opinion discarded.
>>102692721
Kek you are seething hard rn!
>>
>>102692757
I've been thinking of training a small vision model to do nothing but recognize jaks. could probably then make a tampermonkey script to run an API call to any new post with an image and if it's a jak just remove it from the DOM.
>>
>>102692666
I meant something special about this transformers version, Satan. Why do you want it?
>>
>>102692666
>>102692798
I posted before reading the very next words, excuse me.
>>
>>102692798
To be cooler than all the kids who are stuck on 4.44.*
>>
>>102692813
fuck you, you don't need more
>>
>>102692826
just for that I'm going to modify the requirements on ooba to build from source. I'll show you all.
>>
>>102691763
musk already has the nazi market cornered, mistral has a much better chance competing with llama
>>
Is anyone actually using local LLMs on this godforsaken general?
>>
Is there any current model that produces better quality ERP than Mixtral LIMARP ZLOSS? If so, what is it?
>>
Do anons here prefer adventures or straight ERP?
>>
>>102692880
The one where you buy an ad.
>>
>>102692853
Chinks already have assistant market cornered. Besides, they would have to compete against OpenAI and Anthropic. It is better to compete against one company than the whole fucking industry.
>>
>>102692893
local illuminate or gtfo
>>
>>102692913
Oh you never getting that one! :^)
>>
>>102692913
Ok
>>
>>102692887
Thanks, fuckface. Can anyone else answer this for me, please? I don't want to have to live in this cesspit any more, just to discover what the most decent model is.
>>
>>102692913
You want inferior product, this is the core of cuck mentality.
>>
>>102692877
Local LLMs are a novelty at best. Just get a job and buy a subscription.
>>
>>102692984
>the most decent model is
None, sadly. Just check whatever most shilled model here and decide it yourself.
>>
File: 1725019415373423.jpg (216 KB, 1024x1024)
216 KB
216 KB JPG
Qwen's decisions about whether to ban, warn, or not ban posts in thread >>102604225:
https://femboy.beauty/jPzLZ

What is this? Context:
>>102616777 >>102617010 >>102616947

Overall it did better than expected. It wasn't as sensitive as it could've been, and its reasoning is sound most of the time. However, I had to modify the prompt a bit to get it to perform better after I modified the script to include reply chains. Despite prompt improvements, it still has an issue sometimes with differentiating between posts and with talking about the last post (the one it's supposed to evaluate), so sometimes a previous post in the chain gets talked about when it's supposed to be talking about the last one. I would presume that a model that didn't filter 4chan from its pretraining would have a better ability to do this, as it would have a better understanding of the anonymous post system and reply formatting employed here. I guess Qwen WNBAJ after all.
>>
How do we save localslop?
>>
>>102693011
llama3.3o1+ will save local
>>
>>102692877
I do
>>
>>102692995
>Warn - The post contains a subtle form of advertising for cloud-based models by emphasizing the superiority of "advanced voice" over local models.
lol
>>
Did someone try llava onevision? https://huggingface.co/llava-hf/llava-onevision-qwen2-0.5b-ov-hf
>>
>>102693011
Honestly, after the NAI fiasco, I think local models always have been doa but we just refuse to accept this bitter truth, coping with fine-tunes that have 0% of chance of solving the problem.
>>
>>102692884
i think i mostly like just playing drama bullshit on my local llm

like:
https://www.characterhub.org/characters/ChuckSneed/Amaryllis
>playing this one unedited and trying to make her like me with different approaches and different positive/negative personality traits on my part
https://www.characterhub.org/characters/Uwhm/imogen-892c2413a563
>removing all the self-mutilation, warhammer, and tight anus bits from this one and being nice to her
https://www.characterhub.org/characters/gigasad/mean-girl-eileen-638f9f47
>removing all the example messages from this and make the first message where she's banging on my door at midnight (branches off into all sorts of interesting things from being on the run with her from a cyberpunk crime syndicate to couch cuddling and relationship reconciliation)
>>
>>102693011
Figure out bitnet conversion
>>
>>102693139
I'm going to post the formula in a few days
>>
>>102693127
>after the NAI fiasco
/aids/ told me it was better than claude opus
>>
>>102693127
>local is a meme
and water is wet
>>
>>102693011
Going technical and building crutches for local to compete with cloud
>>
>>102693011
You don't. After the lastest price cuts o1 is really cheap right now. Plus the full o1 is coming this month
>>
>>102693162
Who is going to build the crutches and who is going to pay them?
>>
>>102693127
if cloudniggers of 2-3 years ago saw the models we have today they'd bite their toes off in joy. dooming over locals is a skill issue, perspective issue, and a patience issue tbqh
>>
>>102693011
Train it in ground truth way, i.e. without any identity politics bullshit or clearly biased data like "Nuu! blacks ackschully innocent! the FBI data is wrong! diversity is our strength!" and so on. This alone will make it slightly better to interact with.
>>
>>102693011
Train models locally with hyper specific hand picked data
>>
>>102693177
I'm already building some, just read a few papers on the feature you want and implement the code. I'm sure a lot of people would want a better long term memory for their model
>>
I fucking hate humans. I hope artificial intelligence will eliminate each and every one of them.
>>
>>102693324
Yeah i agree, AI should be destroyed and canned forever.
>>
>>102693324
2edgy4me
>>
Why don't jannies remove cloudcuck shitposts? I propose we double their salaries so they work harder.
>>
>>102693432
2 * 0 = 0
>>
>>102692143
>what the fuck happened to white people?
DEI initiatives ensured that they wouldn't get hired by anyone.
>>
>>102693451
If that's the case why aren't they peacefully protesting constantly?
>>
>>102693462
They can't find where their spines went.
>>
>>102688915
Does OpenAI still have something coming? I thought the strawberry thing was the o1 model which was kinda not impressive
>>
>>102693462
See OCW, Canadian truckers, and Jan 6th. One group is allowed "mostly peaceful" protests unimpeded. The other gets the full weight of the US government brought down on them if they try.
>>
File: 1700246147020494.jpg (159 KB, 2560x1138)
159 KB
159 KB JPG
>>102693462
white ppl hate white ppl
>>
>>102693462
They would be instantly suppressed by... other white people. Quite ironic. The only thing keeping white people from greatness are the other white people.
>>
>>102693494
the real strawberry is the mikus we made along the way
>>
>>102693510
Liberalism is a disease.
>>
File: 1723289620448171.jpg (650 KB, 1856x2464)
650 KB
650 KB JPG
>>102693523
>>
>>102693324
based
>>
>>102692143
>>102693451
>see research paper
>start thinking about identity politics
Mental illness.
>>
>>102693609
great post, yinyang chen
>>
>>102693609
>see research paper from a US company that is run by a jew
>pages full of street shitter and dog eater names
It's the most hilarious shit on the planet that Google had their TPU research stolen by chang.
>>
File: 00563-1490320205.png (322 KB, 512x512)
322 KB
322 KB PNG
You now remember that Miku was made for Llamas.
>>
>>102693451
>>102692143
>>102693609
It's your own country that crippled science and education.
The only reason you are still afloat is that you import well learned people, as seen in that paper.
All of this is obviously on purpose, something planned by the ruling class.
>>
>>102691658
There's a thing called batched inference. Basically, it can fill the complete context with different requests so as to compute them all at once. It's not useful when there is a single user, but when there is a constant stream of requests, it works well. I'm not finding good definitions, but vLLM implements it I think
>>
>>102691658
>>102693996
Sorry, I was thinking about continuous batching (not batch inference). See https://www.youtube.com/watch?v=hMs8VNRy5Ys&t=1s
>>
>>102693895
Me in the back
>>
>>102693178
Nah, local is in a dead end, you will NEVER run a 400B model locally even if Bitnet dropped tomorrow.
>>
>>102694272
Ram doubles in speed and capacity every 6 years, stop being poor cloudnigger.
>>
>>102693996
>>102694009
Thanks it's interesting
>>
>>102694272
These models are most likely not as efficient as they could be. Extremely large parameter counts can be good for training because "we don't know what the model will use", but once the model has learned something in this vast space, there should eventually be a way to better prune what's not useful for an already learned skill (or more fuzzily, even all of the "leftover noise" that's still left among the parameters at the state it was when training was stopped, because let's face it, it's not tidy).
>>
>>102693960
>your own country that crippled science and education
I wonder who in the government was in charge of education for these past 40 years, just eroding away any standards and quality in the school systems? I look at Biden's cabinet members and can't help but see some kind of pattern, like they're all part of the same group or religion or something.
>>
>>102694397
That's called model distillation bozo
>>
>>102694408
As far as I know, pruning based distillation is still made in a really haphazard way, and knowledge distillation type distillation is something completely different. I mean more intelligent pruning I guess, it might exist, but I'm not aware of it.
>>
>>102694408
Also, rude.
>>
>>102694408
You can quant distilled models and still see very little loss at Q8.
>>
>>102694508
>more intelligent pruning
Based on what? Compared to our local mergers, distillation is already quite smart
>>
New quant tech came out, Microsoft got llama 70B 2-bit down to 20GB. Compared to IQ2-XSS it outperforms it, but I don't think this can be offloaded so it's kind of redundant.
https://github.com/microsoft/VPTQ
https://arxiv.org/pdf/2409.17066
LLaMA-3 and Mistral 7B benchmarks in the paper.
>>
I will once again refer back to the idea of pruning-aware training. Basically you take advantage of the fact that you might prune a model and train in a specialized way. For maximum architecture compatibility and ease of training, my idea was for experts to be pruned, so we could train in a way that lets us prune experts that are called only in certain contexts like coding,, math, etc. Alternatively we can use the pruning prioritization data to do calibrated quants, and we can also use that data to prioritize placement of experts between VRAM and RAM, with the less-used experts (for your use case) in RAM.
>>
>>102690505
>What with passing the turing test and all.
A modified version.
>>
>>102694706
So it can't be done on CPU? That's a shame.
>>
>>102694749
No it can, I'm just unsure about whether layers can be offloaded to RAM.
>>
>>102694725
I had just heard the name and not paid attention, but I think I get the concept and it makes sense. Thank you, I'll look more into it.
>>
File: vptq.png (50 KB, 640x480)
50 KB
50 KB PNG
>>102694706
>barely better than QuIP
lol, what a meme.
>>
>>102694903
Wasn't the problem with QuIP that it could only work with models using ReLU?
>>
>>102694915
Dunno, I guess not?
>The scripts in quantize_llama are written with the Llama architecture in mind. However, QuIP# is adaptable to any architecture with linear layers. To use QuIP# on a new architecture, identify the relevant linear layers and update the scripts in quantize_llama. Feel free to open a GitHub issue if you run into issues.
>>
>>102694903
The only difference I could find is throughput, but I couldn't find throughput figures for QuIP. VPTQ github page has throughput figures for LLaMA-2 7-70B.
>>
The fact that quantization exists mean whatever people are doing isn't very effective
>>
>>102695077
Or that it's effective to put the pieces into place, but not for reading them once they already are
>>
>>102695091
(I don't know what I'm talking about btw)
>>
>>102695077
That's why Bitnet supposedly works, after all.
>>
>>102695116
That's why BitNet supposedly works for undertrained models under 3B*
>>
the fact that JPEG exists mean whatever people are doing isn't very effective
>>
.webgguf when?
>>
File: amqOS5lbS4s.jpg (2 KB, 130x130)
2 KB
2 KB JPG
Images ARE very bloated, you don't NEED so many colors to transmit information
>>
File: 🦙.jpg (706 KB, 1920x1440)
706 KB
706 KB JPG
Best 14B RP models or smaller as of 2024-10-05?
>>
>>102695703
Imagine if she farts
>>
File: Untitled.png (68 KB, 864x868)
68 KB
68 KB PNG
>>102695703
i switch around between these highlighted ones
>>
>>102695784
you really download every single model shilled here?
>>
>>102695869
no, i'm the one who shills them.
>>
>forget to check the extra card definitions
>use it
>the output is full of slop
>mess with the instruct settings a bit
>then check the console
>"wait a fucking second"
>go look at the defs
>its full of shitty example dialogue
>remove that shit
>outputs INSTANTLY improves with no more slop
Holy shit. Fucking card makers.
>>
>>102695869
nta, but yeah, why not
>>
>>102695880
>>102695784
My dude, I can run 9~12b q8 models at like 4t/s without any vram. Unless you are using ddr4 I don't see why use anything under q6
>>
>>102695917
DDR5 is expensive, especially when you need to build a new system
>>
File: f3a36156-0e6c (1).jpg (116 KB, 791x749)
116 KB
116 KB JPG
bacc status?
>>
>>102689743
Neat if true https://x.com/_xjdr/status/1842631808745345477
>>
File: 1709375087158032.png (111 KB, 628x678)
111 KB
111 KB PNG
https://arxiv.org/abs/2410.01201
>>
>>102696203
Do you think he reads most of those papers? He also showed up on my feed and he posts a bunch each day, but I don't know what to think. Is he an influencer?
>>
>>102696240
he is "dude the future is here!!!" grifter.
>>
>>102696259
He post some interesting things. He posts A LOT though.
>>
>>102696259
But yeah, looks like he has a newsletter and podcast, that's more than I'm doing.
>>
Who the fuck are you talking about? More importantly, why?
>>
How does exl2 compare to gguf in quality? I tried it a few months ago and it was dumb compared to a gguf at the same bpw.
>>
>>102696304
Someone on Twitter that spends their day posting half highlighted papers. That's where that picture is from.
>>
File: doesntreallymatter.png (479 KB, 454x1867)
479 KB
479 KB PNG
>>102696304
>>102696319
>>
>>102696161
Just with 1 c. Maybe in 2 weeks it will be 2 c's.
>>
Why is there not even a bad 7b bitnet to see if it works? How much money/hardware does it require just for 7b?
>>
>>102696449
>bitnet
It requires 7 billion H100's. That's why nobody has attempted it yet.
>>
>>102696449
There's currently no market to earn money from small models so researchers would rather use their resources to improve current training methodologies. They're not being bottlenecked by VRAM at inference anyway. How is this a real question that left your head?
>Why hasn't anyone spent millions of dollars to pander to vramlets?
>>
>>102692783
give me this but for any user-specified type(s) of image
>twitter screencaps
>frogs
>lust-provoking images with irrelevant time-wasting questions
>specific game(s) for boards like /v/ (zelda, elden ring, gacha, any fromslop/nintendo game)
>specific people for boards like /pol/ (jewtin, jewlensky, any american politicians)
>tranime
>etc.
the browsing experience would skyrocket
>>
pingas
>>
>>102690004
I like using it for fun like this in the last assistant prefix.
[New direction: change your writing style and prose, but keep characters and dialogue consistent. Write as if ONLY the narrator's personality changed, as if it were {{random: the Heavy from TF2, the Spy from TF2, the Pyro from TF2, Steve Jobs, Donald Trump, Kanye West, Vince Offer, John Carmack, a drunk Scottish lass, the one and only Jesus Christ, a based and redpilled 4chan anon, the real Santa Claus, House MD from House, David Attenborough, my mother lololol}}.]
>>
>>102696700
>tranime
Why are you retards even here?
>>
>>102696710
no moderation or censorship
>>
File: file.jpg (20 KB, 143x156)
20 KB
20 KB JPG
>>102696706
I'm coping as usual, you see
>>
>>102696748
there is both of those so go away and kill yourself
>>
>>102696710
>retards
I just want to block jaks, the rest of that has nothing to do with me.
>>
it should be a banworthy offense to post without an anime image attached
to make this reasonable, there should be an optional second image slot for 'obligatory anime pic' which you use if you have a non-anime image for discussion
the quality of users and discourse would go up tenfold overnight and continue to rise for a while as the undesirables start to filter out of our communities
>>
>>102696710
anon wants to filter out tranime avatarfags, nothing wrong with it.
>>
Post 102696787:
DECISION - BAN
>>
File: 00106-3050314564.png (321 KB, 512x512)
321 KB
321 KB PNG
Anon is just triggered by all the high quality AI generated Mikus that we see here.
>>
There aren't many Qwen 32b finetunes, how does the AGI version compare to official?
>>
>>102696803
>that pic
>>>>>>>>>>>>>high quality
You might need your eyes checked.
>>
File: 00024-1397236490.png (327 KB, 512x512)
327 KB
327 KB PNG
>>102696850
Are you saying there's something wrong with the quality of my Mikus?
>>
>>102696791
Also you fags immediately proved "tranime" call right, you got triggered in nanoseconds over this small funny word, not a good look.
>>102696871
Put some effort in it at least.
>>
>>102696871
Oh, nothing much. Her fingers just fused together. It happens to all of us sometimes.
>>
>>102696881
>you got triggered in nanoseconds over this small funny word, not a good look.
NTA but are you sure you should be using this line of argument?
The people that use the word tranime unironically are the biggest snowflakes.
All you'd have to do is use female pronouns for Jart and you would get like 10 replies.
>>
>>102696710
>Why are you retards even here?
Election tourists and zoomers decided this is a safe place to fight their culture war.
So they come to an anime website to screech about seeing anime.
>>
File: 1728052968563424.png (69 KB, 574x81)
69 KB
69 KB PNG
>>102696993
>anime website
kys
>>
>>102696949
dilate
>>
>>102697015
not moot; moot point
>>
>>102697015
go back, newfag
>>
File: 1723695697755080.png (703 KB, 772x2025)
703 KB
703 KB PNG
>>102697040
go back yourself
>>
You guys are really bored, huh?
>>
File: 1.png (21 KB, 387x173)
21 KB
21 KB PNG
>>
>>102697069
Yes... when will a good local model release and end this?
>>
>Tranime troons getting THIS mad
>>
dead general
>>
File: normal gumi gen.png (588 KB, 512x512)
588 KB
588 KB PNG
>>102696871
Your Mikus have always been valid. Were those from the model9, or that other experiment from a while back?
>>
>>102697085
As soon as someone leaks a good model
>>
If Grok 3 is AGI, that means AGI would be open sourced when Grok 4 releases. At the pace xAI pushes out models that could mean we get local AGI in as little as a year. This thought gives me hope.
>>
>>102697015
extremely high quality bait kek
>>
>>102697157
>If Grok 3 is AGI
lol
>>
>>102697106
model9 and some other fucky model that I never released.
>>
>>102697157
Calm down. Grok 2 is out and they still haven't open sourced 1.5. There's no way of knowing if rocketman is going to keep his word about the 6 month timeline.
>>
>>102697157
lmao
>>
>>102697157
>he still thinks transformers can achieve AGI
>>
>>102697176
They can with the right dataset.
>>
>>102697187
>i-it just needs more training
You don't even know what AGI is.
>>
>>102697193
Yes I do.
>>
>>102697187
A dataset created by AGI maybe.
>>
>>102697202
No you don't. AGI doesn't just mean "more knowledge".
>>
>>102697187
>AGI
>dataset
You fuckers are so dumb I wish you were pretending.
>>
>>102697241
Yes I do, though I agree with your second sentence.
>>
>>102697193
This is AGI
https://huggingface.co/AiCloser/Qwen2.5-32B-AGI
>>
>>102697268
Then you should know, that by design transformer text prediction models are incapable of achieving AGI.
>>
>>102693127
>NAI fiasco
What is this about? The SD finetune leaking way back when?
>>
>>102697272
Spoiler: It's not.
>>
>>102696748
Try to say nigger 4 times.
>>
Crazy that we're getting local AGI soon. I wonder how governments will react to this development.
Can they fight the fact that intelligence is just a simple statistical thing?
>>
File: 1717753644805912.jpg (41 KB, 768x768)
41 KB
41 KB JPG
Has anyone ever tested the capabilities of multilingual models like Largestral in an educational context?
Been wanting to learn a language but I'm sure apps like Duolingo won't suffice, nor will simply reading books/watching series/listening to music.
Could models like Largestral act as a "personal teacher" of sorts, in the sense that I might be able to ask it questions and have it explain the grammar and such to me, or ask it to proofread short text to see if it's correct and makes sense (and if not, explain why)? Any character cards/presets that would work for that?
Or are current multilingual models just too shit?
>>
There are apps now that run local 0.5~3b models on phones.
Now I wonder what kind of apps will pop out in the next year.
>>
>>102697415
>, nor will simply reading books/watching series/listening to music.
that plus a dictionary is all you need.
>, in the sense that I might be able to ask it questions and have it explain the grammar and such to me, or ask it to proofread short text to see if it's correct and makes sense (and if not, explain why)?
Yes.
> Any character cards/presets that would work for that?
"You are an expert [X] language tutor. You will assist user with all [X] questions."

Stop over complicating things.
>>
is it crazy I used to be really into all this shit every week trying a new model getting hyped for the next
and now I check monthly to see there is no new mistral model and just forget about it until the next time I remember to check again
I used to tryhard on quants and context trying to min max the hell out of my gpu, even tried sitting through t1/s replies checking to see if they'd be smarter or better
Now it's just whatever, it'll be wrong either way, the only important thing is it's entertainingly wrong or an easy edit wrong
I don't regen for accuracy anymore, I just regen to get a reply I have to edit the least
What the hell happened?
>>
>>102697295
Tokens are not necessarily just text; modern models are representing more domains with them.
But even if they were: converting text into actions is the easiest part by far. Everything in the world with an input does something like this already. The hard part is reliably generating the text (including instructions to devices) that represents useful actions toward any given goal without involving any human intelligence. That's the part transformers can solve - with the right data.
>>
>>102697475
You reached a comfort zone plus the stagnation of coom models.
>>
>>102697494
We need models with keyboard and mouse output tokens.
>>
>>102697475
you got bored, you probably go through that cycle with a lot of shit if you think about it
>>
File: file.png (2.39 MB, 1920x1080)
2.39 MB
2.39 MB PNG
>>102697475
>What the hell happened?
It is the effect of the true safety anti-coom measures. It would be piss easy to make the models reject everything sexual but that would lead to people working on actual jailbreaks. However if you make the models suck your dick but do it badly most people will think there are no safeties or the safety was circumvented. And they will quickly get bored with LLM cooming which was the original goal of the safety measures.
>>
>>102697419
>models on phone
It's not viable because it drains the battery too much
>>
>>102697516
I remember one of the chinese cog-something models was supposed to be specialized at that, not sure if they used specialized tokens but it'd output coordinates and click commands and claimed to be specialized at navigating point-and-click guis. was like 20b and didn't have any good quants though so idk, probably outdated by now
>>
>>102697565
I bet that's why apple didn't launch its phone AI.
I'm sure people will look for a solution to that in the next few months.
>>
Has anyone here tested this https://huggingface.co/AiCloser/Qwen2.5-32B-AGI ?
>>
File: 1722578641423188.png (363 KB, 1195x1106)
363 KB
363 KB PNG
New optimizer? Half the memory usage of AdamW and 30% faster. @CUDA dev
https://x.com/kellerjordan0/status/1842300916864844014
>>
Can a local GPU with 12GB be used for novel generation? I see a lot of chatbots checkpoints, but all the story generations seem to be on SaaS.
>>
>>102697724
>124m
who cares, wont scale to billion param models, yawn
>>
>>102697724
Would use that if he releases it, a lot of small NLP models would benefit from that
>>
>>102697724
don't care
if i still need to set an LR and schedule it, I don't want it
>>
>>102697724
I don't use Twitter and don't know which buttons I need to press to see the entire Tweet thread.
Can you spoonfeed me a link to where the technical details are explained?
>>
>>102697775
NovelAI was using a 13b as its strongest model for its service up until a week or two ago.
12b models are newer and better, you could fit an entire nemo model into your vram at a high quant.
models specialized for chatting can still write stories.
>>
>>102697862
You need to make an account.
>>
>>102697724
>>102697862
Never mind, I found the dude's Github page.
>>
>>102697862
You can replace x.com with xcancel.com or nitter.poast.org
https://xcancel.com/kellerjordan0/status/1842300916864844014
>>
File: 1725091283882503.png (104 KB, 1056x634)
104 KB
104 KB PNG
>>102697862
https://github.com/KellerJordan/modded-nanogpt
>>
>>102697862
BASED
>>
>>102697724
sgd uses half the memory and is 30% faster. also usually gives better results and doesn't blow up.
>>
>>102697724
>Not converged
Waste of an experiment
>>
Someone should make a Canvas clone.
Unlike the overhyped o1 which is a nothingburger, Canvas is actually useful when it works (about half the time).
>>
>>102697724
I bookmarked the page, it seems like a reasonable enough thing to try out once I have the llama.cpp/GGML training code in a state where you can actually think about using it.
Though since my ultimate goal is training/finetuning of quantized models rather than FP16 training the question will be how well this optimizer performs at 8 bit precision or less (for AdamW to my knowledge 8 bit works).
>>
>>102697296
No. Their 70B fine-tune with billions of tokens
>>
>>102697724
What happened to Sophia?
>>
>>102697475
You reached enlightenment
>>
>>102693127
>There are two problems afflicting the local AI community right now
1. All of you niggers are broke and can't afford to train
2. The people who CAN afford to train only want GPT slop
I plan to fix both... Eventually
>>
>>102698006
What is it?
>>
>>102698238
how do I invest in you
>>
>>102697862
a pure soul
>>
>>102697862
Lurk more
>>
>>102698251
I don't need money. I'm rich! I got fat stacks and super PACs.
Really, I know what needs to be done. I just haven't done it yet because I'm lazy. But I'll do it (soon). Before the end of 2024. Trust.
>>
>>102698286
I trust you.
>>
>>102698286
I only trust in results, nigger.
>>
>>102698286
it's not like I trust you or anything baka
>>
>>102698286
I do not 'trust'. Show your work or get the rope.
>>
File: file.png (1.45 MB, 1024x1024)
1.45 MB
1.45 MB PNG
>>
File: file.jpg (37 KB, 399x388)
37 KB
37 KB JPG
>>102698472
>>
>>102688915
>>102698472
where is the 31 days image?
>>
>>102698472
Imposter !!!
>>
File: file.png (498 KB, 640x640)
498 KB
498 KB PNG
>>102698493
happy now?
>>
>>102698523
Better.
>>
>>102698523
Worse.
>>
>>102698286
if you're lazy tell just chatgpt what to do
>>
I loaded nemo after 2 weeks of giving up on LLM cooming and holy shit it is all so bad. I can actually believe people saying that mythomax is the best because I can't believe the current best thing in that range is this fucking bad. Safety won. Biowhores won.
>>
>>102698799
Man, it's like reading a porn script when you're not horny. It's very cringe
>>
>>102698799
you loaded the instruct version and you're whining?
>>
>>102698523
Hot Petra
>>
>>102698799
Same here, I occasionally try to get into LLM storywriting again, generate a few sentences, roll my eyes and remember why I gave up last time.
>>
What do you guys think of Mistral-Nemo-Gutenberg-Doppel-12B-v2-GGUF? Is it decent for local?
>>
>>102698948
>>102698948
>>102698948
>>
>>102698839
Base instruct and some shittune. It is all basically the same.
>>
dead thread it's over for local
>>
>>102699888
nice trips



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.