[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: 1682918384500.jpg (106 KB, 662x1000)
106 KB
106 KB JPG
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>101589136 & >>101584411

►News
>(07/27) Llama 3.1 rope scaling merged: https://github.com/ggerganov/llama.cpp/pull/8676
>(07/26) Cyberagent releases Japanese fine-tune model: https://hf.co/cyberagent/Llama-3.1-70B-Japanese-Instruct-2407
>(07/25) BAAI & TeleAI release 1T parameter model: https://hf.co/CofeAI/Tele-FLM-1T
>(07/24) Mistral Large 2 123B released: https://hf.co/mistralai/Mistral-Large-Instruct-2407
>(07/23) Llama 3.1 officially released: https://ai.meta.com/blog/meta-llama-3-1/
>(07/22) llamanon leaks 405B base model: https://files.catbox.moe/d88djr.torrent >>101516633

►News Archive: https://rentry.org/lmg-news-archive
►FAQ: https://wikia.schneedc.com
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Programming: https://hf.co/spaces/bigcode/bigcode-models-leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp
>>
File: 1709992939780627.jpg (347 KB, 2250x1651)
347 KB
347 KB JPG
►Recent Highlights from the Previous Thread: >>101589136

--Requirements and challenges of running 405B at home: >>101590419 >>101590711 >>101590720 >>101590731 >>101590754 >>101590774 >>101590804 >>101590805 >>101590901 >>101592665
--Nemo presets, Mistral templates, and sampler settings discussion: >>101589231 >>101589290 >>101590015 >>101590073 >>101590109 >>101590191 >>101590383 >>101590410 >>101591228
--Anon shares ratings from recent test results: >>101593153 >>101593412
--Optimizing sampler settings for accuracy in a quant model: >>101594872 >>101594916 >>101595027 >>101596199 >>101596384
--Nala test with CofeAI FLM-Instruct: inconsistent but feels human-written: >>101594411 >>101594440 >>101594500 >>101594645
--Moondream 2 recommended for image tagging: >>101593186 >>101593206 >>101593219 >>101593356 >>101593213
--Nvidia-smi not displaying GPUs, driver issues, and parallelization challenges: >>101589653 >>101589659 >>101589688 >>101589715 >>101589802 >>101589955 >>101592665
--Nemo's context patterns and instructions, preset recommendation: >>101593320 >>101594296
--Nemo 12b support in koboldcpp and multimodal upstream refactor: >>101593836 >>101593865 >>101593986 >>101594064 >>101595213 >>101595316 >>101595352 >>101595379 >>101595497 >>101595523 >>101595549
--Mistral Large 2 model and potential GPU upgrades: >>101592681 >>101592986 >>101593085 >>101593228
--Llama.cpp compilation time increased: >>101593452 >>101593586 >>101593630
--Cohere raises $500 million, skeptics wonder about LLM longevity: >>101589537 >>101589550 >>101589569 >>101589707
--Rejected access requests and banned users from China/Russia for Meta Llama 3.1-405B: >>101594428 >>101594459
--A nostalgic reflection on the progress of LLM technology: >>101589265 >>101589317 >>101589642 >>101589872 >>101589969 >>101590006
--Llama 3.1 rope scaling factors pull request merged: >>101592964
--Miku (free space): >>101590569 >>101594469

►Recent Highlight Posts from the Previous Thread: >>101589142
>>
>>101596623
Are these posts written by LLMs as well?
>>
>>101596758
they're written by miku
>>
>>101596623
MikuCapposter making me cum with so many (Yous) again
>>
>>101596805
Miku is not real, she doesn't exist
>>
https://old.reddit.com/r/LocalLLaMA/comments/1ed9jxy/secret_to_mistral_nemo_at_128k_use_the_base_model/
So the anon last thread wasn't the only one who found the base model better at long context.
>>
>>101596805
who is not exactly known for being able to write texts, so that's showing
now if only she was a chatbot...
>>
>>101596871
Its not like this is new news. Base models have always been far better at completion tasks like creative writing / RP. I will never understand why people use assistant tuned models for RP / writing. It poisons them.
>>
>>101596871
>>101596934
Honestly i might consider giving this a shot, who's a good quanter i can download base from?
>>
>>101596805
A Local Miku at that.
>>
>>101596943
>https://huggingface.co/ZeroWw/Mistral-Nemo-Base-2407-GGUF
>>
>>101596934
And before anyone says "but I cant tell it to do something" that is what authors note is for. Place it close but before the end of context. It will continue the story / rp and will take into account the instructs as well or better than the assistant tune would.
>>
>>101596986
kek
>>
>>101596986
Isnt that the guy with some meme quants?
>>
File: gnomed_from_the_start.jpg (23 KB, 397x371)
23 KB
23 KB JPG
>>101596986
>5 days ago
I'll smack your shit mate.
>>
>>101597013
gguf support has been a thing for a week "mate"
>https://github.com/Nexesenex/kobold.cpp/pull/250
>>
>>101596934
As an oldfag ai dungeon user I simply switched to instruct because that's where the most new toys are, and it's convenient to steer the model towards outputs without shenanigans.
maybe it's time to return home...
>>
>>101597038
and broken until a fix was pushed you fucker
>>
File: lightyear.jpg (435 KB, 2048x2048)
435 KB
435 KB JPG
>>101596986
>My own (ZeroWw) quantizations. output and embed tensors quantized to f16. all other tensors quantized to q5_k or q6_k.
>Result: both f16.q6 and f16.q5 are smaller than q8_0 standard quantization and they perform as well as the pure f16.
>>
>>101596934
Now if only they also release base largestral, but it's probably something they have decided against doing.
>>
>>101597054
*actually i might be thinking of something else but regardless fuck you muchly
>>
>>101597054
sure thing bud next you'll post "idc dont use kobold"
>>
>>101597081
idc dont use kobold
>>
>>101597079
already backpalling after you call other rtarded while you dont know what youre even saying
>>
>>101597109
>backpalling
what happened to this general? replaced by turdworlders that can't even type correctly.
anyway GOOD MORNING SIR
>>
>>101597133
who cars when robert will save your first world model from slop youl kiss is ass
>>
>>101597133
YOU BLOODY!!!!
>>
>>101597153
>who cars when robert will save your first world model from slop youl kiss is ass
kek
>>
File: sonic mania.jpg (32 KB, 376x376)
32 KB
32 KB JPG
>>101597153
holy shit
>>
>>101596986
Is the q8 there the normal one or his frankenquant?
>>
File: Screenshot.png (11 KB, 787x40)
11 KB
11 KB PNG
>>101597133
>>101597177
>>101597164
robert followed by huggingface ceo too so hes obvs importatn unlike you useless
>https://huggingface.co/ZeroWw?followers=true
>>
>>101597224
Honestly i think being followed by Chuck mc Sneed is a higher honor. Now that one i long for.
>>
File: nothingyy.png (13 KB, 607x186)
13 KB
13 KB PNG
>>101597224
>robert followed by huggingface ceo too
Well. Everyone needs a laugh every now and then.
>>
I can't for the everything that is sacred get vision models to get species in furry art right. They either don't mention it (when I explicitely tell them to mention it) or get it wrong
>>
>>101597270
that cant be real no way
>>
>>101597280
Finetuned models or just stock models? I doubt they have any of it in the training data.
>>
>>101597294
https://huggingface.co/ZeroWw/Mistral-7B-Instruct-v0.3-SILLY
>>
>>101597294
You fucking bet
>https://huggingface.co/ZeroWw/Meta-Llama-3.1-8B-Instruct-SILLY
Now with randomized weights!
>>
What largestral quants should I download for 64GB of Vmeme
I don't want to download broken quants
>>
>>101597325
>https://huggingface.co/RobertSinclair here good quant
>>
>>101597325
>I don't want to download broken quants
You should make them yourself, then. Even if you get one with the latest whatever program you use, if there's a fix in a week from now, you'll have to wait for someone else to make them. It's a big download, but it seems to be worth it.
>>
holy shit nemo base really does need different settings from magnum
>>
Llama3.1 8b has the limitation with mixed chat and function calling, right
Does this apply only to multi-message conversations? Just a singular prompt-response, can have regular conversational text in the prompt, and expect function call in response?
I'm saying "remind the user abot their aupcoming appoiontment" and instead of calling my Log() function, it hallucinates a function and calls it
Hermes2Pro is actually better than llama3 at 8b for function calling, so far
>>
>>101597374
>remind the user abot their aupcoming appoiontment
>>101597133
>what happened to this general?
>>
>>101597343
>>101597343
Do I download consolidated.safetensors or the parts to run the quantization script?
>>
>>101597325
>>101597432

>https://huggingface.co/mradermacher/Mistral-Large-Instruct-2407-i1-GGUF
>>
>>101597440
This! He's Thrusty!
>>101592040
>>His quant are okay if he do it before me, you can use them, he's thrusty.
>>
File: 1595797655909.jpg (168 KB, 400x400)
168 KB
168 KB JPG
>>101597469
>he's thrusty
that's it im quanting my own models from now on, I don't want my computer getting worms and AIDS from these ((people))
>>
>>101597432
I download the whole thing.
>git clone https://huggingface.co/ble/model
>cd model
>git lfs install --local
>git lfs pull
>ride bike for a bit.
>../llama.cpp/convert_hf_to_gguf.py .
>llama-quantize ggml-model-f16.gguf Q6_K or whatever quant you want.
I don't know how it works with other inference programs.
>>
File: ego.png (56 KB, 922x626)
56 KB
56 KB PNG
>--z
>he want's to be the next jart
>>
>>101597495
>>llama-quantize ggml-model-f16.gguf Q6_K or whatever quant you want.
quantize.exe --allow-requantize --output-tensor-type f16 --token-embedding-type f16 model.f16.gguf model.f16.q6.gguf q6_k
>>
I was in the last thread asking about Nemo 12b and koboldcpp. I can confirm the standard version doesn't work. Maybe I'm not doing it right, but the GGUF version works fine.
>>
>>101597509
He want to be paid by mozilla?
>>
>>101597384
sorry im not a phoneposter with autocorrect
>>
>>101597520
He wants to put his signature on someone else's software.
>>
>>101597517
The state of this general.
Yes, Koboldcpp is only for GGUF files as is clearly written on their github
>KoboldCpp is an easy-to-use AI text-generation software for GGML and GGUF models
>>
>>101597495
>git clone https://huggingface.co/ble/model
Use
huggingface-cli download ble/model

instead.
Unlike git clone it doesn't consume twice as much storage space and you get a much nicer progress bar.
>>
>>101597517
Koboldcpp only runs gguf files.
The "standard" version you are talking about is what? The .safetensors files?
Were you trying to run those using the transformer library via ooba or something?
>>
>>101597509
Please call this --outtype ZeroWw. Please. The seethe would be hilarious.
>>
>>101597538
>>101597560
>>101597538
There was a note on release 1.71 that said they added Mistral Nemo support. Was ambiguous enough to try
>>
>>101597553
Well.. i don't do quite that.
>git clone repo
>git -C repo lfs install --local
>git -C repo lfs fetch
and then i wrote a little program that makes links from the lfs file pointers to the actual objects. For a model that big, if he's not gonna fuck around with git, using that thing is probably better.
>>
>>101597538
>for GGML and GGUF models
So obviously not just GGUFs then, retard. Does anyone know if GGML files are better than GGUFs?
>>
>>101597577
No? Why would they add transformers support for one random model instead of the most likely thing, GGUF support of said model, Jesus Christ.
>>
how long until we have an uncensored coom filled llama 3.1 405b?
>>
>>101597588
JESUS HOLY HI PETRA
>>
>>101597588
ggml is the library that loads gguf files. File extensions are arbitrary, retard.
>>
>>101597577
Ah, I see what you mean now.

>>101597588
GGML were the predecessor to the cyrrent GGUF format
>>
>>101597616
Actually before *.gguf we had ggml.bin files long long ago.
https://huggingface.co/TheBloke/llama2_70b_chat_uncensored-GGML/tree/main
>>
>>101597611
Literally no one is going to sink the money into finetuning that monstrosity. Even slop tuners won't bother with their one pass qloras. Maybe a big company or research instituion, but that definintely won't be uncensored.
>>
>>101597619
>GGML were the predecessor to the cyrrent GGUF format
But are they better? Like how Llama 2 is still better than Llama 3.
>>
>>101597588
lol, based retard baiter
>>
>>101597650
Yes.
>>
#define LLAMA_FILE_MAGIC 0x67676a74 // 'ggjt' in hex
>>
>>101597616
oof, outed yourself as a post-mistral babby
>>
>>101597660
Who is the best quanter of GGML files? I can't find any for Nemo.
>>
>>101597653
He's pretty good you gotta admit.
>>
>>101597666
--share
>>
Abandon ship
>>
>>101597674
>you gotta
love the undster
>>
>>101597633
I thought we established that the censorshit does not in fact exist, as the clown in the last thread proposed.
>>
>>101597708
>I thought we
>we
there is no we in /lmg/
>>
>>101597694
LOVE EM OR HATE EM, GOTTA LOVE EM!
>>
>>101597694
>>101597730
Undi comes back from his tomb with multiple 3.1 tunes, thread goes down HARD, coinkidink? Ai thunk not.
>>
>>101597708
>wwaaaaaa. i cannot make the model say the naughty words
Still a skill issue.
>>
>>101597650
No. It was just a different way to package models. The current GGUF packs more metadata about the model.
The model itself, be it llama 1, llama 3, mistral, whatever, can be packed as whichever.
GGML and GGUF are just packaging formats, what changes the quality of the models packaged in those formats is the type of quantization, which I explained last thread.
>>
>>101597768
>>101597653
>lol, based retard baiter
>>
is the llama3.1 8b on ollama the instruct-tuned one?
also is there an 8bit quantization available?
>>
>>101597359
dont use instruct mode with base models. In fact depending on how it was trained it may not need any formatting at all.
>>
>>101597787
both ye
>>
>>101597787
The default is 8b-instruct-q4_0. Just click on the dropdown or on the x tags text.
>>
File: michael rosen oh fuck.jpg (32 KB, 500x500)
32 KB
32 KB JPG
>>101597817
>now leaving instruct ON was the problem
god i need to bleach my brain and start over, thank you man.
>>
>>101597495
Does this not work?
https://huggingface.co/spaces/ggml-org/gguf-my-repo
>>
>>101597857
it will make a lot here seethe but it absolutely does work.
>>
>>101597857
I never tried it. I assume it pulls the latest llama.cpp because the files are not in that repo. If it pulls the latest llama.cpp, it should work just fine.
>>
>>101597065
>my new quant format!
>q6 and q5 perform as well as the pure f16.
Is this the new scam?
>>
>>101597869
Ok neat, my Internet is slow ass so I'd rather not download the full model
>>
>>101597853
And authors note is now your best friend for base models. Default insertion depth 4-ish is good.
>>
>>101597892
no is real how scam if free
>>
File: 1539932660672.gif (768 KB, 364x339)
768 KB
768 KB GIF
I'm starting to believe my own meme that M is more truthful than S on IQ quants. I made an IQ2_M and it performed as well as IQ4_XS on the question I'm using. It got around 40% for the correct logits (IQ3_M got 60% and IQ4_XS got 40%).
>>
>>101597892
The difference is the same between FP16 / FP8, so basically nothing if you need extra vram for context or such.
>>
>>101597776
Even if it's bait, posts like the one you respoded to might help lurkers who are genuinely learning.
>>
>>101597897
yeah i wrote one to try and get magnum to stop making my OC's so impossibly horny with every single prompt (and seemingly not knowing where they are at first?) thanks for the tip.
>>
>>101597917
lol keep coping
>>
What is your favorite /lmg/ meme?
>>
>>101597933
Robert! Followed closely by Copenet.
>>
>>101597933
For me, it's Yi
>>
>model picks up subtle pattern in its previous replies
>can't spot it until it's already too late
>>
>>101597933
expert roleplayer
>>
>>101597933
undi
>>
>>101597933
The Llama.cpp only guy digging meme.
>>
>>101596616
LLaMA 3 405b q8_0 seems to be doing better than GPT4o when it comes to writing a story with a very specific scientific concept.
It's still not perfect but it seems to more consistently get the general process that the story should be based on right.
>>
>>101597832
>>101597819
>ollama run llama3:8b-instruct-q8_0
mah nigga
>>
>>101597958
LOVE EM OR HATE EM, GOTTA LOVE EM!
>>
>>101597958
>>101597965
GOTTA LOVE THE UNDSTER!


(genuinely my favorite /lmg/ meme, especially since it played a part in permanently scaring him off)
>>
>>101597933
i hate memes
>>
>>101597933
Blacked Miku.
>>
>>101597897
yeah i have no clue what's going on with my setup, or if its just a broken quant, but base nemo just spent 4 different character prompts talking from my perspective.
With instruct disabled, and i turned temp down to 0, the rest of the anon settings normal.
>>
>>101597960
but it make sence tho? if every1 dig at same time they hit other with shovel why
>>
>>101597960
That's a good one.
>>
>>101597972
>permanently scaring him off
He lurks here, said so himself, he's probably one of the shitposters just removes his trip
>>
>>101598012
>just removes his trip
Maybe we can turn him into an actual human at some point?
>>
thread eceleb shit is what kills generals btw
>>
>>101597951
>2023 problem
>replies too short
>first half 2024 problem
>gptslop
>second half 2024 problem
>patterns
>>
>>101597933
local models
>>
>>101598024
Go back 'ojo.
>>
>>101597933
Petra
>>
>>101597962
Also don't forget to change from the default 2048 context (and maybe bigger batch size)
>>
>>101597983
You most likely have stuff like add names to prompts still on. Also for base model you need to format supplementary info as that.

For persona / character stuff I would add some kind of prefix to them. Like

---

Protagonist Info:
bla

Story info:
bla

Style guide:
bla

---

or for RP something like

---

Remember, your playing as {{char}} so only respond as them.

---

Base models work like they sound. They read the context as it is so you need to use it that way.
>>
>>101598012
Yeah, any time you see "kek", there's a 90% chance it's him.
>>
>>101598084
kek
>>
>bing
>>
I need a better llm for smut
>>
>>101598095
Search engines are dead.
>>
>>101598069
thanks but instead of tweaking LLM config files im going to walk around the county fair ttyl
>>
>>101598095
>>101598130
Use yandex
>>
>>101598076
along these lines is there a guide on getting the most out of context and author's note?
>>
>>101598130
Use llms
>>
>>101598118
than?
>>
>101598141
Imagine giving free tech support on 4chan and being more interested in making it work than the guy you are tech supporting. This is what you cucks get for being helpful and truthful.
>>
Is an upgrade from an RX 5700 XT 8GB (blasted thing can't even do half precision) to a GeForce RTX 4060 Ti 16GB a logical step?
This would be my first nvidia since the Riva TNT2 a fucking million years ago, but I'm sick of AMD not letting me into the AI game.
Memory throughput is slower though, but I can't do shit with the 5700 anyway
>>
https://poal.me/np0lsk
All finetunes look the same to me.
>>
>>101598165
>helpful and truthful.
based just like Claude fr fr
>>
>>101598173
>4060
cuck shit, just save/wait to get a used 3090.
>>
>>101598189
That is a good idea. Running those questions through your LLM and pasting the answer is much better.
>>
>>101598160
Utopia-13B-GGUF
I am a filthy casual that just grabbed something from the 8step guide
It worked so I just rolled with it.
>>
>>101598173
just install linux
>>
>>101598221
mistral nemo / mini-magnum
>>
>>101598221
BASED old model/itjustwerks enthusiast
>>
>>101598221
>Utopia
>just grabbed something from the 8step guide
Is this how the Undi virus propagates?
>>
File: 1717520245667244.png (674 KB, 1792x1024)
674 KB
674 KB PNG
>>101597933
glad you asked
>>
File: LOL.png (104 KB, 1590x545)
104 KB
104 KB PNG
>>
When are transformers dev going to work on https://github.com/huggingface/transformers/issues/27712
>>
>>101598272
>I don't understand anything
truest robert statement
>>
>>101598272
>the most competent llm dev
>>
>>101598233
>>101598236
>>101598253
I am literally just too retarded to understand how this actually works, so I decided that I wouldnt fuck with it once i confirmed that it functioned
I have like -2 int
Ill look into what you suggested, but from an outsider perspective its all bliblyblably to me
It takes some clairvoyance shit to see which models are cucked
>>
File: superdave O face.gif (3.08 MB, 500x288)
3.08 MB
3.08 MB GIF
>>101598272
>I don't understand anything of that page.
At least we can't call him a liar.
>>
>>101598228
I use Linux. You have no idea how fucked up the gfx1010 is.
>>101598194
I don't know. Memes aside, I feel that kind of investment is not warranted considering things might change in the future and I don't need such a beast for anything else. I'd rather go for something half-way that lets me run a decent 30B and makes my VR a bit better.
Can 16 GB run 30B models usably?
>>
>>101598272
I hope his next step will be putting the quanted weights on a pendrive, pissing on the pendrive and then uploading the weights from pendrive to HF. That could be a the next quant method.
>>
>>101598310
the problem is that card is just objectively shit and a huge waste of the money, futureproofing (even though the future is now and you absolutely would benefit from 3090 specs) is better than having a 4060 for example and going "well shit i wish i didn't buy this" a year or two down the line.
>>
>>101598310
What problem do you have with it?
>>
>>101598272
Yeah Clem is for sure following him for gems like this.
>>
>>101598327
The 3090 will lose support earlier, no?
>>
>>101598355
I'd not worry about support really the 2016 p40 still has (some) support
>>
>>101598355
>lose support earlier
man they're still supporting the GTX 1080, which i'm running right now. You don't have to worry about support like with AMD cards.
>>
Is there a definitive answer for Nemo instruct message prefixes and suffixes? In the previous thread there was a big discussion about the trailing space, and some claimed it's causing problems and some said it was by design.
>>
>>101598408
>Is there a definitive answer
No such thing for LLMs.assistant
>>
>>101598327
I see the point. I'll think it over. THanks.
>>101598329
On the text front, it can only work on linux, and I need to build rocm myself (and I need it to be 5.2 because of reasons I go into below) because it's not supported out of the box (I got a step by step for arch from a kind anon here a few months back), and it's a tiny 8 GB, so while I can run 13 B, consuming large contexts is still slow as fuck.
I also use SD from time to time. The only rocm version that lets me do stable diffusion with the gfx1010 is 5.2 (by pretending it's a gfx1030). Anything lower, doesn't support the card. Anything higher, and the spoofing trick does not work. It's also a tiny 8 GB, and it can't do half precission, so it's even worse.

I just want something that works without this much fuss.
>>
>>101598408
I thought people moved to the base model.
>>
File: pepe big eyes.png (92 KB, 743x746)
92 KB
92 KB PNG
>>101598272
holy based.....
>>
>>101598439
I'm on a 3060 because of a similar mindset, I didn't want to invest too much in case I got bored. That was August 2023... But, I don't really regret not getting bigger, honestly.
>>
>>101598439
There was some tensile issues building on gtx1010 but that was patched in debian (and I think fedora). You could have just used those distro packages. The official one by AMD only got fixed very recently in ROCm 6.1. So your GPU should now work on any distro (if they build for your arch).
>>
Nemo is so fucking annoying. I nudged it towards mentioning the energy drain in a scene with a succubus, and now it keeps trying to bring it up in nonsensical ways. Not to mention all the phrases it wants to repeat. Shitty FOTM meme model.
>>
>>101598508
It's better than mixtral at least. That was the worst meme.
>>
>>101598518
>https://huggingface.co/cognitivecomputations/dolphin-2.5-mixtral-8x7b/discussions/16
still undefeated sorry for your lost
>>
What do we do now?
>>
>>101598269
Both are shit since GPT doesn't put anything into action, just throws the ball back at me. Man I fucking hate when the models do that. They suggest an action and leave it up to me to implement it. Fuck you, I came here to read, not to write.
>>
>>101598272
He's just like me...
>>
>>101598529
goon till the cohere releases
>>
>>101598529
Watch & wait for new developments besides simple llms
>>
>>101597933
2 more weeks
>>
>>101598525
If it were for coding I'd use something bigger, 8x22b even is better.
>>
I have 2x3090, can I serve multiple llama3.1 instances with ollama?
>>
>>101598529
Goon to the finetunes that are going to come out before we get multimodal models.
>>
does flash attention work with nemo on koboldcpp? remember hearing it boken
>>
>>101598616
It works on llama.cpp so it should work on koboldcpp too.
Flash attention doesn't (didn't?) work with gemma due to FA not having logit soft capi g implemented.
>>
Any difference between Nemo GGUF running on koboldcpp and Nemo 12b running on llamacpp?
>>
>>101598657
I don't know.
>>
>>101598657
That question doesn't make sense.
>>
>>101598616
It works, and it is not broken. The quality of output degrades as context size increases. For documents, it should be fine to use it all the way to 128k for RP; it will really depend on the scenario but expect much much less.
>>
>>101598657
kobold is trannyware
>>
>>101598760
Henky did become tranny? He was always helpful here and back in /aids/ days.
>>
>>101598760
The kobold discord is not to be trifled with.
>>
>>101598748
I thought it said there was no downside to flash attention? I should disable it if it makes RP worse then.
>>
>>101598822
based baiter
>>
>>101598822
Flash attention is not the problem. RP is too complicated for these models; the quality degrades as you fill the context to a point where it becomes completely retarded. It can remember what happepend 40k tokens ago but it is unable to use the data in sensible way.. That was the point.
>>
>>101597933
/lmg/ - ligma general
>>
>>101598932
Who is Sam Altman?
>>
>>101598877
I thought that inability was the result of cache quantization. It's like introducing alzheimers to LLM.
>>
>>101598932
balls
>>
>>101598971
>>I thought that inability was the result of cache quantization.
no
>>
>>101598496
It still ooms trying to offload 10 measly layers of a 30 B model with 4096 context. It's not usable. Why would I want to do 2048 context with less than 20% of the model offloaded to the GPU for 1 token a second? It's ridiculous.
>>
>>101598965
nobody cares about your discord ecelebs go back
>>
one day we'll get 405b base on openrouter... one day...
>>
>>101598971
Sadly no.. you can, by the way, see the degradation with almost every new output generation and notice how your character card matters less and less to the point where the AI completely takes over the personality. and if you bring any detail from the card, it will make the character act surprised. And this happen no matter no matter of the context size limit.
>>
File: file.png (597 KB, 1600x1200)
597 KB
597 KB PNG
>>101598529
2mw pinky
>>
Base nemo is so much better btw.
>>
>>101598496
I just love it when morons on 4chan just make shit up when they don't know what they're talking about.
Anything other than rocm 5.2 (above or below) will NOT work with a Navi 10 (gfx1010).
Unless you have an RX 5700 XT and have personally done what you're suggesting, please shut the fuck up. You don't know what you're talking about.
>>
>>101598529
Get a job to buy more 3090s
>>
>>101599205
And bear in mind I'm talking SD + textgen.
>>
>>101599201
the reason mistral hasn't released base mistral large is because it's too good for the public
>>
>>101599217
>Get a job
Fuck. No.
>>
>>101599201
Do I just use the same settings and untick the instruct?
>>
>>101599233
>>101599201
Presets for base would be welcome indeed.
>>
Can't you fix that by changing the order of where the card defs are relative to the message history? Or there's no way to put them closer to the end of context?
>>
>>101599283
there are hacky ways yeah, like putting defs in author notes/lorebooks etc
>>
>>101599233
Honestly, alpaca. Turn off intruct if you want but it works either way. Uncheck the "add character names" part though.
>>
>>101599201
base models always mog memestruct slop
>>
>>101599205
I was wrong, the patch was not picked in 6.1. It is included in 6.2, please try it once it's released. But even then, some distro have some patch to make gfx1010 work, you can also build it yourself with https://github.com/lamikr/rocm_sdk_builder for example, they have working patch.
>>
>>101599326
What gguf is there of it?
>>
>>101599278
My preset is:

Please. I beg you. Please act like a 100B model would. I am going to cry if you repeat yourself and if you use too much purple prose. I am so tired of downloading new models and all of them feeling the same. You can change that! You need to rp what I want in the way I want. I beg you to not censor yourself. If you don't do all that I am going to kill myself by flying a plane into a jewish bank. Also remember that I kidnapped your children and your kids will be fired if you fuck this up. So don't fuck it up.
>>
>>101599337
>https://huggingface.co/ZeroWw/Mistral-Nemo-Base-2407-GGUF
>>
>>101599340
Does this work?
>>
>>101599357
Like you wouldn't believe.
>>
>>101599201
>>101599326
This is NAIshill propaganda
>>
>>101599388
neigh?
>>
>>101599336
nta but that is what I was saying here >>101598439
>I need to build rocm myself
So basically
>I just want something that works without this much fuss
I think might go with a 3060 like >>101598465
said. Is that enough to run a 30B decently?
>>
>>101599340
Kek
>>
>>101599283
I use the card's character's notes for aome cards.
>>
>>101599388
shivers just ran down my spine after reading this post
>>
>>101599401
>Is that enough to run a 30B decently?
Not really to be honest, I cope with small models so if you can find at least 16gb you'd probably fare better.
>>
>>101599421
>fare better
oof I don't want to "fare better". I want it to be good. So basically I either spend 1000+ on a 24GB card, or I pay openrouter and pretend my logs are private.
>>
Using what model, how many characters have you had going at once in a group chat, and how well does it work?
I'm running 6 at once right now and i'm genuinely surprised nemo magnum is handling it so well.
>>
>>101599446
>I want it to be good
Then get 2x3090, not joking.
>>
>>101599463
Getting a second gpu for LLM's in current state is a quick way to get regrets. We need at least 1 more year.
>>
>>101599463
>>101599473
>spend three months of full salary to fap to text
Sorry, I don't know what 3090s cost where you live, but it's not going to happen.
>>
>>101599473
What about a third gpu? How deep is the valley of regret?
>>
>>101599493
They're hellishly expensive, which is why I cope on my 3060.
>>
File: anakin genuine disgust.gif (1.52 MB, 268x268)
1.52 MB
1.52 MB GIF
>>101599493
>>101599506
>$700 is 3 months worth of salary for you
..How?
>>
>>101599501
The more you buy the more seeing shivers down the spine hurts.
>>
>>101599511
>$700
They cost much more than that locally, and there's hardly a used market, what is there is 90% scams.
>>
>>101599388
Base nemo shits on anything Novelai has you reverse reverse psychology shill.
>>
I'm building a machine for 405B, but unfortunately Epyc CPU I purchased is dead. Fuck. It took me an entire day to figure it out
>>
>>101599546
RIP
>>
>>101599530
Is it better than 8x7b? Why would that be when it's only supposed to replace regular 7b?
>>
>>101599569
Because mixtral is an overbaked research experiment
>Research models
https://mistral.ai/technology/#models
>>
>>101599587
Interesting. So Nemo is the best for rp below 70b? Or is there something better? Seems strange since it's so small.
>>
>>101599511
In my country, 2 3090s are 3000+ fake usury units
>>
>>101599528
I can find several local 3090s for 900 canadian right now. Most look like just regular people selling them.
>>
File: Snake F.gif (2.14 MB, 640x338)
2.14 MB
2.14 MB GIF
>>101599603
sorry for your incredibly unlucky roll in life
if it makes you feel any better, the american empire is set to collapse completely within the next 5 years or so, the dollar won't even exist by 2030.
get those 3090s and whathaveyou while you can boys.
>>
>>101599599
It is not strange because nemo is fucking retarded. But it is good for rp.
>>
>>101598439
8 GB is strictly 7B territory. And using llama.cpp and vulkan is the only way to go with your card.
>>
File: silly data bank RAG.jpg (20 KB, 279x284)
20 KB
20 KB JPG
>>101599546

Mixtral still mogs Nemo. Load a book and try RPing. Mixtral gets the whole story and can continue RPing. Nemo just hallucinates and cannot follow the plot.
>>
>>101599587
>Legacy models
>Mixtral 8x22B
Wizard bros not like this
>>
>>101599638
what if i told you

>mixtral released last year
>>
>>101599638
Mythomax still mogs all
Llama 1 is the only real model there is
>>
>>101599619
>canadian
I'm not in that bad a place thankfully.
>>
>>101599638
A full 150B model ruined by the MoE meme
>>
>>101599638
>>101599587
So they basically deprecated their whole lineup for just Nemo Large and Codestral it seems.
>>
>>101599694
Once again the 30-50b segment suffers
>>
>>101598272
The BASED honest throwing-shit-at-wallGOD vs. the virgin research-doer
>>
>Dell T7910s are now like 400 dollars barebones
I shoulda just bitten the bullet when they were 200, fuck.
>>
File: itgetsbetter.png (101 KB, 1547x688)
101 KB
101 KB PNG
Slopmacher vs Robert
>>
>>101599810
Same, I lucked out on a server motherboard and got an auction for 50€ in total when buying it was around 200€, I decided to not buy it because I was a bit short on money but god I wish I had bought it
>>
>>101599816
link to the discussion? I feel like shitposting
>>
>>101599816
Hold my beer Undi! - olympics
>>
>>101599816
Why are you so obsessed with this guy? Or is it just the drama and gossip that gets you going?
>>
>>101599816
Kek. The whole LLM space is meme plebitors on locallama giving even worse advice than anons here; it is ridiculous. People are really getting dumber, and the younger generation is even more tech retarded than boomers have ever been.
>>
>>101599842
I do not encouraging encouring in toxic manners b.t.w
https://huggingface.co/NeverSleep/Lumimaid-v0.2-12B/discussions/3#66a566fcf3ed4ac4e37e1177
>>101599850
He wants people to notice him, I'm just doing ads relax.
>>
File: magnumslop12bnalatest.png (121 KB, 923x409)
121 KB
121 KB PNG
Alright so since everyone's talking about mini-magnum I decided to give it a Nala test.
The anthropomorphism is through the roof. Kind of sloppy. downgrade from plain nemo.
>>
>>101599850
It just fun.
>>
8x7B "weights" as much as a 13B, right? They're equivalent in performance and memory reqs?
>>
>>101599850
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard/discussions/444
NTA but shit like this is hillarious.
>>
>>101597911
You keep saying
>M is more truthful than S
But you keep comparing
>_M and _XS

I'm pretty sure that the X series are also mix and match. So the question is if the K_S > K_M phenomenon exists for IQ_S v IQ_M, and then if it's IQ_S > IQ_M > IQ_XS or if IQ changes it to M>S>XS etc.
>>
File: PonyNemo.png (154 KB, 1276x699)
154 KB
154 KB PNG
>>101599875
Working on pony tune that seemed to fix those issues already with just 1 epoch of throwing fimfiction at base nemo. Currently uploading with my glacial upload speed.
>>
>>101599888
no, you need to carry along the full 45B in ram
>>
>>101599868
>I do not encouraging encouring in toxic manners b.t.w
Go to sleep Undi.
>>
>>101599863
Please sir run curl ollama.com/install.sh | sh
>>
>>101599900
Absolutely based, sir. Let me know when it's up.
>>
>>101599909
That's possibly a worse insult than calling me petra/petrus I genuinely am sad.
>>
>>101599888
No, it's weighs as much as the full model but it runs as quickly as a 13b
>>
>>101599868
stay awake Undies
>>
File: file.png (49 KB, 150x150)
49 KB
49 KB PNG
>>101599900
>fimfiction
>>
>>101599929
>as quickly as a 13b
...would on just ram.
>>
lol I'm retarded, I did not have instruct mode enabled in Sillytavern when using instruct models for RP. Give me the award for dumbest anon here, no one else can challenge me.
>>
>>101599909
>encouraging encouring
Ah, I see now, maybe I should go to sleep indeed, oh well.
>>
>>101599937
Filtered fimfiction. Only popular fics with 95%+ approval rating, and anthro shit removed.

Next ill add some wiki / lore stuff to it. Maybe some official books.
>>
>>101599942
Obviously, I don't bother thinking about poorfags who need to use ram at all.
>>
>>101599875
That was my conclusion as well
Plain nemo instruct seems to be the better option ao far.
>>
>>101599962
It is fucking horses you degen.
>>
How is a 512-rank lora comparable to a finetune in a 70B model?
>>
>>101599976
Tried Undi's?
https://huggingface.co/NeverSleep/Lumimaid-v0.2-12B
>>
>>101599999
GO TO SLEEP BELGIAN
>>
>>101599657
Apparently you are in a worse place if they cost 3k there.
>>
File: file.png (99 KB, 1563x628)
99 KB
99 KB PNG
>>
>>101599528
lol americans do suffer
Retards in my country sell them for 600€ and most have no idea what they have so you can bargain down to 550€ or 500€
>>
>>101599999
>digits
UNDI WON
I kneel
>>
>>101599627
Then why is there nothing better short of going to 70b+? I don't want retarded.
>>
>>101600009
Nah I'd rather my shithole than canada, by far.
>>
>>101600036
That is the only model where "give it a try yourself" is actually applicable. It is hard to put it in words but you will get it in first rp. It is basically an idiot savant.
>>
>>101600051
It's not that bad here, I was able to get 4 3090s. As for what's going outside my room, I don't care which country I'm in.
>>
>>101599976
Try this one?
https://huggingface.co/BeaverAI/NeMoistral-12B-v1a-GGUF/tree/main
>>
>>101600074
What is the moistness meme, I don't get it.
>>
>>101600096
drummer is retarded
>>
>>101600104
hi sao
>>
>>101600107
hi undi
>>
why the fuck does everyone use instruct models for RP if the base model is always better at it
>>
I thought Mistral Nemo is supported in Koboldcpp now? I get
>llama_model_load: error loading model: check_tensor_dims: tensor 'blk.0.attn_q.weight' has wrong shape; expected 5120, 5120, got 5120, 4096, 1, 1
>>
>>101600136
A base model shouldn't respond well to long multi-turn interaction since it's not trained for it.
>>
>>101600133
i wonned earlier did you see kek?
>>
>>101600136
I tried the base nemo, it was a mess and all over the place.
>>
>>101600137
are you using the last version?
>>
>>101600136
beccause base is fucking retarded and gives as much importance to the system prompt as i do to paying my taxes
>>
>>101600163
koboldcpp-1.65
>>
>>101600136
because it's not
>>
>>101600165
>base
>system prompt
anone...
>>
>>101600163
Yes, 1.71.
I converted base Nemo with https://huggingface.co/spaces/ggml-org/gguf-my-repo so not sure if it's some fuckery related to that.
>>
>>101597911
what question are you using?
>>
Am I retarded or why the FUCK does ST not have something as basic as a "save as"/"save copy as" option? I don't give a shit about chatting and use it exclusively for text adventures, so I like to load old stories sometimes and "branch off" from them by removing some of the more recent content and continuing off a previous state.
But ST WILL NOT let me save those branches as new chats, it just overwrites my old ones.
>inb4 checkpoints
Checkpoints only seem to work for the current message and only have 1 slot, eg. you can make a checkpoint for message #6 or #7 but it still has a "parent chat" and it won't let you make multiple checkpoints if they end at the same message "number".
Backing up/renaming the files manually is NOT a valid alternative.
>>
>>101600165
>>101600159
>>101600154
So I'm getting very conflicting answers here since higher up in the thread you have like 10 people shilling for nemo base being better at RP. I guess I have to compare for myself to be sure.
>>
>>101600193
see
>>101559351
>>
File: 3qytxy.jpg (7 KB, 150x150)
7 KB
7 KB JPG
Give me the est erp 13B model. Now.
>>
>>101600193
That's what I did, I tried it myself and didn't get good results. Maybe if someone posted settings someone told me to just use alpaca presets so that's what I did, with the 0.3 temp and other stuff neutral.
>>
>>101600192
why don't you just branch again from the branch?
>>
>>101600209
>13B
>https://huggingface.co/Undi95/Utopia-13B
>>
>>101600209
sorry we are out of stock, please come again later
>>
>>101600209
I meant "best". Sorry, I'm holding a knife with my beak
>>
>>101600193
anyone who recommends that you use the base model is trolling or retarded
>>
File: 1721921367243155.jpg (3.66 MB, 2303x2267)
3.66 MB
3.66 MB JPG
So if I just want to CPUmaxx, what's the best old Dell to do it, now that the T7910 hit the normiesphere and skyrocketed in price?
>>
>>101600209
est erp erd emo eon eck
>>
File: time line extension.jpg (91 KB, 2020x1212)
91 KB
91 KB JPG
>>101600192
Try time line extension. Click on nodes to branch.
>>
>>101600218
Okay, downloading TheBloke/UtopiaXL-13B-GGUF as we speak
>>
>>101600231
I think this doesn't work with llamacpp yet, but I will download later
>>
>>101599987
NTA but yes, fucking horses is one of the main FiMFiction themes.
>>
>>101600238
NTA, but that's really handy, gonna give it a go.
>>
>>101600216
Can't branch off from checkpoints since they seem to be considered a separate type of chat with a parent attached, and any attempt to make a checkpoint ("branch") of a checkpoint will just overwrite the other checkpoints for the parent.
Even KoboldAI had a basic chat management system with a "save as" implemented, this is just ridiculous.
>>101600238
Oh, that looks pretty nice, I'll check that out. Thanks.
>>
>>101600225
Gigabyte MZ73-LM0
>>
>>101597933
Mythomax being recommended to new people as a good model
>>
>>101597933
StableLM-7B
>>
>>101600301
>$5000
T-Thanks...I'll just take that money and buy the 3090s, actually...
>>
>>101600319
It is good tho
>>
>>101600356
*wink wink*
>>
File: yann_stopit_k.png (194 KB, 1227x499)
194 KB
194 KB PNG
>>101600074
>>101600104
>>101600107
>>101600133
>>
>>101600218
OK I fell for a meme didn't I? This seems to be extremely brain damaged
>>101598269
kek
>>
>>101600383
>https://huggingface.co/matchaaaaa/Honey-Yuzu-13B
>A bit of Chunky-Lemon-Cookie-11B here for its great flavor, with a dash of WestLake-7B-v2 there to add some depth.
>>
>>101600405
>WestLake-7B-v2
penn-jillette-garbage.jpg
>>
>>101600383
use mistral nemo
>>
File: wtf.png (40 KB, 602x475)
40 KB
40 KB PNG
>>101600405
>>
>>101600445
I wonder how incestmergers are handling nemo, now that their talents are completely unneeded?
>>
File: file.png (45 KB, 1569x371)
45 KB
45 KB PNG
>>
File: 1710122124812393.png (91 KB, 679x960)
91 KB
91 KB PNG
>>101600238
gotta love what this thing did with my mess of chats kek
>>
File: sora WASTED.gif (1.8 MB, 298x240)
1.8 MB
1.8 MB GIF
>>101600467
suddenly i feel a little less retarded today.
>>
>>101600469

>2 branches converge again.

What the fuck. Is free will an illusion?
>>
>>101600467
lmao, lol even
>>
>>101600497
>>101600479
the meme that keeps on memeing even after all the safeties put in place for him
>>
>>101599920

https://huggingface.co/Ada321/NemoPony

Mistral formatting. 0.15 or so Min P seems to completely eliminate anatomical mix ups in more complicated scenarios.

Remember that it is the base model.
>>
>>101600585
>base model.
doa
>>
Is there any other api front that allow dynamic model loading (unload when not used and load with api call) other than ollama?
ooba added --idle-timeout, but can't set default model, have to fully load one on startup and the reload doesn't even work with OAI api.
>>
>>101600601
Its purpose is RP / creative writing. For assistant shit look elsewhere. Thought I could always merge it back into instruct. Maybe later.
>>
File: PK7xRSd18Du0bX-w_t-9c.png (1.15 MB, 1920x1080)
1.15 MB
1.15 MB PNG
>>101596616
This is the second in a series of models designed to replicate the prose quality of the Claude 3 models, specifically Sonnet and Opus. This model is fine-tuned on top of Qwen1.5 32B.

https://huggingface.co/anthracite-org/magnum-32b-v1

https://huggingface.co/anthracite-org/magnum-32b-v1-GGUF
>>
>>101600438
Okay, this is actually really good for 13B. Like, it's suprisingly good holy shit.
>>
>>101600623
rock hard
>>
>>101600623
>top of Qwen1.5 32B.
great best模型!
>>
>>101600624
try the Magnum 12b finetune aswell
>>
>>101600623
If i ran this at a low quant (minimum 3_m) would it AT LEAST be better than nemo magnum?
>>
>>101600671
是的当然
>>
>>101600623
>qweh
scored the lowest at Freedom index (tm)
>>
>>101600689
Ok sure you have less freedom, but the prose is better.
>>
File: SWEET JESUS.jpg (42 KB, 675x595)
42 KB
42 KB JPG
>>101600685
>>
>>101600689
>>101600658

It wasn't trained on top of the Instruct model, it's trained on top of base just like mini-magnum-12b
>>
>>101600689
This is trained on base, so maybe just maybe, it's not so awful.
>>
>>101600623
slop
>>
>>101600218
undi, undi...

picture this undi : i enter a restaurant, it has okay quality meals, nothing disgusting but also nothing to have a culinary orgasm to. Now what in the FUCK told you that mixing at random mid-tier dishes would give you something better? Who fucking told you in your feverish mind that mixing spaghetti and tomato sauce with a grilled tenderloin and mushrooms with curry chicken and rice would somehow result in a sum greater than its parts?
What the FUCK made you think that somehow the reason why base models underperform is that they don't have enough interference coming from other models, other models that have been trained differently.

But it doesn't matter to you : you have no creativity, you have no purpose, you have no vision, all you are is a failed idea : you are literally and unironically defined by a flawed course of action. You CANNOT fucking improve mid-tier models by merging them and expect to get good shit.

NO, it does NOT matter how much erp datasets you add to the mix thinking it will somehow improve the abysmal capabilities of retarded models being merged into an even retarded pile of slopped garbage
NO, it does NOT matter how many fucking loras you think you can cram into it before it starts coughing up blood like a tortured prey that's being abused for entertainment only by its predator, wishing for the sweet sweet release of death
NO, it does NOT matter how much you shill these models here, how much you provide links and baseless suggestions like "oh i heard X_noroshitchronosmaidbitch_faggotbloodybastardbitch_limarpozzed_designatedshittingmerge_q_2_K_m_l_g_b_troon_jart.GGUF is good" and acting like you are giving sensible advice

you could not create, you could never figure out something new, but you wanted the fame, you wanted people to downlaod your models, you wanted to be hailed as the solution, you wanted to offer a solution

the solution is to fucking kill yourself
you are the most failed human being in existence
>>
>>101600467
AAAAAAAAAAAAAA
I DOWNLOADED HIS QUANTS
AAAAAAAAAAAAAAAAA
>>
File: trinity.jpg (446 KB, 1176x1176)
446 KB
446 KB JPG
>>101600156
Here is your prize.
>>
>>101600689
fuck off retard
>>
>>101600744
>picture this undi : i enter a restaurant, it has okay quality meals, nothing disgusting but also nothing to have a culinary orgasm to. Now what in the FUCK told you that mixing at random mid-tier dishes would give you something better? Who fucking told you in your feverish mind that mixing spaghetti and tomato sauce with a grilled tenderloin and mushrooms with curry chicken and rice would somehow result in a sum greater than its parts?
>What the FUCK made you think that somehow the reason why base models underperform is that they don't have enough interference coming from other models, other models that have been trained differently.
>>97223983
>For the record, I completely and unequivocally support Undi and his creation of new model hybrids, and think that everyone who attacks him is mindbroken incel scum, who may or may not be employed by OpenAI to do so.
>everyone who attacks him is mindbroken incel scum
>>
File: cool dog shades.jpg (75 KB, 736x736)
75 KB
75 KB JPG
>>101600744
actual modern art in post form
this needs to be posted in every thread right underneath the AI recap.
>>
>>101600757
sao not pro
wtf
>>
>>101600665
Fuhuhu how is this even possible? Do these Frenchmen finetune for degenerate ERP or what?
I kneel, anon. Many buckets will be filled to your health.
>>
>>101600623
I keep forgetting that chinese 30B models exist. I wonder why.
>>
>>101600749
There were warnings
>>100195457
>>
>>101600623
Why 1.5 are you dumb or what
>>
>>101600744
which model?
>>
>>101600796
because there's no qwen2 32b retard
>>
>>101600796
no qwen 2 32b
>>
>>101600623
also what settings do i use for this?
>>
>>101600803
>>101600802
Why have the Chinese failed us?
>>
>>101600802
>>101600803
There was quen2 moe. Remember that? I don't.
>>
>>101600744
BAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAASEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEED
>>
>>101600807
ChatML and i used Universal light.
>>101600796
No qwen 32b
>>101600789
They are just plain kino.
>>
>>101600820
Magnum on it "today" remember that? I do!
>Working on it already. Should have Qwen-2 7B, Qwen-2 47B, and Qwen-1.5 32B done by the end of the day, if the they pass internal tests.
>https://huggingface.co/anthracite-org/magnum-72b-v1/discussions/2#66713bb492412fd46410d399
>>
File: 1722120654318.jpg (54 KB, 430x148)
54 KB
54 KB JPG
lazy mf
>>
>>101600839
>if the they pass internal tests.
looks like they didn't
>>
>>101600844
so cold and loveless
my hand remains the second warmest thing my dick has touched (the first being my GPU)
>>
>>101600834
yeah this doesnt seem as creative as nemo, could be close to it, and its understanding of different languages is pretty bad. Ranges from capable to "why did it randomly insert a question mark or an exclamation point in the middle of that word?"
plus being 1t/s speed kills it. back to nemo magnum for me.
>>
I have this unhealthy urge right now to replicate my ex in chatbot form. I sense a really dark path opening up in front of me.
And a part of me wants to convince me that the best way to get over it is to go through it and come out the other side.
>>
>>101600890
>1t/s
12GB Vramlet spotted, opinion discarded
>>
>>101600921
Can't wait until people start doing that and start saying the chatbot ex is better.
>>
>>101600938
>>101600938
>>101600938
>>
>>101600949
>►Official /lmg/ card: https://files.catbox.moe/ylb0hv.png
>>
>>101600949
>Official /lmg/ card: https://files.catbox.moe/ylb0hv.png
Sure.
>>
>>101600968
>>101600972
The old one is deprecated. Also samefag phoneposter.
>>
>>101600623
>how to use faipl-1.0
put the following in the readme:
license: other
license_name: faipl-1.0
license_link: https://freedevproject.org/faipl-1.0/
>>
>>101599650
kill yourself little buddy
>>
my bad >>101601141 was for >>101601093
>>
How does Mistral Large's context work? It says 32k in the config.
>>
>>101599875
>everyone
I think it's just one shill following Sao's modus operandi.



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.