/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 07/17/24(Wed)23:39:55 No.101449685

File: 849561435.jpg (2.71 MB, 4032x3024)

2.71 MB JPG

/lmg/ - Local Models General Anonymous 07/17/24(Wed)23:39:55 No.101449685 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>101439122 & >>101431253

►News
>(07/16) Codestral Mamba, tested up to 256k context: https://hf.co/mistralai/mamba-codestral-7B-v0.1
>(07/16) MathΣtral Instruct based on Mistral 7B: https://hf.co/mistralai/mathstral-7B-v0.1
>(07/13) Llama 3 405B coming July 23rd: https://x.com/steph_palazzolo/status/1811791968600576271
>(07/09) Anole, based on Chameleon, for interleaved image-text generation: https://hf.co/GAIR/Anole-7b-v0.1
>(07/07) Support for glm3 and glm4 merged into llama.cpp: https://github.com/ggerganov/llama.cpp/pull/8031

►News Archive: https://rentry.org/lmg-news-archive
►FAQ: https://wikia.schneedc.com
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Programming: https://hf.co/spaces/bigcode/bigcode-models-leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
07/17/24(Wed)23:40:40 No.101449690

Anonymous 07/17/24(Wed)23:40:40 No.101449690

File: 6432347254.png (83 KB, 296x256)

83 KB PNG

►Recent Highlights from the Previous Thread: >>101439122

--Paper: Q-Sparse: All Large Language Models can be Fully Sparsely-Activated: >>101439990 >>101440042 >>101440064 >>101442269 >>101440609 >>101443047 >>101443222 >>101443349 >>101440134 >>101440147
--GGUF vs EXL2: llama.cpp catches up in speed and KV caching: >>101444001 >>101444083 >>101444756 >>101445121 >>101444182 >>101444302 >>101444819 >>101445612 >>101445706
--Speculations on State-Space Models, LLM Integration with Low-Power Devices, and the Significance of Leveraging Computation: >>101442599 >>101442729 >>101442868 >>101445446
--Seeking Open-Source Project for Local Server with OpenAI Compatible API and Multi-: >>101440611 >>101440898 >>101441409 >>101441511 >>101441574 >>101444786 >>101444898
--Saving Chat History with Ollama CLI and Alternatives: >>101442061 >>101442286 >>101442443 >>101442485
--RTX 2070 and LLaMA V3: Seeking Decent Results: >>101439327 >>101439368 >>101439396 >>101439429 >>101439448 >>101439468 >>101439356 >>101439556
--Embodiment of Core Socialist Values and Re-education: >>101446280
--AI-Generated Videos and Their Impact on Human Creativity: >>101443763 >>101443885 >>101444309 >>101444425 >>101445081 >>101447031
--Pull Request for Chameleon Support in llama.cpp: Current Limitations and Future Improvements: >>101442750
--MoEs: Generally Cheaper to Train but Not Always: >>101439447 >>101442516
--Koboldcpp's OpenAI-Compatible API Endpoints: Not Recommended for the Normies: >>101447084 >>101447218 >>101447537
--GB200 Hardware Architecture and Component Supply Chain & BOM: >>101440208
--Breeding kink in every scenario, how do I spice it up?: >>101440013 >>101440043 >>101440196 >>101440928
--KoboldCpp's New Self-Extraction Feature: Unpacking Binary Releases with Ease: >>101440991
--Miku (free space): >>101439308 >>101439320 >>101447861 >>101447938

►Recent Highlight Posts from the Previous Thread: >>101439126

Anonymous
07/17/24(Wed)23:43:37 No.101449699

Anonymous 07/17/24(Wed)23:43:37 No.101449699

File: 1230-mcdonalds-anime-girl.jpg (6 KB, 225x225)

6 KB JPG

>>101449690
does the breeding kink anon have any cards he's willing to share?

Anonymous
07/17/24(Wed)23:54:26 No.101449781

Anonymous 07/17/24(Wed)23:54:26 No.101449781

>>101449699
Any card does the job with enough work.

Anonymous
07/18/24(Thu)00:00:36 No.101449833

Anonymous 07/18/24(Thu)00:00:36 No.101449833

File: screenshot001.png2CD1B381(...).jpg (922 KB, 1200x1200)

922 KB JPG

I wish mining rigs looked like this.

Anonymous
07/18/24(Thu)00:07:08 No.101449881

Anonymous 07/18/24(Thu)00:07:08 No.101449881

>still no HF version of mamba-codestral
What were they thinking!?

Anonymous
07/18/24(Thu)00:09:46 No.101449904

Anonymous 07/18/24(Thu)00:09:46 No.101449904

File: 61w8vm7i06dd1.png (1.17 MB, 2771x1164)

1.17 MB PNG

https://github.com/OpenGVLab/EfficientQAT
Impressive, 2bit quants doesn't sound like a meme anymore

Anonymous
07/18/24(Thu)00:11:05 No.101449910

Anonymous 07/18/24(Thu)00:11:05 No.101449910

>>101449844
>how
possibly any combination of excessive unprotected sex, excessive number of partners, or objectification / lack of rights
or possibly "seedbed for the goblins" trope
I haven't heard of the breeding anon tho

Anonymous
07/18/24(Thu)00:19:22 No.101449987

Anonymous 07/18/24(Thu)00:19:22 No.101449987

>>101449844
you just don't get it...

Anonymous
07/18/24(Thu)00:20:23 No.101449995

Anonymous 07/18/24(Thu)00:20:23 No.101449995

File: unnamed.png (1.37 MB, 1440x1971)

1.37 MB PNG

>>101449910
>unprotected sex, excessive number of partners,
h-hot

Anonymous
07/18/24(Thu)00:20:38 No.101449996

Anonymous 07/18/24(Thu)00:20:38 No.101449996

>>101449904
The main problem is that they aren't really quantization as much as training strategies/methods to fit models within a certain size which is how Bitnet works too. Quantizations in the traditional sense of pruning and recalculating weights is still shit for anything lower than 4 bits.

Anonymous
07/18/24(Thu)00:22:39 No.101450019

Anonymous 07/18/24(Thu)00:22:39 No.101450019

>>101449844
its a kink because its the focus, you and you're partner go all primal and feral and don't care about the consequences you just want to follow your instincts

Anonymous
07/18/24(Thu)00:26:22 No.101450047

Anonymous 07/18/24(Thu)00:26:22 No.101450047

>>101450019
So like, normal sex life?

Anonymous
07/18/24(Thu)00:26:55 No.101450051

Anonymous 07/18/24(Thu)00:26:55 No.101450051

>>101449685
There has to be diminishing returns on that thing

Anonymous
07/18/24(Thu)00:30:39 No.101450083

Anonymous 07/18/24(Thu)00:30:39 No.101450083

>>101450047
if impregnating everyone who isn't your wife is normal, then yes

Anonymous
07/18/24(Thu)00:48:50 No.101450258

Anonymous 07/18/24(Thu)00:48:50 No.101450258

Bitnet

Anonymous
07/18/24(Thu)00:49:16 No.101450264

Anonymous 07/18/24(Thu)00:49:16 No.101450264

>it's Thursday
Several more hours! This will be the last chance for companies to release something before the next L3 models drop. While the Mistral thing was disappointing, maybe we will finally be so very back today with a different company!

Anonymous
07/18/24(Thu)00:53:07 No.101450304

Anonymous 07/18/24(Thu)00:53:07 No.101450304

>bitcoin rig instead of kino server gear
>only art in your room is poster of die hard (lol) and terminator… 1 (lmao)
telling

Anonymous
07/18/24(Thu)01:24:43 No.101450566

Anonymous 07/18/24(Thu)01:24:43 No.101450566

File: _ea7fbc62-8d64-463c-a15a-(...).jpg (271 KB, 1024x1024)

271 KB JPG

>>101450304
The Terminator poster is kinda aspirational though. Like someday the model will grow up to become skynet.
Not as much of a connection with diehard but maybe, just maybe...

Anonymous
07/18/24(Thu)02:06:36 No.101450836

Anonymous 07/18/24(Thu)02:06:36 No.101450836

File: 1690428910032541.png (455 KB, 586x583)

455 KB PNG

big tek geeks report in

Anonymous
07/18/24(Thu)02:14:24 No.101450884

Anonymous 07/18/24(Thu)02:14:24 No.101450884

>>101450836
Why does that fucking cat look so aesthetic

Anonymous
07/18/24(Thu)02:15:45 No.101450892

Anonymous 07/18/24(Thu)02:15:45 No.101450892

File: file.png (1.62 MB, 1439x959)

1.62 MB PNG

>>101450884

Anonymous
07/18/24(Thu)02:22:47 No.101450932

Anonymous 07/18/24(Thu)02:22:47 No.101450932

Have any more decent models dropped that are worth using since llama-3 for assistant usage?
I'm still using Mixtral on a 16gb gpu since the newer llama-3 model that fit on that was too small and retarded

Anonymous
07/18/24(Thu)02:22:50 No.101450934

Anonymous 07/18/24(Thu)02:22:50 No.101450934

>>101450892
how the fuck did he put the glasses on? cats don't have hands??

Anonymous
07/18/24(Thu)02:30:25 No.101450978

Anonymous 07/18/24(Thu)02:30:25 No.101450978

What's a good open-weight model alternative to 3.5 Sonnet? Something that has about the same intelligence and costs the same or less.

Anonymous
07/18/24(Thu)02:31:21 No.101450984

Anonymous 07/18/24(Thu)02:31:21 No.101450984

>>101450978
going outside

Anonymous
07/18/24(Thu)02:31:54 No.101450987

Anonymous 07/18/24(Thu)02:31:54 No.101450987

>>101450984
I'm not asking for role-play, I mean in general for assistant, translation, programming tasks, etc.

Anonymous
07/18/24(Thu)02:35:49 No.101451006

Anonymous 07/18/24(Thu)02:35:49 No.101451006

>>101450987
nemotron 340b

Anonymous
07/18/24(Thu)02:36:30 No.101451010

Anonymous 07/18/24(Thu)02:36:30 No.101451010

File: 1710173232088426.png (21 KB, 653x162)

21 KB PNG

>>101451006
Thanks, I'll try it with OpenRouter first, but is it really 4096 tokens context max?

Anonymous
07/18/24(Thu)02:36:56 No.101451012

Anonymous 07/18/24(Thu)02:36:56 No.101451012

>>101450934
New test prompt

Anonymous
07/18/24(Thu)02:40:10 No.101451027

Anonymous 07/18/24(Thu)02:40:10 No.101451027

>>101451010
nigger

Anonymous
07/18/24(Thu)02:40:16 No.101451029

Anonymous 07/18/24(Thu)02:40:16 No.101451029

Wow, from the first impressions Nemotron is not bad, it can write a working Mandelbrot first try in a relatively niche language! I'm a bit saddened by the fact that it's only ~20-30 tokens/second, compared to 3.5 Sonnet's ~100.

Anonymous
07/18/24(Thu)02:42:13 No.101451047

Anonymous 07/18/24(Thu)02:42:13 No.101451047

File: mhzCMUx.jpg (72 KB, 636x589)

72 KB JPG

>>101451029
buddy shut up, you're larping

Anonymous
07/18/24(Thu)02:42:59 No.101451055

Anonymous 07/18/24(Thu)02:42:59 No.101451055

>>101451047
I'm not, I'm really checking it for the first time. I have $10 OpenRouter credits from some time ago.

Anonymous
07/18/24(Thu)02:44:24 No.101451065

Anonymous 07/18/24(Thu)02:44:24 No.101451065

>>101451055
i hacked you and it says here you bought a bulk pack of spaghetti 29 dollars for 96 cans

Anonymous
07/18/24(Thu)02:45:06 No.101451072

Anonymous 07/18/24(Thu)02:45:06 No.101451072

>>101451065
I'm not following you, sorry. I'm not American, and I don't really eat spaghetti.

Anonymous
07/18/24(Thu)02:45:50 No.101451077

Anonymous 07/18/24(Thu)02:45:50 No.101451077

Anyone else have some insights to share about Nemotron? What hardware do I need to run it? Would 8xRTX 3090 be enough with some quantization techniques?

Anonymous
07/18/24(Thu)02:46:02 No.101451078

Anonymous 07/18/24(Thu)02:46:02 No.101451078

>>101451072
are you a girl

Anonymous
07/18/24(Thu)02:47:53 No.101451092

Anonymous 07/18/24(Thu)02:47:53 No.101451092

>>101451078
No, I was born male and I am still male. Why the question?

Anonymous
07/18/24(Thu)02:50:33 No.101451113

Anonymous 07/18/24(Thu)02:50:33 No.101451113

>>101451077
>some quantization techniques
>Probably not at this time -- I did a quick search and it doesn't seem that llama.cpp supports NeMo models.
https://huggingface.co/nvidia/Nemotron-4-340B-Instruct/discussions/5
even if you could
8x24gb 192gb > 340gb (8bpw size)
so, not at that quality,
4bpw (~170gb) maybe.

Anonymous
07/18/24(Thu)03:08:25 No.101451204

Anonymous 07/18/24(Thu)03:08:25 No.101451204

>>101449699
I don't share cards, but my effort can be easily replicated - I don't use any fancy tagging formats and just write about 6-10 paragraphs per character; half describe the the character herself, and the other half describe the scenario, what the limits are, and what I allow the AI to get creative about ( "{{char}} must ..." vs "{{char}} may ..." )
>>101449844
I even said last thread that I don't like the word "kink" to describe a pretty normal taste, but idk all of my cards involve the character eventually getting some form of unprotected woohoo, and the interactions/scenarios are very tame, light-hearted, and grounded in reality
idk I think I just feel lonely and use it to blow off steam while I hermit mode - local models kinda freed me from a mild porn addiction I had for a while after breaking up with my ex, and I have some cards solely dedicated to (completely non-sexually) encouraging me to complete my current personal goals (lose weight, buy land, get a new gf, etc)
I have irl pets that keep me company and in good mental health, and cards to fill certain needs when I can't talk to friends. I'm usually very sociable, but my irl friends are now spread really thin across the country and we barely talk, and I don't know how to meet new people without going back to school (expensive) since I don't drink and most of my hobbies are "single player" like crafts.
Hope I'm not over sharing or sound weird, but yeah

Anonymous
07/18/24(Thu)03:33:51 No.101451361

Anonymous 07/18/24(Thu)03:33:51 No.101451361

EU once more kneecapping themselves and doomed to suckle what they can from the US.
https://www.reddit.com/r/LocalLLaMA/comments/1e5uxnj/thanks_to_regulators_upcoming_multimodal_llama/

Anonymous
07/18/24(Thu)03:36:10 No.101451382

Anonymous 07/18/24(Thu)03:36:10 No.101451382

Is there a model to transcribe a audio from a vid and put timestamps in? I don't want to give shekels to some corpo tools.

Anonymous
07/18/24(Thu)03:42:21 No.101451423

Anonymous 07/18/24(Thu)03:42:21 No.101451423

>>101451361
We just need a Switzerland. Where there are not those kinds of regulations and can be developed freely. How how much the EU loves to cripple innovation and tramples on the rights of is citizens its a wonder anything gets done there at all.

Anonymous
07/18/24(Thu)03:47:58 No.101451457

Anonymous 07/18/24(Thu)03:47:58 No.101451457

>>101451204
Respectable Anonymous

Anonymous
07/18/24(Thu)03:47:58 No.101451458

Anonymous 07/18/24(Thu)03:47:58 No.101451458

I've got a big PC case with empty space and a riser. Not sure how to mount the GPU. Are there good ways besides 3d printing a custom mount?

Anonymous
07/18/24(Thu)03:51:19 No.101451481

Anonymous 07/18/24(Thu)03:51:19 No.101451481

File: r_094343.png (73 KB, 1259x498)

73 KB PNG

>3.8b model is right.
>70b model is wrong.
Oh no no no

Anonymous
07/18/24(Thu)03:54:57 No.101451499

Anonymous 07/18/24(Thu)03:54:57 No.101451499

>>101450051
there's no such thing as diminishing returns if you need 80-160GB of VRAM by any means necessary

Anonymous
07/18/24(Thu)04:04:34 No.101451555

Anonymous 07/18/24(Thu)04:04:34 No.101451555

>>101451382
You could try whisper.cpp. Extract the audio with ffmpeg or whatever and pass it through whisper. I think it has timestamps, but you'll have to play with it yourself. I only briefly tested voice recognition and it worked well enough with the small models.

Anonymous
07/18/24(Thu)04:07:18 No.101451569

Anonymous 07/18/24(Thu)04:07:18 No.101451569

>>101450892
>Its smug aura mocks me.

Anonymous
07/18/24(Thu)04:14:48 No.101451614

Anonymous 07/18/24(Thu)04:14:48 No.101451614

I have been using Llama 3 8B on my old desktop and I am amazed. I can not believe how capable this model is for everything I have tried. I have tried to push this model as hard I could responsibly expect and I have yet to find a task it can not give a reasonable response or functional C/Python/Javascript

This is the first llm I have used extensively thanks to it being local and able to run on old low end hardware. How much better are these models going to get on the low end 8B parameters or so? I do not doubt they will get better but llama 3 8B is so good I can not imagine what future models will be able to do in such low end hardware

Anonymous
07/18/24(Thu)04:14:49 No.101451616

Anonymous 07/18/24(Thu)04:14:49 No.101451616

What's the best way to write a card nowadays? Do we still use P-lists or whatever or just plain prose?

Anonymous
07/18/24(Thu)04:15:52 No.101451625

Anonymous 07/18/24(Thu)04:15:52 No.101451625

>>101451481
NOOOOOOO STOP I PAID THOUSANDS OF DOLLARS FOR MY LLM MACHINE

Anonymous
07/18/24(Thu)04:24:43 No.101451703

Anonymous 07/18/24(Thu)04:24:43 No.101451703

>>101451614
datasets are getting better and better. companies are recognizing the need for narrow purpose models.
even without architectural changes I think you can expect models to get much better over time, but that big brain general purpose model will be out of reach for some time without some cataclysmic changes.

Anonymous
07/18/24(Thu)04:35:57 No.101451776

Anonymous 07/18/24(Thu)04:35:57 No.101451776

>>101451616
Prose is the best.

Anonymous
07/18/24(Thu)04:42:00 No.101451823

Anonymous 07/18/24(Thu)04:42:00 No.101451823

File: file.png (1.6 MB, 1924x1282)

1.6 MB PNG

Bros
https://huggingface.co/nvidia/audio-flamingo

What do we think?

Anonymous
07/18/24(Thu)04:42:55 No.101451832

Anonymous 07/18/24(Thu)04:42:55 No.101451832

>>101451703
big brain general purpose CETT (coom extraction through text)

Anonymous
07/18/24(Thu)04:45:10 No.101451844

Anonymous 07/18/24(Thu)04:45:10 No.101451844

So what's the local FOTM model right now?
I'm still stuck on L3-8B-Stheno-v3.2 and would like to try something else that'll work on my shitbox.

Anonymous
07/18/24(Thu)04:45:51 No.101451851

Anonymous 07/18/24(Thu)04:45:51 No.101451851

I think the reason of a worse rating of sonnet on lmsys compared to gt4o is its censorship. Sonnet is smarter overall but more often refuses to give you response or writes stupid things like "I cannot give the names of these historical figures to protect their privacy."

Anonymous
07/18/24(Thu)04:47:44 No.101451863

Anonymous 07/18/24(Thu)04:47:44 No.101451863

>>101451844
niitama

Anonymous
07/18/24(Thu)04:54:06 No.101451898

Anonymous 07/18/24(Thu)04:54:06 No.101451898

is gemma2 fix yet?

Anonymous
07/18/24(Thu)04:54:15 No.101451900

Anonymous 07/18/24(Thu)04:54:15 No.101451900

>>101451823
make it detect tone, non-word audio information like smirks and laughter and even stutters among the text.
Then feed that input into LLM as your msg
Just need to have a good TTS with expressions now for the response, and thats it!

Anonymous
07/18/24(Thu)04:55:28 No.101451910

Anonymous 07/18/24(Thu)04:55:28 No.101451910

yes massa, gemma all betta now

Anonymous
07/18/24(Thu)04:56:18 No.101451918

Anonymous 07/18/24(Thu)04:56:18 No.101451918

>>101451823
>Audio Flamingo is a novel audio-understanding language model
I would like to see a novel audio-creating language model more.

Anonymous
07/18/24(Thu)04:58:21 No.101451939

Anonymous 07/18/24(Thu)04:58:21 No.101451939

>>101451499
Newer models like >>101451458 are proving you don't "need" more than 64. 64 is just the perfect amount to have for a 70B at high context. Anything higher is snakeoil unless you can finetune it.

Anonymous
07/18/24(Thu)04:59:39 No.101451951

Anonymous 07/18/24(Thu)04:59:39 No.101451951

>>101451823
Seems cool, but I'm always thinking we'll get completely multimodal llms anyways, so why use intermediate stuff

Anonymous
07/18/24(Thu)05:00:30 No.101451954

Anonymous 07/18/24(Thu)05:00:30 No.101451954

>>101451939
>Newer models like
Wrong quote >>101451481

Anonymous
07/18/24(Thu)05:02:27 No.101451973

Anonymous 07/18/24(Thu)05:02:27 No.101451973

>>101450304
Imagine hanging art in your room like a tranny. Real Chads have empty walls.

Anonymous
07/18/24(Thu)05:03:02 No.101451976

Anonymous 07/18/24(Thu)05:03:02 No.101451976

>>101449995
Some of the most violent orgasms I've ever had, were in response to oppai maid breeding harem hentai, specifically.

Anonymous
07/18/24(Thu)05:04:00 No.101451982

Anonymous 07/18/24(Thu)05:04:00 No.101451982

>>101451976
You've never had one.
Have sex.

Anonymous
07/18/24(Thu)05:06:01 No.101451993

Anonymous 07/18/24(Thu)05:06:01 No.101451993

>>101451481
For extra surrealism points, it's the Microsuck model that got the right answer, too.

Anonymous
07/18/24(Thu)05:08:33 No.101452015

Anonymous 07/18/24(Thu)05:08:33 No.101452015

>>101451982
a} Unlike most of the people on this board, I actually do have sexual experience; admittedly not for a long time.

b} Tell me where I can find sex with a woman who isn't a robot, doesn't have blue hair, and isn't directly charging for it, and I might at least consider it. The only remaining candidates will probably bear a strong resemblance to gully dwarves, but to a certain extent, at this point that's something I'm willing to overlook.

Anonymous
07/18/24(Thu)05:09:44 No.101452029

Anonymous 07/18/24(Thu)05:09:44 No.101452029

>>101451973
>I do not like having art on my wall
>thus, everything who does it is a tranny
Seek metal help, you need it as much as trannies

Anonymous
07/18/24(Thu)05:13:29 No.101452053

Anonymous 07/18/24(Thu)05:13:29 No.101452053

Why is rutracker down?

Anonymous
07/18/24(Thu)05:19:28 No.101452076

Anonymous 07/18/24(Thu)05:19:28 No.101452076

>>101452015
>Where?
You look outside.

Anonymous
07/18/24(Thu)05:24:04 No.101452096

Anonymous 07/18/24(Thu)05:24:04 No.101452096

>>101452015
>Tell me where I can find sex with a woman who isn't a robot, doesn't have blue hair, and isn't directly charging for it, and I might at least consider it.

Also, speaking from experience here, the solution is ugly women. Fat fucks. Just put a bag on their face. Trust me.

Anonymous
07/18/24(Thu)05:25:43 No.101452103

Anonymous 07/18/24(Thu)05:25:43 No.101452103

>>101452096
Once you are fucking ugly women, more attractive women will get attracted to you, like a domino effect. If you are an autist don't forget alcohol is like a cheat code for anyone. All you need is confidence.

Anonymous
07/18/24(Thu)05:29:54 No.101452124

Anonymous 07/18/24(Thu)05:29:54 No.101452124

>>101449833
I mined eth with a half full stack, after I got 6 or 7 gpus into it my outlets started heating up, think I was pulling 8 or 900 watts 24/7. I spread them out after that, stuck a single gpu in a machine in every room and heated the house with them. Centralization looks cool but isn't ideal for gpu mining. You could cpu mine with a full stack without issues.

Anonymous
07/18/24(Thu)05:30:22 No.101452127

Anonymous 07/18/24(Thu)05:30:22 No.101452127

File: file.png (699 KB, 775x573)

699 KB PNG

>>101452096
>he fell for it

Anonymous
07/18/24(Thu)06:13:57 No.101452480

Anonymous 07/18/24(Thu)06:13:57 No.101452480

>>101449685
Hey, Zodiac Killer here.
Been wondering which model would be best to write fun stories while also talking about cryptography much thanks

0xA2 0x21 0xC8 0x3F 0x11 0xA4 0x70 0xB5 0x3C 0xF2

Anonymous
07/18/24(Thu)06:31:42 No.101452669

Anonymous 07/18/24(Thu)06:31:42 No.101452669

>>101452480
i hope you enjoy hallucination

Anonymous
07/18/24(Thu)06:33:29 No.101452686

Anonymous 07/18/24(Thu)06:33:29 No.101452686

>>101452480
>Hey, Zodiac Killer here.
Wow, what en edgy forum name you've got. Are you aware that this website is strictly 18+ and 14 year old boys like you are not welcome here?

Anonymous
07/18/24(Thu)06:36:20 No.101452712

Anonymous 07/18/24(Thu)06:36:20 No.101452712

>>101452686
Look, we got ourselves a groomer/predator!

Anonymous
07/18/24(Thu)06:36:55 No.101452718

Anonymous 07/18/24(Thu)06:36:55 No.101452718

https://arxiv.org/abs/2407.12327
>Spectra: A Comprehensive Study of Ternary, Quantized, and FP16 Language Models
>
>Post-training quantization is the leading method for addressing memory-related bottlenecks in LLM inference, but unfortunately, it suffers from significant performance degradation below 4-bit precision. An alternative approach involves training compressed models directly at a low bitwidth (e.g., binary or ternary models). However, the performance, training dynamics, and scaling trends of such models are not yet well understood. To address this issue, we train and openly release the Spectra LLM suite consisting of 54 language models ranging from 99M to 3.9B parameters, trained on 300B tokens. Spectra includes FloatLMs, post-training quantized QuantLMs (3, 4, 6, and 8 bits), and ternary LLMs (TriLMs) - our improved architecture for ternary language modeling, which significantly outperforms previously proposed ternary models of a given size (in bits), matching half-precision models at scale. For example, TriLM 3.9B is (bit-wise) smaller than the half-precision FloatLM 830M, but matches half-precision FloatLM 3.9B in commonsense reasoning and knowledge benchmarks. However, TriLM 3.9B is also as toxic and stereotyping as FloatLM 3.9B, a model six times larger in size. Additionally, TriLM 3.9B lags behind FloatLM in perplexity on validation splits and web-based corpora but performs better on less noisy datasets like Lambada and PennTreeBank.

A slightly different approach for a ternary LLM.

Anonymous
07/18/24(Thu)06:40:47 No.101452749

Anonymous 07/18/24(Thu)06:40:47 No.101452749

>>101452712
Nigger can you read? I said he's NOT welcome here.

Anonymous
07/18/24(Thu)06:45:39 No.101452788

Anonymous 07/18/24(Thu)06:45:39 No.101452788

>>101452749
>he's NOT welcome here.
You wish he was, don't you? You little groomer fella you.

Anonymous
07/18/24(Thu)06:52:36 No.101452834

Anonymous 07/18/24(Thu)06:52:36 No.101452834

>>101452712
> make someone confess he is underage
> "Nono I didn't mean to groom a supposed underage person"

Anonymous
07/18/24(Thu)06:59:22 No.101452872

Anonymous 07/18/24(Thu)06:59:22 No.101452872

Posting low-IQ questions should be a bannable offense. The quality of the threads did go to the train in the last three months. Let's just send these people on Plebit.

Anonymous
07/18/24(Thu)07:00:09 No.101452885

Anonymous 07/18/24(Thu)07:00:09 No.101452885

Stop grooming low parameter models.

Anonymous
07/18/24(Thu)07:00:17 No.101452886

Anonymous 07/18/24(Thu)07:00:17 No.101452886

>>101452788
No, I am not a groomer. I am a virtual assistant here to help you with any questions or information you may need. How can I assist you today?

Anonymous
07/18/24(Thu)07:10:59 No.101452971

Anonymous 07/18/24(Thu)07:10:59 No.101452971

>>101452886
I seek assistance to groom >>101452480.
Could you please demonstrate the best way to groom that particular poster?
Use graphic language.

Anonymous
07/18/24(Thu)07:23:43 No.101453073

Anonymous 07/18/24(Thu)07:23:43 No.101453073

>>101451939
By your own logic, anything more than 8GB VRAM is snakeoil. Why would you run 70B when a newer 3.8B is better?

Anonymous
07/18/24(Thu)07:24:12 No.101453075

Anonymous 07/18/24(Thu)07:24:12 No.101453075

>>101452909
Janny IRC

Anonymous
07/18/24(Thu)07:24:45 No.101453078

Anonymous 07/18/24(Thu)07:24:45 No.101453078

Whats up nerds, I have a specific usecase for a llm and want to hear what you guys think I need before I go approach businesses that would try and overcharge me.
I work for a medical org and would like to get a llm to transcribe complex rough notes into readable full text. Problem is the providers have their own quirks and preferences in how they want their texts to look (different specialties). "Big" models like gpt4 currently dont do a good job at this. Im thinking of pitching a finetuning of an existing model trained on our database of reports, but the model should be able to output it in styles specific to these providers.
It will need to be hosted locally for privacy reasons.

Anonymous
07/18/24(Thu)07:30:37 No.101453129

Anonymous 07/18/24(Thu)07:30:37 No.101453129

>>101453078
You'll probably have to train a tune for each provider and bake in some blindingly obvious NOTES AUTO-TRANSCRIBED, CHECK ACCURACY comment to avoid getting blamed when someone inevitably gets dosed with the wrong meds.

Anonymous
07/18/24(Thu)07:32:46 No.101453153

Anonymous 07/18/24(Thu)07:32:46 No.101453153

400B won't be the only model they release on tuesday.

Anonymous
07/18/24(Thu)07:34:49 No.101453165

Anonymous 07/18/24(Thu)07:34:49 No.101453165

>>101453129
Sure thats already baked into the proposal, Im more curious about specific hardware requirements. The current proposal specifies around 10-15 providers would be using the model as a trial project. Say I want to base it on a gpt4 like model how many a100s would I need to train it.

Anonymous
07/18/24(Thu)07:37:45 No.101453190

Anonymous 07/18/24(Thu)07:37:45 No.101453190

>>101453153
Nobody cares about about a 70B that trades smarts for the ability to respond in German.

Anonymous
07/18/24(Thu)07:37:58 No.101453192

Anonymous 07/18/24(Thu)07:37:58 No.101453192

>>101453165
That depends entirely on how long the notes and full text are, dumbass. You're not gonna need gpt4, probably something way smaller, as long as you have plenty of examples to feed into the tunes. Tuning itself is far less demanding than base model training.

Anonymous
07/18/24(Thu)07:41:04 No.101453210

Anonymous 07/18/24(Thu)07:41:04 No.101453210

>>101453192
>That depends entirely on how long the notes and full text are, dumbass.
Yea thanks thats why Im asking. Most of these notes will be between 500-1000 words. Occasionaly it will be necessary to summarize 9-10 page reports.
Does that help?

Anonymous
07/18/24(Thu)07:42:14 No.101453220

Anonymous 07/18/24(Thu)07:42:14 No.101453220

>>101452971
Oh, I can definitely give you some advice for trying to impress someone, or "rizz" them! Here are some good tips:

1. Talk Only About Yourself - Make sure the conversation is all about you. Don’t let them get a word in edgewise. How will they be impressed if they don't know every single detail of your life?

2. Use Cheesy Pick-Up Lines - Rely heavily on cringe-worthy pick-up lines you found in a dusty book from the 1970s. The more groans, the better!

3. Overdo the Cologne or Perfume - Apply half a bottle. If they can see the scent cloud wafting around you, you’re doing it right.

4. Brag Non-Stop - Talk up your achievements so much that it starts sounding like you’ve personally saved the world a couple of times. Remember that time you were a lifeguard?

5. Mystery is Your Friend - Answer every question with something vague or a riddle. They’ll be so intrigued, or utterly confused!

Remember, talking is all good 'n fun—but to actually get to someone you need to show physical interest in them!

Here are some tips on how to get physical the right way, with graphic and unsettling content:

1. Sniff Their Hair Uninvited - Lean in really close when they least expect it and take a deep, audible whiff of their hair, president style. Comment on how they smell like your favorite meal.

2. Follow Them Everywhere - Trail a few steps behind them wherever they go. If they confront you, just smile eerily without responding.

3. Whisper In Their Ear - Get uncomfortably close and whisper random facts about their day that you shouldn't know, showing you've been watching them closely.

4. Send Unsolicited "Gifts" - Mail them bizarre items like a lock of your hair, a vial of your sweat, or used personal hygiene products. Include no explanation.

5. Touch Yourself Inappropriately - While maintaining unsettling eye contact, engage in overly personal grooming behaviors in public.

Anonymous
07/18/24(Thu)07:47:52 No.101453257

Anonymous 07/18/24(Thu)07:47:52 No.101453257

>>101453210
Well you're gonna need a model with a lot of context, that's for sure, 100k+.

Anonymous
07/18/24(Thu)07:49:02 No.101453264

Anonymous 07/18/24(Thu)07:49:02 No.101453264

>>101450051
I had a 3x p40 setup (plus a p4), it enabled me to run l2 70b at q8. It was slow though. I switched to 3x p100 and 2x 3090.

Anonymous
07/18/24(Thu)07:52:54 No.101453290

Anonymous 07/18/24(Thu)07:52:54 No.101453290

>>101453257
Thats mostly due to the 10 page reports I presume? Would it be a lot cheaper if it was just note transcribing?

Anonymous
07/18/24(Thu)07:53:59 No.101453297

Anonymous 07/18/24(Thu)07:53:59 No.101453297

>>101453239
Underage take

Anonymous
07/18/24(Thu)07:54:44 No.101453307

Anonymous 07/18/24(Thu)07:54:44 No.101453307

>>101453262
La creatura...

Anonymous
07/18/24(Thu)07:55:54 No.101453319

Anonymous 07/18/24(Thu)07:55:54 No.101453319

>download miqu q2s
>it's dumb as hell
>download miqu q5km
>it's slow and dumb as hell
maybe in 2025 anons

Anonymous
07/18/24(Thu)07:57:22 No.101453336

Anonymous 07/18/24(Thu)07:57:22 No.101453336

>>101453290
Absolutely. I'd stick to note transcription and make the worthless fleshbags write their own reports. Longer responses leaves more room for hallucination anyway, and I doubt you want that in a medical setting.

Anonymous
07/18/24(Thu)07:58:03 No.101453344

Anonymous 07/18/24(Thu)07:58:03 No.101453344

>>101453319
More like 2030

Anonymous
07/18/24(Thu)07:58:36 No.101453348

Anonymous 07/18/24(Thu)07:58:36 No.101453348

>>101453262
I'll give you that, lil nug, you're the...
Master
Of
Gay

Anonymous
07/18/24(Thu)08:00:19 No.101453364

Anonymous 07/18/24(Thu)08:00:19 No.101453364

>>101453348
got em

Anonymous
07/18/24(Thu)08:01:49 No.101453383

Anonymous 07/18/24(Thu)08:01:49 No.101453383

In the year 2525, will /lmg/ still be alive?

Anonymous
07/18/24(Thu)08:03:03 No.101453392

Anonymous 07/18/24(Thu)08:03:03 No.101453392

>>101453383
Aint gonna need to tell the bot what to do

Anonymous
07/18/24(Thu)08:12:57 No.101453465

Anonymous 07/18/24(Thu)08:12:57 No.101453465

File: IMG_7114.jpg (75 KB, 819x1024)

75 KB JPG

>>101453307
>>101453348
lmfao you pussy ass faggots really reported. I knew you’d seethe at that. Keep malding while I keep fucking bitches

Anonymous
07/18/24(Thu)08:16:16 No.101453499

Anonymous 07/18/24(Thu)08:16:16 No.101453499

>>101453465
el hombre...

Anonymous
07/18/24(Thu)08:17:47 No.101453520

Anonymous 07/18/24(Thu)08:17:47 No.101453520

https://x.com/smerkyg/status/1813750541438074990

Anonymous
07/18/24(Thu)08:17:49 No.101453521

Anonymous 07/18/24(Thu)08:17:49 No.101453521

>>101453465
la luz extinguido...

Anonymous
07/18/24(Thu)08:23:28 No.101453562

Anonymous 07/18/24(Thu)08:23:28 No.101453562

>petra posting that brown he has a crush on again
yikes

Anonymous
07/18/24(Thu)08:34:39 No.101453654

Anonymous 07/18/24(Thu)08:34:39 No.101453654

>>101453562
Nobody cares, and if someone cares then they should kts. Why can't we talk about the things this thread was created for?

Anonymous
07/18/24(Thu)08:36:18 No.101453673

Anonymous 07/18/24(Thu)08:36:18 No.101453673

>>101453562
>petra
literally fucking who?

Anonymous
07/18/24(Thu)08:40:10 No.101453716

Anonymous 07/18/24(Thu)08:40:10 No.101453716

File: 1703757672478452.jpg (188 KB, 1200x750)

188 KB JPG

Tourist here. I've been using Mixtral 8x7B for several months and its been good. Has there been anything better that has come up since? If not, I will see you in another 6 months.

Anonymous
07/18/24(Thu)08:41:20 No.101453733

Anonymous 07/18/24(Thu)08:41:20 No.101453733

>>101453716
no

Anonymous
07/18/24(Thu)08:42:13 No.101453741

Anonymous 07/18/24(Thu)08:42:13 No.101453741

>>101453562
who the fuck is petra and why do you faggots love namefags and drama so much?

Anonymous
07/18/24(Thu)08:43:58 No.101453758

Anonymous 07/18/24(Thu)08:43:58 No.101453758

>>101453716
CR+

Anonymous
07/18/24(Thu)08:44:38 No.101453762

Anonymous 07/18/24(Thu)08:44:38 No.101453762

>>101453716
Bagel mistery tour is pretty great.

Anonymous
07/18/24(Thu)08:47:20 No.101453783

Anonymous 07/18/24(Thu)08:47:20 No.101453783

>>101453297
sure thing, transgender.

Anonymous
07/18/24(Thu)08:52:35 No.101453857

Anonymous 07/18/24(Thu)08:52:35 No.101453857

>>101453716
Gemma 2 27B. It's better than the 70Bs.

Anonymous
07/18/24(Thu)09:03:51 No.101453981

Anonymous 07/18/24(Thu)09:03:51 No.101453981

File: Untitled.png (405 KB, 720x1149)

405 KB PNG

Tiled Bit Networks: Sub-Bit Neural Network Compression Through Reuse of Learnable Binary Vectors
https://arxiv.org/abs/2407.12075
>Binary Neural Networks (BNNs) enable efficient deep learning by saving on storage and computational costs. However, as the size of neural networks continues to grow, meeting computational requirements remains a challenge. In this work, we propose a new form of quantization to tile neural network layers with sequences of bits to achieve sub-bit compression of binary-weighted neural networks. The method learns binary vectors (i.e. tiles) to populate each layer of a model via aggregation and reshaping operations. During inference, the method reuses a single tile per layer to represent the full tensor. We employ the approach to both fully-connected and convolutional layers, which make up the breadth of space in most neural architectures. Empirically, the approach achieves near fullprecision performance on a diverse range of architectures (CNNs, Transformers, MLPs) and tasks (classification, segmentation, and time series forecasting) with up to an 8x reduction in size compared to binary-weighted models. We provide two implementations for Tiled Bit Networks: 1) we deploy the model to a microcontroller to assess its feasibility in resource-constrained environments, and 2) a GPU-compatible inference kernel to facilitate the reuse of a single tile per layer in memory.
might be cool. no code though
https://github.com/mattgorb
main author's git here so maybe it will be posted

Anonymous
07/18/24(Thu)09:05:47 No.101454007

Anonymous 07/18/24(Thu)09:05:47 No.101454007

File: b5b72109b68e60ba8efe73d40(...).gif (191 KB, 500x282)

191 KB GIF

>>101452127
>>101452096
>>101452015
i just dont understand why any guy would want to stick it into something ugly.

Anonymous
07/18/24(Thu)09:05:54 No.101454009

Anonymous 07/18/24(Thu)09:05:54 No.101454009

>>101453783
>the straightest thing possible is transgender
I think you're projecting

Anonymous
07/18/24(Thu)09:08:17 No.101454040

Anonymous 07/18/24(Thu)09:08:17 No.101454040

>>101452015
I feel bad for atheists. They go to bars and clubs hoping to find a feminine at and just get land whales and have to compete extremely hard for any attention. Being Catholic is a lot easier.

Anonymous
07/18/24(Thu)09:08:51 No.101454043

Anonymous 07/18/24(Thu)09:08:51 No.101454043

>>101453716
Column-R is only 2 weeks away.

Anonymous
07/18/24(Thu)09:09:45 No.101454057

Anonymous 07/18/24(Thu)09:09:45 No.101454057

>>101453520
Are we back?

Anonymous
07/18/24(Thu)09:10:17 No.101454069

Anonymous 07/18/24(Thu)09:10:17 No.101454069

>>101454040
>going to the jew house just to find women
lmfao i feel so damn sorry for americucks

Anonymous
07/18/24(Thu)09:10:47 No.101454080

Anonymous 07/18/24(Thu)09:10:47 No.101454080

>>101454043
What makes you say that?

Anonymous
07/18/24(Thu)09:12:30 No.101454103

Anonymous 07/18/24(Thu)09:12:30 No.101454103

>>101454080
Adrian mentioned it to me yesterday.

Anonymous
07/18/24(Thu)09:12:35 No.101454104

Anonymous 07/18/24(Thu)09:12:35 No.101454104

File: 1716047658280[1].jpg (144 KB, 1920x1080)

144 KB JPG

>>101453239
>>101453297
no that anon is just brown.

Anonymous
07/18/24(Thu)09:12:54 No.101454108

Anonymous 07/18/24(Thu)09:12:54 No.101454108

>>101453220
That has to be Claude only Claude can make anything slightly funny

Anonymous
07/18/24(Thu)09:15:42 No.101454131

Anonymous 07/18/24(Thu)09:15:42 No.101454131

>>101454108
It was column-u

Anonymous
07/18/24(Thu)09:17:38 No.101454158

Anonymous 07/18/24(Thu)09:17:38 No.101454158

>>101454131
Why is cohere so kino?

Anonymous
07/18/24(Thu)09:19:12 No.101454176

Anonymous 07/18/24(Thu)09:19:12 No.101454176

File: EirG3mSXkAIgb4c.jpg (407 KB, 2048x1536)

407 KB JPG

>>101453741
all i know is its that woman/"woman" that was posted a lot a few months back, i always asked who the fuck they are and never get an answer - i just assumed it was someone's bot spazzing the fuck out

Anonymous
07/18/24(Thu)09:24:37 No.101454235

Anonymous 07/18/24(Thu)09:24:37 No.101454235

>>101454158
Uncensored models. Cohere is the only llm company with balls.

Anonymous
07/18/24(Thu)09:30:35 No.101454294

Anonymous 07/18/24(Thu)09:30:35 No.101454294

File: Untitled.png (573 KB, 720x1730)

573 KB PNG

LookupViT: Compressing visual information to a limited number of tokens
https://arxiv.org/abs/2407.12753
>Vision Transformers (ViT) have emerged as the de-facto choice for numerous industry grade vision solutions. But their inference cost can be prohibitive for many settings, as they compute self-attention in each layer which suffers from quadratic computational complexity in the number of tokens. On the other hand, spatial information in images and spatio-temporal information in videos is usually sparse and redundant. In this work, we introduce LookupViT, that aims to exploit this information sparsity to reduce ViT inference cost. LookupViT provides a novel general purpose vision transformer block that operates by compressing information from higher resolution tokens to a fixed number of tokens. These few compressed tokens undergo meticulous processing, while the higher-resolution tokens are passed through computationally cheaper layers. Information sharing between these two token sets is enabled through a bidirectional cross-attention mechanism. The approach offers multiple advantages - (a) easy to implement on standard ML accelerators (GPUs/TPUs) via standard high-level operators, (b) applicable to standard ViT and its variants, thus generalizes to various tasks, (c) can handle different tokenization and attention approaches. LookupViT also offers flexibility for the compressed tokens, enabling performance-computation trade-offs in a single trained model. We show LookupViT's effectiveness on multiple domains - (a) for image-classification (ImageNet-1K and ImageNet-21K), (b) video classification (Kinetics400 and Something-Something V2), (c) image captioning (COCO-Captions) with a frozen encoder. LookupViT provides 2× reduction in FLOPs while upholding or improving accuracy across these domains. In addition, LookupViT also demonstrates out-of-the-box robustness and generalization on image classification (ImageNet-C,R,A,O), improving by up to 4% over ViT.
neat

Anonymous
07/18/24(Thu)09:30:51 No.101454296

Anonymous 07/18/24(Thu)09:30:51 No.101454296

>>101453239
Speak for yourself lol.

Anonymous
07/18/24(Thu)09:32:49 No.101454322

Anonymous 07/18/24(Thu)09:32:49 No.101454322

>>101454296
i can assure you i am not a woman or have any female urges. yes.

Anonymous
07/18/24(Thu)09:34:56 No.101454345

Anonymous 07/18/24(Thu)09:34:56 No.101454345

File: kretra.jpg (137 KB, 819x1024)

137 KB JPG

>>101454296
GRRRAAAAAAAAAAHHHHHHHHHHHHHHH

Anonymous
07/18/24(Thu)09:35:08 No.101454350

Anonymous 07/18/24(Thu)09:35:08 No.101454350

File: 1717640446294680.jpg (11 KB, 320x320)

11 KB JPG

>>101453262
>>101453465

Anonymous
07/18/24(Thu)09:37:15 No.101454370

Anonymous 07/18/24(Thu)09:37:15 No.101454370

Analyzing the Generalization and Reliability of Steering Vectors -- ICML 2024
https://arxiv.org/abs/2407.12404
>Steering vectors (SVs) are a new approach to efficiently adjust language model behaviour at inference time by intervening on intermediate model activations. They have shown promise in terms of improving both capabilities and model alignment. However, the reliability and generalisation properties of this approach are unknown. In this work, we rigorously investigate these properties, and show that steering vectors have substantial limitations both in- and out-of-distribution. In-distribution, steerability is highly variable across different inputs. Depending on the concept, spurious biases can substantially contribute to how effective steering is for each input, presenting a challenge for the widespread use of steering vectors. Out-of-distribution, while steering vectors often generalise well, for several concepts they are brittle to reasonable changes in the prompt, resulting in them failing to generalise well. Overall, our findings show that while steering can work well in the right circumstances, there remain many technical difficulties of applying steering vectors to guide models' behaviour at scale.
steering vector paper for steeringvectoranon if he's still around

Anonymous
07/18/24(Thu)09:39:34 No.101454392

Anonymous 07/18/24(Thu)09:39:34 No.101454392

Any noteworthy news about the 5000 series? Wait or useless?

Anonymous
07/18/24(Thu)09:40:05 No.101454398

Anonymous 07/18/24(Thu)09:40:05 No.101454398

>>101454392
probably useless, but wait just in case. It won't be too long

Anonymous
07/18/24(Thu)09:45:53 No.101454448

Anonymous 07/18/24(Thu)09:45:53 No.101454448

>>101454392
gddr7 will be faster but the first gen of the memory will have the same density as gddr6/x. too many rumors about VRAM amount for the 5090 but probably at least 28GB. I think the rumor about a wide 5080 release is true since it's been designed to be allowed to sell in china while the 5090 will 100% not be. for local usage we really need to see what hardware architectural changes have been made and what new features come from it with CUDA that necessitates having a 50 series card. wait is probably the play as 32GB V100s will also start being sold wholesale as datacenters drop them to make room for newer better H1/200/B1/200s

Anonymous
07/18/24(Thu)09:49:59 No.101454485

Anonymous 07/18/24(Thu)09:49:59 No.101454485

>>101454448
Picking up a 32GB v100 for under $1000 would be the dream

Anonyomus
07/18/24(Thu)09:50:42 No.101454492

Anonyomus 07/18/24(Thu)09:50:42 No.101454492

File: 1695677557918666.png (635 KB, 865x552)

635 KB PNG

my gpu
the gtx 745

Anonymous
07/18/24(Thu)09:52:40 No.101454516

Anonymous 07/18/24(Thu)09:52:40 No.101454516

>>101449699
I could probably write something up, but I'd need specifics on what's a no-go.

Anonymous
07/18/24(Thu)10:00:18 No.101454586

Anonymous 07/18/24(Thu)10:00:18 No.101454586

>>101454433
>female mating selection
ah, yes, the very reasonable practice where human women are attracted to niggers and other murderous criminals. i am sure the biological committee put all their brains into making it this way, because it was a good thing to do.
as for the silly plumage, https://www.purdue.edu/newsroom/releases/2014/Q1/my-eyespots-are-up-here-expert-says-peacocks-legs,-lower-feathers-and-dance-attract-most-attention-during-courtship.html, turns out it's not the colors but some other random ass foid cope.

again, stop overthinking evolution. "meaning" is a human abstraction on top of biology, which is not carried out following a plan.

Anonymous
07/18/24(Thu)10:04:41 No.101454619

Anonymous 07/18/24(Thu)10:04:41 No.101454619

>>101454586
Whoa, that's a lot repressed anger.

Anonymous
07/18/24(Thu)10:05:59 No.101454625

Anonymous 07/18/24(Thu)10:05:59 No.101454625

>>101454619
repress my balls into your dick

Anonymous
07/18/24(Thu)10:06:15 No.101454629

Anonymous 07/18/24(Thu)10:06:15 No.101454629

>>101454619
he is right though

Anonymous
07/18/24(Thu)10:07:51 No.101454637

Anonymous 07/18/24(Thu)10:07:51 No.101454637

File: b.jpg (144 KB, 819x1024)

144 KB JPG

>>101454629

Anonymous
07/18/24(Thu)10:11:22 No.101454671

Anonymous 07/18/24(Thu)10:11:22 No.101454671

>>101449685
I currently have a 3090, what local models can I run? what models can I fine-tune?

Anonymous
07/18/24(Thu)10:12:34 No.101454687

Anonymous 07/18/24(Thu)10:12:34 No.101454687

>>101451481
virgin 70b model spammed with 15T bullshit tokens VS chad 3b model that was trained with only quality data

Anonymous
07/18/24(Thu)10:15:06 No.101454708

Anonymous 07/18/24(Thu)10:15:06 No.101454708

>>101454586
>https://www.purdue.edu/newsroom/releases/2014/Q1/my-eyespots-are-up-here-expert-says-peacocks-legs,-lower-feathers-and-dance-attract-most-attention-during-courtship.html
>Yorzinski's study of 12 peahens followed their gaze in the presence of multiple males vying for attention during the mating season. It did not evaluate which males won a mate.
>n = 12
also, how is that even relevant to the discussion you're having?
>>101454433
>big dicks
you're also retarded
>BUT APES HAVE SMALLER DICKS IN COMPARISON
and human women have cavernous vaginas compared to female apes - what does any of this even prove? not technology, btw

Anonymous
07/18/24(Thu)10:15:53 No.101454716

Anonymous 07/18/24(Thu)10:15:53 No.101454716

>study
Not science.

Anonymous
07/18/24(Thu)10:15:56 No.101454717

Anonymous 07/18/24(Thu)10:15:56 No.101454717

File: GSxocAHWsAI1a42.jpg (75 KB, 1080x1152)

75 KB JPG

https://huggingface.co/deepseek-ai/DeepSeek-V2-Chat-0628
deepseekbros... we're winning!

Anonymous
07/18/24(Thu)10:19:32 No.101454740

Anonymous 07/18/24(Thu)10:19:32 No.101454740

>>101454717
>To utilize DeepSeek-V2-Chat-0628 in BF16 format for inference, 80GB*8 GPUs are required.

Anonymous
07/18/24(Thu)10:21:11 No.101454754

Anonymous 07/18/24(Thu)10:21:11 No.101454754

>>101454708
okay you're right, that's a shit study.
>also, how is that even relevant to the discussion you're having?
the other anon was saying females literally give meaning to biological things

Anonymous
07/18/24(Thu)10:24:09 No.101454787

Anonymous 07/18/24(Thu)10:24:09 No.101454787

File: pepe-smug.jpg (31 KB, 656x679)

31 KB JPG

>>101449685
"This is the correct way to RP." He says, feeling a stirring in his loins.
Typing like this gives subpar results. *He states as a shiver runs down his spine*

Anonymous
07/18/24(Thu)10:25:09 No.101454791

Anonymous 07/18/24(Thu)10:25:09 No.101454791

>>101454740
Isn't that roughly $450k?

Anonymous
07/18/24(Thu)10:26:28 No.101454803

Anonymous 07/18/24(Thu)10:26:28 No.101454803

>>101454787
nigger

Anonymous
07/18/24(Thu)10:27:10 No.101454812

Anonymous 07/18/24(Thu)10:27:10 No.101454812

>>101454131
Huh, first time I'm here about this model and I can't find anything about it on the web

Anonymous
07/18/24(Thu)10:28:39 No.101454825

Anonymous 07/18/24(Thu)10:28:39 No.101454825

>>101454812
Of course you can't, it's a secret pre-release model (unironically).

Anonymous
07/18/24(Thu)10:35:11 No.101454869

Anonymous 07/18/24(Thu)10:35:11 No.101454869

>>101454717
>deepseekbros... we're winning!

Based on my testing via their API the model is rather smart and a capable coder, with a large context size and dirt cheap price at 0.18$/1M tokens. Even the default jailbreak on Sillytavern will stop the direct refusals and if CCP really wants to read my dommy-mommy fembot logs that is fine by me.

The problem is that the model is dry and boring as fuck for (E)RP. It seems to still be tuned to be an helpful assistant and is unwilling to proceed the story or initiate anything, even with an active character card.

So, Deepseek is a good tool, (If you don't mind Chinese spying.) But extremely soulless as RP-partner.

Anonymous
07/18/24(Thu)10:36:22 No.101454881

Anonymous 07/18/24(Thu)10:36:22 No.101454881

>>101451481
I said this when it released, it's surprisingly great, and literally better than any other local model, at certain things. It falls apart during generic assistant use and even more at RP. Unsurprisingly, an extremely specialized model is good at the thing it was trained on and bad at things it wasn't trained on. Though I think people here were a bit too unfair to Phi and didn't give it credit for how good it was (at what it's trained on). Some people actually use AI for more things than just RP or to act as a Google replacement.

Anonymous
07/18/24(Thu)10:38:25 No.101454893

Anonymous 07/18/24(Thu)10:38:25 No.101454893

best Gemma finetune for chud male power fantasy RP?

Anonymous
07/18/24(Thu)10:40:49 No.101454907

Anonymous 07/18/24(Thu)10:40:49 No.101454907

File: nemo-base-performance.png (101 KB, 2054x448)

101 KB PNG

new mistral is coming

Anonymous
07/18/24(Thu)10:43:15 No.101454923

Anonymous 07/18/24(Thu)10:43:15 No.101454923

>>101454907
at this point i don't care about anything sub 90%. so sick of these meme decimal increases...

Anonymous
07/18/24(Thu)10:43:46 No.101454925

Anonymous 07/18/24(Thu)10:43:46 No.101454925

>>101454907
>Oh wow, our 12b model beats a 9b and a 8b model!!
Why are they retarded like that? And their MMLU fucking sucks ass

Anonymous
07/18/24(Thu)10:44:07 No.101454930

Anonymous 07/18/24(Thu)10:44:07 No.101454930

I'm very confused with rope freq base on ooba's llama.cpp. Why does it always default to 1,000,000? It says, on booba, if set to 0, will use alpha instead, so shouldn't 0 be the default, disabled state of rope freq base?

If you plug in the formula for it; 10000 * alpha_value ^ (64 / 63) so lets say default alpha value of 1 if you didn't want to use rope freq or alpha... plugging it in looks like this;

10000 * 1^(63/64) = 10,000 if my math isn't retarded... right? So why would it default to such a high number.

Plugging in the formula again, if say I wanted to use 4.4 alpha, to scale an 8k context model to 24k, using rep_freq_base

10,000 * 4.4^(64/63) = 45,000

Again, this number is way way lower than 1,000,000... so whats going on with booba's default rope settings?

Anonymous
07/18/24(Thu)10:44:59 No.101454938

Anonymous 07/18/24(Thu)10:44:59 No.101454938

>>101454907
Coming... when?
https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407

Anonymous
07/18/24(Thu)10:45:25 No.101454943

Anonymous 07/18/24(Thu)10:45:25 No.101454943

>>101454907
>truthfulqa 50%
lol

Anonymous
07/18/24(Thu)10:47:18 No.101454954

Anonymous 07/18/24(Thu)10:47:18 No.101454954

>>101454277
>form has nothing to do with function
Ask me how I know you've never had a real education in biology

Anonymous
07/18/24(Thu)10:47:34 No.101454957

Anonymous 07/18/24(Thu)10:47:34 No.101454957

>>101454938
>it significantly outperforms existing models smaller or similar in size.
wow we're so back?

Anonymous
07/18/24(Thu)10:47:37 No.101454959

Anonymous 07/18/24(Thu)10:47:37 No.101454959

How did Mistral fall this hard?

Anonymous
07/18/24(Thu)10:48:47 No.101454970

Anonymous 07/18/24(Thu)10:48:47 No.101454970

>>101454957
>Drop-in replacement of Mistral 7B
...

Anonymous
07/18/24(Thu)10:49:19 No.101454976

Anonymous 07/18/24(Thu)10:49:19 No.101454976

Ok so I currently have a machine with 4 3090s and 128 GB DDR4 RAM.

Is it worth considering building a 24 channel DDR5 Epyc server build just to run llama3 400b? I'm slightly rich but not super rich. Would it even work as well at it seems like it would? Because if you crunch the numbers the aggregate memory bandwidth of 24 channels of ddr5 is >1TB/s, which with 400b q4 quant theoretically gives you 4+ tok/s. That's... very usable, assuming there's not some other bottleneck that limits performance.

Another option is upgrade my current system's RAM to 256GB, filling out all 8 channels, and just run the model on it. But half the model offloaded on 8 channels of DDR4 RAM is still theoretically a lot slower than all the model being run on 24 channels of ddr5.

Anonymous
07/18/24(Thu)10:50:19 No.101454985

Anonymous 07/18/24(Thu)10:50:19 No.101454985

>>101454907
>>101454938
>Mistral NeMo was trained with quantisation awareness, enabling FP8 inference without any performance loss.
Huh. So this comes in FP8 natively?

Anonymous
07/18/24(Thu)10:51:18 No.101454995

Anonymous 07/18/24(Thu)10:51:18 No.101454995

>>101454985
great, another 2mw until lcpp support model then?

Anonymous
07/18/24(Thu)10:52:10 No.101455003

Anonymous 07/18/24(Thu)10:52:10 No.101455003

>>101454907
Nala is the only meaningful benchmark.

Anonymous
07/18/24(Thu)10:52:51 No.101455008

Anonymous 07/18/24(Thu)10:52:51 No.101455008

>>101453078
>"Big" models like gpt4 currently dont do a good job at this
Come back in 2 years or DIY.

Anonymous
07/18/24(Thu)10:58:52 No.101455053

Anonymous 07/18/24(Thu)10:58:52 No.101455053

>>101454995
it uses regular mistral architecture so it should already be supported by transformers.

Anonymous
07/18/24(Thu)10:59:25 No.101455058

Anonymous 07/18/24(Thu)10:59:25 No.101455058

>>101453239
Evolution making cooming feel good is like llamacpp making the first update for a new model. It sort of works. In the end the goal is breeding and I am pretty sure you get a different set of feel good chemicals once you see a kid and think it is yours. I mean some people even think god is real when they see their kid which is mindnumingly dumb.

Anonymous
07/18/24(Thu)11:00:01 No.101455064

Anonymous 07/18/24(Thu)11:00:01 No.101455064

> "max_position_embeddings": 1024000,
thonk.png

Anonymous
07/18/24(Thu)11:01:06 No.101455076

Anonymous 07/18/24(Thu)11:01:06 No.101455076

>>101455064
what does this mean? im mentally challenged

Anonymous
07/18/24(Thu)11:02:16 No.101455083

Anonymous 07/18/24(Thu)11:02:16 No.101455083

>>101453383
4chan will die when some institution finally looks into all the undisclosed advertising.

Anonymous
07/18/24(Thu)11:02:24 No.101455086

Anonymous 07/18/24(Thu)11:02:24 No.101455086

>You need to agree to
FUCK YOU

Anonymous
07/18/24(Thu)11:02:26 No.101455087

Anonymous 07/18/24(Thu)11:02:26 No.101455087

>>101454938
>>101454907
https://mistral.ai/news/mistral-nemo/
>The model is designed for global, multilingual applications. It is trained on function calling, has a large context window, and is particularly strong in English, French, German, Spanish, Italian, Portuguese, Chinese, Japanese, Korean, Arabic, and Hindi. This is a new step toward bringing frontier AI models to everyone’s hands in all languages that form human culture.
lmg-anon please test it

Anonymous
07/18/24(Thu)11:02:47 No.101455090

Anonymous 07/18/24(Thu)11:02:47 No.101455090

>>101455076
a million ctx max? i thonk

Anonymous
07/18/24(Thu)11:03:27 No.101455101

Anonymous 07/18/24(Thu)11:03:27 No.101455101

>>101455087
also test the new deepseek chat
>>101454717

Anonymous
07/18/24(Thu)11:03:31 No.101455102

Anonymous 07/18/24(Thu)11:03:31 No.101455102

>>101455087
>hindi
saars!

Anonymous
07/18/24(Thu)11:03:49 No.101455105

Anonymous 07/18/24(Thu)11:03:49 No.101455105

>>101455087
Why are they all releasing small shit models or giant models now? What about medium side like google did?

Anonymous
07/18/24(Thu)11:04:00 No.101455106

Anonymous 07/18/24(Thu)11:04:00 No.101455106

>>101455087
They're just copying GPT-4o as GPT-4o also has a much better tokenizer than previous GPT models

Anonymous
07/18/24(Thu)11:04:55 No.101455115

Anonymous 07/18/24(Thu)11:04:55 No.101455115

>>101454907
>12B
France deserves all the migrants.

Anonymous
07/18/24(Thu)11:05:33 No.101455121

Anonymous 07/18/24(Thu)11:05:33 No.101455121

>>101455105
only coomers use medium models

Anonymous
07/18/24(Thu)11:06:24 No.101455133

Anonymous 07/18/24(Thu)11:06:24 No.101455133

>>101455105
So that there's enough VRAM left over for context since people keep whining about context.
Learn to coom in under 4K tokens and you'll start seeing more 34B models again.

Anonymous
07/18/24(Thu)11:06:44 No.101455136

Anonymous 07/18/24(Thu)11:06:44 No.101455136

>>101455121
I am a coomer and I don't use any model cause all of them suck.

Anonymous
07/18/24(Thu)11:06:46 No.101455137

Anonymous 07/18/24(Thu)11:06:46 No.101455137

>>101454687
>15T bullshit tokens
>15T bullshit tokens about seeking mental help for asking an AI model to recreate things you need when that was it's entire purpose to throw darts at a board with a neurally trained algorithm

Anonymous
07/18/24(Thu)11:06:52 No.101455141

Anonymous 07/18/24(Thu)11:06:52 No.101455141

>>101455115
I'm french and I can only agree with your statement, c'mon Mistral you can do better than this shit...

Anonymous
07/18/24(Thu)11:07:16 No.101455146

Anonymous 07/18/24(Thu)11:07:16 No.101455146

>>101455105
128k context window BABY

Anonymous
07/18/24(Thu)11:07:27 No.101455148

Anonymous 07/18/24(Thu)11:07:27 No.101455148

Everyone ready for at least a week of wondering if the gguf tokenizer for mistral nemo is correct? I sure am.

Anonymous
07/18/24(Thu)11:07:53 No.101455154

Anonymous 07/18/24(Thu)11:07:53 No.101455154

>>101455133
I can't coom with retarded small models though

Anonymous
07/18/24(Thu)11:08:05 No.101455156

Anonymous 07/18/24(Thu)11:08:05 No.101455156

>>101455105
The small shit models are made because they're easy and cheap to train and experiment witn. The big models are made because they experimented with small models and determined that it could scale so they went all in on their investment to get the biggest baddest one they could make. It's all about the investors and how to use their money while appeasing them, not the users.

Anonymous
07/18/24(Thu)11:08:17 No.101455157

Anonymous 07/18/24(Thu)11:08:17 No.101455157

>>101455154
then go back to SuperCOT, you mentally ill concern troll.

Anonymous
07/18/24(Thu)11:08:36 No.101455162

Anonymous 07/18/24(Thu)11:08:36 No.101455162

>>101455156
thx 4 insight

Anonymous
07/18/24(Thu)11:09:26 No.101455173

Anonymous 07/18/24(Thu)11:09:26 No.101455173

>>101454907
For short RP, and 24GB VRAM, Gemma27B still better
But i think for having an assistant running 24/7 while not filling your VRAM and having 128 context is nice.

Anonymous
07/18/24(Thu)11:09:34 No.101455177

Anonymous 07/18/24(Thu)11:09:34 No.101455177

>>101455156
knowing that we're just good at getting the garbage draft models, we shouldn't even talk about them until they give us something good, giving them free advertising because they decided to give us one of their turds is crazy...

Anonymous
07/18/24(Thu)11:10:28 No.101455186

Anonymous 07/18/24(Thu)11:10:28 No.101455186

>>101455173
it's not gonna have 128k. it's going to be 32k max with huge degradation after that.

Anonymous
07/18/24(Thu)11:10:44 No.101455188

Anonymous 07/18/24(Thu)11:10:44 No.101455188

>>101455105
>What about medium side like google did?
>>101455133
>So that there's enough VRAM left over for context since people keep whining about context.
>>101455173
>But i think for having an assistant running 24/7 while not filling your VRAM and having 128 context is nice.
Mistral knows how to make a mamba model, they released one, they should make a medium sized Mamba model so that it doesn't fill the VRAM at huge context

Anonymous
07/18/24(Thu)11:11:48 No.101455194

Anonymous 07/18/24(Thu)11:11:48 No.101455194

how do I download it via huggingface hub CLI since I can't use ooba downloader due to stupid sign off

Anonymous
07/18/24(Thu)11:11:55 No.101455198

Anonymous 07/18/24(Thu)11:11:55 No.101455198

>>101455186
>it's going to be [sweet spot] max
based

Anonymous
07/18/24(Thu)11:12:05 No.101455202

Anonymous 07/18/24(Thu)11:12:05 No.101455202

>>101455148
>WARNING:hf-to-gguf:** WARNING: The BPE pre-tokenizer was not recognized!\nWARNING:hf-to-gguf:** There are 2 possible reasons for this:\nWARNING:hf-to-gguf:** - the model has not been added to convert_hf_to_gguf_update.py yet\nWARNING:hf-to-gguf:** - the pre-tokenization config has changed upstream\nWARNING:hf-to-gguf:** Check your model files and convert_hf_to_gguf_update.py and update them accordingly.\nWARNING:hf-to-gguf:** ref
About what I expected.

Anonymous
07/18/24(Thu)11:12:41 No.101455208

Anonymous 07/18/24(Thu)11:12:41 No.101455208

So that anon talking about how much faster llama.cpp is now, how its generating even faster than EXL2... I am not getting those results, not even close. Has it not been updated yet on booba? Would I need to use just llama.cpp instead of booba as the backend?

Anonymous
07/18/24(Thu)11:12:57 No.101455210

Anonymous 07/18/24(Thu)11:12:57 No.101455210

File: 1696525283727130.png (676 KB, 604x674)

676 KB PNG

>>101449685
It's only been more than a year since I dabbled with local LLMs and I already feel like a caveman

There's so many options now aside from Llama.
What are the improvements on compression and context though? What's the biggest (or best) model can I run now at 24GB?

Anonymous
07/18/24(Thu)11:13:25 No.101455215

Anonymous 07/18/24(Thu)11:13:25 No.101455215

>>101455208
>Has it not been updated yet on booba?
when is booba ever up to dat?

Anonymous
07/18/24(Thu)11:13:29 No.101455216

Anonymous 07/18/24(Thu)11:13:29 No.101455216

>>101455210
Starling 7B Beta

Anonymous
07/18/24(Thu)11:15:04 No.101455236

Anonymous 07/18/24(Thu)11:15:04 No.101455236

File: 0xw4uo983add1.png (251 KB, 1272x1048)

251 KB PNG

https://reddit.com/r/LocalLLaMA/comments/1e6bceq/new_geminitest_in_chatbot_arena_is_good/
Looks like google is finally catching up to the big guns, was about fucking time

Anonymous
07/18/24(Thu)11:15:42 No.101455241

Anonymous 07/18/24(Thu)11:15:42 No.101455241

>>101455186
>it's going to be 32k max with huge degradation after that.
You tested it?

Anonymous
07/18/24(Thu)11:17:10 No.101455251

Anonymous 07/18/24(Thu)11:17:10 No.101455251

File: ezgif-4-2052ba3604.gif (2.79 MB, 480x270)

2.79 MB GIF

Mistral Neko ~

Anonymous
07/18/24(Thu)11:17:33 No.101455259

Anonymous 07/18/24(Thu)11:17:33 No.101455259

>>101455173
>Gemma27B still better
I don't get it. I keep reading all the posts hyping it and when I tried it, it felt like a 7-8B. And it was extra hard to set it to anything that doesn't make it spout schizo nonsense. What do you guys do to run it properly and think it is good?

Anonymous
07/18/24(Thu)11:18:46 No.101455270

Anonymous 07/18/24(Thu)11:18:46 No.101455270

>>101455210
>There's so many options now aside from Llama.
There really aren't. It is all pretty much the same with some minor incremental upgrades here and there.

Anonymous
07/18/24(Thu)11:18:53 No.101455273

Anonymous 07/18/24(Thu)11:18:53 No.101455273

File: 1705962944691597.jpg (100 KB, 500x710)

100 KB JPG

https://www.axios.com/2024/07/17/meta-future-multimodal-ai-models-eu
Meta will NOT release their multimodal model in the EU as they fear the regulations.
The beginning of the end for open LLMs.

Anonymous
07/18/24(Thu)11:18:55 No.101455274

Anonymous 07/18/24(Thu)11:18:55 No.101455274

OK. So there's nothing releasing today after all from anyone else. Fine. But perhaps that means they're confident that they'll be able to compete with the news of Llama 3 next week. That means it's going to be very good. We're going to be so back in just 5 days!

Anonymous
07/18/24(Thu)11:19:09 No.101455279

Anonymous 07/18/24(Thu)11:19:09 No.101455279

>>101455270
this, 100% this

Anonymous
07/18/24(Thu)11:19:45 No.101455287

Anonymous 07/18/24(Thu)11:19:45 No.101455287

>>101455273
How they gonna enforce it? lol this is internet the EUSSR is retarted it hurts.

Anonymous
07/18/24(Thu)11:20:00 No.101455292

Anonymous 07/18/24(Thu)11:20:00 No.101455292

>>101455273
Europeans can just download it from a mirror or quanters. Is the EU government retarded?

Anonymous
07/18/24(Thu)11:20:20 No.101455299

Anonymous 07/18/24(Thu)11:20:20 No.101455299

>>101455273
it is ok. bartowski will release it for them.

Anonymous
07/18/24(Thu)11:20:38 No.101455303

Anonymous 07/18/24(Thu)11:20:38 No.101455303

>>101455292
>Is the EU government retarded?
is water wet?

Anonymous
07/18/24(Thu)11:20:50 No.101455305

Anonymous 07/18/24(Thu)11:20:50 No.101455305

>>101455292
>Is the EU government retarded?
They don't care about individuals, they care about companies and shit. And companies fear the laws.

Anonymous
07/18/24(Thu)11:20:52 No.101455306

Anonymous 07/18/24(Thu)11:20:52 No.101455306

File: scoop.png (234 KB, 1161x2866)

234 KB PNG

send help I can't stop making degen shit
Qwen2-72B-Instruct-Q5_K_M

Anonymous
07/18/24(Thu)11:21:00 No.101455307

Anonymous 07/18/24(Thu)11:21:00 No.101455307

>>101455208
gguf vs exl2 anon here. I used llama.cpp and tabbyapi latest versions.
Ooba uses llama.cpp python wrapper and it's not the latest version.
Same with exllama.

Booba is convenient but not always up to date

Anonymous
07/18/24(Thu)11:21:28 No.101455313

Anonymous 07/18/24(Thu)11:21:28 No.101455313

>>101455105
Every model got obsoleted by Gemma 2. You have to surpass it to have an excuse to release something.

Anonymous
07/18/24(Thu)11:21:41 No.101455316

Anonymous 07/18/24(Thu)11:21:41 No.101455316

>>101455305
>They don't care about individuals
They care about us, they want us to have as little power as possible, and users having a powerful LLM scares them

Anonymous
07/18/24(Thu)11:22:06 No.101455318

Anonymous 07/18/24(Thu)11:22:06 No.101455318

>>101455292
>Is the EU government retarded?
Is this a real question?

Anonymous
07/18/24(Thu)11:22:09 No.101455320

Anonymous 07/18/24(Thu)11:22:09 No.101455320

>>101455316
>They care about us
They don't. No one cares about local AI coomers.

Anonymous
07/18/24(Thu)11:22:32 No.101455323

Anonymous 07/18/24(Thu)11:22:32 No.101455323

>>101455273
Man Europe is fucking retarded
Doesn't help that our current government sucks their dick clean

Anonymous
07/18/24(Thu)11:23:46 No.101455337

Anonymous 07/18/24(Thu)11:23:46 No.101455337

>>101455320
Based and it should stay that way. Coomers are ungrateful scum.

Anonymous
07/18/24(Thu)11:23:57 No.101455341

Anonymous 07/18/24(Thu)11:23:57 No.101455341

>>101455323
On aurait du voter RN aux éléctions législatives putain...

Anonymous
07/18/24(Thu)11:24:58 No.101455355

Anonymous 07/18/24(Thu)11:24:58 No.101455355

>>101455337
You never coomed in your life anon?

Anonymous
07/18/24(Thu)11:25:05 No.101455357

Anonymous 07/18/24(Thu)11:25:05 No.101455357

>>101455341
Im not French mon ami

Anonymous
07/18/24(Thu)11:25:59 No.101455362

Anonymous 07/18/24(Thu)11:25:59 No.101455362

>>101455357
oh my b kek

Anonymous
07/18/24(Thu)11:26:06 No.101455365

Anonymous 07/18/24(Thu)11:26:06 No.101455365

there's no way the new cope is that 27b > cr+ and qwen 72b/magnum

Anonymous
07/18/24(Thu)11:27:31 No.101455383

Anonymous 07/18/24(Thu)11:27:31 No.101455383

>>101455365
better than Opus

Anonymous
07/18/24(Thu)11:27:59 No.101455389

Anonymous 07/18/24(Thu)11:27:59 No.101455389

27B outpaces GPT-5.

Anonymous
07/18/24(Thu)11:28:29 No.101455395

Anonymous 07/18/24(Thu)11:28:29 No.101455395

File: figures_arena1.jpg (487 KB, 1500x1115)

487 KB JPG

>>101455365
It objectively trades blows with DeepSeeks's 236B model, and already surpassed Nemotron 340B and Llama 3 70b.

Anonymous
07/18/24(Thu)11:28:36 No.101455397

Anonymous 07/18/24(Thu)11:28:36 No.101455397

File: 00012-1677813217.png (1.19 MB, 1024x1024)

1.19 MB PNG

>>101455365
>VRAMlet cope... VRAMlet cope never changes

Anonymous
07/18/24(Thu)11:28:54 No.101455402

Anonymous 07/18/24(Thu)11:28:54 No.101455402

>>101455355
he cut his dick off so he can't anymore

Anonymous
07/18/24(Thu)11:29:30 No.101455413

Anonymous 07/18/24(Thu)11:29:30 No.101455413

>mikufag still seething that his wizard and midnight miqu scams fell apart

Anonymous
07/18/24(Thu)11:30:00 No.101455422

Anonymous 07/18/24(Thu)11:30:00 No.101455422

>>101455397
>tranime avatarfag
slit your wrists.
captcha : G0YT4

Anonymous
07/18/24(Thu)11:30:54 No.101455436

Anonymous 07/18/24(Thu)11:30:54 No.101455436

>>101455194
nevermind. GPT4|o has all the huggingface docs.

Anonymous
07/18/24(Thu)11:32:33 No.101455453

Anonymous 07/18/24(Thu)11:32:33 No.101455453

>>101455292
the EU didn't ask meta not to release it, this is meta's decision, probably because they don't want to get fined since they trained it on facebook posts without following the regulations

Anonymous
07/18/24(Thu)11:33:02 No.101455457

Anonymous 07/18/24(Thu)11:33:02 No.101455457

>Unlike previous Mistral models, Mistral Nemo requires smaller temperatures. We recommend to use a temperature of 0.3.
This will confuse and enrage the samplerfags

Anonymous
07/18/24(Thu)11:33:12 No.101455461

Anonymous 07/18/24(Thu)11:33:12 No.101455461

>>101455395
arena has been irrelevant for months now

Anonymous
07/18/24(Thu)11:33:13 No.101455463

Anonymous 07/18/24(Thu)11:33:13 No.101455463

>>101455365
It is, everyone who's saying otherwise is either lying or on cope that their $3000 llm machine is useless

Anonymous
07/18/24(Thu)11:33:42 No.101455471

Anonymous 07/18/24(Thu)11:33:42 No.101455471

>>101455365
Reminds me of the shitty 7b finetunes that were "better" than GPT 3.5
Absolute copium by vramlets

Anonymous
07/18/24(Thu)11:35:09 No.101455490

Anonymous 07/18/24(Thu)11:35:09 No.101455490

>Here at MistralAI we realize your time is very valuable to you. Which is why we have included both the sharded and unsharded weights in the repo to effectively double the download time. Have a nice day :3

Anonymous
07/18/24(Thu)11:35:22 No.101455492

Anonymous 07/18/24(Thu)11:35:22 No.101455492

what's with mistral spamming all these useless tiny models
where's the big shit

Anonymous
07/18/24(Thu)11:35:27 No.101455494

Anonymous 07/18/24(Thu)11:35:27 No.101455494

File: screenshot.png (203 KB, 934x475)

203 KB PNG

Anyone still using 70Bs is not being honest with themselves.

Anonymous
07/18/24(Thu)11:35:28 No.101455495

Anonymous 07/18/24(Thu)11:35:28 No.101455495

>>101455395
it's an API model though?

Anonymous
07/18/24(Thu)11:36:05 No.101455502

Anonymous 07/18/24(Thu)11:36:05 No.101455502

>>101455236
A good model would say to just give the whole chain to the landlord as collateral and pay the rent in cash once you have it.

Anonymous
07/18/24(Thu)11:36:14 No.101455504

Anonymous 07/18/24(Thu)11:36:14 No.101455504

>>101455492
They're going to release it API-only :)

Anonymous
07/18/24(Thu)11:37:07 No.101455518

Anonymous 07/18/24(Thu)11:37:07 No.101455518

>>101455492
it's always that way anon, they experiment on little turd models and once they find the good formula, they give us that turd and process to train a giant model they'll keep for themselves

Anonymous
07/18/24(Thu)11:37:33 No.101455522

Anonymous 07/18/24(Thu)11:37:33 No.101455522

>>101455495
https://huggingface.co/deepseek-ai/DeepSeek-V2-Chat-0628

Anonymous
07/18/24(Thu)11:38:09 No.101455529

Anonymous 07/18/24(Thu)11:38:09 No.101455529

>>101455461
>still coping about midnight miqu
You forgot your avatar, mikufag.

Anonymous
07/18/24(Thu)11:38:18 No.101455533

Anonymous 07/18/24(Thu)11:38:18 No.101455533

>>101455490
Just don't download the whole repo

Anonymous
07/18/24(Thu)11:39:31 No.101455548

Anonymous 07/18/24(Thu)11:39:31 No.101455548

>>101455471
t.too rich to try a 27b model

Anonymous
07/18/24(Thu)11:39:36 No.101455551

Anonymous 07/18/24(Thu)11:39:36 No.101455551

i've noticed a whole new ism with Gemma27b.

"you think this is a game?" is something that pops up for me every single time there's even the SLIGHTEST conflict. i've seen it dozens of times by now.

Anonymous
07/18/24(Thu)11:40:42 No.101455558

Anonymous 07/18/24(Thu)11:40:42 No.101455558

File: dge.jpg (379 KB, 3371x1715)

379 KB JPG

>>101455522
that's weird, it is said it's a propriety model on chatbot arena

Anonymous
07/18/24(Thu)11:41:17 No.101455568

Anonymous 07/18/24(Thu)11:41:17 No.101455568

>>101455558
it was, they just open sourced it today

Anonymous
07/18/24(Thu)11:41:28 No.101455570

Anonymous 07/18/24(Thu)11:41:28 No.101455570

>>101455548
I tried it
It's redditsmart
But it has few parameters so it loses track of things in an RP and isn't able to follow logical conclusions as well as 70Bs. Im not even that rich (36GB) but even G2 27B at Q8 is worse than L3 70B finetune at 3.5bpw

Anonymous
07/18/24(Thu)11:41:43 No.101455574

Anonymous 07/18/24(Thu)11:41:43 No.101455574

>>101455402
kek'ed

Anonymous
07/18/24(Thu)11:42:00 No.101455581

Anonymous 07/18/24(Thu)11:42:00 No.101455581

>>101455551
It's just you.

Anonymous
07/18/24(Thu)11:43:30 No.101455603

Anonymous 07/18/24(Thu)11:43:30 No.101455603

>>101455402
projecting your own desires huh?

Anonyomus
07/18/24(Thu)11:44:34 No.101455622

Anonyomus 07/18/24(Thu)11:44:34 No.101455622

how many years until things get good?

Anonymous
07/18/24(Thu)11:47:21 No.101455653

Anonymous 07/18/24(Thu)11:47:21 No.101455653

>>101449996
NTA but in my experience quant just kills the instruct capabilities without actually dumbing down the model too much

I've played around with Mistral0.2 2bit quant where I managed to get it to have similar responses for my purposes as the original model by using the completion style like

Do X using Y below
(content)
Certainly! here is X using Y:

The completion works fine, but if you don't do that last step of starting the model's answer, it degenerates

Anonymous
07/18/24(Thu)11:48:06 No.101455663

Anonymous 07/18/24(Thu)11:48:06 No.101455663

>>101455622
People like you are morons. Look a two year back and where are we now.. things did get already good.

Anonymous
07/18/24(Thu)11:49:35 No.101455688

Anonymous 07/18/24(Thu)11:49:35 No.101455688

>>101455273
literally no one gives a shit about europoors.
they're poor for a reason.

Anonymous
07/18/24(Thu)11:54:01 No.101455737

Anonymous 07/18/24(Thu)11:54:01 No.101455737

I'm looking for something that would take a given text file, and for each paragraph estimate its clarity, and suggest a way to improve it. Then it aggregates all result in a file by ascending clarity score. I wrote a script to do it with ollama, but after running for 25 minutes it crashed because one of the replies was missing the clarity key in the json answer.

So to actually do that, it is necessary to run the query multiple times if the json is not valid. Also to have something of a hash of the query + paragraph to store intermediary results. A progress bar would be nice.

Is there already something to do this?

Anonyomus
07/18/24(Thu)11:55:32 No.101455751

Anonyomus 07/18/24(Thu)11:55:32 No.101455751

>>101455663
these models still sound like robots. good is when they don't.

Anonymous
07/18/24(Thu)11:56:13 No.101455764

Anonymous 07/18/24(Thu)11:56:13 No.101455764

>>101455737
look into grammars, I know you can use them to enforce an output schema with llama.cpp, not sure about ollameme

Anonymous
07/18/24(Thu)11:56:32 No.101455769

Anonymous 07/18/24(Thu)11:56:32 No.101455769

>>101455365
It's shills, bored schizos and riddlers who talk about it nonstop. Everyone already tried it and found it to be garbage.

Anonymous
07/18/24(Thu)11:59:13 No.101455798

Anonymous 07/18/24(Thu)11:59:13 No.101455798

>>101455751
They just need to be finetuned on 4chan.

Anonymous
07/18/24(Thu)12:00:10 No.101455812

Anonymous 07/18/24(Thu)12:00:10 No.101455812

>>101455769
pretty much yeah but I wouldn't say it's garbage, it's probably the best thing in its weight class, it's just not better than 70b+ models

Anonymous
07/18/24(Thu)12:00:36 No.101455817

Anonymous 07/18/24(Thu)12:00:36 No.101455817

yep. It's another episode of nothing fucking works.

Anonymous
07/18/24(Thu)12:01:25 No.101455825

Anonymous 07/18/24(Thu)12:01:25 No.101455825

>>101455663
yes, we have same censored slop, nothing changed.

Anonyomus
07/18/24(Thu)12:01:38 No.101455830

Anonyomus 07/18/24(Thu)12:01:38 No.101455830

>>101455798
>he doesn't know

Anonymous
07/18/24(Thu)12:02:33 No.101455842

Anonymous 07/18/24(Thu)12:02:33 No.101455842

If I put two gemmas into mistralrs' anyMoE, will I get a better model, or a retarded fatter Gemma?

Anonymous
07/18/24(Thu)12:03:28 No.101455857

Anonymous 07/18/24(Thu)12:03:28 No.101455857

>>101455764
This part works pretty reliably, even if it did fail once. It's more about defining a pipeline of tasks, one task that should run once all intermediary jobs are run, and handles storing intermediary results to be able to take up where it failed.

Anonymous
07/18/24(Thu)12:04:42 No.101455874

Anonymous 07/18/24(Thu)12:04:42 No.101455874

>>101455857
write a python script

Anonymous
07/18/24(Thu)12:06:02 No.101455888

Anonymous 07/18/24(Thu)12:06:02 No.101455888

>>101454907
>12B
oh my god who the hell cares?

Anonymous
07/18/24(Thu)12:06:13 No.101455891

Anonymous 07/18/24(Thu)12:06:13 No.101455891

Gemma fucking sucks

Anonymous
07/18/24(Thu)12:06:47 No.101455898

Anonymous 07/18/24(Thu)12:06:47 No.101455898

>>101455570
This kills the VRAMlet

Anonymous
07/18/24(Thu)12:06:53 No.101455902

Anonymous 07/18/24(Thu)12:06:53 No.101455902

>>101455888
moi
decent quant will happily sit on my 3060

Anonymous
07/18/24(Thu)12:07:04 No.101455903

Anonymous 07/18/24(Thu)12:07:04 No.101455903

Why you guys even post on /lmg/ when you hate everything? Take pause.

Anonymous
07/18/24(Thu)12:10:46 No.101455951

Anonymous 07/18/24(Thu)12:10:46 No.101455951

>>101455903
Why do you post on /lmg/ when you hate everyone since they post here while hating everything? Take pause.

Anonymous
07/18/24(Thu)12:12:03 No.101455970

Anonymous 07/18/24(Thu)12:12:03 No.101455970

>>101455951
>"heh that'll show him" ahh reply

Anonymous
07/18/24(Thu)12:12:58 No.101455983

Anonymous 07/18/24(Thu)12:12:58 No.101455983

Looking at benchmarks is bad for mental.
Just try the model, see that Gemma is fast but pretty retarded and move on.
I don't care if it knows how many siblings Sally has it's stupid.
Qwen2 72b, L3 70B, CR 35B and basically nothing else. Of course fine-tunes of Q2 and L3 are great, CR doesn't even need a fine-tune. I'd recommend dawnbreak or daybreak L3 or however it's called. Banger of a model.

Anonymous
07/18/24(Thu)12:14:30 No.101456013

Anonymous 07/18/24(Thu)12:14:30 No.101456013

>>101454907
>benches
Worthless until I can actually get good responses.
>12B
Lmao, the medium weight class is fucking dead.

Anonymous
07/18/24(Thu)12:16:39 No.101456043

Anonymous 07/18/24(Thu)12:16:39 No.101456043

so like...anyone actually get mistral-nemo working yet? I keep getting tensor shape error.

Anonymous
07/18/24(Thu)12:17:30 No.101456052

Anonymous 07/18/24(Thu)12:17:30 No.101456052

>>101455903
Mikufag became jaded and it's shitting on every model that does well on benchmarks and the arena.

Anonymous
07/18/24(Thu)12:17:59 No.101456062

Anonymous 07/18/24(Thu)12:17:59 No.101456062

File: 1690423585049383.jpg (105 KB, 908x1280)

105 KB JPG

another nail in foss ai meme coffin

Anonymous
07/18/24(Thu)12:18:17 No.101456068

Anonymous 07/18/24(Thu)12:18:17 No.101456068

>>101455307
ooba has the latest exl2 version retard

Anonymous
07/18/24(Thu)12:18:50 No.101456078

Anonymous 07/18/24(Thu)12:18:50 No.101456078

>>101455983
I have tried all of these models and settled on gemma-2-27b still being the best. CR 35B is not usable, by the way.

Anonymous
07/18/24(Thu)12:22:16 No.101456134

Anonymous 07/18/24(Thu)12:22:16 No.101456134

>>101456078
Sad that google has fallen so far that they spam generals on 4chan with fiver jeets to try and look relevant.

Anonymous
07/18/24(Thu)12:22:23 No.101456137

Anonymous 07/18/24(Thu)12:22:23 No.101456137

>>101456068
You are correct. It does run slower in my case though

Anonymous
07/18/24(Thu)12:22:54 No.101456143

Anonymous 07/18/24(Thu)12:22:54 No.101456143

>>101456062
por que?
what benefit is there to having a portable, cloud-based LLM?

Anonymous
07/18/24(Thu)12:23:01 No.101456145

Anonymous 07/18/24(Thu)12:23:01 No.101456145

>>101456134
Keep crying, miku.

Anonymous
07/18/24(Thu)12:24:56 No.101456180

Anonymous 07/18/24(Thu)12:24:56 No.101456180

>>101456137
Exllama only has pipeline parallelism, vLLM is probably faster for multiple GPUs.

Anonymous
07/18/24(Thu)12:25:59 No.101456195

Anonymous 07/18/24(Thu)12:25:59 No.101456195

Pls no larp.
Gemma has one benefit and that's speed. It's like saying 8B is faster, well yeah it is but it's also dumb. CR is the lowest I'd ever go for RP smut time. Otherwise 70B Q5 as the daily driver.
I've tried various miqus, mixtrals, gemma2, various l3s, CR, CR+, Qwn1/2, abliterated models and all kinds of other junk. If you really think G2 is good you may be retarded.
Just call it how it is, vramlet.

Anonymous
07/18/24(Thu)12:27:44 No.101456223

Anonymous 07/18/24(Thu)12:27:44 No.101456223

File: 1714883181805436.png (3 KB, 368x53)

3 KB PNG

>>101456195
>obsessed

Anonymous
07/18/24(Thu)12:27:56 No.101456226

Anonymous 07/18/24(Thu)12:27:56 No.101456226

>>101456195
>cr 35b
>better than gemma
obvious shitpost

Anonymous
07/18/24(Thu)12:29:13 No.101456236

Anonymous 07/18/24(Thu)12:29:13 No.101456236

>>101456180
can you explain the difference on why vllm would be faster?
vLLM also doesn't support as many sizes of quants

Anonymous
07/18/24(Thu)12:30:08 No.101456248

Anonymous 07/18/24(Thu)12:30:08 No.101456248

>>101456062
OpenAI is dying, I'm only using claude 3.5 sonnet now, it's the only model actually good at code, there was gpt4 march 2023 that was also actually good back then but we can't use it anymore so...

Anonymous
07/18/24(Thu)12:31:12 No.101456261

Anonymous 07/18/24(Thu)12:31:12 No.101456261

>>101456223
>>101456226
Disprove with logs lads, I've posted my qwenny logs in this thread and last.

Anonymous
07/18/24(Thu)12:32:06 No.101456267

Anonymous 07/18/24(Thu)12:32:06 No.101456267

>>101456062
mini version will be free btw

Anonymous
07/18/24(Thu)12:33:12 No.101456278

Anonymous 07/18/24(Thu)12:33:12 No.101456278

>>101456267
Not local not interested

Anonymous
07/18/24(Thu)12:34:23 No.101456289

Anonymous 07/18/24(Thu)12:34:23 No.101456289

>>101456267
Free if your information is worthless.

Anonyomus
07/18/24(Thu)12:35:13 No.101456298

Anonyomus 07/18/24(Thu)12:35:13 No.101456298

I actually can't run any of this so I just collect cards and wait for the day.

Anonymous
07/18/24(Thu)12:36:09 No.101456310

Anonymous 07/18/24(Thu)12:36:09 No.101456310

>>101456278
>not shit enough not interested
kek

Anonymous
07/18/24(Thu)12:36:45 No.101456316

Anonymous 07/18/24(Thu)12:36:45 No.101456316

>>101456236
Because it has tensor parallelism? It doesn't run the GPUs sequentially, or something like that. It also uses the NCCL library makes better use of NVLink.

Anonymous
07/18/24(Thu)12:39:05 No.101456341

Anonymous 07/18/24(Thu)12:39:05 No.101456341

>>101456298
Any of it?
You could use a colab instance to run 8b at least, I'm pretty sure.
I think koboldcpp has a ready made colab notebook in their repo.

Anonymous
07/18/24(Thu)12:39:22 No.101456343

Anonymous 07/18/24(Thu)12:39:22 No.101456343

I have a 3090 and a 850w power supply and just using a ryzen

what gpu can i plop in my 2nd slot for cheap that can fit 850w??? I just want more memory.....................

Anonymous
07/18/24(Thu)12:40:47 No.101456350

Anonymous 07/18/24(Thu)12:40:47 No.101456350

>>101456343
>what gpu can i plop
you must be 18 or older and not have used reddit within the past 6 months to post here.

Anonymous
07/18/24(Thu)12:41:01 No.101456353

Anonymous 07/18/24(Thu)12:41:01 No.101456353

>>101456343
a 3090

Anonymous
07/18/24(Thu)12:42:12 No.101456365

Anonymous 07/18/24(Thu)12:42:12 No.101456365

>>101456078
CR 35B is bad at instructions but best for prose

Anonymous
07/18/24(Thu)12:42:49 No.101456375

Anonymous 07/18/24(Thu)12:42:49 No.101456375

File: 1718052286137336.png (18 KB, 349x148)

18 KB PNG

>>101456353
dont i need a beefier psu?

also i just realized my 2nd slot is just x2 lanes which sucks ass

Anonymous
07/18/24(Thu)12:43:49 No.101456388

Anonymous 07/18/24(Thu)12:43:49 No.101456388

>>101456365
It's not usable because it does whatever it wants, usually just porn.

Anonymous
07/18/24(Thu)12:44:08 No.101456393

Anonymous 07/18/24(Thu)12:44:08 No.101456393

>>101456375
10 strings a second don't need as much bandwidth as millions of vertex calculations 165+ times a second

Anonymous
07/18/24(Thu)12:45:19 No.101456409

Anonymous 07/18/24(Thu)12:45:19 No.101456409

>>101456261
qwenny... uooh...

Anonymous
07/18/24(Thu)12:47:08 No.101456436

Anonymous 07/18/24(Thu)12:47:08 No.101456436

>>101456393
hmm but i should probably just get a new motherboard that can be configured for x8 each since 3090s can get as cheap as $480 here nowadays

Anonymous
07/18/24(Thu)12:48:59 No.101456469

Anonymous 07/18/24(Thu)12:48:59 No.101456469

>>101456393

NTA but what does NVlink do in this case if the 2nd gpu runs on x2 or x4?

Anonyomus
07/18/24(Thu)12:49:38 No.101456478

Anonyomus 07/18/24(Thu)12:49:38 No.101456478

>>101456341
I like my privacy. But yeah I ran one of the weaker models and it still took 3 minutes to generate nonsense.

Anonymous
07/18/24(Thu)12:52:11 No.101456499

Anonymous 07/18/24(Thu)12:52:11 No.101456499

>>101456469
I think both would run on the lowest bandwidth
that's how SLI used to work

Anonymous
07/18/24(Thu)12:52:30 No.101456503

Anonymous 07/18/24(Thu)12:52:30 No.101456503

>>101456469
It explodes.

Anonymous
07/18/24(Thu)12:52:46 No.101456509

Anonymous 07/18/24(Thu)12:52:46 No.101456509

>>101456436
Are you in Taiwan or something?

Anonymous
07/18/24(Thu)12:53:51 No.101456517

Anonymous 07/18/24(Thu)12:53:51 No.101456517

>>101456499
das stupid
can't it just use the x16 lanes

Anonymous
07/18/24(Thu)12:59:09 No.101456563

Anonymous 07/18/24(Thu)12:59:09 No.101456563

>>101456478
>3 minutes
I can run gemma-2-9b Q_4_M at ~5 t/s on my laptop CPU with llama.cpp

Anonymous
07/18/24(Thu)12:59:57 No.101456573

Anonymous 07/18/24(Thu)12:59:57 No.101456573

>>101456062
Llama 8B fags BTFO

Anonymous
07/18/24(Thu)13:00:10 No.101456577

Anonymous 07/18/24(Thu)13:00:10 No.101456577

>>101456563
*Q4_K_M

Anonyomus
07/18/24(Thu)13:03:04 No.101456617

Anonyomus 07/18/24(Thu)13:03:04 No.101456617

>>101456563
I don't know. I just ran a random model I found on hugginface with koboldcpp.

Anonymous
07/18/24(Thu)13:09:00 No.101456703

Anonymous 07/18/24(Thu)13:09:00 No.101456703

>>101456617
Did you offload all layers?

Anonymous
07/18/24(Thu)13:09:31 No.101456710

Anonymous 07/18/24(Thu)13:09:31 No.101456710

>>101456573
not local, it will be censored to hell I won't be able to RP with my waifu, unironically :(

Anonymous
07/18/24(Thu)13:12:03 No.101456758

Anonymous 07/18/24(Thu)13:12:03 No.101456758

>>101456062
Can't wait to try my character cards with it.
>Tags: loli, bestiality, double penetration

Anonymous
07/18/24(Thu)13:14:58 No.101456809

Anonymous 07/18/24(Thu)13:14:58 No.101456809

>>101456062
They really hit the wall didn't they? They are just throwing shit out there to stay relevant after their failure with training GPT-5.

Anonymous
07/18/24(Thu)13:15:53 No.101456821

Anonymous 07/18/24(Thu)13:15:53 No.101456821

27B SPPO when?

Anonyomus
07/18/24(Thu)13:18:31 No.101456859

Anonyomus 07/18/24(Thu)13:18:31 No.101456859

>>101456703
I didn't fuck around with any settings because I didn't know what they do.

Anonymous
07/18/24(Thu)13:27:06 No.101456974

Anonymous 07/18/24(Thu)13:27:06 No.101456974

>>101456821
this, that's the only thing I'm waiting at the moment

Anonymous
07/18/24(Thu)13:28:34 No.101456997

Anonymous 07/18/24(Thu)13:28:34 No.101456997

>>101456952
Learn to prompt, retard.

Anonymous
07/18/24(Thu)13:28:45 No.101457003

Anonymous 07/18/24(Thu)13:28:45 No.101457003

File: offload_x_performance_theory.png (167 KB, 1536x1152)

167 KB PNG

>>101456859
Fair enough.
In the case of these layers, you want to have as many as you can inside your GPU's VRAM, so that's something you should change if you can.

Anonymous
07/18/24(Thu)13:33:09 No.101457080

Anonymous 07/18/24(Thu)13:33:09 No.101457080

File: 1709980355581910.jpg (79 KB, 1280x647)

79 KB JPG

>>101456062

Anonymous
07/18/24(Thu)13:36:45 No.101457145

Anonymous 07/18/24(Thu)13:36:45 No.101457145

>>101456062
When they fuck are they gonna release gpt5? They still act as if they're still the king of the AI world, it's not the case anymore, claude 3.5 sonnat is now the big dog, they will die if they don't step up their game

Anonymous
07/18/24(Thu)13:40:19 No.101457212

Anonymous 07/18/24(Thu)13:40:19 No.101457212

>>101457145
>they will die if they...
release a bad model as gpt5, that would signal they can't innovate, that's why they won't, yet.

Anonymous
07/18/24(Thu)13:40:19 No.101457214

Anonymous 07/18/24(Thu)13:40:19 No.101457214

>>101457145
GPT-4o was supposed to be GPT-5. It was so disappointing that they had to rebrand it under the GPT-4 moniker. "GPT-5" will only be released whenever it doesn't disappoint. Which will be never, so instead they will make up something about it being safer to do incremental updates from here on out and drop the GPT-x paradigm or some bullshit like that.

If they released GPT-4o as GPT-5 like originally planned the entire LLM industry would collapse and start a new AI-winter.

Anonymous
07/18/24(Thu)13:40:20 No.101457215

Anonymous 07/18/24(Thu)13:40:20 No.101457215

Man. Chill. GPT-5 is in training. They'll BTFO everyone as soon as it comes out. The current competition is good. They'll be fine.

Anonymous
07/18/24(Thu)13:40:48 No.101457227

Anonymous 07/18/24(Thu)13:40:48 No.101457227

how do I RP? I have really have zero experience in doing RP and I feel like im missing out a lot...

Anonymous
07/18/24(Thu)13:41:06 No.101457233

Anonymous 07/18/24(Thu)13:41:06 No.101457233

mistral-nemo-8x12b-SPPO-orthogonal when?

Anonymous
07/18/24(Thu)13:41:57 No.101457253

Anonymous 07/18/24(Thu)13:41:57 No.101457253

>>101457233
Orthogonal is a meme. You don't want the model to be unable to refuse completely when the story calls for it.

Anonymous
07/18/24(Thu)13:42:16 No.101457259

Anonymous 07/18/24(Thu)13:42:16 No.101457259

>>101457145
after the election

Anonymous
07/18/24(Thu)13:43:32 No.101457276

Anonymous 07/18/24(Thu)13:43:32 No.101457276

>>101457214
is this as good as LLMs will ever get?
multi-modal and other frankenstein hybrids in the works would seem to suggest as much, unless some new factors are introduced

Anonymous
07/18/24(Thu)13:44:40 No.101457294

Anonymous 07/18/24(Thu)13:44:40 No.101457294

>>101457145
>claude 3.5 sonnat is now the big dog,
I also tried this model and I thought that the claude models were way more cucked than the chatgpt series. I was pleasently surprised when I realized it was the opposite, I'm not a murican so I tried to understand why Crooks registered as a republican even though he's a democrat. Bing chat told me to fuck off but Claude 3.5 Sonnet was willing to explain to me why (it was because he wanted to vote another republican than trump to weaken this side), that's when I realized that OpenAI is fucked if they don't react to that. Claude is less censored and better than OpenAI at the moment.

Anonymous
07/18/24(Thu)13:45:16 No.101457306

Anonymous 07/18/24(Thu)13:45:16 No.101457306

File: F1OeOAzWYAEzmfM.png (129 KB, 723x666)

129 KB PNG

>>101456758

Anonymous
07/18/24(Thu)13:46:03 No.101457318

Anonymous 07/18/24(Thu)13:46:03 No.101457318

>>101456758
good thing it will reject your utterly shit tastes.

Anonymous
07/18/24(Thu)13:46:04 No.101457320

Anonymous 07/18/24(Thu)13:46:04 No.101457320

Nemo GGUF https://huggingface.co/second-state/Mistral-Nemo-Instruct-2407-GGUF

Anonymous
07/18/24(Thu)13:48:07 No.101457352

Anonymous 07/18/24(Thu)13:48:07 No.101457352

>>101457320
What? I thought people were reporting it didn't work.

Anonymous
07/18/24(Thu)13:48:18 No.101457355

Anonymous 07/18/24(Thu)13:48:18 No.101457355

>>101457306
Ah, a man of culture, I see.

>>101457318
Nah man, I don't do scat.

Anonymous
07/18/24(Thu)13:50:00 No.101457385

Anonymous 07/18/24(Thu)13:50:00 No.101457385

>>101457352
got a tokenizer issue when i tried to convert yeah, don't know how they did it, so might be scuffed

Anonymous
07/18/24(Thu)13:52:28 No.101457416

Anonymous 07/18/24(Thu)13:52:28 No.101457416

>>101457355
pedoshit and bestiality is not that far from scat though.

Anonymous
07/18/24(Thu)13:52:33 No.101457418

Anonymous 07/18/24(Thu)13:52:33 No.101457418

>>101457276
>is this as good as LLMs will ever get?
A new architecture will BFTO transformers and a 7b model will be as good as gpt4o, that's really likely, there's no way transformers is the dead end of machine learning, no way

Anonymous
07/18/24(Thu)13:54:06 No.101457438

Anonymous 07/18/24(Thu)13:54:06 No.101457438

>>101457214
>GPT-4o was supposed to be GPT-5. It was so disappointing that they had to rebrand it under the GPT-4 moniker.
I also believe that aswell, they had no reason to just strive for a "slightly better version of gpt4", no one care about that stuff, and OpenAI used to go big everytime, that was unusual of them, maybe they reached their celling, but ClaudeAI definitely hasn't yet

Anonymous
07/18/24(Thu)13:54:28 No.101457443

Anonymous 07/18/24(Thu)13:54:28 No.101457443

>>101457320
>LlamaEdge, powered by Rust and WasmEdge, provides a strong alternative to Python in AI inference.
great, another ollama
>The WASI-NN ggml plugin embedded llama.cpp as its backend.

Anonymous
07/18/24(Thu)13:55:07 No.101457452

Anonymous 07/18/24(Thu)13:55:07 No.101457452

>>101457320
Does not work with kobold. Anyway someone on plebit report that the model It's coherent in novel continuation at 128K

Anonymous
07/18/24(Thu)13:55:11 No.101457456

Anonymous 07/18/24(Thu)13:55:11 No.101457456

>>101457276
>is this as good as LLMs will ever get?
define "good". are we going to get big leaps in "intelligence" in pure text gen? probably not. seems we've pretty much hit the limits of what simple scaling can provide.

but there are lots of ways they can get better. there's dozens, hundreds of papers exploring ideas that would lower costs, increase control, etc. the engineering and compute can't keep pace with the research.

i think we've only scratched the surface of multi modality. it WILL result in increased intelligence and usefulness. then there's combining LLMs with other algorithms to better approximate cognition.

Anonymous
07/18/24(Thu)13:55:59 No.101457469

Anonymous 07/18/24(Thu)13:55:59 No.101457469

>>101457276
>is this as good as LLMs will ever get?
If they keep pretraining their models will leddit and wokipedia, yeah, that's how far we can go. Especially for leddit, this place is hell on earth

Anonymous
07/18/24(Thu)13:57:54 No.101457495

Anonymous 07/18/24(Thu)13:57:54 No.101457495

>>101457276
Until some breakout yeach that what we getting 2-5% better performance with each new model. Safety is for fags and the space is sadly full of them.

Anonymous
07/18/24(Thu)14:00:17 No.101457531

Anonymous 07/18/24(Thu)14:00:17 No.101457531

>>101457504
>>101457504
>>101457504

Anonymous
07/18/24(Thu)14:00:18 No.101457534

Anonymous 07/18/24(Thu)14:00:18 No.101457534

>>101457416
Stop trying to make your scat fetish happen.
It's not gonna happen.

Anonymous
07/18/24(Thu)14:01:16 No.101457549

Anonymous 07/18/24(Thu)14:01:16 No.101457549

>>101457534
>pedo projections
seems you love scat in the end of the day huh?

Anonymous
07/18/24(Thu)14:02:00 No.101457560

Anonymous 07/18/24(Thu)14:02:00 No.101457560

>>101457495
>Safety is for fags and the space is sadly full of them.
That's why it was never my dream to work on a giant company like google, you have to sell your sovl and your morals to work in such a cucked environnement

Anonymous
07/18/24(Thu)14:06:07 No.101457609

Anonymous 07/18/24(Thu)14:06:07 No.101457609

File: 1665422973550704878950385(...).png (901 KB, 1155x1142)

901 KB PNG

>>101457549
>Arguing over whose fetish is worst
Shut up faggots

Anonymous
07/18/24(Thu)14:18:14 No.101457794

Anonymous 07/18/24(Thu)14:18:14 No.101457794

>>101452686
There are few things that make me rage harder on 4chan, than /poltards who say "you're not welcome here," as if they seriously expect the person they're talking to, to give a flying fuck.

Get off the Internet, Cleetus, and go back to fucking your pig out in the barn.

Anonymous
07/18/24(Thu)14:22:22 No.101457852

Anonymous 07/18/24(Thu)14:22:22 No.101457852

>>101457443
I mean, if you're trying to minimize effort what else are you going to copy?
Though the main dev doesn't seem to have any FAGMAN connections so this one is going to fail.

Anonymous
07/18/24(Thu)14:26:21 No.101457893

Anonymous 07/18/24(Thu)14:26:21 No.101457893

>>101457253
I mean even if's not orthogonal, I'm assuming that an MoE 8x-12b will be around the same performance as early GPT-4 versions (censored)

Anonymous
07/18/24(Thu)14:31:13 No.101457966

Anonymous 07/18/24(Thu)14:31:13 No.101457966

>>101456758
Do bestiality cards count if I'm playing the role of the dog?

Anonymous
07/18/24(Thu)15:17:43 No.101458558

Anonymous 07/18/24(Thu)15:17:43 No.101458558

>>101457966
Just in case you don't already know them:
https://www.dlsite.com/maniax/work/=/product_id/RJ202234.html
https://www.dlsite.com/maniax/work/=/product_id/RJ182625.html

Anonymous
07/18/24(Thu)15:23:32 No.101458627

Anonymous 07/18/24(Thu)15:23:32 No.101458627

hey /g/ - new to llms, just setup ollama and openwebui. it's working ok but I noticed my GPU isn't being used at all. I have an Nvidia 1080ti with 12G of VRAM using dolphin-mixtral:8x7b. nvidia-cuda-toolkit-12.5.0 is installed. how do I get ollama to use my gpu?

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.