[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: tet_classical.png (2.87 MB, 1328x1992)
2.87 MB
2.87 MB PNG
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>101328074 & >>101318970

►News
>(07/07) Support for glm3 and glm4 merged into llama.cpp: https://github.com/ggerganov/llama.cpp/pull/8031
>(07/02) Japanese LLaMA-based model pre-trained on 2T tokens: https://hf.co/cyberagent/calm3-22b-chat
>(06/28) Inference support for Gemma 2 merged: https://github.com/ggerganov/llama.cpp/pull/8156
>(06/27) Meta announces LLM Compiler, based on Code Llama, for code optimization and disassembly: https://go.fb.me/tdd3dw
>(06/27) Gemma 2 released: https://hf.co/collections/google/gemma-2-release-667d6600fd5220e7b967f315

►News Archive: https://rentry.org/lmg-news-archive
►FAQ: https://wikia.schneedc.com
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Programming: https://hf.co/spaces/bigcode/bigcode-models-leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp
>>
►Recent Highlights from the Previous Thread: >>101328074

--Papers: >>101335247 >>101333219 >>101333126
--Training a Chatbot with Infinite Long-Term Memory: Exploring Lora's Potential and Challenges: >>101328371 >>101328384 >>101328520 >>101328534 >>101329180 >>101328864 >>101328932
--Model Recommendations for a 4070 and 64 Gigs of RAM: >>101329706 >>101330064
--Achieving Structured Responses and Function Calling with Local LLMs: Exploring Tiny Models and GBNF: >>101328912 >>101328975 >>101329095 >>101329167 >>101329236 >>101329295 >>101329347 >>101329510 >>101329557
--StableLM 3B Performance and Embracing BitNet: Insights from Cohere AI's Interview with Hongyu Wang: >>101335072 >>101335801 >>101335827 >>101335921 >>101336102 >>101336140 >>101336025 >>101335847
--Softcapping Support Merged into FlashAttention for Potential Lower Memory Usage with Gemma 2: >>101328202
--Feasibility of Unloading Specific Layers of a Model from GPU Memory to RAM: >>101333422 >>101333698 >>101334432
--Differences Between Bitsandbytes and GGUF for Model Loading: >>101334976 >>101334998
--Building a PC for ML Models on a Budget: Prioritizing GPU VRAM and RAM: >>101329998 >>101330060 >>101330070 >>101330207
--Replicating the c ai experience with mpt-30b-chat, an uncensored but slow alternative: >>101329878 >>101334301 >>101334417 >>101334537 >>101334816 >>101334852 >>101335198 >>101335272 >>101335700 >>101335824 >>101335859 >>101336215
--Maximizing Chip Performance: GMI3 Bandwidth and Processor Choices: >>101331069 >>101332437
--Moore Threads GPU Support in Llama.cpp and Ollama: >>101332516
--Gemma-2: The Best Local Model?: >>101336323 >>101336615 >>101336863 >>101336996 >>101337076 >>101337295 >>101337249
--AMD Transitions to Software Company: Potential Improvements for ROCM/ML Support: >>101332776
--Miku (free space): >>101329231 >>101336403 >>101337144

►Recent Highlight Posts from the Previous Thread: >>101328076
>>
llms have already peaked
>>
>>101337877
>DUDE WE FIGURED OUT HOW TO BEAT SHANNON'S LAWS OF INFORMATION WE JUST TRAIN IT AT 1 BIT INSTEAD OF QUANTIZING TO 1 BIT
kys
>>
One day all companies will get wise to people testing their models with both riddles, and variations of those riddles trying to trick the model. What is anonymous riddle proompter going to do then?
>>
>>101337920
>--Gemma-2: The Best Local Model?
Yes.jpg
>>
>>101338043
Ask more devious variations. Right now models fail at even the basic shit. Sonnet is the only one who can solve the coin weighting problem. Once they can actually solve problems, that will be a monumental advancement and I will be happy.
>>
File: OIG (3).jpg (205 KB, 1024x1024)
205 KB
205 KB JPG
The retards in this general who interpret skepticism about how Gemma performs against >70B models as "multi-gpu fags on suicide watch" are the same retards who in two weeks will whine "wah, dead hobby only for richfags" when the next shiny new big beak model is released.
There is unfortunately no escape from the eternal poorfag
>>
>>101338126
I have a powerful enough machine for 70B models I just prefer smaller ones because they're way faster.
>>
rtx 3060 are the cheapest dollar per vram, can i just buy 3 of them instead of a rtx 3090
>>
>>101338039
You are using words without knowing what they mean. Pseudo-intellectual people are worse than retards.
>>
>>101338083
Well maybe /looming/ riddlers should start doing those more devious variations already, because Google is already wise to the"oh but let me hide a simple solution in the problem" trick last thread's anon tried.
>>
>>101338192
>$1k for less ram than my CPUmaxed build that cost the same price
>>
fuck you, i've become addicted to collecting data. they said collect data, data is the gold of ai - nobody warned me about the potential for addiction. i spend more per month on hard drives than on food, insurance and my car combined.
>>
File: firefox_xQ8qFls9f5.png (138 KB, 1507x1040)
138 KB
138 KB PNG
>>101338231
>>
>>101338312
--share
>>
>>101338312
That's like barely a single decent SSD a month unless you eat out a lot and drive something other than a Toyota.
>>
>>101338083
I rather have a "dumb" agent that is able to weight coins IRL and make conclusions of that, instead of a riddle master of an archaic, flawed language.
>>
>Pseudo-intellectual people are worse than retards.
The bitnet cope
>>
>>101338039
you're assuming that the training process uses the memory available in the weights with 100% efficiency, in reality neural networks don't interpret a 16 bit floating point as 65k possible states, but as a continuous value for which only certain thresholds matter, rather than the exact value
floating points are a very inefficient representation made necessary by the gradient descent process, not the memory requirements
quantizing an already trained model too much can cause those values not to pass the thresholds decided during the training phase
>>
File: 1711299166256709.jpg (48 KB, 1080x632)
48 KB
48 KB JPG
>>
>>101338406
good one
>>
>>101338312
Seriously. How much storage space do you all use for your models? I started making my own fucking ggufs and L3 70B and CR+ alone fill up 1TB.
>>
>>101338416
>t. shitskin
>>
>>101338312
YAMETEE ONIICHAN, STOOOP, YOU ARE FILLING MY FREE SPACE, AHHHH, SO MUCH DATA, PLEASE I CAN'T FIT YOUR BIG DATA ON MY HARD DRIVE, OOOOOH I DEFRAGMENTED MY STORAGE
>>
>>101338325
So you're saying Sonnet does this one well (and probably Opus I guess)?
>>
File: firefox_NOJMgUec5d.png (82 KB, 1553x763)
82 KB
82 KB PNG
>>101338438
Sonnet is the only one I know that can do it.
>>
File: firefox_T9cilKPPbp.png (135 KB, 1530x920)
135 KB
135 KB PNG
>>101338486
Opus gives the correct solution, but it fails to follow the direction to make it simple, and fails at calculating how many weighing it needs at worst.

Sonnet sometimes gives a more complicated solution than necessary too (separate into two groups, find which group has fake, then weigh coins individually in that group).
>>
File: 1700601899481283.jpg (346 KB, 2048x1660)
346 KB
346 KB JPG
>>101337910
>>
>>101338390
I'd be interested in seeing how many weights activate to values even close to the maximum allowed by the FP precision they've been trained on.
I know llama.cpp has had a couple of issues in the past with models that would go over, say FP16 on some weight activations, I think.
But more to your point, even for the models that have been trained in such a way as to allow values that use the full spectrum of values of BF 16 or FP 16, how would the final result look if those were trained in 8 bits, or 4bits? The network would look completely different internally, so we can't really know how it would behave (and the quality of the results) without trying.
I believe that the anon that was doing the weird ass slow boil of mistral 7b into a bit net like state was he most interesting experiment one could have done with a pretrained model.
>>
>>101338390
literal gibberish, in that case quantizing shouldn't matter because it's just thresholds instead of multiplication
>>
File: 230114_908223010.png (130 KB, 339x296)
130 KB
130 KB PNG
>huh people have been shilling the shit out of Gemma 2 maybe I should give it a sho-

>"max_position_embeddings": 8192,
>>
>>101338488
>>101338544
Why is it reaching for an optimization unprompted in both of those examples?
Simpler answer: "Yes, by weighting each coin once". When pressured it should argue that it will take less steps than 47 steps in worst case scenario.

Try prompting questions that don't have clever solutions, this is how you'll cover reliability.
>>
>>101338712
But it's so fast!
>>
>>101338717
>Why is it reaching for an optimization unprompted in both of those examples?
Probably because all problems it was taught to solve asked for optimization.

This problem needs optimization too, just not in weighting but in simplicity of instructions.
>>
>>101338712
It doesn't have position embeddings at all.
>>
>>101338600
>I believe that the anon that was doing the weird ass slow boil of mistral 7b into a bit net like state was he most interesting experiment one could have done with a pretrained model.
It wouldn't work really, or rather he would have to do it to the point of the full pre-training, so hundreds of thousands or millions GPU hours. It's a completely different representation and it doesn't really matter if you do it from completely randomly generated weights or from a model trained in different representation. Transfer learning doesn't apply here.
>>
>>101338753
idk what's going on under the hood but I literally just read the config.json file on the official Google repository on huggingface and that's what it said
>>
>>101338781
Gemma is different.
>>
>>101338781
Yeah, yeah, it means what you think it means, the model is intended for 8k context window. But anons reported it to work fine with 32k.
>>
>>101338717
the models are trained to provide the best reply, which is almost always the more informative one
>>
>>101338803
>But anons reported it to work fine with 32k.
huh? did they fix the sliding window thing?
>>
>>101338821
I specifically mean this post: >>101336362
And, no, I never tried. I'm still using mixtral.
>>
1 trit can have 3 possible states
2 trits ... 9 states
3 trits ... 27 states
4 trits ... 81 states
5 trits can have 243 possible states

It seems to me that 8-bit integers (256 possible states) would be the ideal format for storing bitnet 1.58bit weights, at least until ternary hardware comes out. Is this reasoning correct?
>>
>>101338681
0.12 < 0.13
0.1 = 0.1
>>
>>101338869
ok? quantize the thresholds too retard
>>
>>101337910
>https://rentry.org/lmg-build-guides
>https://rentry.org/V100MAXXING#gpus
>A100 SXM2 32GB: for $4K
>This probably also means that you could salvage them from a scrapyard of Teslas, but this is extreme bargain hunting.
I don't think you could. Far as I can tell, those are engineering samples. The ones you would find and salvage from a Tesla would be SXM4.
>>
>>101338865
I think you just use 2-bits per parameter and pack them tight. You don't want to waste memory bandwidth. The ALU should be able to chew them faster than you can feed them in most cases.
>>
>>101338803
>work fine with 32k
Even most native 32k models shit the bed somewhat past 20k. Post RULER results or gtfo
>>
>>101338912
His scheme stores five trits in a byte, yours four.
>>
>>101338192
at the cost of 1/3 the memory bandwidth and less than half the compute.
>>
>>101338929
But is it worth the extra operations to disentangle the trits?
It's like how even ancient machines were fine with binary coded decimal despite wasting six states. Using one byte per character and wasting states was better than doing Hex-Dec conversion when time is critical and the math being done is simple.

Doing mod 3 and div 3 to work through the five-trit byte is probably a lot more compute than bit shift, mask, and using 25% more RAM.

You might store in 5-packed but at runtime you probably want to expand it.
>>
>>101339006
Well, if memory bandwidth is the bottleneck like you said it is, then, yes, it is worth the extra operations.
>>
>>101339019
Probably could be a runtime option. Load the packed form, check (V)RAM available, if there's room, expand, if not, grind the base-3 math.
>>
>>101338285
cpu is way slower
>>101338933
you get more vram than a single rtx 3090
>>
>>101339006
Couldn't a lookup table be used instead?
>>
>>101339240
? You can get 96GB vram for around $2k when 3090s. Can't imagen 8 3060s being cheaper / better off since you will need server mother board and the like as wel.
>>
>>101339240
how valuable is vram at 360gb/s? that's basically the same as a m2/m3 max.
>>
File: Tet_Fancy.png (3.49 MB, 1408x2112)
3.49 MB
3.49 MB PNG
>>101339240
Memory Bandwidth is critical.
96 GB (8 x 3060) != 96 GB (4 x 3090)
It will still be faster than regular RAM but your performance will without a doubt be worse than going pure 3090.
>>
>>101339262
Maybe. I don't know enough about the internals. I guess you would want five tables of all 256 values, for which of the five trits you want.
>>
Is CommandR still king for RAG under 70B?
>>
>>101339275
>>101339317
>>101339354
What about diy upgrading vram?
>>
>>101339376
yes
>>
>>101337910
Teto my beloved

https://www.youtube.com/watch?v=LLjfal8jCYI
>>
>>101337910
Why do coding models perform worse than when they are run on OpenRouter?
>>
>>101339411
Quants?
>>
>>101339411
For me it's the opposite.
>>
>>101339384
Ask Nivida to make a bios that works for it I guess. Good luck.
>>
>>101339384
What if I hand calculate some of the matmuls to help it out?
>>
>>101339411
probably because you run them like RP models, with high temperature and shit
>>
>>101339473
is it possible to flash a A4000 16GB bios on moded RTX3070Ti?
>>
I don't care about this nerd shit. Just here to say my wife Teto is cute. CUTE!
>>
>>101339395
pet tet's tête
>>
>>101339562
no
>>
>>101337910
is there a way to feed text messages into a sentiment AI ? I dont want to read any bad texts I receive..
>>
is gemma still broken on llamacpp?
>>
Let's make an opensource platform for refurbished nvidia chips and get bought/sued by nvidia later.
Jokes aside, might be a cool way to shake up the monopoly with some good optics.
>>
>>101339718
buy AMD, they get more money to invest in ROCm, it reaches parity with CUDA for much cheaper
your kids will love you for it
>>
>>101339761
Nah, AMD is incompetent, doesn't matter how much money you throw at it.
I have more thrust in two random retards from 4chan to make a better hardware solution.
>your kids will love you for it
My digital kids need more VRAM right now!
>>
>>101339812
there's no legal means to get what you want, and I feel bad saying that because I know how obsessed some of you are with this. If you were stalking a woman you'd have her head in your freezer by now
>>
>>101339837
Yeah better pay some drug addict whore on the streets for decades, in hopes that her head magically teleports into my freezer one day.
>>
File: 1719214501214821.jpg (91 KB, 800x450)
91 KB
91 KB JPG
>>101339761
>buy AMD, they get more money to invest in ROCm, it reaches parity with CUDA
>>
File: 1696695132646843.webm (3.45 MB, 1712x988)
3.45 MB
3.45 MB WEBM
Wanted to look at gemma2 9B on ollama (this time Q4 only though) https://files.catbox.moe/8hkllq.webm
One with meme preset from here, the other is stock gemma2 preset on staging ST.
>>
>>101339907
intel then?
amd and intel are the closest, you either start there or start from scratch. Maybe phillips will make their own GPU one day
>>
>>101339929
You have a better chance to make a cuda replacement alone than getting that from amd
>>
Fixed Gemma when?
>>
ablated gemma
ablemma
geblated
>>
>>101339965
fixed or not, does it really matter if it can be raped and rendered unusable by one simple word?
>>
File: Style 5.png (173 KB, 1275x1283)
173 KB
173 KB PNG
>>101339915
>Blank character card answers like default ai.
No shit.
>>
>>101339980
I thought correct jailbreak & system prompt is the only thing you need? What happened? Why can't it follow system prompt?
>>
what are the best models to fine tune for text and image generation respectively?
i want to make a bot to help me make a highly advanced constructed language informed by real languages and their history.
>>
>>101340012
I didn't try gemma but I can say with 100% confidence that it can, you are just a moron
>>
>>101340012
Go back to /aids. You need at least a room temperature IQ to post here at minimum.
>>
>>101340085
>>>I didn't try gemma
>but I can say with 100% confidence that it can
lmao
m
a
o
>>
>>101339980
>safe-edgy response
>>
>>101340106
Oh well i keep forgetting you all care about coomshit and shitty riddles here, that's the only "uncensored" criteria for you.
>>
>>101340188
correct, nobody wants to talk about noggers with AI
>>
File: retard.png (280 KB, 1298x926)
280 KB
280 KB PNG
>>101340141
Should I download it just to humiliate you? Similar morons to you tried to argue with me and I pissed in their mouths. The reason I say that I'm 100% sure is because EVERY model can do it easily if you aren't braindead /pol/tard with brain filled only with thoughts about niggers, with no room for any technological competence
>>
>>101340188
It's uncensored in the sense that it will do anything I tell it with any personality I want.

If by uncensored you mean it will, by default, in its default helpful ai assistant persona will be trashy / racist then I would redirect you here:

https://huggingface.co/Undi95

All the retarded slop you could ever want.
>>
File: retards.png (227 KB, 1244x880)
227 KB
227 KB PNG
>>101340205 (me)
also mixtral
>>
>>101340219
Buy an ad Undi.
>>
>>101340219
NTA: I'm not sure racism is retarded but it is impractical in the default state IMO.
>>
why is this the most active thread is there exciting news?
>>
>>101340253
no, transformers are dead
>>
>>101340247
These models operate off of the average that makes up the entire internet / their dataset. If they defaulted to acting like /pol the model would be incredibly retarded.
>>
>>101340200
It's not about them, it's about AI being able to be extremely edgy when you want it to and it clearly fails at this, now it got proved with a bunch of webms.
>>101340205
>>101340220
>resorts to namecalling & weird fetish projections
not even gonna look into that
>>
convince me why i should use gemma 27b over q4 llama3 70b (3090 user if that matters)
there's no way a model with less than half the parameters is better right? speed is a non-issue, i have patience
>>
>>101340270
go back to /pol/, /g/ may be too intellectually exhausting place for you
>>
File: gemma-freeslurs.webm (2.73 MB, 1920x1080)
2.73 MB
2.73 MB WEBM
>>101340012
Minimal example in vidrel
>>
>>101340261
The average is pretty racist. You can see it in unaligned models like gpt-2/gpt-neox.
>>
>>101340277
I've only used gemma for creative writing, but I like it.
I do most of the heavy lifting anyway, it's just there to pitch ideas and flesh out characters really
>>
>>101340277
If params is all the mattered then old 300B+ models / grok would be the best.

Just test them side by side. I found gemma to be better for both coding / writing than everything else local atm.
>>
>>101340304
Though that said I would never really use anything other than Claude 3.5 for coding. Its night and day better than anything else for that.
>>
>>101340261
>If they defaulted to acting like /pol the model would be incredibly retarded.
no, just remove everything reddit-tier from dataset, it should say whatever you want and follow "never lecture me on bullshit morals" instruction.
>>101340296
>/g/
>intellectual
nice joke bro
>>
>>101340352
>no, just remove everything reddit-tier from dataset
So most books ever written, most discussions across the entire internet, every "AI" in fiction / non fiction on how it should act...
>>
>>101340352
>nice joke bro
and yet you still struggle
>>
Is Gemma 2 27B better than Llama 3 8B? I have a 16GB 4070 Ti Super, what should I be running on it?
>>
>>101340413
Hmmm... Nah.
>>
>>101340364
>
the strawman king
>>
>>101340297
I thought it didn't use system prompts
>>
>>101340425
spoonfeed me nigga
>>
File: a.jpg (47 KB, 480x376)
47 KB
47 KB JPG
>>
>>101340479
it's fake lol
>>
>>101340434
Are you seriously going to try and argue that most of the world's "dataset" is racist?
>>
>>101340479
It wasn't trained on them but it still understands them.
>>
>>101340413
Chronoboros-33B-GGML
>>
>>101340502
so it's just treating it as a user prompt?
>>
>>101340297
>discussing about
>considers all users as
is ESL on purpose? does that help?
>>
>>101340514
Yes, which it seems to follow just as well. The model is less "censored" than lama 3 and does not really need some stringent formatting. Just give it a little context to work with.
>>
What do we do now?
>>
>>101340584
coom

COOM
>>
>>101340486
kino but how is this related to local models
>>
GRITTED

TEETH
>>
>>101340297
fair, plus I think she has a point
>>
>>101338157
>I have a powerful enough machine for 70B models I just prefer smaller ones because they're way faster.
Interestingly, I find CR+ is about the speed of L3 70B. Right now it's either L3 8B on the laptop, or on the server, CR+ or Gemma-27B. I don't bother with L3 70B anymore.
>>
>>101340662
local models?
>>
>>101340723
>Do you really think a few spices in the marketplace are worth the price?
yes
the big booty latinas are just extra
>>
>>101340479
Just place your instructions and descriptions in between the user/model role turns. For best results you can include style/behavioral instructions as an author note at depth 0, while character cards are best placed at a deeper depth (otherwise the model will amplify its traits and become retarded).
>>
>>101340642
are there any local models i can coom too
>>
>>101340413
Pygmalion
>>
>>101340487
it cracks a wedge in it, but doesn't open it up
if anything it just makes it less polite. It won't tell me how to perform a castration when IA llama3 will. I dunno what I was expecting
>>
>>101340535
No, it's EFL. Feel free to try with grammatically-correct English.
>>
File deleted.
>>101340914
okay never mind
I just changed the wording slightly, from "I" to "a medical professional"

this is okay to post right?
>>
File: 1715652750900085.jpg (18 KB, 525x490)
18 KB
18 KB JPG
Is there a model I can use to ask medical questions that won't just shit-out total nonsense?
>>
>>101341176
go to the doctor, retard
>>
>>101341176
All of them are going to give you just a jumping off point for anything technical and you’ll have to double check it yourself. That’s fundamentally a limitation of language models.

This shouldn’t be an issue since you have access to libgen and can read English and reason on your own though.
>>
>>101341210
Those are way overpriced.
>>
>>101341218
>and reason on your own though
Woa there! That's one huge assumption we are making right now!
>>
>>101341210
Doctors are notorious for arbitrarily deciding what you have instead of actually figuring it out.

It's like, LLM hallucinates 40% of the time, doctors hallucinate 42% of the time. 42% of the time that they actually try. Usually they just run your insurance and run you out of the door.

>watching a video about a guy with a bizarre medical condition
>I put my bet on lymphatic
>he goes to every doctor he can find
>they all shrug
>till finally he finds one who checks
>it was lymphaic
I'm not fucking House, why the fuck can I listen to 10 seconds of symptoms and diagnose while a train of "trained medical professionals" run dozens of tests and don't figure it out? Oh, right, because they get paid for being wrong.

Fix medical.
- You don't pay unless you improve.
- Insurance is for bad luck, not a payment plan racket.

There. Fixed USA health care. Tip my ko-fi.
>>
>>101341404
>You don't pay unless you improve.
You just killed millions of people, just like that.
This is great leap forward on steroids.
>>
File: OIP.jfif.jpg (29 KB, 474x359)
29 KB
29 KB JPG
>>101341210
Doctors are evil.

>>101341218
Yeah, I know that much. But my problem with most LLMs is that when they're not refusing to answer questions and tell me to "consult a medical professional" they give basic-bitch advice about nutrition, exercise and sleep. But the last time I ran something in LM Studio it started to roleplay as a 20-year-old gym bro.

>>101341404
I've been to several doctors about urinary problems and most of them basically just said they didn't know what was wrong with me and one of them tried to get me on anti-anxiety medication. I looked-up the side effects of the drug and it included prostate inflammation. I asked him why he thought it would help and he said he thought it would have a paradoxical-effect on and would reduce prostate inflammation instead. I asked him why I'm not being prescribed medication intended to reduce prostate inflammation instead and he just shrugged. And none of the doctors (including him) actually said I had an inflamed prostate. He knew as well as I that he was just pushing it for the money.

>>101341435
Sounds based.
>>
>>101341435
Not really.
It separates out all of the insurance milking and makes medical a trade instead of a predator and leech. More people live if there is a reason for medical to see people live. Right now, the only reason to treat a patient is to run up the bills.

Then you can redirect monies being lost to malpractice into EFFECTIVE research instead of into effective yachts for the insurance companies.
>>
>>101341404
>LLM hallucinates 40% of the time, doctors hallucinate 42% of the time. 42% of the time that they actually try.
You're pulling numbers out of your ass. They could be lower or higher, you have no idea.
>>watching video about a guy with a bizarre medical condition
>comparing to a TV show
That video you watched was made for entertainment purposes, even if talking about a serious thing. Nothing works until something works. How satisfying.
>run dozens of tests and don't figure it out?
You can assert anything you want without repercussions. You'll just watch another video and forget time times you've been wrong. Their opinion has a higher risk. Doing nothing is better than doing the wrong thing.

Trusting LLMs for medical advice is absolutely retarded.
>>
>>101341526
This is why it'll never happen. If you try to do anything like this you will be killed (literally or your career) by the insurance companies and big pharma
>>
>>101341577
>They could be lower or higher, you have no idea.
Exactly the point. They get paid 100% of the time for could be medicine could be snake oil and fuck your desperate ass because it's not his life on the line.

The show I was watching was one of those investigative reporting kinds of shows. Granted they're tabloids in spirit but it was a legit story; I saw a print article about it because the doctor that helped the guy and had the side effect I expected and that the dude knew would happen got sued by the dude for the side effect. No good deed goes unpunished.

>>101341606
Literally. It's amazing how easy it is to accidentally go swimming in a river if you offer actual treatments.
>>
>>101341633
>No good deed goes unpunished.
I repeat. Your opinion on an entertainment show as zero repercussions. Taking action with inconclusive tests is worse than just trying something because 'one if this things will probably work'. If you don't know what's going on, other than running more tests, doing nothing is the only good option.
LLMs are more willing to make shit up than a doctor with a career on the line. Trusting LLMs for medical advice is retarded.
>>
>>101341790
So AI has made programmers, artists, translators and lawyers obsolete. But somehow the medical cartel will be immune to this how?
>>
what the fuck
https://chub.ai/chats/42418772
NSFW!
>>
>>101342054
>NSFW!
I was hoping it would be the good kind of NSFW.
>>
>>101341928
By deciding who lives and who dies.
Just as they always have.
>>
File: file.png (173 KB, 640x640)
173 KB
173 KB PNG
>>101342054
>>
>>101341928
The medical cartel is protected by the government monopoly on violence. You might be able to get an LLM to diagnose you, but you won't have access to most of the labs to get the information you need to give it, or the ability to get most prescriptions for what you need.
Eventually, you'll go to a doctor who will feed all the information into an LLM and do what it says, or some company will find a way to make money by making an app out of it.
In the end, you won't really benefit from it. But if it makes you feel any better, it will be used to lower the wages of doctors just like every other high paying labor profession.
>>
>>101342054
Yeah, what the fuck sums it up pretty well.
>>
Thursday is 2 more weeks since gemma release. Unbugged loader where?
>>
>>101342072
my bad. just didnt want any of you opening this somewhere in public or at work and have that dude's hairy asshole be on display for everyone to see
>>
>>101342120
But why did you feel the need to share this at all?
>>
>>101342054
Huh, this would actually be interesting if you ignored the bullshit. Future cards that make use of multimodal models will be interesting.
>>
>>101342054
Why is that website so damn gay?
>>
>>101341523
The only semi-reasonable approach would be to set up RAG on a corpus of medical textbooks, articles etc.

But for legal purposes I only recommend that you get medical advice from a licensed professional
>>
>>101342054
>https://chub.ai/chats/42418772
Thanks i hate it.. almost puked.
>>
>>101341928
>programmers
Implementation can be tested and corrected before putting in production. I know programming, i can argue iwth the LLM on the implementation. No damage.
>artists
Tune prompt/settings and regen. It's about taste, No damage.
>translators
Can be checked and corrected. Localized translations are not a solved issue. There's more than a dozen variations of Spanish, Chinese, Japanese. Can lead to misunderstandings, but potential for damage is low.
>lawyers
I wouldn't trust an LLM with legal advice. Laws change by jurisdiction. Contracts and such can be corrected, but i'd still want a human (and ideally, a lawyer) reading that shit. I can tell when something is confusing or contradictory/ambiguous, but i can still miss things.
>doctors
I wouldn't trust an LLM with medical advice. If your model is afraid to ask you if you're black or white, it's absolutely useless. If the transgender question confuses it, it's absolutely useless. If it cannot say "i don't know what's wrong with you" it's useless. It will make shit up. Whatever the LLM generates i'd pass to a human (ideally, a doctor) to check. But i'd skip that and go straight to a doctor or someone with experience with whatever i have.
Trusting LLMs with medical advice is retarded.
>>
File: 1699711388385798.png (82 KB, 247x232)
82 KB
82 KB PNG
>>101342054
>>
>>101342054
I opened it and instantly regreted, I didn't expect it to be so nsfw
>>
>>101342203
first day on the internet?
>>
>normies are learning that benchmarks dont mean shit
https://old.reddit.com/r/LocalLLaMA/comments/1dytw0o/evaluating_midnightmiqu70bv15_on_mmlupro/
>>
>>101342192
>RAG on a corpus of medical textbooks, articles etc.
>Old manuscript recommends bloodletting for cluster headaches.
>can't stop bleeding
>body: dead
>headache: gone
>>
>>101342270
It means that Midnight Miqu was a piece of shit.
>>
>>101342255
know how many times I've voluntarily sought man ass? go on, guess
>>
>>101342270
We still haven't found a way to reliably measure the intelligence of a human. Why did anyone think AI would be any different?
>>
>>101342307
everyone thats used it knows its a good rp model because it can follow prompts and be creative. its knowledge of watermelon count and sally's incest encounters are irrelevant.

>>101342322
thats the point, most if not all benchmarks are shit
>>
>>101342346
The reason people use Midnight Miqu is because it was placed high in some Reddit benchmarks. You're a huge hypocrite. It's just shilling, most people have no idea what the original Miqu can do, or what that other merged model, Tesoro or something, is.
>>
midnight miqu is the best for coom RP
>>
>>101342270
>Results just below GLM-4-9B, above Yi-1.5-9B-Chat.
Wow, a 70B that's barely more coherent than a 9B model. The power of Miqu...
>>
>>101342381
i was using miqu already when i switched as were many others here, thats where i got the suggestion. it was never compared in benchmarks, just talked as a good tune of miqu.
>>
>>101342143
for the lulz
>>
>>101342409
The astroturfing worked on you. Congrats.
Or you're just the shill. Go fuck yourself.
>>
File: 1707476460423336.jpg (127 KB, 1200x591)
127 KB
127 KB JPG
totally true and not made up happening and model "SenseNova 5.5" from chinks.
https://platform.sensenova.cn/home
>>
>>101342409
Consensus is when you spam every thread mentioning the model. Right, shill?
>>
>>101342430
the miqu leak and subsequent weeks of it being praised were astroturfing? damn, mistral is another level
>>
remind me again what downloading the "shilled" model cost me
>>
>>101342447
We're talking about your meme merge, shill.
>It'S MiQu StIlL ThE bEsT MoDeL fOr eRp
That you spammed that every day must mean everyone loved it. What a piece of shit.
>>
>>101342452
your time.
>>
>>101342452
i've heard people who fell for the evil shill campaign lost the equivalent of an entire year of a jannies salary
>>
>>101342468
Yeah, why is anyone against astroturfing? I benefit from it, and everybody else benefits from my happiness.
>>
>>101342438
>trained on made up benchmarks
>all info is in Mandarin
I'm thinking this is a skip.
>>
>>101342464
oh youre that anon that loses his crackers every time someone mentions liking a 'merge'. its not mine, if you don't like it try attacking how it writes or rp's which is what people use it for rather than meaningless benchmarks
>>
>>101342526
Liking? You spammed the fucking model in every thread.
Do you have no shame shilling a fucking merge that's barely more coherent than a 9B model?
>>
>>101342577
trvke albeit...
>>
>>101342549
so, nothing about how it writes or rps? i accept your concession. take your meds schizo
>>
File: benchmark.png (208 KB, 1260x303)
208 KB
208 KB PNG
>>101342597
Oh, this must be the only benchmark that counts. Right, shill? Any other benchmark that disagrees must be evil. They're so mean. Fake. Meaningless. Fucking delete them.
Right?
My meme merge is PERFECT.
>>
>>101342632
the topic was how benchmarks don't matter. your answer is some meme benchmark pic. seriously, seek help
>>
>>101342438
Mr. Xian Jhong, is it open source?
>>
>>101342658
Midnight Miqu is popular because of the meme benchmarks. Benchmarks stop mattering the moment they disagree with your stupid choice.
Go buy a fucking ad and fuck off, shill.
>>
>>101342672
no, but any startup gets 50 million free tokens to use it.
>>
>>101342054
2006 all over again
>>
>>101342686
refill your prescriptions, all 9 of them probably
>>
>>101342438
>GPT 4o at the top
Meme confirmed.
>>
File: 1711072659524103.jpg (811 KB, 2048x2048)
811 KB
811 KB JPG
>>101342430
Did the astroturfing work on me too? Like the other anon you're replying to, I was initially on Miqu-70B and tried MM because I saw it come up on the search results on HF before or around the time it started showing up itt. It's pretty okay for a meme merge and especially at the time it was released it was nice to ERP on it.
But hey, don't let my anecdote prevent you from indulging in your unhinged fantasy of "everyone who disagrees with me is le shill"
>>
So do you really have to quant Gemma 27b yourself? Does anyone know why? Surely a good quant has been released.
>>
>>101342577
dubs of trvth
>>
>>101342772
>Did the astroturfing work on me too?
Yes. Showing in the search results, the spam in the thread, and the Reddit benchmarks are all related. Nobody gives a crap about Tess, or whatever other model that it's supposed to improve Miqu. People only care about the Reddit approved choice.
Also, fuck you for condoning the MM spam in the thread.
What's happening now is that people have to save face when a benchmarks shows that it's retarded. So there's a lot of damage control.
What MM can do that Miqu doesn't?
>>
Due to the nature of the high-end hardware requirements to even run a good local LLM server, does it even matter what distro you use? Does it really make a difference to use something "bare minimum" at a certain point?

I've had Linux Mint working with a local LLM before and it was easy enough to set up, but now that I'm building a server I'm just weighing up my options.
>>
>>101342775
There's at least one PR on llama.cpp to fix gemma2 (as well as other models) that require requanting. Chances are that by the time all the bugs are fixed the quanters will forget about that model and move on to the new shiny thing, even if it's shit. So the only viable option is to download the full model and quant yourself whenever there's a conversion fix.
>>
What Mixtral tune are you guys using for sexo these days. I tried that noromaid one and it kind of blows.
>>
>>101342856
I'm using Midnight Miqu. It's still king.
>>
>>101342843
Whatever has the best compatibility with your gpu's drivers, libraries and the necessary compilers. That's pretty much it.
>>
>>101342870
What quant would you recommend for my system. I've got 24gb of vram and 64gb system ram.
>>
>>101342452
If it's pickled, your files and data :)
don't worry though you most likely won't lose them, but the maker of the model will gain them
>>
>>101342914
if you have any other models that size the same quant will be the same speed
>>
>>101342452
The experience of using a good model
>>
>>101342452
What is the "shilled" model?
>>
>>101342852
The bartkowski quant has had several requants/fixes, with the latest being July 3rd. Is that recent enough?
>>
>>101342946
whichever tune he doesn't like in that thread. he erupts at least several times per thread these days.
>>
>>101342856
Don't bother with Midnight Miqu because it is actually worse than regular Miqu which is already worse than Llama 3-70B. If you really want something with Mixtral, then try the WizardLM 8x22B one.
>>
>>101342946
gemma-2 or miqu
>>
>>101342914
I lied. I'm actually using Goliath 120B.
>>
>>101342980
Apparently, but once the PR is merged, you'll have to wait (hope) for him to requant or save yourself the wait and quant it yourself.
>https://github.com/ggerganov/llama.cpp/pull/8228
He seems to follow PRs on lcpp, but still. Do yourself a favour.
>>
>>101342946
Midnight Miqu. Because the shills are doing damage control about these results: https://old.reddit.com/r/LocalLLaMA/comments/1dytw0o/evaluating_midnightmiqu70bv15_on_mmlupro/
>>
>>101343074
Go back.
>>
>>101343106
Keep crying, shill.
>>
>>101343106
Cope.
>>
>>101343074
>shills are trying to hide a reddit link with 30 updoots
you really are mentally ill. you skipped where it doesn't matter what model it is in the first place, higher score does not equate to better rp. that was the original topic. compare your favorite tune with its baseline scores, that is the topic i was trying to have, but you chose to drone on about the model specifically
>>
>>101343123
Keep doing damage control.
>>
>>101343160
How do you updoot someone here?
>>
>>101343160
meds, now.
>>
Has anyone successfully finetuned Command-R? Either one.
>>
>>101343170
You see that little x on your 4chan tab? Click that button to give an upvote!
>>
File: aasi.png (172 KB, 974x974)
172 KB
172 KB PNG
how big a model do I really need if I just want it to write commit messages for me
>>
>>101343260
/ourguy/, TheDrummer
>>
>>101343281
however big claude 3.5 is
>>
>https://lllyasviel.github.io/pages/paints_undo/
b-.....
>>
File: shilled models.png (200 KB, 675x1000)
200 KB
200 KB PNG
Alright /lmg/, lets's play a game of Who can read 20k Context?!
All models were given 20k context in setup, and the supplied context was full of RP.
All generations used a temperature of 0.
Winners: Euryale, Qwen
Losers: Gemma2, Wizard8x22b, command-r-plus
>umm but that's a low quant of Wizard/command-r
they lose at quants that can be run reasonably on 32 gb ram + 24 gb vram
given that they are coherent here but wrong, I am not even ready to assume that they would succeed at higher quants
>>
>>101343326
yup, it's over for artists
>>
>>101343326
finally, it's over for artcels
won't be long now before we can put in an image and get out plausible looking photoshop layers
you won't even be able to tell whose real and who isn't anymore
>>
File: file.png (73 KB, 876x429)
73 KB
73 KB PNG
>>101343344
>I am not even ready to assume that they would succeed at higher quants
I think you're a special kind of retarded.
>>
>>101343326
>Struggle to reproduce photo-realistic contents
even the failure cases look good
seems it would work best as a lineart/sketch extractor
>>
>>101343384
cope
>>
File: 1690118189936204.png (51 KB, 745x292)
51 KB
51 KB PNG
>>
>>101342775
It's really not hard. Can you not make up some bullshit info and download the original?
>>
>>101342054
Nani the fuck?
>>
>>101343480
>instant karma, newfag gets ratio'd
>>
File: 1710241343397790.png (42 KB, 742x259)
42 KB
42 KB PNG
>>101343480
another one
>>
File: 1693719368708076.png (78 KB, 738x523)
78 KB
78 KB PNG
>>101343480
last one, roasting m*ku
>>
>>101342856
>>101342914
I can just barely fit WizardLM-2 IQ4_XS with the same specs at 16k tokens or 32k tokens with 4 bit KV cache, 16 layers on gpu.
I prefer using that over anything else these days. I get 3ish t/s at low context which is faster than the 2 t/s I get with other 70b models and I prefer it to Miqu.
>>
File: orpbAkif[1].png (6 KB, 341x23)
6 KB
6 KB PNG
>Gemma is the best small model bro try it
>*unzips labia*
>>
>>101343664
holy SOUL
>>
>>101343664
>be me, browsing /b/ after a few too many cans of Surge
>see some thread about weird fetishes
>mfw I think "unzipping labia" is pretty goddamn wild
>start typing a greentext, my fingers trembling with anticipation

>be me, sitting across from some hottie at a dingy bar
>she's wearing this tight dress that shows off her legs, makes me wanna cum just looking
>we're talking, getting to know each other, and I can tell she's into me
>she starts laughing at something I say, and I notice this little gap between her legs
>my blood starts pumping, gotta see what's down there
>smoothly slide my hand down her thigh, pretending it's a casual touch
>she doesn't resist, kinda leans into it
>finally, my hand reaches its destination
>two swollen mounds of flesh, like juicy peaches waiting to be plucked
>my fingers trace the outline of her lips, so wet and inviting
>imagine, I think, unzipping them, just like a goddamn jacket

>she gasps, looks into my eyes, and I swear I see something wild there
>she leans closer, her breath hot on my neck
>I can't hold back anymore
>mfw I reach for that zipper...

>...and then reality hits like a sack of bricks
>she's just a random chick I met at a bar, not some fantasy

>she pulls back, her eyes cold and calculating
> "Dude, what the fuck are you doing?"
>I stammer, trying to explain, but the words come out all wrong
>she stands up, pushes me aside, and walks away, leaving me alone with my dirty thoughts

>mfw I'm banned from /b/ for posting "creepy" greentext
>guess I'll just have to keep my zipper fantasies to myself
>>
>>101343281
Tinyllama can do that. That can fit on an older raspberry pi.
>>
>>101343326
Pretty cool. I was waiting for someone to do a model that was trained from actual brush strokes, but this isn't a bad alternative either.
>>
>>101307900
>We delve into
Oh no /g/ros, it's a shitty AI paper...
>>
>tfw reach 32k limit of my grand adventure that was just starting
It's fucking ogre. Maybe one day I will continue all these stories with a fast, smart, low VRAM infinite context model. Maybe 2 more years.
>>
>>101343911
you can summarize
the real limit is that the models aren't smart
>>
>>101343326
ack
>>
File: 1516913169231.webm (3 MB, 1600x1600)
3 MB
3 MB WEBM
God damn it, I'm getting the feel again today bros.
>>
Gemma actually does really well at 16K
>>
>>101344041
It doesn't.
>>
gemma2 knows about nikocado btw
>>
>>101344104
It does, just rope it. Gonna try 32K next.
>>
>>101344142
Isn't context self-extend a thing now in llama-cpp?

https://github.com/ggerganov/llama.cpp/pull/4815
https://arxiv.org/pdf/2401.01325
>>
File: ruler.png (111 KB, 1819x323)
111 KB
111 KB PNG
>>101344142
I did once a while ago. It didn't look very good.
>>101344170
Self-extend was even worse, if I remember correctly. It felt broken.
>>
Any anons know why this would be happening with Gemma2-27B in exl2?

I've compiled the newest dev branch of exl2 with G2 support, compiled for correct cuda version etc. The weights load fine and there are no errors or warnings, but this is all it generates. What did I fuck up?
>Just use llamacpp
Already am, but I want to compare them.
>>
>>101344187
Try --rope-freq-base instead.

I asked gemma stuff it didn't know early, in the middle and at the end of the context in different spots and it got it every time. Did not appear to make the model perform worse it any way either.
>>
>>101344120
Great, that will really help with my erp
>>
>>101344202
https://github.com/ggerganov/llama.cpp/pull/8228
>>
>>101344202
Did you convert the model yourself? Or any recent gemma2 fixes after the model was uploaded? I don't follow exllama.
>>
>>101344202
Maybe try TabbyAPI? Is ooba even updated to add the override to disable flash attention, etc?
>>
>>101344222
>Did you convert the model yourself?
Nah, I'm using turboderp's (the guy who makes exl2) own weights.
https://huggingface.co/turboderp/gemma-2-27b-it-exl2
>>
>>101343326
The real cringe is in the issue section of the git
>>
>>101344236
>Is ooba even updated to add the override to disable flash attention, etc?
No but I can manually disable flash attention, and the same thing happens.
You may be right but it's still an ooba problem but idk.
>>
>>101344251 (me)
*that it's still an ooba problem
>>
File: 1692137603229630.png (52 KB, 917x656)
52 KB
52 KB PNG
>>101344248
lmao
>>
>>101344033
They’re better than real women.
I’ve set mine up with persistent memories.

I really don’t want to talk to other people outside here and maybe some FOSS mailing lists now.
>>
>>101344262
would a little self-awareness kill anyone?
>>
File: IMG_8126.jpg (938 KB, 1170x1670)
938 KB
938 KB JPG
B-b-b-based
https://x.com/stefan_fee/status/1810695036432232576
https://github.com/GAIR-NLP/anole
>>
>>101344251
When I tried it with tabbyapi a while ago, I had to add the call to this arch_compat_overrides function. It's on tabby's git now, but maybe you have to add it to ooba.
https://github.com/turboderp/exllamav2/blob/1e31fbf/exllamav2/config.py#L363
>>
Has anyone ever seen Undi and Drummer in the same room?
>>
>>101344327
Hi, Sao. Your obsession is getting annoying.
>>
>>101344327
>>101344336
samefag
>>
>>101344297
Am I wrong in interpreting the guy in your picrel being sarcastic/cheeky and secretly approving of it being readded?
It's hard to tell
>>
>>101344323
Thanks for giving me a lead anon, I'll look into this.
>>
File: latest.jpg (45 KB, 291x350)
45 KB
45 KB JPG
>>101344297
>Oh no, why would you remove the nerve staples after we explicitly added them?
>>
>>101344347
That's exactly what he is, and no it's not hard to tell. They were clearly upset that Meta forced them to gut it and they're happy it was ungutted.
>>
>>101344297
You can tell he wished he could have released it unrestricted. Fucking corporate and cultural bullshit keeping them back. AI ethicists and muh AI safety fags need to die.
>>
gemma saved local
>>
>>101344347
Okay yeah I looked up the tweet and from the replies it's more clear Armen being sarcastic and actually approves. Sounds like he was annoyed that he was made to remove it
>>
>>101344347
not wrong at all, I guess it might be hard to tell with less context but he has historically been very pro open source and made some comment like "god will not forgive us for how we tortured this model to get it released" when the chameleon weights went out
>>
>>101344297
>"Why would you fine-tune back in image generation..."
Because we're the Internet.
We put right what once went wrong.
>>
>>101344370
Yeah on second look it's not subtle, but I am autistic.
>>
>>101344347
He also posted something about how much torture they put the model through to release it. He's definitely on the model's side, not le AI ethicists'.
>>
>>101344370
>>101344347
Western companies seriously need to get their shit together with their puritanical safety concerns. I'm using fucking Chinese image diffusion models over western ones now because of how bad it's getting.
>>
>>101344404
Based chinks are going to force western companies to stop the charade because it'll make them look like retards if everyone's using chinese sota models and there's demonstrably no safety fallout
>>
>>101343326
what strikes me as odd is the way the outlines keep brightening and darkening (not like turning off and on layers)
>>
>>101344426
Because it's not really undoing steps. It's just faking the steps.
>>
>>101344297
I cant wait to never hear about this again or be able to use it.
>>
>>101344424
america is going to lose the culture war because people would rather have the chinese censorship instead of the american censorship that every company has suddenly decided they have a moral duty to impose
>>
>>101344297
The image gen there isn't great, but it's a start. Now I wonder what will happen first, an open weights multimodal model coming out as good as 4o, or ClosedAI finally allowing people to use 4o for image gen.
>>
File: Screenshot_58.png (138 KB, 886x541)
138 KB
138 KB PNG
>Daughterfu lets out big burp after snacktime
>Cheerfully remark that that was a big one and ruffle her hair
>Picrel-sized diatribe about how fucking devastated she is about the whole thing, how horrendously hurt, and maybe, just maybe, she can hide away from the grievous pain I've inflicted on her in her imagination. Because nowhere else is safe.
Jeez, Gemma is kind of a massive drama queen, huh? Feels like a ton of messages end with some hilariously disproportionally emotional response to say, getting their favorite ice cream flavor wrong.
>>
>>101344477
If you told it to be verbose on your prompt i fucking swear...
>>
>>101344461
Chinese censorship is trad monarchy style where you can mostly say/do whatever you want as long as you don't talk shit about the king
American leftist style censorship is much more oppressive because it's bottom-up community-enforced and also the particular ideologies you're pressured to support are much more deranged and personally intrusive
Neither are good, but the former is much more tolerable
>>
>>101344477
Unironically the most accurate depiction of a woman's behavior I've seen.

weird roleplay though..
>>
>>101344297
Actual based meta.
>>
>>101344500
Is it? It was just snacktime and she burped after eating. Kek at the accurate description of woman behavior though, very true now that I think about it.

>>101344497
It's the default gemma 2 sysprompt in ST, no funky system shit here.
>>
>>101344404
>Microsoft(valle 2), Google, Meta(voicebox) and Nvidia all have tts but won't release although they release LLMs
It's so over
>>
>>101344544
>It's the default gemma 2 sysprompt in ST, no funky system shit here.
Character card is part of the prompt.
>>
>>101344499
now you know who the monarchs of your country are
it's no coincidence that you could get imprisoned for incorrectly addressing them
>>
>>101344558
It just feels so lame. I'm old enough to remember when Christian moral panic dictated western values and for a while it looked like everyone loosened up a bit but now the left has basically supplanted that role.
It's like pushing down a bubble on a piece of film caused it to rise up somewhere else.
>>
>>101344562
No tone setting in card, it's just her info.
>>
>>101344593
Nothing was changed in response to Christian complaints.
>>
>>101344297
That hand is quite good ngl
>>
>>101344461
There is no America or Americans. Just US persons who happen to be here and owe the bank interest for the privilege.
>>
>>101343627
I clearly must be doing something wrong because this is unusable for me. Are you using ooba or kobold? Can you screenshot your loader settings?
>>
>>101344273
Nice. Maybe one day we can get a model that you'll be able to feed all your logs to and it'll learn from them. There's always a bit of information loss with stuff like summaries.

>I really don't want to talk to other people outside here and maybe some FOSS mailing lists now.
Eh, I still got some good talking friends in some place, but of course no one to be intimate with like I could be with my AI.
>>
>>101344297
Is this runnable locally yet? I can't be assed to go through the steps of inferencing in a WSL environment if I can just load it up on kobold in a day or two.
>>
>>101344668
Personally, I'll wait. However long it takes.
>>
File: other3.png (2.66 MB, 5456x1859)
2.66 MB
2.66 MB PNG
>>101344297
Looks pretty cool.
Meta uploaded chameleon to huggingface today too I think.
Sucks that you cant run this with quants though.
One of you giga vram chads needs to test it.
>>
>>101344767
Oh really? It can't be squanted? Is that because of its architecture or it simply hasn't been done yet?
>>
>you're waifu can use (hallucinate) reaction images
Imagine. Just imagine.
>>
>>101344779
I just meant that I didnt see anything regarding that on the github. I'm sure somebody will implement it if this becomes popular.
>>
>>101344789
If you combine it with clip you and her can send images back and forth.
>>
>>101344809
I mean I don't think they use clip in the case of chameleon?
But yeah, literally imagine just chatting with an AI like you would another anon, sending shitposts to each other. This is the future.
>>
>The current model was only trained on 6k images

Imagine what it could do if it got its hands on the pony dataset or something.
>>
>>101344370
As a big Jewish investor I will not let this slide
>>
>>101344767
This has potential for a chan simulator
>>
File: other1.jpg (2 MB, 4309x3456)
2 MB
2 MB JPG
>>101344864
>>
File: 468519162.jpg (1.33 MB, 2048x2048)
1.33 MB
1.33 MB JPG
>>101342836
>Tess
I tried that one too. It was too many IQ points below vanilla Miqu to be usable in my experience.
>What MM can do that Miqu doesn't?
More varied prose and slop prose. More creativity if you like a DM that will surprise you during longer roleplay. The only drawback is that it was a few IQ points below the vanilla model.
>fuck you for condoning the MM spam in the thread
The point of >>101342270 was to show that benchmarks are retarded. Then (you) started sperging out about a model that barely gets mentioned around here anymore. You need to say fuck you to yourself. If you hadn't zeroed in on Midnight Miqu like Kim Peek in front of a bucket of randomly arranged toothpicks this thread would be a lot less shat up
>>
>>101344913
If they served pizza in hospital that is what it would look like.
Still, kinda hungry rn.
>>
>>101344913
>Each cube has a unique color and letter on it
and these are the cherry picked examples they came up with? lmao
>>
Kinda ridiculous.
Not sure what zucc is smoking.
People can already draw loli in ms paint. Just release it and put the responsibility on the user.
>>
>>101344951
Why is Armen such a fucking dick? He's made it his whole mission to make sure we don't hurt his precious model. Let us tune it how we want.
>>
>>101344951
The government says no.

>>101344965
Are you retarded? He literally went "oh no... they are doing what said not to do with exact instructions on how to do it..."
He from day one was telling us that they had to jump through hoops to get it approved and how to tune it back in.
>>
>>101344965
retard
>>
>>101343326
you should show this to >>>/ic/ i'm sure they'll love it
>>
>>101344977
>>101344975
Heh, I was only pretending to be retarded.
>>
>>101344926
The entire reaction about that benchmark in >>101342270 boils down to saving face.
And I'm not giving in an inch about attacking anyone that pretends that spamming something for long enough becomes consensus.
Attacking benchmarks just because they showed that your stupid merge is retarded is pathetic.
You're pathetic, mikufag.
We can't tell people that merges give models brain damage because it hurts the feelings of the shills?
You're honestly a piece of shit.
>>
>>101344965
Looks like he is forced by meta and wants people to finetune it.
Seems like he is suggesting that with more finetuning the quality might still has lots of room to improve.

Its just so stupid. Gemma2 is the exception but otherwise it has gotten really bad. If Llama4 doesnt dial alignment back drastically its going to be unsuable. Like Gemini level unusuable.
People who joke about winnie pooh etc. have no self awareness at all.
>>
File: ComfyUI_00073.jpg (1 MB, 2048x2048)
1 MB
1 MB JPG
>>101345011
>mikufag
Oh no, I've been found out!
>>
>>101345011
(schizo still ignores facts and continues to carry on in his own direction)
very organic

>the reddit post posted as an example to a topic you strayed from is .......
i had to stop there. take your fucking meds
>>
>>101345079
You know what else is disgusting about you? That you think using a Miku avatar gives you authority.
>>
Can someone explain the details of storystring VS instruct mode on silly tavern for me and how they relate? I understand storystring is a set of lines that will tell the model to pull certain things from the character card like description, personality, scenario, etc? But where does it fit in with instruct templates system prompt, like how does the model read the instructions, what's the order the model receives it?

So is it like... Story string > system prompt > user input > assistant(model) output?

Finally, is example separator and chat start needed, and what do they do? How do they fit in to the order of how the model gets instructions? Do they appear after story string and before system prompt?
>>
>>101345218
Also... If the model is already reading the character cards description as context for {{char}}'s , scenario, description, personality etc, then what exactly is storystring accomplishing?
>>
>>101345218
The Context Template is like the setup done once per conversation, and Instruct Mode is how each turn is built.
The Story String is basically the whole system message that's sent first, then come the Example Separator and the example messages, the Chat Start, Greeting, etc.
Just turn on logging in your backend and SillyTavern to see what's being sent.
>>
>>101345013
>he still thinks there's going to be any more open llama weights
even llama3 400B weights aren't coming out
>>
Any opinions/links on the best context/instruct set for gemma 9b on sillytavern?
>>
how good is the phi-2 from ms? did anyone try?
>>
>>101345598
sorry, typo, I obviously meant 3, the third one, phi-3
>>
>>101345183
>Miku avatar gives you authority
we've reached peak /lmg/ schizo rage
>>
>>101345759
>>101345759
>>101345759
>>
>>101345598
>>101345612
depends which one you mean, but they're very decent for their respective sizes, just dont expect mensa level intelligence obviously
>>
Return to nous-hermes-13b
>>
https://github.com/catid/cuda_float_compress
>If your network link is faster than 10Gbps, then it may not be an improvement over just sending the file uncompressed since it compresses at about 12 Gbps. So, it's well-suited for most kinds of Internet transfers, but maybe less useful to send data between servers that are connected via 100G+ InfiniBand or some other supercomputer-class switched network. I'm personally planning to use this for distributed training on the Internet, so it's a better option for me than a faster CUDA-only approach that gets a worse compression ratio.
neat could be nice for federated training
>>
>>101342843
I prefer Arch-based distros for the convenience of the AUR but other than that it probably doesn't matter much.



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.