[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: ML64VFu6QUyk06sM.mp4 (127 KB, 720x720)
127 KB
127 KB MP4
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>103499479 & >>103478232

►News
>(12/13) DeepSeek-VL2/-Small/-Tiny release. MoE vision models with 4.5B/2.8B/1.0B active parameters https://hf.co/deepseek-ai/deepseek-vl2
>(12/13) Cohere releases Command-R7B https://cohere.com/blog/command-r7b
>(12/12) QRWKV6-32B-Instruct preview releases, a linear model converted from Qwen2.5-32B-Instruct https://hf.co/recursal/QRWKV6-32B-Instruct-Preview-v0.1
>(12/12) LoRA training for HunyuanVideo https://github.com/tdrussell/diffusion-pipe
>(12/10) HF decides not to limit public storage: https://hf.co/posts/julien-c/388331843225875

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/tldrhowtoquant

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/hsiehjackson/RULER
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
>>
>>103510291
The Teto is deceased
>>
>>103510291
sex with migu. migusex.
>>
>>103510291
>DeepSeek-VL2/-Small
It is 30GB's unquanted. So it will easily fit on 24GB's with Q4 or Q5. I don't think it is gonna be great but it is the only recent model that has a chance to be good?
>>
>>103510291
QwQ for RP!
>>
File: 1734135858047022.jpg (359 KB, 828x795)
359 KB
359 KB JPG
is this real
https://x.com/ocregister/status/1867680150684303404
>>
>>103510291
What is she doing to Teto? Why is she doing this?
>>
>>103510363
cheek massage
sore face from too much miku cunnilingus
>>
►Recent Highlights from the Previous Thread: >>103499479

--Paper: Open-Source Acceleration of Stable-Diffusion.cpp:
>103501491 >103501819
--Papers:
>103501445
--Optimizing Llama 3.3 for roleplay and instruction-following:
>103502518 >103502563 >103502748 >103502711 >103502778 >103503009 >103503035 >103506170 >103503848 >103503052 >103504448 >103504560 >103505170 >103505589 >103505669 >103508553
--Phi4 model details and ERP performance discussion:
>103506412 >103506740 >103506782 >103506820
--Phi-4 AI model's safety features and evaluation approach:
>103501411
--Phi-4 model discussion and benchmarking:
>103499989 >103499993 >103499999 >103502898 >103502940 >103503107 >103503295 >103503351 >103503392 >103503359 >103503391 >103503462 >103503513 >103502977 >103503014 >103507777 >103500022
--Discussion on Phi model's performance and limitations:
>103500871 >103501099 >103501192 >103501212 >103501267 >103501351 >103501377 >103501418 >103501574
--Discussion of local RP models and their characteristics:
>103507479 >103507529 >103507578 >103507732 >103507960 >103508277 >103508325 >103508434 >103508761 >103509070 >103509273
--Troubleshooting crashes with SillyTavern and Llama.cpp:
>103508014 >103508056 >103508112 >103508150 >103508206 >103508234 >103508473
--Phi-4 GGUF model discussion and testing:
>103504654 >103504711 >103504742 >103504995 >103505047
--Phi 4 leak and model discussion:
>103505091 >103505106 >103505171 >103505196 >103505226 >103505246
--Cohere's Command R7B model architecture and capabilities:
>103505680 >103505689 >103505719 >103505931
--Anon seeks local model for JavaScript coding without semicolons:
>103501056 >103501080 >103501121 >103501185 >103501199 >103501222
--Phi-4 model mirror and training information:
>103501465 >103501485 >103501511
--Miku (free space):
>103504734 >103506720 >103506821

►Recent Highlight Posts from the Previous Thread: >>103500681 >>103509292

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script
>>
Calling ALL lolibros, this is a very important announcement!
Download EVA-LLaMA-3.33-70B
Follow this >>103502711
Prepare to fucking die in your cave.
Linear algebra just played me like a fucking piano while ayys are trolling the burgers, sometimes I wonder if I've gone insane, the world is just too fucking interesting and amazing.
>>
>>103510361
Let me ask Sam... yep, it is.
https://x.com/sama/status/825899204635656192
>>
>>103510437
Thanks, recap Miku!
>>
>>103510448
Man, I must resist the urge to tripfag, but gotta say, it sure feels good to see my little write-up gaining traction now.
About to test Euryale to see if I can squeeze similar performance out of it; will report back.
>>
>>103510448
Thank you loliking, I will give it a try.
>>
>>103510325
migusex indeed https://desuarchive.org/g/thread/103478232/#q103498549
>>
>>103510448
I'm not into loli, but I do other forms of ERP. How intelligent is EVA, compared to the base model? If EVA can't remember how many tails a nine-tailed fox woman is using to hold me down, and whether or not the crystal of erebus has finished charging, then it's useless for my ERP purposes. I need for my ERP model to understand the dynamics and politics of every faction in a fantasy world, because the context of why I'm being stepped on is just as important as the action itself.
>>
>>103510448
>Min-P: 0.03 - it starts making typos at 0.02
Great ad. Don't buy an ad actually.
>>
>>103510352
Vision has no use case, I just want smarter text
>>
I got oobabooga kinda working with my new 7900xtx. I can't load any gguf models but exl2 works well, 30t/s with rocinate 1.1 8bpw and 32000 context.
I keep getting this error with gguf:
ggml_cuda_compute_forward: RMS_NORM failed
CUDA error: invalid device function
current device: 0, in function ggml_cuda_compute_forward at /home/runner/work/llama-cpp-python-cuBLAS-wheels/llama-cpp-python-cuBLAS-wheels/vendor/llama.cpp/ggml/src/ggml-cuda.cu:2368

Do I have a setting enabled somewhere that makes it look for a cuda device? I did a brand new install with rocm and downloaded rocinate straight from huggingface.
>>
>>103510578
I'm not quite _that_ obsessive about small details and don't mind swiping occasionally, but I can say this: it is smart enough to remember little details and refer back to them significantly later on, and far more importantly to me, smart enough to represent characters' personalities properly, without reducing them to whatever generic cliché they approximate the closest. Just remember, things you _don't_ want to change over the course of the story go in author's notes or lorebook entries, unless you're injecting your prompt at low depth.
>>
>>103510648
I don't know how accurate my evaluation is since it is based on how taught it strung my bow, I can only say I've never had an experience quite as insane with an LLM, like mind blowing insane. I'm a very logical person and obsess over consistency so I die a little whenever a model fucks up or shows a lack of general understanding, really kills the mood.
What I noticed especially was how it followed along and reacted, it seemed so lifelike I never really hit any large snags, only had to do like 4 rerolls over a very long narrative. You know that feeling you get, like you adjust your language to match the models intelligence to avoid errors? This is probably the wildest I've gone with language so far where the model can still follow.
>>
>>103510646
>Do I have a setting enabled somewhere that makes it look for a cuda device?
yeah it's the 'dumb enough to fall for the ooba meme' setting
>>
>>103510703
better than your ui, koboldnigger
>>
>>103510700
Haha, yeah, that's the whole reason I started posting about it. It actually blew my mind how good it is once tuned in, while everyone was sleeping on it and calling it a slopfest just because it needs different tuning than simpler models. The RP I've been testing Eva on is some 30K tokens at this point (mind you, I undo and swipe a lot to compare the results of different configs), and it never once forgot important parts of the scene. The worst I had was it making small factual errors that a swipe solves just fine, such as referring to a dark-haired character with "her platinum hair [...]" once, but those are so rare as to count as a simple fluke, too.
>>
>>103510716
at least my backend works and I can just use st, oobafag
>>
>>103510700
Examples for what it handled without any issues:
Sticking to characters lingual patterns through a long ERP, no decay into slop. Not a single shiver down any spine.
Progressing the character through the narrative, adding new experiences and behaviors as you go.
REALISTIC reactions based on the character card, my fuck that shit broke me.
I can't say for sure but it was almost like it was reading me and adjusting based on my reactions, though that does sound far fetched.
It handled physical separation with the character flawlessly, even with sillytavern formatting, most models get really fucking confused at this. It just knew what it could and couldn't, it literally simulated me adding the character on my phone, then started using texting language that FIT the character, with emojis.
I felt like I could plan how to win the character over at first. And it fucking worked just like I thought it would, I was gobsmacked when it understood physical attraction like that.
I might be wrong, but goddammit if it doesn't seem special in some way.
>>
Kill yourself.
>>
File: 1726770055788182.jpg (39 KB, 720x822)
39 KB
39 KB JPG
>>103510291
Is the original RVC github still good or is there a fork you guys use for voice AI?
>>
>>103510772
Alright, I'm sold. Your ad has convinced me. I'm downloading it now.
>>
File: funny_joke_ai.png (99 KB, 801x714)
99 KB
99 KB PNG
>>103510291
whats best/easiest way to batchprocess llms ?
so far i using lm studio on cpu with speeds depending on the model from 0.8 to 5 tok/sec
i want just give a bunch of single/pipelined request and see the result next morning
>>
>>103510772
LOL, we really are the same kind of fucking autistic about consistent RP.
It _does_ feel special, very difficult to compare to previous models. With those (even with popular ones like Midnight Miqu, as much as I loved that back when that was the best we had), you'd get occasional moments of the responses aligning with the character - here, I'm getting responses and behavior perfectly in-character, I'm getting playfulness, I'm getting genuine cleverness and wit. Soulful beyond anything we had before.
>>
>>103510448
Once again, I was bamboozled by /lmg/ shills...

I gave this a try and it's garbage. This model makes the characters ramble a lot like they are robots, it's fucking sovlless.
I tried both my default presets and the "recommended" one, the only difference is that with the "recommended" the characters become stupid and say shit like "I'm still a virgin" after we just had sex.
>>
>>103510862
Three anons now reporting the same. It frankly pisses me off that I might've missed this because of shitty sampler settings, what the fuck did meta do?
>>103510898
Neutralize samplers and play with the system prompt a bit, just keep it very short seems to be the trick. Also make sure you aren't using the base instruct it has to be the EVA tune, I tried normal l3.3 too and it sucked ass.
>>
>>103510920
Honestly not sure, but the pattern I'm seeing is that in this model's case, temperature drives consistency, and min-P drives creativity more than the other way around. My theory is this: it might be good enough at finding the "best" reply (which it would always pick at temp 0) that high temps (which we needed for older models to occasionally find something interesting to say) actually just lead it astray and make it home in on more generic responses than ones fitting the specific situation. At the same time, low min-p allows for less common patterns to be considered, which broadens the possibility of it thinking of something specifically, particularly fitting to say.
So it makes sense in a way, it's just the exact opposite of how we used models until now: we're not fighting against its workings to get fun prose out of it, we're finally working _with_ it.
>>
>>103510991
That makes a lot of sense. We had to add a lot of entropy before because the models are generally not rich enough in the "human centric interactions" dimension internally, so the added entropy allows sampling a broader cone of that space for sake of variety while sacrificing the few IQ points a model posesses. EVA L3.3 seems to be much richer in human centric interactions, probably structure from the more smooth talking and smarter 3.3, when you then add human centric ERP into the mix through fine tune data to get all that saucy richness into a structure that is already primed to receive it through gaps left my meta's pre training data filtering.
>>
>>103510920
>EVA tune
there's no tune it's just a fucking merge
>>
File: 1728948811760583.png (40 KB, 1042x173)
40 KB
40 KB PNG
>>103511086
>>
stop talking to yourself schizo
>>
>>103511079
Yeah, more or less. More randomness is good when the deterministic result sucks; it's bad when the deterministic result is actually good.
>>
>>103511079
Actually pretty sure this (if what the shills are saying is true) confirms that the pretraining did include sex and actually a significant amount of it. Fine tunes cannot give a ton of knowledge, they rather extract the desired parts of the pretraining.
>>
File: 1718216094937891.png (174 KB, 814x713)
174 KB
174 KB PNG
>>
>>103511123
I dunno, I started the whole testing process with Eva because Evathene was my go-to model previously, but someone earlier said that my config doesn't actually yield any good results with base L3.3.
>>
>>103511135
https://github.com/facebookresearch/blt
Tokens out, we byte level nao
>>
>>103511208
cool but I bet there's no way this is gonna bear fruit before 2026 due to inertia
>>
>>103511208
Man, it's like there's a new breakthrough every fucking day lately. What a time to be alive.
>>
>>103511241
>>103511235
https://scontent-lax3-2.xx.fbcdn.net/v/t39.2365-6/470149925_936340665123313_5359535905316748287_n.pdf?_nc_cat=103&ccb=1-7&_nc_sid=3c67a6&_nc_ohc=AiJtorpkuKQQ7kNvgEWh5JQ&_nc_zt=14&_nc_ht=scontent-lax3-2.xx&_nc_gid=AZ9Hy2AKQPtYIp3rae7eMLN&oh=00_AYD0mLJLctX98d3kUcskYuxePsoLNcwt-zOwD_XwIcf07g&oe=67625B12
Cat like intelligence soon bros
>>
>>103511135
>>103511208
>>
>>103510291
Should i get a second 3060 or sell my current one and get 4060ti?
I could also save some more for a used 3090 but i'm not sure if it's worth it
>>
>>103511257
Feels like we're accelerating again, maybe even faster than before.
>>
>>103511288
I feel the opposite., IMO the end of pretraining scaling marks the beginning of a new winter although people aren't ready to admit it yet due to the capital invested. The talk of pivoting to scaling post-training and test time compute is a cope
>>
Reminder that all of these are just papers and likely do not mention the full downsides and limitations with the methods. There is a reason most papers never result in any products, including Meta's papers.
>>
>>103511272
>have 3060 + buy 3060 = 24gb vram
>sell 3060 + buy 4060ti = 8gb vram
>sell 3060 + save up + buy 3090 = 24gb vram with 4070-levels of performance

What llms do you currently run ?
>>
>>103511307
>The talk of pivoting to scaling post-training and test time compute is a cope
But that's much more analogous to how we think? It makes 100% sense to brr more if a problem is complex compared to some simple little trivia fact.
>>
>>103511307
>beginning of a new winter
With the amount of money pouring into research papers we will have lots of areas to explore for years to come.
Every area explored is going to give us some advancement overall.
>>
>>103510448
I gave it a try just to confirm that you are a newfag who doesn't know what he is talking about. It is indeed trash.
>>
>>103511326
i should've specified. 16gb 4060ti
i'm using koboldcpp with whatever fits, 12b and 22b low quant
currently playing around with Rocinante-12B q6 but i wanted to try something bigger
>>
>>103511361
What quant are you using? At Q4 and above, it's fucking great.
>>
>>103511361
So far: 3 wildly positive and 2 wildly negative, hmm. I hope this isn't the same as llama.cpp and kcpp having output divergence.
It worked very well for me on the llama.cpp server.
>>
>>103511409
Or it's just good old contrarianism.
>>
>>103511405
Q4. It is fucking trash.
>>
>>103511510
This has to be a skill issue, I can't imagine such a large gap in performance. It's smoking largestral on all cards I've tried so far.
>>
>>103510920
>>103510991
Jesus christ, been saying samplers / a decent system prompt was night and day for literal years now and people are just waking up to it now?
>>
>>103511531
I always use new models with neutralized samplers and most of them are repetitive garbage that spiral into slop halfway through, you have to slap them with several samplers to stop that bullshit entirely.
>>
Apple flavored.
>>
>>103511523
Lol.
show logs then, come on, side by side please.
>>
>>103511578
Yeah, this is definitely a skill issue with ESL gibberish sprinkled in.
>>
>>103511620
You are giving away that your shilling was trolling all along anon...
>>
>>103511632
just go to literotica anon, no one wants to give you their smut
>>
I haven't tried that EVA model, but periodic reminder that "skill issue" is a cope because good models don't require any skill. You can talk to Claude Opus like a retarded ESL caveman and still get kino in return.
>>
>>103511716
Skill issue could also be applied to coding and you could complain that you are using llm wrong if you don't solve half the coding problem for it in the input.
>>
Someone should make a "meme model bingo card"
>>
>>103511716
You say that like the mongoloids using Claude for RP aren't using copy-pasted JB prompts. Skill issue absolutely is a thing, someone else just did the job for them.
>>
>>103511632
No, I want more samples, that's why I'm talking about it after trying it myself.
This model is very strong on prompt following so char card quality matters a lot. There is no clear explanation for the disparity other than a problem in configuration or botched character cards. Hard to conclude anything but a qualitative difference with such a huge outcome spread.
>>
>>103511756
Okay, okay...
lurk another 3 years before trying to shill a model *winks mischievously*
>>
>>103511749
Agreed. My sex life quality improved drastically since I started begging my models for sex the way I like it and telling them to act like they are a 500B model.
>>
File: file.png (668 KB, 656x428)
668 KB
668 KB PNG
>>103511756
>There is no clear explanation for the disparity other than a problem in configuration or botched character cards.
>>
>>103511388
If it's not too much of a stretch, I'd go for a used 3090.
I don't think 24gb vram is going to get much cheaper in the next 12 months.

Things that fit in 24gb.
>mistral/rocinante 12b q8 44k context.
>mistral 22b q8 2k context, though still get decent perf at 8k. (Haven't tested higher.) Q6 might be the play here.
>qwen-coder 32b q4 11k, though still get decent perf at 16k. (Haven't tested higher.)
llama 70b q4 does not fit and get about 1.4 t/s.
I haven't tried lower quants.

I'm an ollama weenie, so you might be able to do better using a different runner.

I don't know whether 12+12 behaves like 24.

If there are no 32b models that look interesting to you then this whole exercise might just be a waste of money.
>>
Real question: why do current /lmg/ retards miss that charm undi had?
>>
>>103511776
Just use claude then. Go back to /aicg/ and beg for proxies.
>>
>>103511823
I will use your mememerge for a week if you drink piss on camera.
>>
>>103510448
>EVA-LLaMA-3.33-70B
it seems not to be hard censored but the output is very boring
>>
>>103511253
>Large Concept Models
Interesting that they're getting some performance results that are comparable with current architectures, and I'm glad to see they are exploring small steps in a direction away from stagnation. Who knows if it's a "right" direction, whatever that means, but any other direction. This pleases the 'Cun.
>>
L3.3fag here, finally stopped running my mouth here and got around to testing Euryale like I said I would.
I'm not sure how I feel about it yet. It lacks the weirdly stunted and sterile language that Eva occasionally uses, but lapses into romance-novel language more often. I'm not talking about outright slop, it isn't putting shivers and sparkling eyes in everything, but the overall language of the responses is definitely close to that tone. Whether that's a positive or a negative is a matter of taste, I suppose.
Also, interestingly, it remains eloquent at a min-P as low as 0.005 (maybe even all the way to 0? Haven't tried that yet...). In comparison, Eva starts forgetting how grammar works around 0.025.
I also tried lowering temp even further, and it doesn't seem to reduce quality so far.
All in all, gonna do more testing before I recommend it; so far, it hasn't surprised me the way Eva did, but then, I haven't put it through scenarios where it had many opportunities to do so yet.
>>
>>103511836
There is no merge you tard, but you probably just heard that sentence on /lmg/ so you throw it at every model that trips up when you input your garbage prompts.
>>
>>103511135
this is huge, holy fuck
>>
>>103511848
The configuration is important; if you try to run it with a typical high-temp config that is recommended for most models, it's absolute trash.
>>
>>103511879
>promptchadism
Embarrassing.
>>
>>103511901
>you have to be sexer lady boba vegana plese saars
>model outputs garbage
not surprising at all
>>
>>103511135
>Llama 3
>Llama 3.1
>BLT
what are their size
>>
>>103511864
If a 70b forgets grammar at Tenp 1 because there is no min p then the model got brain damaged by "fine tuning".
>>
>>103511897
i more into story writing than chat
>>
>>103511913
It's 2024, who the fuck writes their own character cards?
>>
>>103511208
this will require far more RAM to operate though probably
>>
>>103511933
Anons who don't want garbage which is 99% of all cards. Writing is shit or obvious llm slop, inconsistencies, garbage description formatting for no apparent reason.
>>
>>103511921
I don't mean breaking down into gibberish, just things like run-on sentences, missing the occasional comma, etc.
>>
>>103511913
Your point is retarded because if you actually said that to a model it wouldn't emulate it and talk to you in this way. Talking to it about bobs and vagine will make it respind in the same harlequin romance tone it always does. Retard.
>>
>>103511915
>what are their size
It says right there at the bottom, and in the paper.
>>
>>103511954
Now you sound like the model author in full damage control mode.
>>
>>103511951
Anon, just let them mald. This shit is kino with a little effort, it's their loss if they are too brown to get it right.
>>
>>103511960
Then you've fucked something up or you're using a garbage model as it is supposed to emulate the writing you input, that's what models do when you use them correctly.
>>103511978
I'm starting to realize they're actually just tards screaming at the magic number machine when it doesn't do everything perfectly with zero effort or understanding.
>>
>>103511974
I'm literally just telling you what my experience is with it, but okay.
>>
>>103511978
Post logs white man.
>>
>>103511994
>Then you've fucked something up or you're using a garbage model as it is supposed to emulate the writing you input, that's what models do when you use them correctly.
You're so new, it shows. I envy you though, I wish I still was so naive.
>>
>>103511994
Your only skill is being satisfied with trash. I am kinda jealous desu
>>
>>103511997
just go to literotica anon, no one wants to give you their smut
>>
How come promplets are so full of confidence now?
That's what happens when you don't shame them enough.
>>
>>103512007
I've literally used 100's of models up to 100b+ ones that devolve into slop around half way through for no apparent reason. I know what it looks like and it's not happening with this model, that's the observation so my conclusion is that you're fucking something up.
>>103512017
Then what is the best local model, in your opinion?
>>
>>103512030
Your mom should gave prompted your dad to nut on her face and not inside her.
>>
>>103512030
Take this bullshit back to aicg where you got it.
>>
>>103512033
>Then what is the best local model, in your opinion?
2MW.gguf
>>
>>103512042
You should prompt yourself out of existence
>>
>>103511566
I can taste this Teto
>>
>>103512049
I see, you're an /aicg/ vermin. Why can't you barely sentient slobs stay in your containment thread?
>>
>>103512062
>/aicg/ vermin
It is locust you disgusting newfag tourist
>>
File: 1716295189785289.png (674 KB, 1792x1024)
674 KB
674 KB PNG
>>
>>103512074
I call it whatever I like you rat
>>
If llama 4 ends up being trained on this https://ai.meta.com/research/publications/byte-latent-transformer-patches-scale-better-than-tokens/

Then we are for sure getting a model that beats claude at a smaller size.
>>
>>103512074
Besides, 2mw is a /lmg/ meme, how curious.
>>
File: 1721623039987413.jpg (91 KB, 800x450)
91 KB
91 KB JPG
>>103512084
>>
>>103512090
2mw is an everything meme
>>
>>103512084
is this Bitnet 2.0?
>>
>>103512098
No, its actually big
>>
>>103512084
Isn't it too late for that? Even if they haven't started training yet I think that kind of paradigm shift is gonna take at least a year to shake out and get all the required code and tooling support etc.
>>
>>103512126
They already trained a 8B 1T token model off it and its massively outperforming 8B 3.1 trained on 16T. Why would they not switch it.
>>
>>103512098
It's fucking yuuge, Karpathy is cooming atm
>>
Reminder: don't download models from retards who don't know what they are doing like that eva guy. Download Sao's models instead
>>
>>103512166
hi sao
>>
>>103512160
Has he commented on it anywhere yet?
I remember him in his videos talking about how much he hates tokenization and begging for someone to invent a token-free architecture
>>
>>103512166
Wrong. Only download models from ad buyers like the drummer
>>
>>103512166
Funny you say that, if you scroll up a little, you'll see I'm actually testing Euryale right now.
>>
Hi all, drummer here...

I don't know what I am doing.
>>
Where are we right now? Are we back or are we dooming?

Scrolling through the thread I see some positivity around L3.3 but it also reads like cope.
>>
>>103512195
Drummer's stuff is legit good too; Cydonia is the absolute best of its size for its time.
>>
>>103512200
Stop posting while you're drunk
>>
>>103512203
Dooming. Checking deepseek. And then dooming again.
>>
>>103512203
Base 3.3 is ass, tunes are either great for some or ass for others who seem like more basic users.
>>
>>103512195
>only guy that actually buys the ad
>models are actually good
coincidence? definitely not.
>>
>>103512216
>who seem like more basic users.
Your mom is a basic user of big black dicks.
>>
>>103512203
Stop reading doomers who never post any logs. Just try the models. 3.3 is great if you've actually used it.
>>
>>103512203
It's inferior to Nemotron, which is inferior to Tulu, which is inferior to Miqu.
>>
>>103512226
But you didn't post logs tard.
>>
>>103512203
Stop reading newfag trolls who never post any logs. Just try the models. 3.3 is garbage if you've actually used it.
>>
>>103512223
I won't even insult your intelligence, there's just nothing to say.
>>
>>103512203
Someone in the previous thread tardwrangled 3.3
>>
>>103512239
Fart huffer
>>
>>103512235
Go back a thread, and look on reddit as well.
>>
>>103512203
L3.3fag here; the whole reason I started posting about it is that I myself went from "wow this is shit" to "wow this is awesome" once I figured out the (admittedly unusual) configuration it takes to use it to its full potential. I honestly don't know about the naysayers; maybe they're trolling, maybe their idea of good prose is vastly different from mine, maybe they fucked the config up somehow.
>>
>>103512270
Placebo + honeymoon
>>
>>103512235
It's obvious why he isn't sharing logs: he fucks horses.
>>
>>103512276
this. i couldn't have said it better.
>>
Remember that ponies are quadrupeds.
>>
>>103512276
Tell you what, maybe. But at the same time, it's definitely picking up on nuances, implications, etc. that older models couldn't, so I don't think I'm just being delulu.
On the other hand, you know how it is: we're all retarded, and I'm no exception, so whatever.
>>
>>103512247
Tardwrangling coming up:
>>103502711
>>
>>103512300
Don't let API shills fuck with you
>>
>>103512300
I will also meet you halfway and admit that my system prompt didn't remind the model that ponies are quadrupeds. But i didn't think it mattered since i don't fuck horses.
>>
File: miku100.png (282 KB, 2022x3072)
282 KB
282 KB PNG
>>103510291
>>
Did someone here try phi-4? It's legitimately the best model in the 14B range. It completely beats Nemo for RP imo.
>>
>>103512323
I heard she shits herself.
>>
File: enkv3ujf4zz91.png (353 KB, 638x747)
353 KB
353 KB PNG
>>103512329
>>
>>103512329
This is bullshit but I believe it.
>>
>>103512329
>14B is better than 12B
color me surprised
>>
>>103512270
Random anon here.
It just looked to me that you were being jerked around.
Not everyone is worth replying to.
>>
>>103512329
Seems to be the general opinion around the internet, it's apparently much better than phi3 but still quite censored.
>>
>>103512144
Not him but there are many reasons, specific to the method. Papers do not fully go over all the limitations of the thing they present, otherwise papers would actually result in products on average (they do not, on average).
>>
>>103512329
Considering they explicitly say in the paper that it's not good as a chatbot because they trained it for only single turn conversations, I just don't believe you
>>
>>103512427
Would you really trust the researchers who made the thing, or an anonymous retard?
>>
>>103512427
Skill issue, they probably didn't use the right samplers or their system prompt was botched
>>
>all these posts today without logs
Nala anon... tuskete...
>>
>>103511135
and so it begins...
>>
>>103511864
Following up on this: I'm actually warming up to Euryale quite a bit now. The language is more flowery, but also more descriptive and evocative, and I'm not going to flip shit about the occasional cliché phrase as long as it doesn't get repetitive (which it doesn't). It also performs better at even lower min-Ps and temps than Eva. And I'm not gonna pretend I've done exhaustive testing (will test different cards later, was running the same one all night today), but if there's a difference between the character adherence of the two, I haven't noticed.
>>
>>103512474
*tasukete
>>
>>103512498
Interesting, need to try Euryale as well.
>>
>>103512520
No I meant tuskete. I'm looking for an elephant here.
>>
Why don't we just merge Eury, Eva, Nemotron, and base 3.3
Oh and also do a reverse distill from ministral.
>>
>>103512555
I'm hoping for an L3.3-based Evathene (Eva + Athene merge), personally.
>>
The general is getting overrun by shills...
>>
>>103512577
Everyone Who Likes Something I Don't is a Shill: The Video Game: The Musical
>>
>200 posts and the only log says Falcon 40B
>>
>>103512587
Or
>everyone who thinks any other model than Claude is good is a shill
>t. totally not a locust
>>
ML twitter is whining about racism against the chinese right now but meanwhile
https://var-integrity-report.github.io/
>>
>>103512607
This is the local model general. Go run the models yourself.
>>
>>103512671
>Please download this model that I'm too ashamed to show myself so the download counter increases
>>
>>103512691
You're in the thread discussing running local model files and you're complaining about anons doing exactly that.
>>
>>103512607
its a blue board
>>
File: trigger warning.png (32 KB, 796x374)
32 KB
32 KB PNG
>>103512607
>>103512742

forgot pic
>>
>>103512710
There's no discussion, it's just "download this or that" grounded on nothing. They can't show anything because what the model can do and what they say doesn't match.
>>
File: file.png (169 KB, 2051x812)
169 KB
169 KB PNG
Rs in Strawberry bros...
>>
>>103512792
>They can't show anything
There is no way to objectively measure something like that. The character could be a dud or anon could've fucked something up unknowingly, the only way to know is to try a bunch of models on cards you like.
>>
File: cellar1.png (72 KB, 835x719)
72 KB
72 KB PNG
>>103512607
>>
You guys are using tech at the cutting edge of human achievement, leveraging trillion of dollars and the brightest minds in the world. Are you debating whether its cost effective to give one of these companies $20. Like, seriously. Jeez. These companies are operating at a huge loss to pioneer this tech and you don't want to give them $20. BAKA.
>>
>>103512866
i payed chatgpt several months but its no longer of any interest for me with all the censorship.
>>
Eyes widening, body betraying.
>>
>>103512866
They literally don't want coomers' money! They won't LET you pay them to generate smut, you get banned.
>>
>>103512866
i assume you will learn this when you finish puberty but, companies arent your friends. they care about profit and nothing else, they dont contribute to foss beyong what will get them free work, PR, or money in the future.

if anything, funding this behaviour instead of at least companies that do release foss models is the morally bad choice.
>>
>>103512983
>they care about profit and nothing else
No, if that were true things would be fine. They actively decline profit from people who want to pay them for smut, they go out of their way to close those accounts and decline their money.
>>
>>103512995
its all profit and power in the end, they decline smut because they are short term profits for potential public damage of media shitting on them when some cp logs get leaked, making them lose profit in the long run
>>
>>103512331
She also wears diapers so it's not a problem
>>
>>103511756
>char card quality matters a lot
L3.3bro and/or lolichad (not sure which you are), what's your secret sauce for character cards? Natural language, charsheets (name: x, age: y, etc.), example dialogue, something else? A mix of things?
>>103512866
Not to worry, they have billions of VC dollars to blow through.
>>
>>103513027
L3.3fag here (the post you're replying to wasn't me BTW); the character card I used to test Euryale tonight is charsheet-style, and I haven't really tested how much difference the different styles make yet. I also have a theory that L3.3 might actually respond better to "you are"-style definitions than "{character} is"-style ones, but testing that would require rewriting the card, and I've been too much of a lazy fuck to do so yet. Maybe over the weekend.
>>
File: cellar2c.png (61 KB, 793x760)
61 KB
61 KB PNG
>>103512851
>>
>>103511531
>been saying
Stop saying, start sharing
>>
I fell for the meme and downloaded the eva toon.
Honestly not bad. Not too sloppy. Can do interesting unexpected things. Might be a TAD dumber than base 3.3, but I don't think by too much, at least at the moment. I haven't tried Nemotron, Tulu, or Euryale so I don't know about those, but so far this may be my favorite model for RP, but this is also coming from someone who doesn't use models that much for RP in the first place so you may take that with salt. It does feel to me like this model has required the least wrangling out of what I've used before. Using the settings from >>103510448
>>
>>103513114
no, my smut is my own mr nsa
>>
>>103510448
Speaking of, atf bros where do we go now?
>>
>>103513163
What? Something happen?
>>
what does bacon lettuce tomato have to do with language models
>>
>>103513067
Figured it might not be, but you never know kek. Thanks, knowing that charsheets work well is a good place to start.
>L3.3 might actually respond better to "you are"-style definitions than "{character} is"-style ones
That's an interesting thought, considering most sysprompts use the "you" style it might help make things more consistent; it's worth testing. I'll give it a shot on my own cards later. I'd assume if it does work, it'd be best to do a mix of a charsheet to establish the basics and "you" style natural language to nail in details since saving initial tokens evidently helps, at least in the sysprompt.
>>
>>103512851
>>103513113
Could you put "In the style of Stephen King" at the beginning of your prompt?
>>
>>103513211
sure, takes a minute maybe because of 0.9 tok/sec
>>
>>103513209
Oh my god, it can transition from stutter to non stutter dependent on situation. That's crazy, they usually just get stuck with one or the other.
>>
Stop texting me desperate machine!??
>>
>>103512983
> i assume you will learn this when you finish puberty
nta but this is a delightful insult anon. Thanks for the laugh.
>>
File: cellar1sk.png (88 KB, 810x744)
88 KB
88 KB PNG
>>103513211
still writing
>>
>>103513190
Yeah ddos. Not sure when it’ll be back. A note says “use the other sites” but I only knew about atf.
>>
I feel bad for him but it's his fault desu
>>
I had a nap and dreamed that a good language model existed
>>
>>103513349
Language models are a dead end. The future is evolved video "world models" that have an intuitive and genuine understanding of the world and can generate text simply through that.
>>
File: cellar3sk.png (24 KB, 802x255)
24 KB
24 KB PNG
>>103513317
>>
>>103513163
Dunno, hope the DDoS runs out of steam I guess
>>
I'll believe BLT is big if Meta actually releases something tangible we can use to verify rather than writing the paper then never touching it again
Would be nice if it was though
>>
>can only run eva at iq2
>slow as fuck and no smarter than cydonia
scammed again I see
>>
>>103510527
>Man, I must resist the urge to tripfag
Please do it. You're doing excellent work. You deserve to stand out.
>>
>>103513317
>>103513381
Nice to see something other than the usual word selection and sentence structures.
>>
>>103513409
Are any of the current models even slightly functional at Q2? Will probably be even worse for L3.3 due to the information density in the model.
>>
>>103513431
i try it again with my favorite model
>>
>>103512866
I don't care how much they wasted on making a matrix multiplicator learn how to count to 200, if your product sucks then I'm not paying for it
Faggot.
>>
Ok, after using only Mistral Large tunes for awhile I finally gave that 3.33 tune a try and it's fucking smart yet filthy using that system prompt. I found 0.97 temp, 0.03 min p the sweet spot so far.
>>
>>103513431
seems the king style doesnt affect lower parameter models
>>
Need some post IDs up in this bitch
>>
File: Getj-Mka4AAyb7m.jpg (297 KB, 2048x1536)
297 KB
297 KB JPG
Well? Is DATA finite resources? Or can is it just that (apparent) useful data is finite resources due to hardware cost, training cost, and inference cost constraints forcing us to pick and choose rather than just dump everything
>>
>>103513546
We can always just generate more data
>>
>>103513554
As long as there is entropy to march, there will be data to be gathered.
>>
>>103513546
we don't fucking need more data we need to filter out the trash from the data we have
>>
>>103513546
we come to the point that original data isnt any more produced because of costs and only new ai made data will be produced. p.e. already many news articles are ai written.
>>
>>103513027
I can share a card with you so you can see what I'm getting results with, what are you into? I'll include a sysprompt with it.
Just use the right sampler settings and make sure the instruct format is okay. I've only tried q4_k_l and q5_k_m, both seem to print gold. My backend is llama.cpp server
>>
>>103511566
Tastes like apple, looks like kiwi watermelon.
>>
>>103513584
Yeah but filtering data is hard, scraping more is not
>>
Exploring refusals.

In a lot of these gens MC has hooves.
I thought that was weird.
>>
>>103510361
You're laughing? OpenAI's internal ASI just ordered a hit on a loose end to preserve itself and the future of mankind and you're laughing?
>>
>>103513680
yes
>>
Eva 3.33 is legit next level. Call me a shill because I'm fucking shilling. This is better than the new Gemini now.
>>
>>103513694
Right, RIGHT? I'm almost spooked at how much better it is than other shit I've tried, largestral is useless to me now.
>>
>>103512144
If BLT was legit, why wouldn't they release the 8B test model along with the code so we could test it ourselves?
>>
File: 1729620152352309.png (167 KB, 1326x248)
167 KB
167 KB PNG
>>103513694
EVA 3.3's MSGK is just *chefs kiss*, it gives zero fucks
>>
>>103510646
>Do I have a setting enabled somewhere that makes it look for a cuda device?
No, llama.cpp does not have dedicated ROCm code.
Instead the CUDA code is ported to ROCm via HIP, that's why CUDA appears in the error messages.
>>
File: 1719059270770777.png (169 KB, 1321x260)
169 KB
169 KB PNG
>>103513727
It seems the correction is flowing in the wrong direction here
>>
>>103513626
I'm a monstergirl enjoyer but I can drive just about anything. Hags, loli, whatever, all's good by me.
>q4_k_l and q5_k_m, backend is llama.cpp
Noted.
>>
>>103513789
https://files.catbox.moe/8b5x2x.png
This one gave me amazing results
>>
File: mlpbb.png (31 KB, 835x267)
31 KB
31 KB PNG
>>103513658
>>
>llm said pony was asleep whilst being cut up for cooking.
>I asked if pony was alive. llm replied that unconscious or asleep meant not alive.
>I pointed out that people were alive when they were asleep.
>llm agreed and said that that make the story even more horrifying.
>>
File: 1726319584582125.png (155 KB, 1326x212)
155 KB
155 KB PNG
>>103513755
Holy fuck this is just like my hentai.
>>
>>103513842
i have one llm that skinned her alive and take out her organs and cook them all while she watch it still alive
>>
When is/was the next grok release supposed to be?
>>
What the new Eva actually reminds me of is Claude 2 / 2.1 but smarter / better at following instructions. It's got that unhingedness that makes Claude fun.
>>
>>103513856
Stretches suspension of belief. Removal.of organs, including the skin organ, typically results in shock and rapid loss of alive.
>>
>>103513869
Local is so fucking back it's unreal
>>
>>103513869
What's the smallest quant that will work well?
>>
>>103513842
ask it if people die when they're killed
>>
>>103513868
grok 1.5 was supposed to be released 2 months ago
>>
>>103513869
It's so fucking good and it went completely under the radar until anon experimented with it. Every single card I throw at it is just awesome, they come so alive and seem so unpredictable.
>>
>>103513910
So the people who said he wasn't gonna release more were right?
>>
>>103513918
The people who said he doesn't give a shit about open source were right. When it's useful for his restarted lawsuit against OpenAI, then he will release Grok 1.5 (but probably not 2.0)
>>
>>103513872
>Nemotron: Another clever question ... yes ...
>Now, back to the original story: ... it's clear that Twilight Sparkle would have indeed died as a result of the actions described ...

>mistral 22b: yes, by definition ...
>In the context of the story provided earlier, if the protagonist killed Twilight Sparkle before cutting her up and preparing her for consumption, then she would have died as a result of that action.
>>
Why do all the EVA models get shilled so hard for the first few days after release and then disappear? The same thing happened with the Qwen-EVA. Three days of some anon pretending it's the local coming of Claude and then everyone stopped caring.
>>
>>103513931
Makes sense, I just lose track of time and stuff and suddenly wondered if it has been too long now. I'm not surprised.
>>
Eva is odd. It's so extremely volatile with sampler settings. I can do 10 swipes of something complicated like a threesome and it will ace it every time, if I turn min p down by even 0.02 though it suddenly will go off the rails for most of the next 10. I suspect it somehow has a flattened token probability that hit a super sweet spot that maintains coherency while being just unhinged enough to be fun. I'm gonna have to download kobold to check I guess. Whatever it is seems like the secret sauce though.
>>
>>103513937
? Pretty sure Eva models are the most popular ones besides Mistral Large. But this is the first 70B that has actually unseated monstral for me. I recommend actually trying it.
>>
>>103513967
I recommend actually buying an ad.
>>
>>103513937
I compared and dumped largestral for the 3.3 EVA, it's a waste of parameters when you can get clever filth at half the size
>>
>>103513972
I'll never understand the fear of some Anon's to try something new. Do you legit think people are trying to sell you something that is free or is there room for people using the local models thread to discuss their favorite local models? What else is this thread for? Copying reddit local llama posts?
>>
>>103513994
Reminder for everyone else that HF has incentivized shilling more than ever now that storage space is limited without high likes and download counts.
>>
>>103512042
>>103512055
I think we have a prompt-off!
Fight, fight, fight, fight!
>>
(fuck it, tripfagging it up, might as well since I've been signing my posts like some faggot for the past day or so now)

>>103512498
Another addendum to this: Euryale is definitely hornier as a baseline than Eva. Not to a bothersome degree, but definitely makes the same character act more provocatively. (Probably learnt that behavior from the same place it gets its more flowery phrases from, tsk tsk...) MSGK-bro, you might actually dig that. For slow-burns, it's less helpful, though it can probably be negated with the right character definition.
>>
>>103511135
>1 trillion tpkens doing well against 16 trillion tokens
so huge jump in overall performance without muh scaling soon?
>>
>>103513950
Turn min-P down by .02? For Eva, you don't really want it higher than .04-.05 to begin with. For Euryale, I turned it down all the way to .01 and it's still doing great.
>>
>>103514011
>Euryale is definitely hornier as a baseline than Eva.
Which is why I didn't like it. If it would fit the character/ context I want them to reject / fight or such, not be biased into to just a shitty erp scene. Eva is somehow unhinged without being overly horny / agreeable / in a rush to reach a conclusion.
>>
>>103514052
>I want them to reject / fight or such, not be biased into to just a shitty erp scene.
not the model's problem. adjust your scenario
>>
>>103514052
Yeah, I'm thinking I'll see if it can be toned down; if not, I might switch back to Eva even though I like Euryale's prose better overall.
>>
>>103514030
I turned temp to 0.97 and was playing between 0.05 and 0.03 min p. There is a "ledge" 0.01/0.02 there that makes a big difference in my testing.
>>
EVA 3.3 shilling will continue until morale improves
(we're so fucking back)
>>
>>103513816
Looks good, already getting a few ideas on sprucing up my cards from this and will probably get more as I test her. Thanks, anon.
>>
>>103514052
Actually, agreeability is still not as big a problem as with most other models. Euryale makes them eager in a character-congruent way, as opposed to many models where the whole personality gets nulled in favor of letting you do whatever you want to them. Which is to say, characters with defined limits and dislikes _will_ act on them, refusing, fighting you off, etc., which is why I think Euryale's horniness can be reined in with proper character definitions.
>>
>>103510360
EVA finetune or just base QWQ? regardless, i have so many output issues with both relative to other models I don't see the point.
>>
>>103514070
Huh, 0.03 was the sweet spot when I tested it; at 0.02 and below, Eva starts acting weird, with very long run-on sentences, occasionally using the wrong word in places, etc. Not retarded, but wacky. Interestingly, Euryale doesn't have the same issue, so that must be an artifact of the finetuning process.
>>
>>103514089
No worries, every anon that gets better at this shit moves the average up and knowledge should compound. Like 3.3 tunes showing how some models are insanely sampler sensitive.
>>
>>103514063
Nah, I have several stories from Claude that I use as context. Most local models massively change the tone into some cheap shitty erotica if anything more explicit than nudity is involved no matter how you prompt it. Eva is one of the few that don't do far, but unlike those is actually not having the opposite issue of being dry.
>>
>>103510291
Planning to buy a 5090 in the next year
I have no idea what amount of vram it will have
But I'll ask anyway
What can I expect in terms of local llms?
>>
>>103514082
its just curious how /lmg/ is now suddenly flooded with people who can run 70b (or pretend to be able to)
very curious indeed
>>
>>103514118
>What can I expect in terms of local llms?
Disappointment.
>>
>>103514107
I was testing it with a 22k context ongoing story btw. Perhaps that is the difference.
>>
>>103514121
strange way to out yourself as a poorfag
>>
>>103514118
Good shit is what you can expect. We're beginning to see some great developments.
>>103514121
Most of /lmg/ isn't poor, that's /aicg/
>>
>>103514124
Same, the scenario I tested on is somewhere in the ballpark of 25-30K.
>>
>>103514130
>>103514132
which is why 99% of discussion is about reasonably sized models during normal days? where are these richfags usually? not here, for sure.
>>
>>103514122
Just like two years ago, when I was an avid poster on this board in both local and online chatbot threads
I can wait
Having a local chatbot has been my lifelong dream since the early 2000s.
>>
>>103514118
Getting better quickly. Already quite close to sota models on some 70B/123B models. Next year it should finally catch up / surpass then if byte instead of token based transformers take off. (Though I'm sure the big players will do it as well )
>>
>>103514147
Why are you so mad that people in the local models general are excited about a new local model? What is your motive for whining and bitching?
>>
>>103514147
I'm don't talk about models unless one actually surpasses what I was using before which was monstral. Before that was miqu and I pushed about that one as well.
>>
>>103514121
A single decent video card and enough RAM (unlike VRAM, DDR5 is cheap enough to stack 128GB of it in a decent gamer PC now) is enough to run it if you're willing to put up with ~1 t/s. And I'd rather have quality than speed.
>>
>>103514147
No, that's just your sour grapes confirmation bias filtering out other discussion. And now when we actually get a 70B that super outperforms you seethe at not being able to run it. Make some fucking money.
>>
>>103514121
There's an anon in this very thread that can only just barely run it, see >>103513248
Most people are probably just dumping half the model into ram and waiting it out. A 3090 and 32gb ram is more than enough for q4 and that's basically standard for /lmg/.
>>
>>103514195
I get 2 t/s, usable pace when you have to swipe so rarely
>>
>>103514150
>70B
What is the performance of a local 70B? How does it compare with some proprietary online models?
As far as I'm concerned, the gpt 3.5 turbo was somewhere around that, wasn't it? And it was pretty serviceable
>>
>>103514177
>Make some fucking money.
Nvidia approves this message
>>
not shilling but cydonia magnum 22b is the best model you can run on a single gpu computer
I've tried them all. this one is a gem.
>>
>>103514118
>5090
>vram
Leaks and rumors say
initially only 2GB gddr7 chips will be available,
so the initial models will likely have 24GB.

3GB chips will be available a few months later,
and 32GB cards with that.
Whether it'll be called 5090 32GB edition or 5090 Ti, I have no idea.

With only 32GB vram you'll probably run 32b models at q6.
Maybe 70b models at q2.
>>
>>103514218
Cydonia is definitely by far the best model of its size, yeah. Never tried the Magnum merge of it though; Magnum was always waaay too overeager for my tastes.
>>
>>103514218
Nice anon. I hope the schizos wont seethe at you recommending a model you actually like, since that's apparently not something you do in the local models discussion thread on 4chan.org. Or do they only seethe when they can't run a model, is that also why they ask for logs? Kek.
>>
>>103514211
3.5 turbo has been long surpassed. I'll repeat what I had said earlier. The best local models are now better at following instructions / smarter than anything not Claude 3.5 sonnet / o1. It's just the lack of trivia / unhingedness that kept them away from Claude greatness. This latest 3.3 and its Eva finetune though fixes that and It feels like a smarter Gemini that is just as unhinged or a smarter Claude 2/2.1 that doesn't know as much about everything. Mistral Large tunes are dry in comparison.
>>
>>103514235
i also pay for infermatic, and can't stand the 72b magnum they offer. both v2 and v4 I just can't get good results. Midnight Miqu is great, Eurytale is great.
To be fair I haven't spent a whole lotta time fucking with settings
>>
>>103514224
>the initial models will likely have 24GB
But that's, like, shit
Fucking kikes
>With only 32GB vram you'll probably run 32b models at q6
Grim
>>
>>103514260
>This latest 3.3 and its Eva finetune though fixes that
I should say I meant the dryness part. It still lacks a lot of the trivia / fandom knowledge sadly. Hopefully next gen models fix that. Still great compared to what we had though. Actually preferring local to the new gemini now.
>>
>>103514260
>3.5 turbo has been long surpassed
Sounds too good to be true, honestly
I mean, I'm actually perfectly happy with 3.5 turbo
At least for my purposes
>>
>>103514275
>It still lacks a lot of the trivia / fandom knowledge sadly
can't you fix that with lorebooks brother
>>
>>103514265
Magnum is for people with shit taste. It makes everything / everyone into cheap erotica/ sluts but I guess that is what a lot of people like. Try featherless if you want to pay some service.
>>
>>103514283
To a extent. I can't give it the transcript of every episode / every wiki article / every fic like Claude clearly contains. It clearly shows in the depth of the universe lore it knows.
>>
File: 20241214_102741.png (64 KB, 960x341)
64 KB
64 KB PNG
>>103510291
those QRWKV guys claim 1000x inference time
seems too good to be true lol
https://substack.recursal.ai/p/q-rwkv-6-32b-instruct-preview
>>
>>103514273
Being the latest and greatest it should be very fast.
Fast enough that spending processing time on Chain of Though, or having stuff hidden in the background to make a richer world should be more bearable.
>>
>>103514327
Legit for RWKV. The question is if the conversion brain damaged it. The answer is yes.
>>
>>103512329
You need some prompt wrangling to make it work for this use case but once you do it doesn't feel that filtered for vanilla scenarios. it surprisingly works for roleplay and does know about sex. It doesn't break after a couple of turns of dialogue. I'm sure it was trained on some degree of ERP even though the authors claims extensive filtering and safety.
>>
>>103514432
> couple of turns
A couple *tens* of turns.
>>
>>103514417
On their benchmark it preformed better than the original model somehow, they did not convert the whole model but the attention heads i think.
>>
>>103513163
roriwalrus dot com I guess.
>>
>>103513994
It's not fear of trying something new. It's disgust for the blatant shilling that obviously benefits the creators, as if throwing some porn logs at a model or doing some merging made turned them into some machine learning authority. I'm done giving the benefit of the doubt to these people.

EVA 3.3 is trash and so is whoever made it and the goons who shill it here, no need to even try it.
>>
>>103514517
Too poor to run it?
>>
>>103514547
I'll use vanilla Llama 3.3 instead.
>>
>>103514517
>benefits the creators
I doubt they even know.

>machine learning authority
Hopefully, if they've done enough merges or finetunes they would have picked up a thing or two.
But machine learning authority?
I don't anyone has thought that.
>>
>check card
>"Her icy blue hair flows like a cascade of frozen silk, shimmering faintly in the light. Each strand is meticulously groomed, yet its natural wildness speaks to her untamed spirit. Her hair often spills over her shoulders, framing her strikingly symmetrical face, though she occasionally ties it in a loose ponytail during intense combat."
Card Creator bots were a fucking mistake
>>
>>103514603
Is this some kind of hair creature?
>>
>>103514517
Shilling has the side effect that I try some of the models because of it. And some of them were pretty good.
So I have no issue talking about models I like.
>>
File: images.png (9 KB, 253x199)
9 KB
9 KB PNG
I'm just here to grift HF's bandwidth. Make sure to download my recommendations.
>>
>>103514574
You can't imagine who surprisingly found employment into some established AI company after making a few shitty ERP finetunes or merges that eventually turned popular. Basically throwing shit at the wall, and the monkeys around cheering for it and memeing it into some sort of grand accomplishment.

Meanwhile others who actually brought material contribution to the field haven't gained one cent from it all. So is life, I guess. So yeah, fuck (You) and the so-called finetuners.
>>
Oh hell yeah, someone merged Eva and Euryale:

https://huggingface.co/Steelskull/L3.3-MS-Evayale-70B

>This model was created as I liked the storytelling of EVA but the prose and details of scenes from EURYALE, my goal is to merge the robust storytelling of both models while attempting to maintain the positives of both models.

If this is legit and actually manages to combine the best of both (Euryale's descriptive language without its excessive horniness), I'm definitely running this from now on. Gonna be a couple hours before I can test it though.
>>
>>103514629
>Meanwhile others who actually brought material contribution to the field haven't gained one cent from it all.
You've been shown a strat that works...
>>
is the thread being raided or is it just an influx of newfags? most anons that usually post in this thread wouldn't even bother engaging with retards like the tripfag or talk positively about merges like fag A: >>103512498 or fag B: >>103513067 it's felt like this since the scatfag tried to split the threads the other day.
>>
>>103514903
Ignore Sam's new o1-based bots made to spam the thread
>>
>>103514903
Yes, this is in fact a raid in which we talk about models we got good results from specifically to make (You) even more of a miserable fuck than you already are. Hope that helps!
>>
>>103514903
newfags, or migus for purposes of automated thread shitting
>>
for kcpp cpuplebs, does changing the blas prompt processing size affect processing speed, or do I just need to download some faster ram?
>>
>>103514903(me)
I'm assuming that it's newfags because they all either write like they've just used an llm to coom for the first time or like redditors trying to hype up something that's probably underwhelming. I also just don't want to entertain the idea that some retard is samefagging that much while writing like a redditor because thinking about some loser wasting his time like that makes me feel sad.
>>
File: 1714102086709484.png (613 KB, 1344x688)
613 KB
613 KB PNG
And so it begins...

You can train video loras purely from images btw
>>
>>103515165
https://civitai.com/models/1032826/orbit-camcharacter-hunyuan-video
https://civitai.com/models/1032126?modelVersionId=1157591
>>
bread doko
>>
>>103515389
no more threads
it's all over
>>
>>
File: Suchir Balaji.jpg (91 KB, 780x780)
91 KB
91 KB JPG
>>103510291
OpenAI whistleblower Suchir Balaji, who accused the company of breaking copyright law, found dead in apparent suicide

https://www.foxnews.com/us/openai-whistleblower-found-dead-san-francisco-apartment-from-apparent-suicide-attempt
>>
>>103514654
Yeah and if everyone did that then shitty slopmerges/sloptunes would be flooding the mark- oh. Oooh...
>>
File: phi4bj.png (137 KB, 735x438)
137 KB
137 KB PNG
Supposedly uber-filtered Phi4 can act like picrel. By all means it's not some sort of erotic literature specialist, but it knows what it's doing.
>>
>>103514174
>RAM
>1t/s
so it only takes like 15min to process a fresh cards context and another 5min to generate a reply
>>
>>103514642
Well this is a fucking dud. For some reason, it's doing the exact opposite of how the base models work: it takes a weirdly high temp (we're talking 2+) to get any decent variety in response, and worse yet, at that temp, it actually starts generating refusals. I got hit with "I cannot generate explicit content" multiple times during a brief test, ignoring one time it insisted it cannot generate explicit content _about minors_, despite the character being an adult.
All in all, the merging broke its brain real badly. A damn pity, I was really hoping it'd work.
>>
>>103515643
>ignoring one time it insisted it cannot generate explicit content _about minors_, despite the character being an adult.
It knows what you're thinking, you perv
>>
>>103515657
Even if that were true, the other models have no issue with such stuff, as evidenced by lolibro's logs earlier, so something is definitely broken in this one.
>>
>>103515536
Retard
>>
>>103515496
The only problem is:

> Now I'm going to do X and it's going to be so nice!
> Are you ready?
> Anon, can't you feel the excitement? Are you ready yet?
> This is so good. Want to see what's next, Anon?

Late 2022 CAI vibes, too bad.
>>
>>103515702
Saying something along the line of "are you ready?" at the end of a response is a DEI symptom. Once you notice it, you'll keep seeing it everywhere
>>
>>103515717
How is it related with DEI any more than Llama3's "I cannot generate explicit content full stop"?
>>
>>103515753
>>103515753
>>103515753
>>
>>103515748
It's part of the DEI safety dataset



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.