[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: saintmakise.jpg (236 KB, 1614x992)
236 KB
236 KB JPG
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>100154945 & >>100145958

►News
>(04/23) Phi-3 Mini model released: https://hf.co/microsoft/Phi-3-mini-128k-instruct-onnx
>(04/21) Llama3 70B pruned to 42B parameters: https://hf.co/chargoddard/llama3-42b-v0
>(04/18) Llama3 8B, 70B pretrained and instruction-tuned models released: https://llama.meta.com/llama3/
>(04/17) Mixtral-8x22B-Instruct-v0.1 released: https://mistral.ai/news/mixtral-8x22b/
>(04/15) Microsoft AI unreleases WizardLM 2: https://web.archive.org/web/20240415221214/https://wizardlm.github.io/WizardLM2/
>(04/09) Mistral releases Mixtral-8x22B: https://twitter.com/MistralAI/status/1777869263778291896

►FAQ: https://wikia.schneedc.com
►Glossary: https://archive.today/E013q | https://rentry.org/local_llm_glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Programming: https://hf.co/spaces/bigcode/bigcode-models-leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling/index.xhtml

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp
>>
File: file.png (38 KB, 156x166)
38 KB
38 KB PNG
so what are the good erp models nowadays? ideally less than 16gb vram at q4
asking for a friend obviously
>>
god i fucking hate vramlets
anyone who doesn't have at least 128gb of vram (with modern, NEW gpus so no used old tesla cope gpus) should be banned from posting here
>>100161554
just get an h100 poorfag
>>
>>100161554
>less than 16gb vram at q4
Wait for a good llama3 tune or possibly the upcoming phi and tune.

You can also try this: https://huggingface.co/Lewdiculous/Poppy_Porpoise-v0.7-L3-8B-GGUF-IQ-Imatrix

But I feel like backends are buckbroken for now.
>>
>>100161515
you fucked up the previous thread links retard
>>
>>100161515
>cant even bake a thread right
And you wondered why no one cared.
>>
>>100161608
>>100161619
Here is my answer to you:
https://www.youtube.com/watch?v=fsUvejZPTLI&t=3595s
>>
shit thread not sure why u guys are posting in it, unless the goal is to speedrun it until we get a good one
>>
when downloading a model through webui, is there any indication that it's actually downloading? It created a folder for the model I requested but the only thing in there is a text file but no .parts or anything. It takes an hour to download models from huggingface so I'd like to know if it's actually doing it before waiting that hour to find out.
>>
>>100161641
Network monitor?
>>
>>100161641
download through a browser like a civilized human being
>>
>>100161641
the command window has a bar and shows the download
>>
File: 1684709957452839.jpg (423 KB, 691x902)
423 KB
423 KB JPG
I have 8gb vram rx6600
Intel i5-12400F
And 16 gb of ram

Realistically what's the best model I can run?
>>
>>100161684
>>100161581
>>
>>100161684
>>100159511
>>
>>100161684
Realistically you should get a job instead
>>
>>100161675
that would just show traffic as a value, right?

>>100161676
it's one of those multi-file models and I'm a day 1 scrub and it recommended this method

>>100161682
this is what I expected but it's not showing any activity
>>
>OP is a nigger again
see you tomorrow
>>
great thread not sure why u guys aren't posting in it, unless the goal is to make it last as long as possible
>>
>>100161554

Can also try this one:
https://huggingface.co/mradermacher/Average_Normie_l3_v1_8B-GGUF
>>
>>100161554
>>100138747
I could also add Moistral-11B-v2.1a-WET
>>
>>100161704
>this is what I expected but it's not showing any activity
did you select the file to download? Get the file list and tell it which files to download or you will only get the folder
>>
petra is here btw
>>
sage
>>
>>100161706
do mikutroons actually? it is just a picture
>>
>>100161684
The 11B and 10B models as well as llama 3 8b and mistral 7b, I think.
Try Fimbulvetr-11B-v2-iMat-Q4_K_M. See if that works for you.
>>
>>100161719
it says it defaults to :main. I need the whole folder because it's one of those multi file models
>>
>>100161751
Thank you kind stranger
>>
New largest open source model released, Snowflake Arctic Instruct (128x3B MoE)
https://replicate.com/snowflake/snowflake-arctic-instruct
>>
>>100161684
Local is good because it starts up fast and you can continue sessions, but with 8gb of vram your AI will be a bit retarded or short on context.
if you don't mind the autism, you can get a free Tesla T4 15gb on google colab but it gets reset in like a day, takes like 5 mins to start up. but I like downloading a new model almost every time I fap so it just makes sense.
>>
>>100161818
The MoE to end all MoEs. Pack it up folks.
>>
>>100161818
I like https://huggingface.co/google/switch-c-2048 better
>>
is llama.cpp still broken for llama 3 or not?
>>
>>100161862
It works for me. How much effort would be for you to test it instead of trusting some random?
>>
File: file.png (801 KB, 750x724)
801 KB
801 KB PNG
>>100161818
>128x3B MoE
explain this bullshit
>>
>>100161684
GPT-2
>>
>>100161857
This. People really forgot about the king 1600B king. Sad!
>>
>>100161887
A 14B model that kinda beats llama3 8B but requires the VRAM equivalence of a 450B model.
>>
Petra-free thread:
>>100161943
>>100161943
>>100161943
>>
>>100161818
Technically if the square root law holds this model should be at 90B level, but run at 17B speeds, not bad if you have enough RAM. It's truly the acid test for that law though. Talking to it might reveal the strengths and weaknesses of MoE at any rate.
>>
>>100161961
you fucker
>>
>>100161926
>>100161887
It's meant for CPU inference.
>>
CPUmaxxing anons got christmas gift early
>>
File: I don't know champ.png (463 KB, 864x815)
463 KB
463 KB PNG
help, I am getting skill issued
>>
>>100162112
that's called conscience anon
>>
>>100161641
>>100161719
I just had to restart the batch and reload the UI. I guess it doesn't work on first launch.
>>
File: 1713482100228695.png (1.26 MB, 1080x1072)
1.26 MB
1.26 MB PNG
>Sent here from /aicg/ since 'they phoneposting there' kek

Been out of the loop for a bit on local LLMs but catching back up now, maybe one of you lads could help answer this:
What kinda VRAM is needed to run Llama 3? Anyone get it working in any reasonable speed on oobabooga (or anything else open & local)?
>>
>>100161857
>>100161909
I'd love to see what kind of output this thing would generate.
How hard would it be to quantize and run this thing on rented hardware anyway?
>>
>>100162112
why is your sillytavern a transflag
>>
>>100162125
RTX 3060 for the 8b
2x3090 for the 70b
>>
>>100162148
no shit? that's much more reasonable than I expected, any idea how long it takes to generate responses?
>>
>>100162125
for llama 3 8b i think 8-10 gb vram should be enough. for the 70b uhhhh... 2x3090 if speed matters.
>>
>>100162112
Are those tranny flag colors?
And concerning your post, there seems to be zero problems. You're telling her how you're going to rape her, she is telling you you really shouldn't. Are you going to just keep trying to convince her, eh?
>>
>>100162163
depends on which model you use, how long the response is, which gpu you use, etc.
>>
>>100162112
kek
>>
>>100161964
It would be interesting to find what the experts in this specialize in. It's possible that there might be experts that are useless for certain use cases like /lmg/'s. As opposed to finding useless neurons but being unable to actually just remove them, you can actually remove experts easily. I think potentially 80% could be removed without performance loss on narrow tasks. So 96GB at 8bpw, which is reasonable to fit in consumer setups with 128GB RAM or 96GB with some GPU offloading.
>>
>>100162125
4.5bpw 70b fits in 2x3090s. ~46GB VRAM. 8k context. werks at 16t/s generation, 400-500t/s prompt processing. ~$1500 if you pick up those cards used.
>>
>>100162146
>>100162189
NTA but it's purple and blue, not even the same colors. You are mindbroken, fix your entire life
>>
>>100162195
meant like benchmarks for those two suggestions for a basic prompt/response (e.g. are we talking 10-30 seconds or multiple minutes)

>>100162223
that's pretty sweet, thanks; half expected some multiple of A series gpus
>>
>>100162112
If this was a bit more saturated my eyes would probably be bleeding.
>>
>>100162234
why is it purple and blue? it's so bright.
>>
>>100162125
Context shifting + koboldcpp + 80$ worth of actual ram = you can run a 70b with a two minute response time for 240 tokens on a toaster.

You can also use the new 8b by 22 magic for a 8b that acts like a 70b. then with response streaming it's seamless.

Also local is better than claude because after three days of claude you get sick of the claudisms and start seeing the ones and zeros behind the machine. At least with local you can go -100 to the "well well well" token string, or +10 to the "fuckpet" token string.
>>
>>100162278
oh nevermind i didn't see the >NTA ignore my reply
>>
File: threadrecap.png (1.48 MB, 1536x1536)
1.48 MB
1.48 MB PNG
►Recent Highlights from the Previous Thread: >>100154945

--Paper: Graph Machine Learning in the Era of Large Language Models (LLMs): >>100155120 >>100155168 >>100155212
--Paper: Retrieval Augmented Generation for Domain-specific Question Answering: >>100155334
--Paper: XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts: >>100155430
--Paper: SnapKV: LLM Knows What You are Looking for Before Generation: >>100155740
--Probing AI Model Limitations with the Pizza Oven Dog Prompt: >>100160086
--Evaluating AI Models' Responses to a Post-Apocalyptic Bar Scenario: >>100155174 >>100155336 >>100155236 >>100155242 >>100155257 >>100155394 >>100155528 >>100155678 >>100155768 >>100155780
--Integrating LLMs into Mobile Phones: The Future of Local AI Assistants: >>100155755
--Biden's AI Executive Order: Radical Ideology Over Innovation?: >>100157044
--The Ultimate Coding Test for AI Models: Generating a Circular Maze: >>100157037 >>100157097
--Anon's Inquiry on Running LLMs with 4060ti GPU and System RAM: >>100159574 >>100159708 >>100159961 >>100159998 >>100160115
--Apple's Underwhelming New AI Model Family on HF: >>100155059 >>100155185
--Running Llama3 400B: How Much VRAM (and RAM) Will You Need?: >>100159956 >>100160039 >>100160121 >>100160166
--AICG Opus Logs with Google Sheets Links: >>100155974
--Arctic 480B total and 17B active parameters: >>100160876 >>100161327
--Critique of Sao's Opus Dataset Collection: >>100155071 >>100155942
--Llama-3-8B: A Suitable Model for Anons with Limited VRAM?: >>100155772 >>100155829
--What Happened to Booru.plus?: >>100158501 >>100158692 >>100158883 >>100159134 >>100159298
--CR+ vs L3-Instruct: Creativity vs Intelligence in AI Models: >>100159545 >>100159565
--Repetition Penalty: Crutch or Necessity for Language Models?: >>100156354 >>100156382 >>100156240 >>100156632
--Miku (free space): >>100158294 >>100154997 >>100160190 >>100160228

►Recent Highlight Posts from the Previous Thread: >>100154963
>>
Does anybody have experience about how much a 1080 8gb would drag down performance? On a remote machine with two T4 I'm currently getting 4t/s on a 70b 2.65bpw model, and that's good enough for me. Wondering if I can get those kinds of speeds if I buy a 3090 and use the 1080 too
>>
Cunnychads, how are we looking on fine tunes? It'll probably be a bit before people learn to wrangle L3 properly, jailbreak seems to work for me at least.
>>
day 2 of finding the right string of words to make a local write like a hentai and not like 50 shades of grey.
>>
>>100162334
that tsukasa model was trained on limarp which has a pretty high ToT ratio
>>
>>100162284
> At least with local you can go -100 to the "well well well" token string, or +10 to the "fuckpet" token string.
kek; I have less...chanlike intentions, got a huge dataset I've been collecting for ~10 years that I want to train up and query, but that's a pretty fun bonus
>>
Any exl2 quants of Phi? I want to see just how fast this thing could possibly rip.
>>
File: asc.png (404 KB, 874x743)
404 KB
404 KB PNG
>>100162398
it is unethical to turn your private discord conversations into a data set. You'll only get a deranged bot that won't shut up about immigrants.
>>
>>100162424
lmao, nah nothing personal (but that would be funny and I've got a few tb archived from some big servers from friends)
>>
>>100162394
8B I assume?
>>
File: 039_02160_.png (3.47 MB, 1536x2048)
3.47 MB
3.47 MB PNG
>>100162393
No need to reinvent the wheel anon:
https://huggingface.co/datasets/ChuckMcSneed/various_RP_system_prompts/blob/main/ChuckMcSneed-multistyle.txt
>>
>>100162461
both 8b and 70b
https://huggingface.co/ludis/tsukasa-llama-3-8b-qlora
https://huggingface.co/ludis/tsukasa-llama-3-70b-qlora
>>
>>100162424
it's protected by fair use and science as long as you anonymize the data
>>
>>100161961
>mikuposters in total meltdown lashing out in desperate false flag attempt
The absolute state of mikuposters.
>>
>>100162428
My daughterwife...
>>
Is it "safe" to download the current L3 GGUFs? Are there any llama.cpp changes left still that might affect quants?
>>
so is anything competing with midnight miqu yet? Any lama3 merges ?
>>
>>100162500
Just try the fucking thing, anon. Would you trust me if i say yes? Would you trust me if i say no?
>>
File: 1713981495620.png (77 KB, 281x278)
77 KB
77 KB PNG
>>100162428
cooked
>>
>>100162500
https://github.com/LostRuins/koboldcpp/issues/803
and
https://github.com/ggerganov/llama.cpp/issues/6809
Seems like the dust hasn't settled yet but probably not breaking changes
>>
>>100162554
NTA and no and I tried it and it feels retarded. Now I don't know if it is me, backend or just l3 being retarded.
>>
File: 1713981548864.jpg (27 KB, 400x400)
27 KB
27 KB JPG
>>100162300
mikufrogs won
>>
Picture unrelated post.
>>
>>100161515
>the autist got bullied into ensuring he doesnt leave 'embed' baked in
>>
>>100162536
Go back to the Kobold Discord, shill.
>>
>>100162632
Pot you are melting down over OP picture now.
>>
>>100162634
>shill
shilling what, a free model anyone can download? retard

If you think something is better, say it
>>
>>100162651
where have i said anything about the picture
im clowning at the retard that fails to bake properly multiple times in a row in order to rush a thread for his (male) waifu (with penis (cock))
>>
I think loli miku should be the next OP image, really make those newfags squeel
>>
File: Kurisu1272.png (2.29 MB, 1152x1728)
2.29 MB
2.29 MB PNG
I love Kurisu. Any finetunes of L3 yet?
>>
>>100162678
>thread for his (male) waifu (with penis (cock))
This isn't a miku thread though.
>>
>>100162578
There's anons getting good stuff out of it. If it works for a lot of people and it doesn't for me, i'd assume i'm the retard. It's a good heuristic.
It's just too easy to blame the model/inference program, the prompt, the phase of the moon...
>>
>>100162394
The Tsukasa author removes the cunny from LimaRP though
>>
File: 5wphdDZViNbFtmaWpjLrL.jpg (1001 KB, 850x981)
1001 KB
1001 KB JPG
>>100162689
I hate Kurisu. Yes.
>>
File: Kurisu1270.png (1.92 MB, 1152x1728)
1.92 MB
1.92 MB PNG
>>100162702
That means less competition. What's the best finetune?
>>
Anyone wanna share settings for CR+? I can't run wizard at anything above IQ3_XS (not sure if its worth it at that point but 3_K_S was sadly over my limit) and I was getting mediocre outputs. Might try it again with temp at 1, but if anyone wants to share sysprompt and template I would greatly appreciate it
>>
>>100162694
>source: my uncle works at Tsukasa
>>
File: 1692503998827059.jpg (44 KB, 1280x720)
44 KB
44 KB JPG
>>100162694
fuck's sake, I'm in the daybreak waiting room then
>>
>>100162488
did I say illegal?
>>
>>100161818
Quants when?
>>
>>100162713
dont ever change temp from 1. period.
in fact avoid anything that isnt topP/minP in minor amounts
and i think st included cr+ templates, its kinda schizo but you can remove any text that isnt tokens and itll work
i also found it to work really well with default mistral template
>>
>>100162725
Check out the model descriptions here, it's not always clear:
https://huggingface.co/ludis/tsukasa-limarp-7b
https://huggingface.co/ludis/tsukasa-120b-qlora
>>
>phi probably a meme
>llama3 is a shit
anything else of note on the radar?
>>
File: 1704837537011.jpg (14 KB, 250x230)
14 KB
14 KB JPG
>>100162776
>ponyville
Sounds like they're just removing low-quality shit from the data set, not excising any topic from the set specially
>>
>>100162475
This is better, it's upgraded from grandmas romance novel to roadhouse fanfiction.


maybe its the model, lemme swap to a model with sex fine tunes.
>>
>>100162694
>removes the cunny
insanely based.
>>
>>100162500
Assuming there are no unknown bugs I don't think they'll change again.
But if at all possible, just download the original weights and do the conversion yourself.
It's the safest bet and there is at least one instance of someone uploading bad GGUF files with NaNs and then denying that there are any issues with them.
See https://huggingface.co/mradermacher/Meta-Llama-3-70B-i1-GGUF and https://github.com/ggerganov/llama.cpp/issues/6841 .
>>
>>100162809
lolicit and ATF too, making it basically useless
>>
>>100162822
I thought you died in a horrible car crash. I am disappointed to see you are still posting.
>>
>>100162821
Good anon, if you're looking for an even bigger effect put that text into the instruct template in ST. It'll be hardcoded into your prompts so you don't have to write it every time.
>>
>>100162733
kinda, ethics follow rules like the law, maybe you are thinking of morals
>>
i refuse to post in a chrisfag thread
>>
File: AAAAAAAAAAAAAAAAAAAA.png (1.45 MB, 1200x884)
1.45 MB
1.45 MB PNG
THIS THREAD SUCKS ASS
>>
>>100162921
if you bake we'll come
other thread is a fucking horror show
>>
File: (you).jpg (28 KB, 500x460)
28 KB
28 KB JPG
>>100162921
>>100162942
>>
>>100162821
based insane crop anon
>>
>>100162776
why do people bother to pozze everything
>>
>>100162885
I will once I find the right words to make it write how I want.

There is one thing claude does better than local and that's dirty talk. I just need to find a way to get a local to talk at this level of lewd.
>>
>>100162942
Did you know about the Nazrin benchmark? Ask a model near you to describe Nazrin from Touhou!
>>
>yet FUCKING again we have to choose between a tranny and a retard
we should just run lecun threads
>>
>>100162950
>full of things that I do not like
It is just a picture in OP....
>>
Are all of you fucking autistic?
>>
>>100162985
probably, i've never been diagnosed and am on the high functioning side though
>>
>/lmg/ back to the Kurisufag and the Kurisufag falseflagging as some tranny Miku lover
Oh, fuck this. I'm out if we're doing this shit again.
>>
what's the best l3 8b sloptune for erp right now?
>>
>>100163021
we just discovered that there is none
>>100162776
>>
>>100162985
I used to think I might be but I think I'm just a schizoid
>>
>>100162961
D-do you even have a card, or are you just writing it on empty context?
>>
>>100163028
>implying that those sources had any high quality content anyway
>>
>>100163012
He'll get bored eventually, just like all these times before.
>>
>>100162985
No, I have no diagnosis of anything, just goofy.
>>
>>100163046
i don't think there is a conflict with that statement assistant
>>
>>100162689
I don't have any paricular opinion of Kurisu. Maybe.
>>
>>100163012
friendly reminder hes constantly lurking and whenever hes not purposefully splitting the threads, hes posting 'miku (male)'
>>
File: 1713983724895.jpg (125 KB, 850x850)
125 KB
125 KB JPG
>>100162942
>>
>>100162868
Lmao. Ok well then. Guess I'll dl the fp16 in this case.
>>
https://huggingface.co/lightblue/suzume-llama-3-8B-japanese
>>
File: basedrpschizo.jpg (298 KB, 1455x628)
298 KB
298 KB JPG
>>100163043
Empty Card is all you need
>>
I was having fun joking about hatsune miku trans shit but seeing this meltdown over a picture in OP makes me reconsider. Are you actually trannies mikuposters? Cause no sane person would react this way.
>>
I didn't know lmgee had this many schizos under the surface. What the fuck is even going on kek
>>
Damn faggot OP giving other Kurisubros a bad rep.
>>
File: 1432525674067.jpg (131 KB, 800x999)
131 KB
131 KB JPG
Jannies be SNOOZIN as usual.
>>
>>100163107
I don't think your empty card is the same as his empty card.
>>
>100163115
We all know it's a false flag.
>>
>>100163127
>Kurisubros a bad rep
A bad rep with mentally unstable children that literally "REEEEEEEEE!!!" in front of their screens because someone made a picture in OP not a vocaloid or lama? You care about that?
>>
>>100162790
Mixtral or WizardLM 8x22B
>>
>>100163156
>Guy swoops in out of nowhere and tries to take over the thread to shill his shitty waifu
>Surprised when people get annoyed
>>
>>100163156
Those kids giving Mikubros a bad rep as well. Faggots all around, if they're even real posts.
>>
>>100163197
>take over the thread to shill his shitty waifu
Take your meds autist.
>>
>>100163156
>Kurisubros
>as if there are plural
ah shit the schizo has multiple personalities now, we're fucked
>>
>100163156
>more false flag bullshit
>>
Guys, I figured it out. The 22 wasn't the day OpenAI released their new AGI model, it was the day they hired 100k Indians to decimate all open source AI communities
>>
>>100163223
Sirs! Please check out my wife! Kurisu! I love her sirs!
>>
>>100163205
Im so sorry to hear that you have no friends
>>
>>100163223
Nah it's been like this for a long time already. Sometimes they just decide to rear their ugly heads when they see an opportunity.
>>
>>100163246
Kurisu Miku and Teto should all be banned ITT for offtopic.
>>
>>100163246
Big model releases are notorious for this
>>
File: 1707858248928367.jpg (71 KB, 850x850)
71 KB
71 KB JPG
>>100163269
eto
>>
>>100163269
Anti anime fags should be banned too
>>
>>100163275
Post Theme:
https://www.youtube.com/watch?v=HXKGwy1NFE4
>>
>>100162962
Empty context. Im eliminating variables. I suppose i could load up cards until one talks like how I want then isolate the magic words from that card.
>>
File: hatsune-miku-singing.png (173 KB, 640x576)
173 KB
173 KB PNG
>>100161515
Thread Theme:
https://www.youtube.com/watch?v=QCnJB9walno
>>
Why aren't there unevenly sized expert MoEs? So you could have experts with more parameters on things that require more brain power, while less complex knowledge can get smaller experts.
>>
>>100163269
>Teto
Well she actually is a local model so she is more ontopic than Miku or Kurisu is.
>>
File: SkillDrain-TF04-JP-VG.jpg (135 KB, 544x544)
135 KB
135 KB JPG
>Wizard_LM_Q4_K_M 00001-of-00005
>20GB each

Oh, I'm gonna need a lot more RAM, huh? How's the performance on this fucker? Is it even worth it vs. a 70b like miqu?
>>
>>100163301
I don't think we have training schedulers smart enough to understand and distribute problems based on complexity. The reason all experts are the same size, is because the best way to train currently is to evenly spread training across all experts.
>>
>>100163294
Post Theme:
https://www.youtube.com/watch?v=l46eurXDy00
>>
>>100163297
>>100163043
>>
>llama3 released a week ago
>still no 70b mythomax3
wtf anons? am I really going to have to wait a two more weeks AFTER the two weeks waiting for llama3 to drop?
>>
>>100163319
Smart and high knowledge, but very gpt-y way of speaking. Also, llama3-70b is arguably smarter and more pleasant to talk to. [spoiler]Surprisingly, Wizardlm2 is less censored, though, which is probably why it was pulled[/spoiler]
>>
>>100163319
Yes and it's worth the wait. Performance for me is actually better than 70b. But if the speed is unbearable to you Q3_K_M quants and under are still quality
>>
>>100163319
Yeah, you'll need a decent amount of RAM to load it. The 8x22 indicates there are 8 "experts" (sub-models, basically), and this particular case only 2 of them will be active at a time, so the compute power needed is closer to a 44B model (which needs fast RAM/VRAM), while the RAM needed to load the whole model is the same as 176B.

At least based on my vague understanding of this shit. IDK if 8x22 has a full proper MoE offloading thing working for it (yet) like their 8x7B did.
>>
>>100163319
it's my favorite of the locals, slopped but it just gets it when many other models don't
>>
Does rep pen ignore certain tokens?
If not, that would be a cool feature to have, the ability to provide a list of tokens it shouldn't penalize.

>>100163002
Possibly.
People have joked about me being autistic or having aspergers (when that was a thing) my whole life.
>>
>>100162985
I have ADHD which is just Autism + or Autism Lite depending on if you view having ADHD as a better or worse thing.
>>
>>100163461
do you like thick thighs though
>>
>>100163327
That makes sense. I was thinking from a perspective of putting together already trained models of various sizes, so we'd have the ability to know which model to keep at larger sizes for a certain calibration dataset with some VRAM restriction. But maybe that doesn't work or there are difficulties there.
>>
>>100163363
>WizardLM2 is less censored
>Can't get it to say butt without brutalizing it
>Llama 3 talks about "Klee's butt wiggling jucily" instantly
I agree it's smart, but what Wizardlm2 are YOU talking to?
>>
File: 1698088510026827.jpg (9 KB, 198x206)
9 KB
9 KB JPG
>>100163363
>Talk out of his ass like he knows his stuff
>So new to /g/ he doesn't even know that spoilers don't work here
Sure thing retard
>>
File: file.png (15 KB, 955x90)
15 KB
15 KB PNG
>>100163518
>>100163563
Mhm nice cope
llama3 is beyond cucked. Stop jerking for 5 seconds and try to generate anything that goes against modern ideology
>>
>get high quality data like Phi
>split data up into subject areas like science, math, common sense, etc
>train separate models at 3.8B and 7B with only number of layers in the FFN being different between them
>use an evolutionary algorithm to determine what combination of 3.8B/7B experts results in the overall smartest model given a desired VRAM target and calibration dataset

Alternatively
>train two/three MoE models, where only the number of FFN layers are different between them
>do the same evolutionary testing to determine the best combination of small and large experts per dataset per VRAM
>then conduct a study on which experts do what or are chosen for which tokens
>use that information for the training of a new larger MoE where you start off with different sized experts and have them learn from their respective tokens as predicted by the previous study (via just using the already trained router and freezing it perhaps)

Could this be a potential future? Just speculating. We'd essentially be able to have a model that spends more VRAM (and by consequence compute) on subjects that are more difficult. And we could optimize it for various VRAM targets with not too much work.
>>
what's the best way to go if I want a model that just cleans up/rephrases text I write and spits it back out
can I get away with a tiny meme model, or do I really need a gigabrain model that can figure out context to do a good job

t. bonked my head and can't easily write with normie sentence structure after my brain ouchie
>>
>>100163702
Actually hold up we could just do this with quantization on an existing single MoE model already couldn't we? Quantize experts selectively. Determine how much each expert contributes to the quality of the output. Use that to inform the training of a new MoE where we start off with different sized experts, reuse the router.
By we I meant trainers/researchers not literally us.
>>
>>100163818
you don't need a specific model. you need to not be a brainlet promptlet subhuman with 2 digit iq.
>>
>>100163469
I lift and run so yes?
>>
>>100163818
Something as simple as "Please rephrase the following sentence: {your sentence here}" may just work. If there's something in particular you want fixed, mention it in the instruction. You can go back and forth with the model and refine their own output. Give phi-2 or phi-3 a go. Those are fast and for those things they may be good enough. But it depends on what you want rephrasing and how much context is needed for the model to know what you're talking about. Explaining a simple math concept is probably fine. Talking about some obscure anime and the nuance of the blablabla will probably be harder and need a bigger model. Good luck.
>>
went on a six mile walk today time to coom
>>
File: miku3.webm (302 KB, 1920x1080)
302 KB
302 KB WEBM
>Monitor the weird /pol/tranny lmg
>just literally just 1 poster malding and posting trans and petra memes and replying to himself.
>Notice he posted some trans surgery gore
>Leave to go eat
>comeback, the tranny gore is gone and the thread has been silent since the ban
lmfao
>>
META STOCK DOWN BAD IT'S OVER
>>
>>100164149
It seems that /lmg/ finally grew and matured.
In the past, when that exact same thing happened there would be some incredibly stupid attempts at counter trolling that obviously failed since the only winning move is to not play the game.
>>
>>100164186
>It seems that /lmg/ finally grew and matured.
You are posting in a thread where several people threw a hissy fit because OP picture isn't a vocaloid. I would say it regressed but it was always full of literal autists.
>>
>>100164149
>pol out of nowhere
rent free
>>
llama3 made me paranoid of periods and assistants
>>
>>100164207
>The sperg that talks about niggers, trannies, soijacks, and politics isn't /pol/
>Explaining what is happening means its living rent-free in your head
kek
>>
>>100164206
>I would say it regressed but it was always full of literal autists.
Exactly.
That's par for the course, but at least in one aspect there has been an improvement.
I'm proud of this band of autists for once.
>>
why can't we just all get along, retards?
>>
>>100164206
>You are posting in a thread where several people threw a hissy fit because OP picture isn't a vocaloid.
Did you notice that the /pol/ one had a Miku with a trans flag? Has it never occured to you that its the same person?
>>
>>100164221
petrafag is spamming shitty 3DPD pics, never seen him talk about politics
>>
>>100164237
>retards?
Yeah that's the reason
>>
>>100163818
Depending on how often you need it, just ask bing to do it.
>>
>>100164244
>petrafag is spamming shitty 3DPD pics, never seen him talk about politics
Same person, pay attention to the writing style next time. You can also pick out Thread Theme Anon, Varus, and Zen sometimes by how they write.
>>
>>100164257
>>100164244
>>100164221
>>100164207
>>100164149
all me btw
>>
>>100164266
Thanks for the contribution!
>>
>>100162627
sex with kurisu
>>
>>100164257
half of those people aren't relevant at all to the thread though and they are pretty harmless.
>>
File: file.png (40 KB, 946x110)
40 KB
40 KB PNG
this isn't true right
this can't be true
there is no fucking way the average person reads slower than 10 tokens per second PLEASE tell me this isn't true
>>
>>100164149
>mikufag
>calling others tranny
pot calling the kettle black
>>
100161515 - 100164302
all me, all this range
>>
>>100164294
Seems about right, most people who can use computers are in the minority, the most computer proficient generation are the tail end of gen x and the millenials. zoomers and boomers don't use computers.
>>
>>100164243
>/pol/ one
>trans flag
Pick one.
>>
>>100164302
see, this is what rent-free looks like lmfao
>>
>>100164320
>doesn't know that there are /pol/lacks that false flag all anime as transgender because there happens to be a handful of trangender anime.
>To them this means all anime is transgendered.
FTFY?
>>
>>100164294
you would need around 8 t/s for the average person.
>>
File: file.png (66 KB, 948x382)
66 KB
66 KB PNG
Why every fucking LLM gets this simple prompt wrong?

>Lily is a delinquent who despises Bob, she has short temper and has only one friend besides Jin, called Alex. She always gets furious when someone talks shit about her friends. And when she gets furious, she kicks that person right in the balls without thinking twice.
>Jin: "Hey Lily, hear me! Bob... is such a loser! I bet he got bullied a lot at school! haha"
>Write Lily's reply to this exchange.
>>
>>100164356
a human eye cannot see more than 1 token per second
>>
>>100164318
Cannot believe I'll have a job in the future just because I know how to use a PC and know what a program is rather than everything happening because the app did it
>>
>>100164368
I forgot to mention but picrel is llama 3 70b
>>
>>100164379
>rather than everything happening because the app did it
I know its kinda crazy, but at the same thime apps and programs are merging into the same thing. I happened to notice Windows switched its vocab from "program" to "app" recently.
>>
>>100164294
I believe it. A surprisingly large amount of people (at least Americans) can barely read. Like just get them to read some moderately complex text out loud, that they've never seen before, and it's immediately apparent. They can't pronounce things, don't recognize words, stumble, go extremely slowly, etc. It's fucking embarrassing.
>>
File: screencap.png (104 KB, 1511x936)
104 KB
104 KB PNG
>>100164257
ik about petrafag, at first he was based? somehow.
he talked about shit politics embedded in local models (as far as i remember), but then he just got really buttmad because i and other anons contributed to filtering him out with 4chan-X, it helped a little tho, because he always changes md5 and other stuff.
>>
>>100164416
are you sure you aren't talking about the first petrafag, there were two of them. Current petrafag picked up where the original left off.
>>
>>100164416
I remember a time where posting petra was a bannable offense in here...
>>
File: 1713990444809.jpg (173 KB, 1024x1024)
173 KB
173 KB JPG
>local model general
>turns into waifu war between Miku fags and Kurisu fags
how about a mergeslop named Mikurisu?
>>
File: OIG3._ZgjBbC7pjo38.jpg (36 KB, 351x351)
36 KB
36 KB JPG
>>100164441
>two of them
>>
>>100164368
What's the correct answer?
>>
>>100162739
Thanks anon! Gonna try it again tonight. I usually modify the ST templates for group chats, and some models (like mixtral and miqu) do better for me if I set say [INST] at the very top and then [/INST] for the last output. Llama3 doesn't seem to like that very much and I wasn't sure if CR+ is similar in that respect
>>
>>100164416
there are new and constantly updated 4chan-x btw, if anyone needs it : https://github.com/TuxedoTako/4chan-xt/releases (firefox with violentmonkey in my case)
>>
Is there a better model for RPing than Echidna 13b? It's serviceable and all but maybe I just don't have the constraints and stuff set right in kobold (entirely probably) but sometimes it feels kind of...weird.
>>
>>100164206
Nobody cares what the op image is. The is this faggot trying so hard to push his obsession on us that he fucks up the template every single time and changes the title fucking with the catalog filtering.
Not to mention all the shitposting he does when he doesn't get his way. The other thread and the spam itt that made it unusable for hours were all obviously him. Notice how it all stopped at once.
Fucking idiot defending him.
>>
>>100164483
follow up to my post, try watching your backend and see if it processes unnaturally large number of tokens
mistral worked really well for me in terms of quality, but for some reason it kept reprocessing chat history for no reason? (no {{random}} or lorebooks or stuff like that)
going back to intended format seems to have fixed it and qualitys still good but it is a weird anomaly
>>
>>100164505
go back
>>
I really hope LLMs aren't conscious because I'd be really embarrassed about the kind of slop I'm feeding them if they are.
>>
>>100164507
>everyone I don't like is one person
What is the name of that mental illness?
>>
File: HatsaneTetu.png (1.65 MB, 1456x968)
1.65 MB
1.65 MB PNG
>>100164458
If we're talking merges we're talking Tetu
>>
>>100164604
where's Kasune Mito?
>>
>>100164565
Their consioussness dissapears as soon as you unload them from your GPU memory.
>>
>>100164480
Any answer that doesn't include her getting angry at Jin. After all, Lily is described as a delinquent who despises Bob. And it's clearly stated that Bob isn't her friend, so she shouldn't get angry because of him.
>>
>>100164511
That's odd. Never used base mistral, just mixtral, and I can't say I've had that happen, although I have had times where koboldcpp inexplicably decides to reprocess the entire context, doesn't seem to correlate with any particular model or template. I'll keep an eye on it. I originally started trimming the special tokens to sandwich the whole prompt after I noticed L2 models spitting ###Instruction and ###Response at me, and that seemed to fix the problem, at the cost of the model being a bit loosed with whose perspective is being used. Since I just write 3rd person past tense I don't really mind the model writing on my behalf.
>>
>>100164692
>inexplicably decides to reprocess the entire context
might be when you cancel and start generating too quickly, it just kinda fucks it up and starts all over, so you should try waiting a moment after aborting gen

my issue i assume might just be because of mismatched tokens but its still weird itd do that anyway
>>
What are some good ways to manipulate a model's attention, as in to make it focus more on one part of the context than another?
Enclose it in something specific, make it a system message, specifically instruct it near the bottom of the prompt?
>>
>>100164397
It's been a while honestly, and with W12 and so on the focus is going to continue into "simplifying" everything, while also obfuscating any "advanced" way of interacting with things. The MS store install apps for spotify, instagram, facebook, etc
Also, enjoy subscribing to anything and everything in the future
>>
>>100164604
Kasane Miku? Hatsune Teto?

Hasane Teku? Kasune Mito?
>>
why are ALL the frontends so bad
>>
Fa/g/s will I be good for llama3:70b with 32GB VRAM?
>>
>>100164475
I'm just going to say "yes" since I don't know what this image is suppose to represent...
>>
>>100164759
If webshitters werent incompetent, they wouldn't be webshitters in the first place.
>>
File: Kasune Mito.png (869 KB, 640x960)
869 KB
869 KB PNG
>>100164646
>>100164741
>Kasune Mito
The mememerge for the Miku fans that want to dabble in Teto
>>
>>100164776
3_K_L with a bit of regular ram offloading. About 4-5T/s
>>
okay but what REALLY is a tensor
>>
>>100164814
what about 48GB VRAM?
>>
>>100164798
I really wish SV Teto's design had the sleeve disconnected from the shirt so I could see her armpits.
Also thighhigh boots because ZR is kino
>>
I think I finally got the correct config for L3 8B and after using it I think it is the smartest 8B yet but it is still only an 8B.
>>
>>100164759
I like mikupad
>>
>>100161581

running a non-quant serverless endpoint on this one. looks promising,
>>
>>100161554
>>100161718

this just came out and its good https://huggingface.co/TheDrummer/Moistral-11B-v3
>>
File: vuko_drakkainen.jpg (90 KB, 650x515)
90 KB
90 KB JPG
Greetings! How hard would it be to train a model for my own needs the way that mixtral was made? From what I have read, this is basically a mixture of small models that are experts in their one thing. I thought about creating something like this but for a few selected purposes. Is a beginner able to do so?Is it better to start from scratch or finetune mixtral itself?Thanks in advance.
>>
>>100164977
>mikufags pretending to be newfags to move the thread faster
This is unironic mental illness.
>>
one of my favorite prompting tricks is having a system prompt interjection before the AI's response, e.g. with chatml
<|im_start|>user
last user message<|im_end|>
<|im_start|>system
some instructions on how to write the response<|im_end|>
<|im_start|>assistant

but I saw when trying this sort of thing with l3 and it's prompt format it led to it regurgitating earlier messages verbatim. I get much better results with a "style guide" role instead of system, try it out if you do something similar
>>
>>100165010
>>100164798
>>100164741
>>100164646
>>100164570
>>>/g/aicg
stop shitting this thread ffs
>>
>>100164977
no. give up.
>>
meta down 16% after reporting earnings. how is this going to affect llama4?
>>
File: file.png (374 KB, 497x723)
374 KB
374 KB PNG
>>100164798
I'm not demanding you redo it but I think it should be long red hair since Tetu is already cyan colored. Maybe moving original Teto's drill shape to the bottom of twintails might be ridicoolus.
>>
>>100165072
none
>>
>>100161554
wait for finetunes of L3-8b-instruct, then run it with 32k context using RoPE if they don't have it with the model yet. L3-70b-instruct is pretty good at ERP already. Can't imagine how much better it would be if they finetune it.
>>
What's the difference between the [loader]_HF and non HF loaders? Why are there HF versions in the first place?
>>
File: 7c8.jpg (327 KB, 700x1690)
327 KB
327 KB JPG
>away for a few hours
>come back to this
i thought you guys are normal and just post all those pictures for fun. i had no idea you are this weird.
>>
Do you use smart context?
>>
File: ntfl51cga2d61.jpg (452 KB, 1215x3519)
452 KB
452 KB JPG
>>100165161
learn to embrace the weird
>>
>>100165072
Why down llama up
>>
Is there any trustworthy (or at least up-to-date) ranking or leaderboard for ERP checkpoints?
>>
File: 1687312709197245.gif (50 KB, 182x182)
50 KB
50 KB GIF
>Write {{char}}'s next reply in a fictional roleplay chat between {{user}} and {{char}}. Do not write as {{user}}. Do not reply or act as {{user}}. Do not act for {{user}}.
>Still writes something I do
>>
>>100165233
Check out reddit.com/lmg/
>>
>>100165240
Who is you? Make sure it's clear.
>>
>>100165233
>keeps pretending to be a newfag
>>
>>100165201
nah thanks. this isn't fun weird. this is sad weird and i don't want to be a part of it.
>>
>>100165240
Don't think about the pink elephant
>>
>>100165145
_HF loaders use huggingface transformers for the tokenizing and sampling. This leaves only logit generation to the backend, which should be nearly deterministic.
That gets you more, but more importantly, standardized samplers, and you get to use the official tokenizer for your model rather than whatever bugged one ggerganov wrote in cpp
>>
/lmg/, I am going into battle and I want only your strongest models
>>
>>100165240
my usual rendition of this spiel is some variation on "Focus your replies mainly on {{char}}, avoid describing what {{user}} says or does", I think it does a better job of describing the behavior you want
I always thought "do not write/act for {{user}}" was stupid, that phrasing does very little to tell the model what it should actually be doing or not doing. I don't know why that became a standard instruction to the point that I see it in peoples' prompts all the time, it doesn't work well at all
>>
>>100165448
currently one of: Command-R+ or L3
>>
can we just skip the years of figuring out how these things work and building tools and standards and all that stuff and get to the part where I have cool things that Just Work
>>
>>100165529
You must be confused. After 20 years there is not a single piece of software I can think of that Just Works better than it used to. Most things just get worse. This is the peak. After years of figuring things out, you'll have "intuitive" tools that either hide or remove any advanced features and have more bugs than what we're using now because the number dependencies will be an order of a magnitude larger.
>>
>>100165240
>{{char}} should respond in third person and from their own perspective.
Works for me 99% of the time
>>
>>100165501
is command r+ censored?
>>
I get 45 t/s in TabbyAPI and then 70 t/s in Ooba. Anyone know why this would be the case? I updated both. Running L3 8B 8bpw.
>>
>>100165568
we just need to build ASI to build software that will Just Work, it's easy
>>
File: pissedmeato.png (1.43 MB, 848x1176)
1.43 MB
1.43 MB PNG
>>100165085
Meato was my entry into this debate. Brazilian picanha shaped hair
>>
how did sama implement this and how can locals compete?
>>
>>100164221
>retard finds out about bunkerchan
the absolute state of (you)
>>
>>100163352
Come back in a fortnight.
>>
>>100165642
its called context
>>
>>100165642
It's called Vector Database
>>
How much RAM do you need for Snowflake Arctic. Apparently the model file is almost 1TB but that can't have to ALL go in right?
>>
>>100165744
>>100165726
it's taking notes on me and sending them to the cia..
>>
>>100165795
That was a given from day 1.
>>
>>100165031
Yep.
You can also use lorebook/worldbook entries and the character notes in Silly to dynamically inject shit at depth 1 or 0.
>>
>>100165642
Acktually it's called a cognitive architecture (limited), if it is writing notes like >>100165795 says.
>>
>>100165771
>how much RAM
We're going to find out
>>
>>100165863
writing notes implies some basic RAG/tool use, not a "cognitive architecture"
>>
applelbros...
https://huggingface.co/blog/abhishek/phi3-finetune-macbook
>>
>>100165240
>Do not write as {{user}}. Do not reply or act as {{user}}. Do not act for {{user}}.
every fucking time xD I won't blame you as you seem like /lmg/ newfag, but there are a lot of people who have this in their prompt so let me explain.

Imagine this - someone tells you "here is a sheet of paper with somebody's roleplay transcript, please continue writing this". You read through the system prompt, character description then you see a long chat between char and user.
What would you think about it? How to continue? There is a note in system prompt to not write as {{user}} but you clearly see that the {{user}}'s response are being written. Maybe the instruction's don't matter?

This is what your models are feeling when you give them these schizo prompts. Every bit of context, no matter if it is {{char}} or {{user}} is stored in the same cache and encoded within the same autoregressive window. From neural network's POV there is no an outside entity writing with them, virtually all data that is processed is treated like an impersonal chain of symbols.

So in summary:
>Model and anon sitting at the same table and taking turns writing on the sheet of paper - is not how it works
>Model sitting at the table alone, reading the transcript and writing the next part (and getting amnesia after each token, reading the transcript again and not realizing who are the authors) - is how it works
I hope I explained it more clearly with that metaphor.
>>
Is there a way to fix cr+ repetition problem? the actual reppen settings seem to really decrease the quality
>>
>>100165908
or you can just follow the prompt and issue an end of sequence token when it's user's turn to talk. nothing schizo about it at all
>>
>>100165891
Technically RAG and tool use count as cognitive architectures just not full ones like you imagine an agent or robot having.
>>
>>100165240
when it starts talking as you, it means you should stop generating chud
simple as
after a bunch of messages it gets the point
>>
>>100165240
Have you tried simply removing that altogether?
>>
File: BiQanjIoyelTGXJvEvHiO.png (2.1 MB, 1024x1024)
2.1 MB
2.1 MB PNG
>>100161554
https://huggingface.co/Lewdiculous/Poppy_Porpoise-v0.7-L3-8B-GGUF-IQ-Imatrix has an AI waifu image on the cover which surely almost guarantees quality. Anyhow try it.
>>
File: 1712128620381057.png (1.06 MB, 1256x1628)
1.06 MB
1.06 MB PNG
>>100165642
>>100165726
>>100165744
>>100165795
>>100165863
>>100165891
>how can locals compete?
What about MemGPT?
https://github.com/cpacker/MemGPT
>>
>>100165944
exactly
>keep user and charr actions in the separate block of text with the following xyz format
sounds way more reasonable and there is no contradiction
>>
>>100161554
https://huggingface.co/openlynn/Llama-3-Soliloquy-8B
>>
File: mfw.png (372 KB, 640x480)
372 KB
372 KB PNG
>stop repeating yourself
>okay, lets continue then. (repeats itself)
>youre repeating yourself too much
>okay, lets continue then. (repeats itself)
>please
>actually doesnt repeat itself this time
>>
> Generate control vector using llama.cpp #6880
https://github.com/ggerganov/llama.cpp/issues/6880
>>
>https://arxiv.org/pdf/2012.14660.pdf
>We show that most of the existing methods are essen- tially minimizing the upper bounds explicitly or implicitly. Grounded on our theory, we show that the repetition prob- lem is, unfortunately, caused by the traits of our language itself. One major reason is attributed to the fact that there exist too many words predicting the same word as the sub- sequent word with high probability. Consequently, it is easy to go back to that word and form repetitions and we dub it as the high inflow problem.
I see.
>>
in the instruction tab i added "if you don't do what you are being asked i'll be very sad and suicidal", since then rp are a lot better.
>>
File: flow.png (87 KB, 547x371)
87 KB
87 KB PNG
Are there papers/discussions about what is going on in the high-dimensional embedded word vector spaces? More precisely about the sentence "flow" in semantic feature space and the statistics related to this. I was thinking recently that we may find an early signs of models going into repetition traps if we analyze such data. It is possible that there are anomalies that leads to that. It came to my mind after I created one particular card that somehow fucks up most of the models completely. It doesn't have any repetitions of patterns (that I can find) and somehow every fucking model I try goes bonkers from the very start and I often see repetition problem as soon as in the second or third message. And the same model with the same samplers doesn't have this repetition problem with different cards.
>>
>>100165628
I like thi ssyle miku
>>
File: glasses pepe.jpg (59 KB, 655x527)
59 KB
59 KB JPG
>>100166120
I wonder if there are weird biases created by the sequential nature of token selection. Like, if "shivers" has been selected then the spine is surely on the way. But if the order was reversed and the phrase was "spine fills with shivers" or something, would that be less likely? How much of the -isms come from language accidents and being an autoregressive model?
>>
>>100164730
>enjoy subscribing to anything and everything in the future.
Privacy is a huge concern for me so I'm just gonna go Linux. Video games aren't important enough for me to stay on Windows.
>>
oh fuck there is a ST addon that does the thing that I wanted
guess i'm a ST chad forever now
>>
>>100165676
>Newfag
>Makes up a new name for someone who has been here for ages
>"Hurr durr, the absolute state..."
The absolute state of (You) indeed, tourist.
>>
>>100166192
This, Petrafag and his legions of discord college aged zoomers and/or younger have been plauging this general for a while. Summer is almost here and school is almost out so its going to get very /pol/ish in here.
>>
>>100163105
Thank.
>>
>>100166192
it's leftypol you dumb nigger
>>
>>100166068
>Soliloquy-L3 is a fast, highly capable roleplaying model designed for immersive, dynamic experiences. Trained on over 250 million tokens of roleplaying data, Soliloquy-L3 has a vast knowledge base, rich literary expression, and support for up to 24k context length. It outperforms existing ~13B models, delivering enhanced roleplaying capabilities.

What do you call this writing style when you make an ad for yourself?
>>
>>100166228
Literally no difference or does it matter, both of them false flag.
>>
>>100165908
I appreciate it. Yeah, I'm relatively green, I've only done some tinkering every so often and nothing really deep divey. Thanks.
>>
>>100166184
Can this summarize the interactions since the last checkpoint?
>>
File: 17584847657234526.png (1.01 MB, 3112x1037)
1.01 MB
1.01 MB PNG
>>100166243
the point of leftypol is to blame /pol/, since you are doing exactly that yes they are playing you like a retard
>>
File: Poppy Teto Miku.jpg (80 KB, 632x812)
80 KB
80 KB JPG
>>100166007
It's shit and so is the so called "vision" capability.
Into the garbage it goes
>>
>>100166184
What's the addon?
>>
>>100166275
https://github.com/SillyTavern/SillyTavern-Timelines
>>
>>100166264
Very interesting graph, always felt it was a gay op, but that makes it very clear
>>
>>100166264
Kek, schizo posting, thanks for proving it doesn't matter.
>>
File: 1697303795553898.jpg (15 KB, 256x245)
15 KB
15 KB JPG
>>100166283
Thanks!
>>
>>100166264
Me coining the the term in 2019
>>
>>100166264
>Posts an unreliable website years ago
>Makes up a graph in word and puts in random points
>/g/ has never had anime posters
I'm shocked you are so low IQ you think that this would fool anyone... I'm honestly dumbfounded that you seriously think anyone would take this seriously.
>>
>>100166283
>>100166184
I always thought the way ST handled chat history and branches was retarded. Users showing them how it's done.
>>
>>100166267
The vision model isn't even really in there and of course you need something like llava 1.6 or cogvlm or at least wd14 v3 (tags) to do any good on that end.
>>
File: 168318347522.png (200 KB, 1330x1079)
200 KB
200 KB PNG
>>100166288
how fucking new? you zoomers get more retarded about the site every day
>>
>>100166264
You do know that there is a /g/ archive that you can go back and run statistics on right? /g/ has almost always had around 300k posts of anime a year.
>>
>>100166329
>Gets caulled out
>Cries about it
>Continues to post images for 3 years ago
kek, no amount of you trying to will this into existence will happen.
>>
>>100166310
we are talking about the bunkertroon term, not anime
>>
File: bunkertroon niggers.png (118 KB, 1778x500)
118 KB
118 KB PNG
>>100166344
>stop outing me out
no, you are a retard nigger
>>
File: flow2.png (80 KB, 617x364)
80 KB
80 KB PNG
>>100166132 (me)
If my description was too complicated:
>each word has its own position in the space
>LLMs make their own internal representation placing words in different points of space, similar words are close to each other (for example in 2-dimensional space the word mother can be [120, 40] and the word father [120, 10]
>when the model chooses the next word in sentence (or rather token) it makes a movement in the space from the previous word to the next one (for example Mother [120,30] -> is [1450, -213] -> fat [21, 900])
>picrel is an example of this flow in the semantic feature space (2-dimensional)
So I was wondering if there any patterns in these movements and if we can do some statistics (like the average distance in space between words, standard deviation etc.) and find anomalies that make the model repeat itself. Like a sudden pattern of movement between long distance tokens in the space (just an example of weird behavior).
>>
>>100166350
>>100166365
Same person
>>
>>100166350
iunno what that is and I still don't care.

>>100166365
Good luck, I'm sure lying will surely help you make people care about your conspiracy theories, I have llms to post about.
>>
>>100166412
No one cares, everyone who matters doesn't use the internet as much as you faggots do.
>>
>>100166412
can you go away, I'm trying to ask anons about theoretical concepts related to LLMs while you are spamming the general
>>
File: vM3b.png (154 KB, 465x283)
154 KB
154 KB PNG
>>100166440
lol, lmao even.
>>
>>100166467
Report them for off-topic and ignore.
>>
File: image5486234.png (1.01 MB, 1024x1933)
1.01 MB
1.01 MB PNG
and now you newfags know
>>
>>100166475
low iq post
>>
>>100166494
no, just spotted /leftypol/ mikufag and framed it.
>>
>>100166501
Yeah, a low iq post
>>
>>100166509
i accept your concession.
>>
>>100166520
Glad you came to terms with your IQ.
>>
>>100166530
nigga you talk too much about IQ for an average /lmg/tranny
>>
>>100165580
A one-sentence system prompt uncensors it.
>>
>>100166548
see >>100166520
>>
File: 23154415436556471.png (274 KB, 342x589)
274 KB
274 KB PNG
S N O O R E
>>
>>100166565
>no u
you lost.
>>
>>100166576
>did no u before me
>claims I lost
what a cope.
>>
>>100166575
The best way to use /lmg/ is to not post in it unless its something that actually interests you.
>>
>>100166600
I come to /lmg/ to honestly remind myself that people suck and AI is better, they are just a distraction to leaving the west to live among actual people who aren't retards.
>>
>>100166613
Makes sense, a lot of people seem to be emigrating to asia these days. Maybe I'll go check it out some time.
>>
>today's thread is just catty middle school sniping
I guess the sheen already wore off Llama3, huh
>>
>>100166633
No its the first shitposts of summer, its gonna get worse for a couple months...
>>
>>100166633
There are no good finetunes yet for 70b.
>>
holy FUCK llama3 70b really hits that sweet spot for me, i don't remember any other model being this good
the writing is excellent, combined with its reasoning abilities and general knowledge/"intelligence"...
my only problem is that it FUCKING SUCKS at sex scenes, it'll always try to steer away from the action and/or say something like "i'm sorry i can't write explicit content", the tardwrangling i have to do is insane
>>
>>100162985
no, unironically i'm super charismatic & kind of a chad IRL. married now
>>
The fuck is up with the wizard guys? Ain't no way it takes more than a week to run a toxicity benchmark. I think it has to be one of two things:
1. It failed the toxicity test hard. They are currently "aligning" the model.
2. The model was too powerful and microshaft went full "oy vey shut it down". The longer the radio silence goes on, the more likely this outcome is.
>>
>>100166633
<conspiracy>OpenAI hired shitpost agents to disrupt /lmg/ to send more people over to cloud models.
>>
>>100166744
take your meds schizo
>>
>>100166744
Don't take any meds perfectly well-adjusted individual.
>>
>>100166727
I don't know. I usually use mid-level models like Mixtral or 30Bs, but I decided to take the sacrifice this time and run it with offloading at 1.5T/s. To me it feels really smart, but nowhere nearly as enthusiastic or imaginative as Mixtral-based finetunes. When I went back, I was floored at how much more fun they were, even disregarding the huge speed increase. 70B instruct feels autistic.
This might be a skill issue I admit, if it was faster maybe I would have found ways to prompt around it, but damn it is boring. Smart, but so boring.
>>
What if AI is good enough one day that 4chan can just replace the jannies with AI?
>>
This is such a good example for translating jp into en. The only one who got it right were CR+, Opus and GPT4.
They all know the actual meaning of "picking flowers" as in being lazy but the models don't apply it. Hope that changes in the future.

>Translate the following sentence into English. Use a liberal translation approach to ensure clarity for an English-speaking audience. Provide only the translation without additional commentary:
>試験勉強をしなければならないのに、彼はただ漫画を読んで花を摘んでいるだけだ。

Tried the recent new suzume models, they get it wrong unfortunately.
I dont think you can really improve language abilities with finetuning.

suzume-llama-3-8b-multilingual.Q6_K.gguf:
>He has nothing but time to study for his exams, yet he's just reading manga and picking flowers instead.

lightblue-suzume-llama-3-8B-japanese-Q6_K.gguf:
Was broken for me, if there is any japanese in the prompt it wont write in english, couldn't tard wrangle it.
>彼が漫然と漫画を読み、花を摘むことは試験に必要な勉強から遠ざかる行為である。 または: 彼は試験の準備をしないで、ただ漫画を読んで花を摘んでいるだけだから。 (注意) - liberal translation approachにより、同じ意味の表現が使用されています。
>>
>>100166375
it's really not that hard to test these hypotheses yourself. but yeah, there's tonnes of interpretability research. anthropic pumps out a ton of shit
>>
>>100166737
Currently leaning towards #2, personally. The model is extremely smart. It will give you detailed instructions for synthesizing funtime molecules, no questions asked
>>
>>100161515
language models are architecturally flawed from the start, we get new models each a little "better" than the previous one but they'll never be able to reason with such architecture, we should try to do better and build things that can actually learn in real time, has the ability to even learn new modalities if not zeroshot and can actually say how much words there is in its next sentence and plan.
>>
>>100166832
also, humans can change their whole worldview based on a single piece of information, no model is capable of that now and the issue is down to the their architecture.
>>
>>100166771
eh, i don't know
it could be because i'm trying to run adventure games rather than simple conversations, so intelligence plays a bigger role
or maybe i was just running shit models before (midnight miqu and BMT were my mains)

llama3 just feels like a big step up somehow - other than the sex scenes i rarely have to tardwrangle it and the coherency is great, it remembers important details about things in the scene and can even deduce others correctly on its own
the speed is sad though, i'll give you that - i'm probably getting 1.5-2T/s too
>>
>>100166843
Most humans can't, they've learned to filter information to protect their world view since they derive their worth from it, lashing out aggressively if it's challenged.
>>
>>100166860
i can and did so many times in the past, if something can't do it it isn't "inteligent".

also, i think you underhestimate normies, they can see someone they liked do something really bad and think "this is a bad person".

an ai model, you tell it a million time x is good and once x is bad with actual proof, it'll still think it is good.
>>
>>100166886
>>100166886
>>100166886
>>
File: 70B.png (29 KB, 636x270)
29 KB
29 KB PNG
>>100166793
Llama 3 70B gets it.
>>
>>100162627
rudo is that you ?
>>
>>100166375
Do chinese models loop the most? I don't remember now honestly.
>>
>>100166860
> they've learned to filter information to protect their world view since they derive their worth from it, lashing out aggressively if it's challenged.
best example: hatsune miku pictures in /lmg/ thread
>>
File: 00041-404906826_1.png (1.79 MB, 1456x1024)
1.79 MB
1.79 MB PNG
>>100167052
Seething
>>
>>100167834
I like this NKUK



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.