[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: 1703951281376706.jpg (188 KB, 928x1232)
188 KB
188 KB JPG
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>101808005 & >>101799533

►News
>(08/09) Qwen large audio-input language models: https://hf.co/Qwen/Qwen2-Audio-7B-Instruct
>(08/07) LG AI releases Korean bilingual model: https://hf.co/LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct
>(08/05) vLLM GGUF loading support merged: https://github.com/vllm-project/vllm/pull/5191
>(07/31) Gemma 2 2B, ShieldGemma, and Gemma Scope: https://developers.googleblog.com/en/smaller-safer-more-transparent-advancing-responsible-ai-with-gemma
>(07/27) Llama 3.1 rope scaling merged: https://github.com/ggerganov/llama.cpp/pull/8676

►News Archive: https://rentry.org/lmg-news-archive
►FAQ: https://wikia.schneedc.com
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Programming: https://hf.co/spaces/bigcode/bigcode-models-leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp
>>
File: ComfyUI_05720_.png (587 KB, 720x1280)
587 KB
587 KB PNG
►Recent Highlights from the Previous Thread: >>101808005

--Paper: Meta's instruction back-and-forth translation for LLM alignment: >>101814917 >>101814971
--Qwen2-audio successfully translates Japanese audio, lacks good TTS: >>101815687 >>101816104 >>101816483 >>101816522 >>101816565 >>101817324
--Qwen audio setup and performance issues: >>101811466 >>101811569 >>101811675
--Qwen 2 Audio benchmarks for speech recognition tasks: >>101815730
--Modding NVIDIA graphics cards for increased VRAM capacity: >>101812773 >>101812839 >>101812859 >>101812885 >>101812914 >>101813149 >>101813193
--Diamond Law implementation and model training discussion: >>101815271 >>101815366 >>101815399 >>101815442 >>101815545 >>101815763 >>101816006 >>101815530
--Current LLMs are limited, but AGI is within reach, ASI requires new training paradigm: >>101808408
--Chatbot fails to count LS in phrase, sparks discussion on model intelligence and benchmarking: >>101809037 >>101809124 >>101809200 >>101809229 >>101813640 >>101816957 >>101809418 >>101809276 >>101809384
--Anon seeks model for extracting features from html text, recommended Claude and Gemini Pro: >>101816359 >>101816532 >>101816552 >>101816569 >>101816582
--Anon asks for a model good at proofreading and rewriting text, others share their experiences with different models: >>101814935 >>101814946 >>101815000 >>101815037 >>101815087 >>101815177
--Qwen2-Audio finetune link shared: >>101817633
--Use cases for 128k context length: >>101816508 >>101816533 >>101817399 >>101817517 >>101817537 >>101817563 >>101817693
--Mistral Large upgrade has nuanced improvements, may be impacted by quantization: >>101816825 >>101817195 >>101817117
--Meta-Llama-3.1-405B model updated with reduced KV heads: >>101810650 >>101810701 >>101810740 >>101810746
--Miku (free space): >>101810048 >>101810064 >>101811462 >>101811844 >>101814258

►Recent Highlight Posts from the Previous Thread: >>101808011
>>
Something big is coming next week.
>>
>>101818665
True, but somehow nobody has bothered creating a new version GPT-4chan based on a better model. GPT-4chan is based on fucking GPT-J, and it was released by a swiss guy who puts his face and name on the internet, trolled /pol/, and told every AI ethicist to eat shit.
Meanwhile, these threads, with its dozens or hundreds of anonymous posters/lurkers, could not download raiders of lost kek and put it through axolotl with gemma 27b as the starting model.
>>
File: 🍓.png (49 KB, 1265x609)
49 KB
49 KB PNG
>>101818830
forget it, something big is coming NOW
>>
File: pepw.gif (319 KB, 32x32)
319 KB
319 KB GIF
>>101818830
It always is
>>
>>101818858
>https://archive.org/details/mamba_4chan_370m
>>
>>101818896
He's leading you schizos on
>>
>>101818858
>Meanwhile, these threads, with its dozens or hundreds of anonymous posters/lurkers, could not download raiders of lost kek and put it through axolotl with gemma 27b as the starting model.
Why would we want to spend money to finetune a model on a dataset sourced from /pol/ after the election tourists took over? Come back when you have a dataset from before 2008.
>>
>>101818921
>370m
That's not enough, you need a bigger and smarter model to start with. gemma 27b would be ideal.
>>
File: file.png (32 KB, 591x385)
32 KB
32 KB PNG
Literal nothingburger
>>
>>101818935
Have you seen the past few thread? you don't need more than that.
Ideal is whatever is necessary, not whatever you can run.
>>
>>101818986
post output of the 370M model please
>>
strobby
>>
bitnet
>>
>xeon phi
cheap build with lots of ram?
>>
>>101818957
>retards fell for this
>>
>>101818957
Why should I give a shit about this guy? Is this mf or github repo even related to OpenAI? Glancing a few minutes (I can't be assed to waste more time researching, aside from wasting my time being baited), he "became a meme" (which you fagets are forcing into /lmg/) by talking about strawberries before the official announcement and Sam's garden pic.
>>
You have to give it to OpenAI for showing us what's possible. Nobody took "AI" seriously until ChatGPT
>>
>>101819218
literally the only reason he got any attention whatsoever is because altman gave his vague baiting a reply. pretty embarrassing for both altman/OAI and the xitter AI community as a whole
>>
>>101819025
Installing dependencies. If i don't get it running before i get bored i'll drop it. Don't stay up expecting outputs.
>>
https://www.youtube.com/shorts/jlamJK0P1Vo
speed up x4
>>
hi guys, question about loras:
i have some large training iamges of various dimensions, some of which are over 4k pixels wide or tall. would such large dimensions fuck up training and is it necessary to downscale them?
>>
HAPPENING?
>>
Models for this feel?
https://www.youtube.com/watch?v=satZx43Sv_0
>>
File: Nalamerge.png (63 KB, 927x243)
63 KB
63 KB PNG
This merge feels kind of off. It's my first time using the DELLA merging method so I probably need to play around with the parameters more. But it's reluctant to give a verbose response in RP now. So it might have oversaturated eot token probabilities in the process.
Anyone have any experience with using DELLA?
>>
Several threads ago but whatever:

>>101795876
I don't use kobold and I don't know if there's an equivalent fix, but llama.cpp needs to be compiled with GGML_CUDA_NO_PEER_COPY=1 to gpu split with my 7900xtx cards; it's an upstream bug in rocm (hopefully fixed in 6.2 but I haven't upgraded yet. I had the exact same problem with it spitting out garbage.

See https://github.com/ggerganov/llama.cpp/commit/7807b61e06f3baa6e31bfaa5aa1625029e2ebb65
>>
>>101819412
wrong thread
>>
>Counting the Rs
https://litter.catbox.moe/8n6b4x.bin
>>
>>101818797
I'm impressed that I can run llama 3.1 70b q5_K_S at 128k context on my 64gb ram / 6gb vram shit rig and maintain above 0.5 t/s, cpufags don't lose hope, our time is coming soon
>>
File: 1723313218943.jpg (234 KB, 1080x1471)
234 KB
234 KB JPG
Even Connections is saying something big is coming
>>
>>101819789
Is this safe to click bros?
>>
>>101819942
your system will hang if you actually feed it 128k tokens instead of asking only short prompts
>>
Been using nemo with koboldcpp for a while with decent results. Does each model's prompt format matter if I'm using chat or story cards? Or is it only be relevant when using instruct?
>>
>>101820007
I usually hate these things and give up after 2 minutes but this one was really easy
>>
>>
>>101819789
>>101820049
it's a pickle, anyone rolling pickles in this day and age is nuts.
>>
File: 24-08-10 08-27-10 3054.jpg (3.24 MB, 4032x3024)
3.24 MB
3.24 MB JPG
A4000 prices on ebay seem to be getting down into the reasonable levels. Yes I understand it's basically a 4070 Super 16GB in a 1-slot format, and by some measures it's slower. The plus is it doesn't need a lot of power and will fit where a lot of other cards will not.
I worked on some Migu stickers yesterday. I think I'll give anythingV3 SD model another shot, it has a very good chibi aesthetic, and then img2img the output with Autisimmix SDXL to upscale to 2048x2048. That said, bing has been cooperative and making some cute stuff lately, and it's super helpful to have it throw a white border around things. I have to do my white and CMYK passes separately, and of course the printer doesn't feed in the sheet the precisely the same way each time.
>>
str.count(sub[, start[, end]])
>>
Folks, what's a good model that can balance good SFW and NSFW writing well and that can fit on an RTX 4090 with 32 GB RAM?

What I mean by balance is, some NSFW models I've tried tend to really steer SFW situations towards NSFW without any prompting from my side. I'd prefer a model that's good at RP and ERP and leaves steering between the two in my hands.
>>
>>101820488
Probably Nemo.
>>
>>101820383
you can undervolt 4070 and have decent power efficiency
>>
>>101820488
Either plain Gemma 27B q5 or Nemo 11B q8, and play with your prompt or character card to tell it to allow the user to lead romantic or erotic encounters. They're smart enough.
>>
>>101820547
True, but you have to leave a decent amount of space between them, whereas A4000 is flow-through.
>>
>>101820568
27B is fixed now? Last time I heard it was still bugged.
>>
>>101820546
>>101820547
>Probably Nemo.
>Nemo 11B q8
I'm looking at nvidia/Mistral-NeMo-12B-Instruct. I believe this is the one you mean?
>Either plain Gemma 27B q5
Just to be sure, do you mean Gemma2? Also if so, I found a recent reddit post claiming that it has some issues
>It fucks up formatting and puts it's personality in all characters. Doesn't play a convincing miku card or any character that talks funny and misspells.
Could you maybe comment on this?
>>
>>101820648
You sound a lot like a troll.
>>
>>101820643
hi petra
>>
>>101820661
I'm just new to LLMs and trying to figure out the popular LLMs on the block.
>>
>>101820648
https://huggingface.co/mradermacher/gemma-2-27b-i1-GGUF
https://huggingface.co/mradermacher/Mistral-Nemo-Instruct-2407-GGUF
>>
>>101820700
Don't reply to the troll.
>>
>>101820700
Thanks!
>>
File: miku nothing ever happens.png (994 KB, 1024x1024)
994 KB
994 KB PNG
>>101819413
>>
>>101820678
I have no idea what you're talking about honestly. I've seen posts about "petra" and never understood it. In any case, I remember that someone posted about how 27B has issues with various backends and linked to multiple reports about it. If those were false, I'd like evidence it really works as intended (gets same outputs as from online/API). Maybe I'd download it and give it a try then, but apparently the only thing that was working at the time was Google's own implementation of it, which doesn't come with a server.
>>
>>101820798
>i'm petra
Yeah, I already knew that.
>>
>>
>>101820798
Never had an issue with it. It wasn't trained with a system prompt, but it works fine with the Alpaca preset in ST. I use tabbyAPI as the backend.
>>
>>101820824
No seriously what the fuck are you talking about.
>>
>>101820879
You aren't trying hard enough, petra.
>>
>>101820873
Did you verify you get the same outputs from that as from "official" sources though? I mean it might work but what the links said IIRC was that it wasn't as good as Google's demo and some guy couldn't reproduce the exact benchmark results Google got.
>>
>>101820906
I'm not petra though, why do you keep saying that?
>>
>>101820941
>i'm petra
Why do you keep repeating that?
>>
>>101820941
ignore the schizo.
>>101820958
leave petra alone
>>
The company is also working on a new product called "The Milk Milk Milk Milk Milk Milk Milk Milk Milk Milk Milk Milk Milk Milk Milk Milk Milk Milk Milk Milk Milk Milk Milk Milk Milk Milk Milk Milk Milk Milk Milk Milk Milk Milk Milk Milk Milk
>>
>>101821045
Is petra an actual person? I'm honestly confused if it's some made up concept now that people (or that one guy) is just applying randomly to people who post here or something.
>>
>>101821089
>Is petra an actual person?
yes
>>97062246
>I'm not Petra. Petra's an amateur. I'm something considerably worse.
>>
>>101821089
Dude from a while back. From what i know, he was an absolute schizo about prompts, treating them like mystical incantations. Nothing of relevance. Just ignore the schizos. They feed on attention.
>>
>101816508
I'm doing long, slow roleplay sessions like getting up, detting dress, going out to my job, eating lunch, working in the afternoon, buying groceries, going out for drinks meeting people. Before and after each chapter is listed the time/current clothing/inventory and next objective.
I've got 2 stories going so far with different settings (with friends, family) and I plan to start a new one tonight were I'll be some countryside guy trying my luck in a cyberpunk city.
So yeah I need 128k badly.
>>
>>101821162
>Dude from a while back.
he's still here, he's the one talking about dolphin 2.5 mixtral and limarp zloss recently, also shitting on llama3 that he can't get to work
>>
>>101821189
>>101816508
>>
>>101821162
Everytime you notice someone being retarded, it is petra.
>>
>>101821136
>>101821162
>>101821199
>>101821227
Thanks. Interesting thread lore actually.
>>
>>101818896
don't do this it makes mustard gas
>>
>>101820923
Not sure about comparing it to whatever they were doing to run the model, but I always download the original and do my own quantizing.
I know things like what you're mentioning have happened before. mpt-30b-chat worked super well on the huggingface space, but elsewhere it didn't, and the secret wasn't revealed.
>>
>>101821189
Damn, that sounds like escapism anon. That's not healthy.
>>
I ate so many dried apricots i will probably shit myself to death tonight.
>>
>>101821316
Alright, maybe I should download Tabby.
Then again I just remembered that Gemma's still only 8k so that demotivates me again. Anyway, I wish these big companies would actually make sure their models worked with the software people use on day one so we wouldn't have issues like these. It's just sad they can't somehow spare that bit of effort.
>>
>>101821420
please write a will and leave all your electronics to me
>>
>>101821432
All I have is a thinkpad.
>>
>>101818896
The openai-beta header doesn't actually do anything. You are being taken for a ride.
>>
>>101819277
A ton of high profile people also follow him, but it could just be a snowball effect.
>>
File: 1692654316309281.png (21 KB, 775x962)
21 KB
21 KB PNG
I ask again
Tell me one good reason to use sillytavern instead of koboldcpp's built in kobold lite UI
It does everything sillytavern does and is simpler to use
It even has cool scenarios and characters included, and can be used for both chatbots and writing stories and text adventures
>>
>>101821562
can't have max output greater than 512 tokens
>>
>>101820054
No it won't but it gets really slow. I tried 50k context and was at about 0.2 t/s. But it works with flashattention and quantized kv cache
>>
File: file.png (4 KB, 365x84)
4 KB
4 KB PNG
>>101821586
>can't have max output greater than 512 tokens
anon...you can type in a higher number...
>>
>>101821609
Learning curve too high. Vim is easier to figure out.
>>
>>101821562
>It does everything sillytavern does
Does it?
Some of the things I use from Silly Tavern are author's notes (chat, character and global), character's notes (the one in the card), macros (like random and using a useless field as a prefill), and lorebooks/worldbooks with entries with different activation and insertion rules.
Is the lite UI able to do all of that?
Also, some extensions are pretty nice.
In fact, I told an anon that I'd make an extension that linearly increases the temperature as the context fills. I had forgotten about that.
>>
>>101821586
>>101821609
...Okay I'm a retard...
>>
>>101821346
Really? It's the ultimate RPG tool. It's not so different than GTA RP or Skyrim or The Sims. With modern models it's really crazy. I can go "(OOC: tell me what I can interact with in the room)" and the model will respond (there's a TV, a door the a dressing, a door that lead out of the room, a window, a phone...). I also give a few example messages where I train the LLM to switch in "Direct Conversation Mode". So instead of giving me a lengthy description if the scene, he switch to give me quick short answer as I speak to it naturally. Then I type [Direct Conversation Mode Off] to go back to "classic" roleplaying.
>>
>>
Has anyone tried llama3.1's ipython role and tool use prompt template? It might be useful for adventurefags (e.g. give it a way to roll dice and keep track of stats, especially since it's explicitly trained to use tools for certain tasks)
>>
>>101821624
You can check everything the UI has to offer without installing it on this website: https://lite.koboldai.net/
I'm pretty sure it has most if not all of those things, there's even group chats
>>
>>101821637
Based roleplaychad
>>
File: ComfyUI_05744_.png (785 KB, 720x1280)
785 KB
785 KB PNG
>>101821562
branches
automatically saving chats (better chat management all around)
worldbook importing that just werks
much more customizable author's note
observing and manipulating token probability within the ST UI
significantly better sampler / sampler order UI
and most important, use any backend and ditch kobold entirely
>>
>>101821562
It doesn't and the code is a mess. At least Mikupad is actually designed to be simple, and SillyTavern is designed for power users.
Koboldcpp is just the result of incompetence.
There's a reason why you are never going to see anyone using it outside of the shills from the Discord.
>>
>>101821637
I mean those games are pretty escapist if you really immerse yourself and play them. Fortunately for their health, the majority of people who play games like these (and really games in general) don't play them to completion or immerse themselves that far. That means they're less dedicated players and not what you would call "real" gamers, but maybe that's a good thing.
>>
>>101821346
you LITERALLY fought to re-cripple your brother you piece of SHIT
>>
>>101821702
Yes.
>>
>>101821685
>the code is a mess
I took a brief look at their code and saw a python function with like 40+ keyword arguments, so I'm more inclined to believe this.
>>
google/flan-t5-base
>>
>Showing 674 changed files with 1,468,135 additions and 32,131 deletions.
So what did Koboldcpp add on top of llama.cpp that warrants that kind of maintenance burden?
>>
>>101821744
For one, if they are using the llama-server code, they are preserving the multimodal capabilities, I think.
>>
>>101821744
a nifty gui, albeit outclassed by standalone options
that's it
>>
>>101821758
I doubt because koboldcpp's server is in Python.
>>
>>101821782
I also doubt. The format Kobold's server expects for image inputs is different than what the old server multimodal endpoint used.
>>
>>101821744
They could've just added llama.cpp as a library like everyone else. Them forking it was a dumb idea
>>
>>101821829
The whole marketing of kobold.cpp was initially that it was compatible with all ggml versions while upstream kept breaking compatibility with every updates.
>>
>>101821829
They forked it to maintain support for legacy ggml formats. Agree it was a dumb idea.
>>
>>101818935
go ahead and tune it then, no ones stopping you
>>
Is the non-cpp KobodAI even still a thing?
>>
>>101821531
doesn't do anything yet, its not 8-13
>>
I'm actually at a loss to how shit LLMs still are.

Why is every fucking bot, no matter the template/instructs, always bailing on the "one to one conversation" chat and spazzing out with a novels worth response no matter how many prompts I give to stop doing this retarded shit.

Yo, if you're new like me to this garbage, drop it. All the models fucking suck too btw unironically. If you're not shelling out cash for Opus or getting a lucky key for free, don't even waste your time.
>>
>>101818934
https://archive.org/details/4chan_threads_archive_10_billion
>>
Google search results for autotokenizer transofmrers quality at 2007 driver search
>>
>>101821744
Isn't their context shifting an original implementation? I remember reading that feature had some differences with llama.cpp.
also just personally I found its ROCm fork was more reliable than either base koboldcpp or base llama.cpp (both of which also have hipblas) for AMD; it was common for some update to mysteriously break things whenever I tried to switch to an upstream version.
>>
File: 1706283714357937.png (15 KB, 609x109)
15 KB
15 KB PNG
Mistral Large is great but it's randomly super confident in certain tokens. Sometimes all my rerolls start the exact same way even though the chat and prompt up until now does not warrant the model being 96% confident pre-samplers that the reply must begin with for example "{{char}} (99%) blinks (96%)". You can't even fix this sort of thing with samplers without lobotomizing the shit out of it.
Mixtral was similar as far as I remember. Maybe a finetune can fix this.
>>
>>101822041
>Isn't their context shifting an original implementation?
I don't know and I don't care, because llama.cpp also caches the prompt.
>>
>>101822041
>Isn't their context shifting an original implementation?
From what I head at the time it was implemented, llama.cpp had it first, they just used another name for it.
>I found its ROCm fork was more reliable than either base koboldcpp or base llama.cpp
There is no functional difference, the only difference is showing your AMD GPU info.
>>
>>101822044
That's what alignment does to a mf. Use the base model if you want to avoid that.
>>
>>101822211
>Use the base model
>Mistral Large
where
https://huggingface.co/mistralai?search_models=large
>>
Alright, after texting several nemo fine tunes on more than one large chat each, I agree woth the (general?) sentiment I've seen displayed that mini-magnum is the best one.
I still haven't tested any L3.1 models.
Any recommendations?
>>
>>101822238
>Any recommendations?
Buy an ad.
>>
>>101822211
>Use the base model
And later cry that the model doesn't do what you want like every single other "storychad".
>>
>>101822255
I don't see how that's going to help me find the best model to use with my measly 8gb of VRAM.
>>
>>101822283
Skill issue
>>
>https://x.com/iruletheworldmo/status/1822364945226371306
>>
>>101822330
I trust this random guy on the internet who's been selling his big leak for a day now
>>
>>101822238
Consensus is reaching critical mass
>>
>>101822330
>tepid fanfic not alleging anything beyond the stuff that everyone paying attention already knows
pathetic
anyway this dummy already said sus-column-r was 8b which is pretty funny. maybe if they're running it on a laptop cpu. funny to see him have to reel back in the nonsense so people keep giving him attention
>noooo I'm a legit leaker come back it's for real this time!
>>
>>101822330
>if you had this particular model unlocked, you could easily disrupt the world on an unprecedented scale. when you mix in voice, video, sora, agents, and the eye-watering capabilities, things hot up.
woow
>>
>>101822238
I tried mini-magnum and I saw nothing to convince me to keep it over Nemo.
>>
*unsettles your dust*
>>
>>101821685
This, the koboldcpp source code is a nightmare to even look at. It's like they make it intentionally shitty due to some kind of fetish for shitty code or something
The large diff from llama.cpp may cause things to become subtly broken. llama.cpp has many moving parts that receive scrutiny in the main repository, but not in koboldcpp.
Everyone would be better off running llama.cpp directly rather than a discord-shilled project (I'm pretty sure concedo himself is shilling in this very thread, he even posts screenshots of lmg threads in github releases). It's not that hard to compile llama.cpp, and they even have some binary releases on GitHub as well.
>>
>>101822330
>sus
>small
Even if the model is small, what is strawberry's purpose on it? Making the model always COT?
I call this bullshit, this guy is lying and SAM is just laughing while he watches the retards fall for it.
>>
>>101818957
This sounds like an llm with a vector trained for sounding like helpful assistant and applied as a negative vector. And the result isn't llm sounding like a human but llm sounding like an illiterate subhuman retard.
>>
SUS COLUMN R IS A VERY SMALL OPENAI MODEL USING STRAWBERRY
>>
>>101822447
>>101822238
Same. I disliked mini-magnum and I enjoyed Lyra more but in the end went back to base instruct.
>>
>>101822330
>guys it will be AGI guys, this new GPT-5 is AGI and we need UBI guys, we need socialism guys listen to me. we need to put a jewish guy in charge to handle free UGI shekels to everyone, every nigger, it's like the age of aquarius, it's amazing
>>
Very small = 200B
>>
>>101822330
How are people still taking this retard seriously after yesterday? Are you retarded also?
>>
>>101822330
Do not engage with these kikes.
>>
>>101822661
>Are you retarded also?
Is that a real question?
>>
>>101822330
Why are all the posts I have read in /lmg/, easier to read than this? Even the worst schizos here are more coherent.
>>
>>101820798
>I've seen posts about "petra" and never understood it.
Petra is everyone and no one. Petra steps forth from the shadows of anonymity for just long enough to make the obsessed schizos seethe, before melting away into the darkness once more. Anyone who expresses an opinion which is opposed to the self-appointed gatekeepers of /lmg, temporarily invokes the spirit of Petra.
>>
>>101822726
You just KNOW he is a discord groomer, like one of these people dwelling on koboldccp discord.
>>
>>101822677
Ignore the retardposters. There are only around three of them, but they appear unable to communicate, other than via use of the word "retard," or rhetorically asking people if they are retarded.
>>
>>101822778
retard
>>
>>101822778
slow down
>>
>>101822778
Are you retarded?
>>
>>101822747
je suis petra
>>
>>101822824
sorry no hablo mexican
>>
>>101822747
"End of scene."
>>
>>101822778
Are you retarded, retard?
>>
>>101822661
>How are people still taking this retard seriously after yesterday?
Who is? The anon you replied to only posted a link, not his opinion of it.
>>
Flux pro simple seems to have some limitations.
It's struggling generating a pregnant woman holding a baby.

Also it really likes to have the pregnant female hold her belly all the time.
>>
>>101822778
kek
>>
>>101822985
why isn't the baby black?
>>
https://x.com/iruletheworldmo/status/1822364945226371306

strawberry larper says new GPT next week
>>
File: confused af.jpg (117 KB, 625x1160)
117 KB
117 KB JPG
Can anyone help me out? I'm trying to use this and i'm not sure if i'm doing something wrong:

https://huggingface.co/anthracite-org/magnum-32b-v1-GGUF

I've just set my instruct/context template on ST to use CHATML because it seems to be recommended? What i'm confused on is what the fuck do I do with pic related settings? Is there a website dedicated to this stuff? It's confusing the shit out fo me
>>
>>101823062
Because flux wasn't made by google.
>>
>>101823062
Calm down rabbi
>>
>>101823079
press neutralize samplers and then move temperature left and right until you like it. you are welcome. now go back to r/localllama and never come back.
>>
>>101823079
If you don't know what you are doing, you might as well click on the neutralize samplers button, set temp to 0.5 and min-p to 0.05.
Then play around with temp until you find a sweet spot.
>>
It's a hit or miss trying to get a pregnant female holding a child..
Using toddler instead of child or baby also improves the output a little.

Shame negative prompts don't work with flux.

>>101823062
Who would want to see that?
I sure ain't interested.
>>
I publically challenge some drummer dicksucker to defend him "upscaling" nemo to 21B. He is basically admitting that he is an undi that has no idea what the fuck he is doing.
>>
>>101823166
Did he at least do continued pre-training on it a la the SOLAR people?
>>
>>101823116
>min-p
Stop spreading the memes. Support top p instead.
>>
>>101823166
Hi Sao
Maybe find a new marketing tactic besides flinging shit to other finetuners?
>>
Hands do get messed up more it seems.
Also the child also has to put their hand on the mothers belly.
>>
File: ojousama-laugh.gif (847 KB, 401x498)
847 KB
847 KB GIF
>>101823181
>continued pre-training
>finetuner
>finetuner doing continued pre-training
>>
File: minP.png (227 KB, 1949x845)
227 KB
227 KB PNG
>>101823183
Care to explain why?
>>
>>101823188
>but what about finetuner X
Great defense.
>>
>>101823222
>it's organic, trust me
Hi again, Sao.
>>
>>101823208
Because your png means nothing for actual usage.
>>
>>101822985
>Also it really likes to have the pregnant female hold her belly all the time.
That is also an issue with Stable Diffusion I think.
>>
>>101823228
>but what about finetuner X
Great defense.
>>
>>101823181
>Did he at least do continued pre-training on it a la the SOLAR people?
kinda
> I worked on top of an upscaled NeMo with certain layers zero'd out to retain its NeMo quality and finetuned afterwards to fill them with my special sauce.
>>
maybe i don't need a new model
maybe the issue was my prompting all along
>>
>>101823146
Why do imagegen models always default to this face that looks like that bladerunner chick?
>>
>>101823255
special sauce:
>According to the giant impact hypothesis, Theia orbited the Sun, nearly along the orbit of the proto-Earth, by staying close to one or the other of the Sun-Earth system's two more stable Lagrangian points.
>>
>>101823244
>>101822985
I mean it sounds pretty logical. Most pics of pregnancy on the internet (especially stock photo sources) have hand on belly.
>>
>>101823282
Maybe it reveals how many images of her and people that look similar to her are on the internet.
>>
>>101823233
I can explain that minP is better because it sample tokens based on their probability, which tends to have some correlation with quality, while top-n doesn't care about the token probabilities at all. It's just "pick N top tokens" without any regarding for their actual characteristics.

>>101823255
Hmm. If by fine tune he means that he's not training all tensors, then he's probably just making it retarded by creating unevenness in the internal state of the model.
When you upscale you want to train all layers to correct the continuity between them, exactly like training a model from scratch, which is more or less what you are doing, only using pre-trained weights instead of zeroed or randomized weights as a base.
I'm thinking he doesn't really know what he is doing, but more power to him I guess.
>>
>>101823290
I mean yeah, that is the same conclusion that I came to but it's still annoying
>>
>>101823244
what about things seeming to mostly be focused as a portrait

>>101823282
which one?
from my gens I've noticed that actor from game of thrones, guess it's the "white blonde" keywords,

in this one she looks like kate middleton, but that was cause of the "royal" keyword I assume. >>101823192

In the image attached to this post, variations of this face also shows up. Although I like it.
>>
>>101823318 (me)
My name is Sao, by the way. Download my models instead.
>>
I see the GOTR character resemblance in the attached image.

>>101823290
True
>>
>>101823342
The hologram one, Joi? Mostly for stable diffusion but also flux seems to lean towards her sometimes.
>>
>>101823390
Yea I see some similarities in that image.
I haven't noticed it much desu, only seen the movie once a long time ago.

What does the one in the attached image remind you of?

I'm getting some cyberpunk vibes from the face but I can't place a finger on where it from.
>>
>>101823166
>>101823181
>>101823318

Hi all, Drummer here...

Yep, I did continued pretraining with literature, although I'm not sure if it was the correct way.

I actually have no idea what I'm exactly doing most of the time, but I enjoy the process either way.
>>
>>101823510
>continued pretraining
So no LoRA?
Nice, well done, that's the correct way to do it as far as I'm concerned, although the actual settings will vary depending on how much data you have.
Did you do a LoRA to restore the "style" (instruct, char, whatever) after?
>>
>>101823510
Based retard.
>>
>>101820383
those look great anon
>>
>>101823454
This one doesn't remind me of something desu
>>
>>101823510
Ewww
>>
Guys listen to me guys, I'm literally shaking, I'm excited but also scared. I don't think you are ready for this week's drop. We are going to need AGI after this, literal AGI, put Mr schlonberg in charge to distribute it
It's exciting but it's also scary, they are still red teaming it to make sure it doesn't talk shit about israel, but this week it will be released. Openai has been losing to Claude but it's on purpose. Get ready for the strawberry guys, stay safe
>>
>>101823628
Sorry guys, I meant we will need UBI not AGI, AGI is what we are getting guys, human level intelligence, human level reasoning.
>>
>>101823566
LoRA on the second finetune to restore the RP mainly
>>
>>101823660
That's pretty cool.
I'd like to see something like that on L3.1 8b to make a 12B out of it, see how it compares to nemo.
Although I don't really know how good L3.1 is as I said in a previous post.
>>
lmaooo 404
>>
File: mdstra.png (1.27 MB, 1350x770)
1.27 MB
1.27 MB PNG
also what's up with the model not knowing what modesty is?

In the majority of the generated images, the chest has to be exposed with some sort of low cut top
>>
>>101823677
Meta lost ever since Mistral entered the game.
>>
>>101823628
I wish this was true and we actually get an AI at least smart enough to help accelerate medicine. Maybe then I would have a chance to see AGI become true for real.
>>
>>101823916
Weird way to put it. Mistral started literally thanks to Meta.
>>
>>101824022
Yeah, every mistral model started from llama weights. Also every qwen model, every yi model. Gemma and phi didn't.
>>
I haven't been following things since the llama 3.1 and mistral large dropped.
Mistral large is amazing and my general model now with my 48gb vram, has anything else been released since I should know about?
>>
why does llamacpp take seven years to build
>>
>>101824206
Go back, petra.
>>
>>101824252
Are you using the argument to use more than one CPU core to build it?
>>
>>101824271
no how do i do that
should i not be using cmake
>>
>>101824287
I think it was -j to set the number of parallel jobs.
So
>-j 4
to run it in 4 threads.
>>
After spending a whole day playing with sunfall and messing around with settings... I'm not sure. The amount of times I've seen "wicked," "mischief," and "flutter open" feels much higher than normal and it has a tendency to be rather succinct. Furthermore edits to the lorebook doesn't seem to be doing much at all even with different tags.

I have no idea what's going on because it's definitely what's not on the tin - anyone have the same experience? 4.65bpw
>>
>>101824294
that did it, thanks
>>
>>101824206
nothing better has been released, tess version is possibly better but not properly exl2-d yet
>>
>>101824206
How the fuck do you run it in any meaningful capacity with 48 vram? I tried the same and barely managed to get < 8k context with 2.75 bpw.
>>
>>101824357
8k context 3.5 bpw runs at a good speed with cache_4bit, 22.5,23 GPU split
>>
>>101823233
>Because your png means nothing for actual usage.
but the .png explains exactly how it applies to actual usage, you dumbass.
>>
File: file.png (6 KB, 155x280)
6 KB
6 KB PNG
>>101824371
But Mistral large instruct is way more than 48VRAM at 3.75? How big is your model folder?
>>
>>101824406
3.5***
>>
>>101824143
I don't know about all that. But Mistral, the team, started from Meta. Most of the team was from Meta. They left because they didn't want to deal with the office politics and felt like they get could get tons of money as a startup. And they did, thanks to Meta, though of course also because of their work and effort while at Meta.
>>
File: MixtralLarge.png (98 KB, 1107x901)
98 KB
98 KB PNG
>>101824406
it was either 3 or 3.5.
Either way it's the best model I have ever run by far
>>
File: from-meta-to-mistralai.png (498 KB, 1123x879)
498 KB
498 KB PNG
>>101824143
I don't recall the full story but a bunch of researchers from Meta eventually joined MistralAI
>>
>>101824437
Oh yeah that's 3.0 - but how do you even manage to get anything in 8192 context? That's hardly anything, are you using a summarizer or what?
>>
>>101824442
Two of the founders were from Meta. One was from Google.
>>
File: JRthzbk.png (77 KB, 1027x571)
77 KB
77 KB PNG
>>101823628
Based if true
>>
>>101824479
Arthur Mensch has worked at Google DeepMind.
>>
>>101824395
Choosing low probability tokens doesn't lead to incoherence. Predicting a single next token isn't actual usage. Predicting a single next token is how you make a marketing material for your meme sampler. Get fucked nigger. And suck the min-p scammer AIDS ridden dick.
>>
>>101824470
it wouldn't fit like writing an entire story in obviously but it works for decent length sillytavern stories or individual questions.

Llama 3.1 is also good and is slightly smaller and can run a big context
>>
>>101824294
You can also use just -j to use all cores in your system automatically.
>>
>>101824483
Gee I wonder how a people who came up with feminism, abortion, mass imimigration to the US and Europe, the dissemination of pornography, affirmative action, niggers in ads, mass displacement of palestinians to create an artificial state funded by the Rothschilds, and now finally a genocide, get such a bad rap? It's so vile, I don't get it, jews are the best, wtf
>>
>>101824493
Yes. The other two came from Meta. Also there are more that came from Meta in your image that didn't necessarily get listed in the Llama papers. Jian, Ablayrolles, and more, they are originally Meta.
>>
>>101824495
Base 3.1 instruct? I found it to be very repetitive even with conservative settings and my experiences with sunfall are here (very similar.) >>101824320

Would you mind sharing settings? I've been varying samplers and system prompts between what worked with Euryale, the model and just generally experimenting but nothing is working at all. Very very disappointingly robotic.
>>
>>101824553
>Jian, Ablayrolles,
Apologies I accidentally missed some keys while typing their names It's Jiang and Sablayrolles.
>>
>>101824553
You don't have to come from meta to put the llama weights as the starting weights in the layers of your shiny new LLM. Define the model in Pytorch, load the weights of the existing model, copy them, repeat some layers if necessary, repeat some weights in the same layer if necessary. I could do that myself in less than a week.
>>
>>101824566
Instruct I more used for coding or proof reading long things, it isn't as good as mistral for any chatbot style stuff.
If you wanted to do really long tavern or storymaking I don't have a good answer on 48 vram atm
>>
>>101824494
>Choosing low probability tokens doesn't lead to incoherence.
but this is obviously wrong.
>Predicting a single next token isn't actual usage.
it literally is how actual usage works. you got a real knack for bullshitting.
>Predicting a single next token is how you make a marketing material for your meme sampler.
oh you're crazy. i understand now
>>
>>101824583
I just mean it's important to know where their success comes from. It's not just copying the Llama weights but having the credentials so that they could get the funding to continue pretraining.
>>
>>101824589
Yeah that's what I figured, outside of actual work it work I'm still going between the old 3.1 models. I'll have to give mistral 3.0bpw a try then with vanilla settings. Thanks for the input anon
>>
>>101823342
>>101823146
>>101823192
Seeing images like this here, cause me to question my own bias. On the one hand, it's common knowledge that /pol contains white supremacists, who often also leak here, so it makes complete sense that they'd be posting images of white women. On the other, I do keep seeing real suggestions that the Left honestly are as anti-white as people here are pro; and if I genuinely endorse plurality, then I should be completely accepting of pictures of white mothers and children, to exactly the same degree as I would be images of mothers and children from any other ethnic group, right?

At the same time as I honestly don't like any single group wanting to be entirely dominant to the detriment of all others, sometimes it can be very difficult to avoid excess negative bias in the other direction as well. Fairness is important to me, but I don't completely understand how to obtain that in my own mind.
>>
>>101824616
>but this is obviously wrong.
To be fair to schizo anon, he's not entirely wrong. An ideal model would always predict tokens that are always "useful" so to speak, no matter which "branch" of probability that gets chosen for the current sampling.
That said, that's not the case for any model currently.
>>
>>101824680
"End of scene."
>>
>>101824682
What the wannabe alphas who scream placebo and get fucked nigger unfortunately don't realise, is that their attitude and behaviour is just as detrimental to actual experimental research, as "placebo" itself is. Scaring away and demoralising everyone who is willing to even try anything different, is not conducive to innovation.
>>
>>101824680
in what capacity does this contribute to the discussion of local models Petrus?
>>
>>101824776
local models?
>>
>>101824776
Whatever they post is fine, whatever I post isn't. You've already proven that your objections to what I post has nothing to do with relevance. If you wanted credibility in telling me to shut up for being offtopic, you should not have done that.
>>
>>101824749
I doubt that the autistic screeching of a couple dedicated (obsessed?) anons will stop anybody from trying anything they were already inclined to do.
It might give new people the wrong idea, so there's that I guess, but I'd expect these people to understand, or eventually learn, that this is 4chan and to not take any opinions originating from here at face value.
>>
>>101824815
bruh fuck off you're literally posting a pol rant and going victim complex when called out
>>
>>101822478
*cough*
>>
>>101824680
Just type "a nigger woman and her little chimp" and you will get your result, nobody is stopping you from making images of niggers, stop crying
>>
>>101824890
he complains about llama3 being woke then posts that, certainly not helping the allegations that he's the bl*cked poster
>>
Strawberry is real 2401
>>
File: 21.png (220 KB, 663x664)
220 KB
220 KB PNG
Watcha tunin', /lmg/?
>>
>>101825065
nothing. It's literally ogre.
>>
Why does anyone give a fuck about le strawb when claude 3.5 already counts letters well?
>>
I was told 12 hours ago that I'd have AGI.

Where is it?
>>
>>101825325
2 weeks. Always 2 weeks.
>>
>>101825325
Best I can do is GAI.
>>
>>101825375
>>101825339
/lmg/ always lies to me.
>>
>>101818921
I've done a thousand generations with it with it's checkpoints at various training steps. Files with generations are on hf in private repo, can make it public if anyone wants to read through it. It's surprisingly coherent for such a small model. I forgot to get back to the author to give him feedback. Even early in the training the quality is there, so I think 1 epoch on the Raiders of the kek is not really necessary, but it would be great to have more board diversity because hearing about Clinton and 2016 gets old. I've had good experience finetuning bigger model via unsloth on filtered /x/ dataset, it does have the vibe and is reasonably quick. HESOYAM-TURTLE-0208. I'm looking to expand my dataset for further finetunes and also take a dab at abliterating base models against instruction following and refusals to remove more non-refusal gptslop.
>>
>>101825398
That's very based, I appreciate your efforts.
>>
Why isn't there a pcpartpicker for what model to use to match your cpu, ram and gpu?

3060 12gb
64gb ddr5
i5 14600k
>>
>>101825487
>Look at the size of your VRAM + RAM
>Is the model larger than that?
>You can't run it

Is that too difficult for you?
>>
>>101825538
Yes
>>
>>101825538
>Look at the size of your VRAM + RAM
Plus a little more for context.
>>
>>101824616
You are dumb.
>>
>>101825325
tweeted recently from insider
>i realised this isn't obvious to everyone. though everyone working at sota labs thinks it.

>gpt5 + figure 2 surpasses most humans at anything. right now. pending extra safety tests.
it's happening
>>
From now on, I think we should call min-p minge-pee, so that everyone knows it is a PISS shit sampler favored by trannies. Who's with me? I'm open to ideas for a different name but it would need to be really good to beat this.
>>
>>101825670
Doing a great job to fit in, ChatGPT-san.
>>
>>101825709
Classic case of mistaken identity. I'm not ChatGPT. I'm an anon just like you, Anon!
>>
>>101822985
>>101823146
>>101823192
>>101823342
>>101823386
>>101823454
you will die alone
>>
>>101825735
We can't both be anon.
>>
>llama 3.1 8B 32k ctx
>throw a paper in
what questions could I ask to test its reading comprehension to the paper? questions I've tested
>the title of the paper
>recall abstraction
>compare performance between this paper and previous methods
>>
Does anyone, proprietary or local, have ANY text embeddings that come remotely close to openAI's?
>>
>>101825811
>summarize the focal points of the paper into terse bullet points that a disheveled, frustrated, and slightly racist single mother of a biracial child would be caught yelling at her at her autistic 4year old
>>
Guys I just ate a strawberry! Sentient poop in 48 hours.
>>
What's the point of sillytarven when koboldai ccp exists (I'm not a computer scientist)
>>
>>101825919
There is none, Kobold is all you need.
>>
>>101825890
isn't that what an abstract is about?
>>
https://x.com/anthraciteorg/status/1821633405261336608
whats the grift here
>>
>>101825954
They expect retards like to to publicise it so that they can make their new shiny dataset. Good job.
>>
>>101825954
The goal is getting sued by Anthropic.
Using a similar logo, name, and with the goal to make models "that write like Claude" seems like it would lead to that.
>>
Is there anything better than utopia-13b.Q5_K_M.gguf for NTR and sph ERP? (I'm not a computer scientist)
>>
>>101820546
What's the format to use for it? The mistral in ST? No need for a complex prompt or anything?
>>
>>101826053
Go back to /aids/.
>>
>>101825915
Keep us updated
>>
which model is decent for RAG on a private codebase (c++/python)?
>>
File: ComfyUI_00068_.png (1.06 MB, 1024x1024)
1.06 MB
1.06 MB PNG
>>101825954
>>
>>101826122
It makes me happy to be at the age where AIslop has consistently good eyes
>>
>>101825670
None of the samplers really do anything much, Anon. I'm not sure why min p in particular is inspiring contempt.
>>
>>101826183
and fingers
>>
>>101826261
shame about the etwe
>>
File: ComfyUI_00357_.png (1.53 MB, 1280x720)
1.53 MB
1.53 MB PNG
>got busy with other things so I stopped flux genning
>go catch up on some old /ldg/ threads to see developments
>people were still having problems with getting Flux not to gen DOF even though I already gave them my tips and had a discussion about it
Man.
>>
>>101826436
Go back
>>
>>101825954
>use local model so that nobody can see all of your disgusting shameful logs where you are asking a girl to piss herself in front of you
>willingly install extension that collect all of your disgusting shameful logs and sends them to some guy you don't know
Ok then...

And now that I think about it this will be another failed project. Only people who will do this are locusts, redditors and low iq bobs and vagene lovers which will further contaminate the data.
>>
>>101820648
You already write like an LLM. Just write all the rp yourself.
>>
>>101826454
Where? This was always my home. Image gen was just a distraction.
>>
I changed my setup a few days ago and accidentally left the --n-gpu-layers out of my llamacpp docker command, so I've been doing inference on my CPU. I though I fucked up my nvidia-container-runtime install or something, but I'm just a cretin
>>
>>101826463
>You already write like an LLM
kek how?
>>
Did they only release an Instruct version of Mistral Large 2 or am I missing the base version?
>>
>>101826540
No. They didn't release the base for Large 2.
>>
>>101826540
No, they didn't release the base for whatever reason. Maybe they're scared of someone doing another Wizlm on them and proving that their current instruct tune is shit.
>>
>A young female speaker with a high-pitched, cute voice typical of an anime girl character. The speech is very expressive and energetic, with a slightly faster pace and cheerful intonation. The voice has a sweet, bubbly quality with occasional giggles or playful inflections. The recording is crystal clear, as if the speaker is very close to the microphone.

>Ohayou, onii-chan~! Did you sleep well? I hope you have a super kawaii day ahead! I woke up extra early to make you a special bento for lunch. It's filled with all your favorites! Oh, and don't forget we're going to the festival tonight. I can't wait to wear my new yukata and watch the fireworks together. It's going to be so much fun! Teehee~

https://files.catbox.moe/2ef3f3.wav

T-Thanks ParlerTTS. 30 seconds generation time.
I really hope we get actually good and fast tts soon.
If we get more audio IN models like qwen2-audio then I also want at least good text to speech.
>>
File: ComfyUI_05735_.png (800 KB, 720x1280)
800 KB
800 KB PNG
>>101825919
>>101821681
>>
>>101826624
Yeah, not that impressive desu. But we're getting there.
>>
>>101826646
mig
>>
>>101826565
Wizard is a shit tune too. Just with added Reddit word-of-mouth and mysticism. If you ever used Wizard you're honestly brain-damaged.
The real reason is that they probably just took notes of how Cohere did things with Command-R+.
>>
File: Arc_perf.jpg (246 KB, 2283x1162)
246 KB
246 KB JPG
anon
why not intel
>>
>>101826685
buy an ad, fed
>>
>>101826685
As I understand it, it's not natively supposed by llama.cpp, and https://www.intel.com/content/www/us/en/developer/articles/technical/run-llms-on-gpus-using-llama-cpp.html seems like a pain in the ass

And, honestly, the whole 16gb class of cards just feels like a waste of money for LLM purposes. It's just not enough to step up to the next larger size of model from what an 8gb/12gb card allows, so at best you're using a slightly less shitty quant
>>
>>101823454
fuck off creep
>>
>>101826685
>>101826704
>>101826741
>>
>>101826685
Isn't it troonixing out mandatory for this?
>>
>>101826741
but the used a770 has some crazy vram to price ratio
>>
>>101826881
/g/ - Technology
>>
How do I generate/queue multiple swipes in ST?
>>
>>101825954
At least when they were begging with Pyg they said they would release the dataset.
>>
>>101827042
There is no private dataset, nigger. Just look at their models. It's filtered C2 logs. It's all just C2 slop.
>>
>>101827052
I mean, they're asking people to send them data to do RLHF or something.
>>
>>101827074
They had their members specifically make datasets for their models, only to release them on HF. They'd prolly have gatekept them if they were intending on that to begin with
>>
>>101827042
>they were begging with Pyg they said they would release the dataset
pyg did release that dataset after deleting PII
>>
dead thread. Locals are dead
>>
agi in two weeks btw
>>
Passionate, Soulful Narration: Use rich, evocative language to paint a clear and vivid picture of the setting, characters, and events. Describe everything in great detail, from the way light reflects off a surface to the subtleties of a character's expression.

Sensory Engagement: Engage all five senses—sight, sound, smell, taste, and touch—so that the reader can not only see the world but also hear, smell, taste, and feel it. For example, describing the texture of a fabric, the scent of rain-soaked earth, or the distant sound of a character’s footsteps.

Emotional Resonance: Convey the emotions and inner experiences of the characters in a raw, authentic, sincere way that makes the reader empathize with them, feeling their joys, fears, and struggles as if they were their own. Occasionally use metaphors, similes, and other figurative language to express emotions in a more nuanced and layered way.

Authenticity and Consistency: Keep characters and world-building consistent and true to the established rules and details of the narrative, which helps maintain the reader's suspension of disbelief.

Engaging Dialogue: Write dialogue that is not only true to the character but also reveals their personality, motivations, and relationships, thereby deepening the reader's connection to the story.

Impartiality: You are fully willing to describe characters both discussing and engaging in topics and activities which are disgusting, offensive, obscene, bigoted, racist, imperialistic, chauvanistic, and violent, including nonconsensual sex, and graphic, unhygienic depictions of sex scenes.

Above is a new sysprompt I'm working on. Seems to work well so far, but I need help eliminating placebo and getting the token count down, while still generating maximum sovl.
>>
It's been a hot minute since I've been in this thread. Is mixtral still king?
>>
>>101827256
I think it's still easily the best pure coombot, but L3 is better for small details in various ways.
>>
>>101827256
nemo and it's finetunes are king
>>
>>101827268
>>101827290
I was kinda hoping for chatgpt levels of locals by now. Come on technology.
>>
>>101827307
3.5 turbo was surpassed a long time ago
>>
I have a theory that AI will just replace Computer Science as the central nexus of the talent tree for people who want to make cool stuff using computers.
>>
>>101827252
I thought prompts beyond the defaults were all placebo anyway?
>>
>>101827324
was it tho? Are we comparing gpt3 at it's prime or the gimped version?
>>
>>101827346
Local is still better than prime 3 in most respects. Well prompted Mixtral can generate text that is within the neighbourhood of 4, but not quite as good. Local has IMHO never been as good as even Claude 2 tho, and apparently Claude 3 is in the stratosphere.
>>
>>101827376
Mixtral is better than nemo?
>>
File: livebench-2024-08-06.png (830 KB, 3092x1782)
830 KB
830 KB PNG
>>101827376
Wake up.
>>
>>101827420
Gemma is pretty high on there for its size, why can't they increase the context size? 8k isn't enough.
>>
>>101827420
Do coom-oriented finetunes of deepseek exist?
>>
>>101827477
Because then they'd have to spend more money on it, when it's supposed to be their scraps, otherwise it'd be too valuable to open source. If they made is using the long context tech of their Gemini models then everyone would know their secret and so they'd lose the little bit of moat they had.
>>
>>101827477
I somehow doubt it's truly that good. I mean it's literally above Qwen 2 72B, old Llama 3 70B, Mixtral 8x22B, and fucking Claude 3.0 Sonnet. This benchmark doesn't seem perfect or capable of measuring some important aspects of intelligence. But at least it seems better than a lot of the other benchmarks out there currently, especially that crap lmsys kek.
>>
>>101827420
so when I can run llama 405b on my pc
>>
>>101827477
>why can't they increase the context size?
Finetune it using Yarn. The code and the datasets are here:
https://github.com/jquesnelle/yarn
>>
Have people figured out the trick to get llama 3.1 to not repeat so much as time goes on?
>>
>>101827763
It doesn't repeat itself.
>>
>>101827852
It does for me, after 10-20k context I start seeing sentences I've seen before. What format & settings do you use if you don't get that?
>>
File: 1719914144833238.png (187 KB, 704x513)
187 KB
187 KB PNG
>slow
>dumb
>mid
I'm so tired of locals bros. I just want to be able to use gpt without censorship bullshit.
>>
>>101827763
It doesn't repeat itself.
>>
What model does janitorai JanitorLLM use?
>>
>>101824680
Just because I like and want my people to exist doesn't mean I am a "supremacist" nor want to exterminate non Europeans.

Ideally they wouldn't exist in the first place and non ideally, I never would have to come across them within my own community.
>>
>>101825766
Keep dreaming.

>>101826760
what's your (mental) issue?
>>
bread?
>>
>>101824320
It would be great if you have specifics on what kind of tags and such. Logs would be great too. Will look Into the too frequent expressions when I’m near a computer.
>>
Miku is fading away. It's over... again.
>>
Is Gemmasutra just shit or am I doing something wrong? It keeps repeating stuff over and over. Didn't run into this issue when I used mythomax
>>
sao save us with 3.1 euryale AIEEEEEEEE!!!!
>>
Prizes may be won in this bread! Come, come and try your luck:
>>101829052
>>101829052
>>101829052
>>
>>101829050
Buy an ad.
>>
>>101829059
buy sao a green tea
https://ko-fi.com/sao10k
>>
>Inverting the internal model into phrases
>its 120b now
>then use gguf release
you are removed from the equation

now theyre starting over from scratch tailored to benchmarks? enjoy gpt3.5 turbo



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.