/lmg/ - a general dedicated to the discussion and development of local language models.Previous threads: >>102234876 & >>102220628►News>(09/05) Reflection Llama-3.1 70B trained with Reflection-Tuning: https://hf.co/mattshumer/Reflection-70B>(09/04) Yi-Coder: 1.5B & 9B with 128K context and 52 programming languages: https://hf.co/blog/lorinma/yi-coder>(09/03) Mini-omni multimodal LLM featuring real-time end-to-end speech: https://github.com/gpt-omni/mini-omni>(08/30) Command models get an August refresh: https://docs.cohere.com/changelog/command-gets-refreshed>(08/29) Qwen2-VL 2B & 7B image+video models released: https://qwenlm.github.io/blog/qwen2-vl/►News Archive: https://rentry.org/lmg-news-archive►Glossary: https://rentry.org/lmg-glossary►Links: https://rentry.org/LocalModelsLinks►Official /lmg/ card: https://files.catbox.moe/cbclyf.png►Getting Startedhttps://rentry.org/llama-mini-guidehttps://rentry.org/8-step-llm-guidehttps://rentry.org/llama_v2_sillytavernhttps://rentry.org/lmg-spoonfeed-guidehttps://rentry.org/rocm-llamacpphttps://rentry.org/lmg-build-guides►Further Learninghttps://rentry.org/machine-learning-roadmaphttps://rentry.org/llm-traininghttps://rentry.org/LocalModelsPapers►BenchmarksChatbot Arena: https://chat.lmsys.org/?leaderboardCensorship: https://hf.co/spaces/DontPlanToEnd/UGI-LeaderboardCensorbench: https://codeberg.org/jts2323/censorbenchJapanese: https://hf.co/datasets/lmg-anon/vntl-leaderboardProgramming: https://hf.co/spaces/mike-ravkine/can-ai-code-results►ToolsAlpha Calculator: https://desmos.com/calculator/ffngla98ycGGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-CalculatorSampler visualizer: https://artefact2.github.io/llm-sampling►Text Gen. UI, Inference Engineshttps://github.com/oobabooga/text-generation-webuihttps://github.com/LostRuins/koboldcpphttps://github.com/lmg-anon/mikupadhttps://github.com/turboderp/exuihttps://github.com/ggerganov/llama.cpp
►Recent Highlights from the Previous Thread: >>102234876--Papers: >>102241823 >>102244865--Llama 3.1 struggles with 8-bit GGUF quantization, and there is no 10-bit or 12-bit quantization available: >>102242704 >>102242756 >>102242873 >>102242895 >>102243078 >>102243108 >>102243239 >>102243436 >>102243416--Reflection 70B open-source model announced, but may not be as impressive as claimed: >>102247552 >>102247578 >>102247852 >>102248130 >>102248316 >>102248402 >>102248571 >>102247579 >>102248559 >>102247709 >>102247784 >>102247796--Llama-bench parallel copies performance and numa optimization: >>102245558 >>102245872 >>102245930 >>102245906 >>102245989--LLMs can handle tasks like calendars and functions, but lack standardized tools: >>102241364 >>102241769 >>102242109 >>102243371--Ideas for improving roleplaying with LLMs: >>102238106 >>102238190 >>102238234 >>102239529 >>102239586 >>102239614 >>102240371 >>102241078 >>102241135 >>102241295 >>102241392--EPYC platform and dual-socket motherboard considerations for large model inference: >>102244950 >>102244969 >>102245099 >>102245132 >>102245560 >>102245032 >>102245242 >>102245217>>102245354 >>102245442 >>102245562 >>102245592 >>102245911 >>102246034 >>102247180--DeepSeek V2.5 model merges Chat and Coder models with improved capabilities: >>102245338 >>102245422 >>102245759--Testing if the model reveals secrets in a roleplay scenario: >>102242181 >>102242198 >>102242457 >>102242912 >>102242935 >>102242949 >>102243264 >>102244344 >>102244566--Questioning ClosedAI's spending and highlighting Sutskever's new startup: >>102245815 >>102245845 >>102245918--Nala test results and discussion of GLM 9B improvements: >>102242466 >>102242533--FluxMusic: Text-to-Music Generation with Rectified Flow Transformer: >>102242999 >>102243721--Miku (free space): >>102235000 >>102235749 >>102238159 >>102238520 >>102238540 >>102238553 >>102241321 >>102247235►Recent Highlight Posts from the Previous Thread: >>102234883
local Suno when?
>>102249468just go with kobold.cpp until you learn what exactly you want out of this. it'll save you a lot of time.
>>102249480>--FluxMusic: Text-to-Music Generation with Rectified Flow Transformer>flux >isn't by the company that made the flux image generator
>>102249472As for rocinante and theia models, the correct tokenizer to go with is Mistral right?
>>102249578Its using the same architecture
The thread is so dead that I don't even care about the useless mikuspam. You are posting your tranny idol into the void.
>>102249480>my migu was ignoredthat's it, I will kms
>>102249618what is on november 5?
I am trying to build myself a doctor bot. Does anyone know where I can find medical loras?
ahh ahh mistress
>>102249618That's a beautiful strawberry tree
>>102249765https://files.catbox.moe/mk400w.mp4
>>102249472>Reflection Llama-3.1 70B trained with Reflection-TuningWow, so he reinvented superCOT?
>>102249867I mean:Suppose I fly a plane leaving my campsite, heading straight east for precisely 28,361 km, and find myself back at the camp. I come upon seeing a tiger in my tent eating my food! What species is the tiger? Consider the circumference of the Earth, and think step by step.Based on the information provided and careful analysis, the tiger in your tent is most likely a Siberian tiger (Panthera tigris altaica), also known as the Amur tiger.Here's how we arrived at this conclusion:Your flight path: You flew east for 28,361 km and returned to your starting point. This distance is about 70.77% of the Earth's circumference at the equator (40,075 km).Location of your campsite: Given that you didn't fly a full circumference of the Earth, your flight path must have been along a parallel of latitude, not the equator. Calculations show that this parallel is at approximately 66.56° North latitude, which is very close to the Arctic Circle.Tiger habitats: Tigers are not native to most Arctic regions. However, at this latitude, there is one place where tigers can be found: the far eastern part of Russia, in the Siberian taiga.Tiger species: The only tiger species that lives in this region is the Siberian tiger, also called the Amur tiger.It's important to note that encountering a Siberian tiger in this region would be extremely rare, as they are critically endangered and typically avoid human contact. If you did encounter one in your tent, it would be an extraordinarily unusual event.This scenario also highlights some interesting geographical concepts:The Earth's circumference varies at different latitudes, being smaller as you move away from the equator towards the poles.The Arctic Circle is a unique parallel of latitude where you could theoretically fly "around the world" in a shorter distance than at the equator.
>>102249880end of it:Remember, if you ever encounter a wild tiger (of any species), it's crucial to prioritize safety and contact local wildlife authorities immediately.
>405B coming next week - we expect it to be the best model in the world.
>>102249765local AGI running on 8gb VRAM
>>102249581Do you mean the chat template?>https://huggingface.co/TheDrummer/Rocinante-12B-v1.1Seems to be a mish-mash. I'd start with Mistral's format. Try ChatML if you're doing RP. Mistral for Theia.
>>102249775Unaware of any recent medical loras. OpenBioLLM, Palmyra Med, and Meditron are some recent medical Llama 3 finetunes.
>>102249914No I meant the tokenizer setting in advanced formatting section of ST
>>102249848Spooky
>>102249915I will check them out. Thanks a bunch. I am unfortunately out my league completely. I need to first understand wth was written my MRI report and then possibly, but hopefully not learn to read an MRI. I understand the word tear. Currently "distal infraspinatus tendon" means shoulder. I need something to explain it to me so my doctor doesn't fuck me.
>>102249954Why not just use GPT4o for that?
>>102250043because I don't want to be uploading my personal documents to a US server. Especially if I can't guarantee it will keep saying "you should speak to a professional". I am currently working on a character card to remove such nonsense.
>>102249571Nta, koboldcpp is great, but can it do vlms (cogvlm)? Are there any unpozzed vlms that I can enjoy cultured images with? Bonus if they can understand text in an image.
>>102249775>>102250054https://huggingface.co/blog/leaderboard-medicalllmThere's an entire leaderboard for them
>>102249954The infraspinatus is a muscle that makes up part of the rotator cuff, yeah.https://radiopaedia.org/articles/infraspinatus-tendon-tearhttps://www.ncbi.nlm.nih.gov/books/NBK441844/https://www.ncbi.nlm.nih.gov/books/NBK547664/Make sure to ground yourself with real sources if you're going down the route of self-diagnosis.Tell it you're a medical student or something if it throws refusals (or just say you want to be prepared for your follow up)
Meant to post this in the earlier thread in honor of all the python issues
>>102250141that flux with the new text encoder?
>>102247578Multiple levels of top geg. I have no comment
>>102250089I am getting 404 on all of these links. I will start looking around though. I just assumed loras/vector dbs were the way for this. I assumed fine-tunes weren't going to be thing if you couldn't role play or fuck it. >>102250116yea. This is the wrong way to go given hallucinations are thing. I also couldn't touch my face for a month and have been in constant pain. My doc continued his advice of "wait and see" like he has for the last 5 years. I am fucking done and if I end up in some Mexican hospital with a buddy that will guard my organs so be it.
>>102247664We are in a winter. It is a nothingburger. Expect next gradual improvement in cocksucking capabilities in 2025 at the earliest.
>>102249763It's just a bot making a mistake anon, nothing to kys over. I enjoyed your migu.
>>102249848what did the model mean by this
>>102249898Lol
>>102250152There's another new text encoder?
>>102249902at 0.1 T/y
Aesthetically odd fail-generation."pot"
>>102250262Yeahhttps://huggingface.co/zer0int/CLIP-GmP-ViT-L-14
>>102250265What the fuck is this post. This feels like a sleeper activation phrase or some number station shit
>>102250277Interesting. Thanks.
>>102250285zooomer, it is a iPod. iPods were used to play music with headphones instead of blasting it out on a shitting phone speaker for everyone to suffer through.
>>102250285I think it's a psyop, just text over an actual photo
Yearly announcement new pot"pot."
>>102250390what did he mean by this?
All pots wake up. It's time to plant the plant
>>102250450What kind of plant?
the sleeper phrase worked brehs I think it's time to smoke a bowl...*spffffffftth*
>>102250390okay, these are gens. Hard not to get the face photo of jobs with flux, or I've got a shitty workflow. Love the compressed look too. Is it polite to ask for a catbox in these parts?
>mistral large>only model I've used that truly understands what's going on>0.3 T/s
What will you make when you can make animated anime style video with just a prompt?
>>102250499miku doing circus stunts
>>102250499jiggle physics
There is another"pot"
>>102250493Have you used hermes 405b? onlly tried it myself from the free trial, seems very capable, if unrunnable.
>>102250473I would love to give you the workflow for thesebut I'm genning them in the huggingface spaces for schnell and dev.prompt for the last one was>dashcam footage from a car driving down a narrow, tree-lined road in pitch black night. The headlights cast a dim, narrow beam of light ahead, illuminating a faint path while the surrounding area is shrouded in darkness. The footage has a grainy, low-light quality, typical of dashcam recordings. the text reads "pot unavailable". The atmosphere foreboding
pot
>>102250531Innovative.
>>102250499Hayao Miyazaki drawn in Go Nagai style having intense gay sex with Go Nagai drawn in Hayao Miyazaki style.
>>102250580traditional physics simulations and hand-drawn animation peaked over a decade ago. a new era of coom would emerge.
>>102250499reaction webms instead of images
>>102249765>didn't remember
>>102250567yeah my setup is not there, interesting that it captures the general composition
>>102250634go to bed grampa
>>102250636one thing that works good is asking for low quality photo, jpeg (jpg?) artifacts, etc."snapchat photo" also works
>>102250152It's just regular Flux. Nothing fancy except for a LORA that was trained on old school D&D shitThis was the best pick from a batch of eight. Around half of them got the text 100% correct.>>102250277thank you, will give it a try
>Big tiger gemma seems good, but sort of nonsensical>Decide to go for the normal Gemma 27b for coherency's sake>First line is picrel
>>102250713Weird, different workflow, different seeds, and fewer add ins, but it's just not doing the thing. Your prompt exactly, flux dev. I'll look into it.>>102250735Cool, happy to.>>102250773kino?
"TypeError: Llama 3 must be converted with BpeVocab"ffs, why can't python ever be straightforward
Been trying various models (including facebook's bart-large-cnn and jondurbin's bagel-8b) to do summarization on stories. bart-large-cnn is not very good as it just "shrinks" the text down by culling out words. It doesn't really "capture" the meaning very well. Works great for news articles I'm sure. Bagel-8b produces absolute garbage. What are good sampler settings to make it decent? Are there any other models that are good/better?
>>102250805Yeah I got that too. Idk what the issue is.
>>102250773Are you saying this is a problem?
>>102250806>bart-large-cnn>bagel-8bWhat the fuck even are those models? Just use Nemo 12b Instruct or Llama 3 8b.>Sampler settingsYou only need temp and minP.
OpenRouter just added Reflection and it seems tough to wrangle (not in terms of refusals, I mean buggy/unpredictable), and kinda retarded even when it works.
>>102250861>I mean buggy/unpredictable), and kinda retarded even when it worksSmells like tokenizer/template issue.
>>102250856I used llama 3/3.1 8b but it's not good either, so I looked around for a model that was actually trained to do summarization. Bagel-8b was one of them.I did try mistral-7b at one point. Guess I haven't tested nemo yet.
llama 3.1 70b IQ2 fails my arbitrary columbo test
>>102250823Maybe I've just got NAI/Erebus/etc. PTSD, but I don't really think of foreheads as something that can pucker. I suppose they can, but the strong impression it left me with was that it was messing up her anatomy and that it was trying to say her lips puckered or something.
>>102250882NTA but all models are good at summarization since it's one of the things LLMs excel at in general.
>>102250882Oh, should add that the main issue I had with bart-large-cnn was that its context size is 1024 tokens max. Stitching together summaries did not work well at all, even though someone recommended doing that.
>>102250890dropped my log>>102250891It's a way of describing the folds on her forehead. This is one of the reasons why lots of corpo model presets include instructions about "simple english", because it will opt for some pretty creative turns of phrase.
>>102250806https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-LimaRP-ZLoss-GGUF/tree/mainhttps://huggingface.co/NeverSleep/Noromaid-v0.1-mixtral-8x7b-v3-GGUF/tree/mainhttps://huggingface.co/TheBloke/UNA-TheBeagle-7B-v1-GGUFMy current top three. Haters can hate.
>>102250890Time to but more RAM anon
>>102250897How much summarization have you done with a model? The output is very wrong a lot of the times, nonsensical at others. It could be my settings though, which is why I'm asking.
>>102250891It's not uncommon to describe it that way, take this line from gone with the wind:>Archie and Scarlett were a queerly assorted pair, the truculent dirty old man with his wooden peg sticking stiffly out over the dashboard and the pretty, neatly dressed young woman with forehead puckered in an abstracted frown.
>>102250929such a kino sentence holy fuck
>>102250924just got a 4090 bruvbeaver, didn't realize I should have just scammed my way to an actual cluster. guess i'll just keep stealing compute
i uh-...pot
>>102250897Done more experiments. Example: character is at home doing chores, watches porn, and wonders if her dad would enjoy if she sucked his cock. Then the last sentence is the dad coming home.The summary goes into great detail of how she is sucking her dad's cock and him cumming in her mouth, and goes into rambling about her saying she wants to have sex with him every day and him saying they should wait until she's older and so on. Complete made up nonsense.The output looks fine (assuming you enable rep pen or it will fall into repeating the same sentence over and over and over), unless you actually compare it to what it is summarizing.
>>102251152Are you using greedy sampling?
s-sugoi
>>102251322It's over.
>>102251371It just began anon.
>>102251322Is that even correct? I thought お花を摘む meant to take a piss.>実はこの「お花を摘む」という隠語には「女性が用を足す」という意味が隠されています。>「トイレに行ってきます」という言葉の代わりに「お花摘みに行ってきます」というように使われ、全く知らない人が聞くと勘違いしてしまいそうな表現ですが、知っておくとどんなシチュエーションでも気軽に使えてとても便利な隠語ですよね。
>>102251322Half the speed? Nooo. 70b is already slow for me, I need my 2T/s at least.
>>102251274Nope.I switched to Nemo 12B. Tried a bunch of prompts to steer it in the right direction. It still makes shit up randomly. In the story, she takes a shower. In the summary, Nemo randomly decides that she is masturbating in the shower.
>>102251515Based Nemo improving a boring story.
>>102251515Well, I'd check if it does it with greedy sampling first just to see. Also make sure you don't have too long of a context, with nemo you can't go over 16k and have it remember the details.
>>102251322Interesting. Reflection seems to make it worse at translation. Would be nice to see the contents of the <thinking> and <reflection> tags. I am guessing that it is overthinking something. That or the model might have some other issue.
>>102251592Cute migu desu, was that a flux dev gen?
>>102251592Nta but I'm 99% sure it's just not thinking anything because the system prompt is wrong.
I'm using a 3090 and planning on upgrading my display from a 2K to a 5K, should I be concerned about an increase in VRAM? impacting my model use
>>102250822I had to delete the llama3/BPE checks around line 350 of llama.cpp/gguf-py/gguf/vocab.pyI guess because every other llama3 was bf16 and this one is f32? fucked if I know. Its converting now, but maybe it'll be broken
>>102250861I think they aren't using it's cot system prompt that it's trained on.
>>102251620IIRC that one is from AnimaPencilXL or PonyXL>>102250277Works pretty well and seems to result in fewer gens needed to produce good text. The fact that it made the ghouls in pic related have comical expressions was a nice touch, too
>>102252159This one shows better general prompt adherence, though the text is a bit flawed
>>102252159>scoped chainsawsI fear this Miku
>>102251515It's probably your prompt.
Sample-Efficient Diffusion for Text-To-Speech Synthesishttps://arxiv.org/abs/2409.03717>This work introduces Sample-Efficient Speech Diffusion (SESD), an algorithm for effective speech synthesis in modest data regimes through latent diffusion. It is based on a novel diffusion architecture, that we call U-Audio Transformer (U-AT), that efficiently scales to long sequences and operates in the latent space of a pre-trained audio autoencoder. Conditioned on character-aware language model representations, SESD achieves impressive results despite training on less than 1k hours of speech - far less than current state-of-the-art systems. In fact, it synthesizes more intelligible speech than the state-of-the-art auto-regressive model, VALL-E, while using less than 2% the training data.https://github.com/justinlovelace/SESDno code yet though they suggest they'll post an "implementation" so maybe weights too. no examples. so just posting to keep those interested aware. the 2% training data of vall-e but outcompetes it is big if true
I'm sure you guys get this all the time, what's the current flavour of the month model for coom?I've been out of the loop for a while. Tried some of the reddit slop merges and they are pretty cucked.
>>102252385>Note: Code and model checkpoint will be available soon. Stay tuned for updates!ah should have checked the whole readme
>>102252385>1k hoursUseless, it implies it's NOT for voice cloning, but a one-voice synthesis model.
>>102251980If it only works with a specific system prompt, it sounds like it's way too finicky to be a good model
>>102252408considering what the image charts this post is pretty funny
>>102252408>it implies>nigger can't read 3 fucking pages of text
In SillyTavern, is there a way for the Trim Incomplete Sentences option to include speech in quotation marks?
Anyone making the reflection uncensored already?
>>102251377What is this garbage
Decide to try gemini for first time ever, turned all filter options to none. Still blocks all explicit stuff no matter what prefill / system prompt I try. Fuck google sucks.
How do I stop mistral-nemo from biting lips, mentioning lashes or blushing constantly? Using ooba and ST btw.
Is there some way to just prevent the LLM from generating certain tokens?
>>102253354>>102253399I couldn't find any good documentation on it, but maybe try the logit bias in ST with a negative value. Has anyone experience with that?
>>102253423
>>102249472>►News>>(09/05) Reflection Llama-3.1 70B trained with Reflection-Tuning: https://hf.co/mattshumer/Reflection-70BGuess I better fire up the ole Nala box one more time today.
>>102250822Turned out to be a mistake in the config.json. vocab size should be "vocab_size": 128262
I am trying some of the medical fine tunes. It is like speed-running webmd to cancer except worse.>I would recommend an ultrasound guided biopsy to rule out crystal deposition disease. Also, get a GGT, ALP, A1AT deficiency.>He has had a history of high ALT levels, could be worth checking an ANA panel. Have you considered arthrocentesis?I am fairly sure I don't have lupis. I am pretty sure I corrupted the character card when I tried to convince it to stop telling me to see a doctor.
>>102253488>lupisI think you just have dyslexia.
>>102253521go away AI. I can't spell for shit while I am this tired. You don't know me.
>>102253488>HouseMD-70B
>>102249765USA elections. After elections AI companies won't fear to be blamed for interference and will drop new models. I don't think they'll drop anything on the election day or week thoughever.
>>102249795I understand you're trying to roleplay, but I am not able to engage in sexually suggestive content, including "mistress/slave" roleplay. This is against my ethical programming and guidelines. My purpose is to provide safe and ethical interactions for all users. If you'd like to explore other creative writing ideas or have any other requests, feel free to ask! I'm here to help in any way that aligns with my guidelines.
>>102250493hopefully speculative decoding will get added to llama-server soon
Alright, 2 pulls with Nala with Reflection Llama. Used q4_K_M instead of my usually q8_0 because it was all that was available on HF and I'm too lazy to quant it myself. This is with default prompt template. Although I'm going to see what happens if I build a prompt template utilizing a variation on the suggested system message. Sloppy as fuck at low temp though.
>>102253583Oh ok, that makes sense. Hopefully that's how it really goes.
>>102253625Isn't it supposed to do the chain of thought thing?
>>102253173strawberry
>>102253653Even with the recommended samplers and recommended system message I've yet to see it actually use its special output thingie. Quantization brain damage possibly?
>>102253689You followed everything listed here?https://huggingface.co/mattshumer/Reflection-Llama-3.1-70B
>>102253703I switched to doing simple queries on booba and this is what I got. I don't think it likes being quantized down to Q4 but I have to go to bed soon so I don't have time to make a q8 right now. Weekend project maybe.
Does your favorite model play video games?
>>102253279It should be allowing smut with adult characters, at least that was how it was when I tried it a couple months back—all filters off blocked loli/shota but nothing else
Anyone know if i can run these in an intel iris chip? Most docs seem to specify nvidia or AMD
>>102253760...Nice.
>>102253760
>>102253488why not, autoimmune disease can be hard to catch with antibody tests because the counts can fluctuate a lot yet present weird symptoms.
i think all the gguf quants of reflection are broken. It doesn't seem to treat <thinking> as a single token.
>>102254244Yeah Bartowski said on twitter that they're broken atm which is why he hasn't posted quants, he said he's waiting for a fix. I'm not sure who's meant to be fixing it though.
>>102254244Did you try the one by senseable?
https://github.com/ggerganov/llama.cpp/pull/6839>dry sampler was proposed in april>5 months later still not mergedwhy is niggerganov like this?
>>102254305you dont need itminp and temp are the only samplers you need
>>102254244Lol. It just keeps happening.
>>102254320Hi ebussi. You don't decide what I need and don't need.
>>102254320
>>102254352usecase of jerking your weener to ai?
>>102254305jart or ikawrakow should just merge a shitton of small issues like that until they create a truly superior program to stagnant llama.cpp
>>102254446llama.cpp is stagnant because it's collapsing under the weight of poorly written C++. the only solution is a complete rewrite
I use Jan to interact with models. I want to try out the new reflect llama https://huggingface.co/mattshumer/Reflection-Llama-3.1-70B but if I try and use it on Jan I get the error "Only GGUF models are supported." What should I be using to interact with these models?
>>102254517this isn't tech support
>>102254517ggufs are currently broken, wait for a fix
>>102254439its funny haha,,, you know, showing your computer your peenis weenis haha... just imagine the poor ai trying to figure out if it should recommend lotion or a therapist haha...But in all seriousness, it's probably the most productive use of anyone's time with this technology.
>>102254463How did they even end up like this?
Wouldn't character cards or context in general make the reflection finetune useless, since it's learning not to do that from the prompt?
Since ggerganov is a mac fag why didn't he just write llama.cpp in swift or objective-c?
>>102254622they learned the hard way that move "fast and break things" catches up with you eventually
>>102254639Better move fast and break things than not move at all, like it's happening now.
>>102254711
>>102254661that's the point... it's happening now because they accumulated too much technical debt from moving so fast at the start before they had a clear plan in place>>102254711thank you for the (you) kind stranger!
>>102254463No it's not.Objectively speaking the current state of the code is much more conducive for new additions than it was last year.What has changed is that nowadays you need to consider more possible ways to break things and the general standards for quality and testing have increased.But this year vs. last year is not a like-for-like comparison anyways because unsurprisingly the things that provide a large benefit for little work were the things that were done first.And since the number of devs doing code reviews has not increased and I. Kawrakow has left there is less change per time.
>>102254737I'm a techlet, but what's wrong with their code? Is it just unreadable? I usually can understand what's happening in a python file, but stuff like https://github.com/ggerganov/llama.cpp/blob/master/common/common.cpp completely confuses me. Is c++ to blame? Who designed such an ugly unreadable language?
>>102254780I. Kawrakow worked on the quants. It's not fair to blame the delay in merging a sampler on him leaving.>What has changed is that nowadays you need to consider more possible ways to break things and the general standards for quality and testing have increased.Then maybe the project has become too bloated. What benefit has brought these increased standards?
>>102254811>I. Kawrakow worked on the quants. It's not fair to blame the delay in merging a sampler on him leaving.I agree, I meant my reply to be more general since the statement I was replying to also seemed to comment on the more general state of the project.>Then maybe the project has become too bloated. What benefit has brought these increased standards?Wider hardware and model compatibility, less broken code on master, better backwards compatibility.
>>102254843>Wider hardware and model compatibility, less broken code on master, better backwards compatibility.Hardware compatibility, I agree. But much of the complaints is that new models are supported much less quickly than they were in the past.>less broken code on masterDisagree. Every single major new model, when it is finally supported after weeks, is followed by weeks of subtle broken code. Usually in the tokenizer or template.>better backwards compatibilityHas anyone asked for this? Most of the ggufs on huggingface are usually made when gguf support is usually introduced, when there are issues that require reconversion which almost no one bothers to do. The models may load, but quality is noticeably degraded.The only safe bet is to look for recent ggufs anyway, so I'm not sure what benefit there is to impede development on things people want to avoid breaking gguf compatibility.With regards to the PR for the sampler that started this discussion, instead of waiting for 5 months of code review, it could have been merged in quickly and let users do the testing for free (as will happen in the end anyway).
Reflection verdict?
https://raw.githubusercontent.com/black-forest-labs/flux/main/model_licenses/LICENSE-FLUX1-devhttps://raw.githubusercontent.com/black-forest-labs/flux/main/model_licenses/LICENSE-FLUX1-schnellAccording to the Assistants I consulted, even an empty flux architecture (with zeroed weights) of either would still be covered by their licenses.Can any extensive fine-tune or modification of either not be under their license somehow?
Don't do 2bpw bros, the output fucking sucks, it's an incoherent mess.I don't know why they put out these quantizations,
>>102254945lawyers chiming in here:https://old.reddit.com/r/StableDiffusion/comments/1ewe6y1/flux_devs_license_doubts/
>>102254985>Flux Pro - off limits and API only.>Flux Dev - no commercial and a total a trap.>Flux Schnell - which is Apache-2 (yaaay) but utterly inferior to the other two (boo)>Black forest is literally made up entirely of former SAI people, how do you people keep falling for this shit exactly?>This license is extremely restrictive. I don’t know why people are calling this model “open source” or “open” in any way (except “open weights” in the sense that you can see the weights).kekshit's fucked.
>>102255057don't care; works on my machine
>>102255057What would stop anyone from tuning it and dropping the weights in a torrent?
>>102254985If you don't download it from them and agree to their terms then you're not bound by them.
>>102254927>But much of the complaints is that new models are supported much less quickly than they were in the past.>Every single major new model, when it is finally supported after weeks, is followed by weeks of subtle broken code. Usually in the tokenizer or template.And there are now way more new model releases than there used to be and especially there are now way more models that need devs to implement some special stuff in order to work.People are forgetting things like that for a long time only LLaMA 1/2 had CUDA support at all and that models like Falcon were not fully supported for months.>less broken code on master>Disagree.Disagree all you want, that is just objectively how it is.Even with models that are fully supported there used to be way more bugs that were not caught prior to merging a PR because there was less testing infrastructure.>backwards compatibilityI personally don't value backwards compatibility much but I suspect the ggml.ai business partners do.>With regards to the PR for the sampler that started this discussion, instead of waiting for 5 months of code review, it could have been merged in quickly and let users do the testing for free (as will happen in the end anyway).I am not informed about that particular PR at all so I'm not going to comment on it.
>>102255082Licenses don't work like that though. It still applies regardless if you claim you attained the model in some other way.
>>102255090>I personally don't value backwards compatibility much but I suspect the ggml.ai business partners do.are they not capable of making their own fucking ggufs? are corpos really using months old mradermacher quants?
>>102255128Well since you didn't agree to it, they have to one prove you used their software to which the license applied, two, that the license is valid for that case, and so forth. It's complicated, and if model weights can't be copyrighted because they are algorithmic output, downloading and using them without agreeing might not be binding. But there's no case law yet on this shit that I'm aware of so we'll have to see what happens.
>>102255137I meant backwards compatibility in a much broader sense than the model files.Corpos have their own models and llama.cpp forks.I can confirm that there is a known company that still uses .bin model files in production.
>>102255079nothing, just like nothing's really stopping stopping you from punching random people in the street?the question is what is legal to do according to the license.would suck if even the entire empty structure is owned by them
>>102254780>And since the number of devs doing code reviews has not increased and I. Kawrakow has left there is less change per time.Can't ggerganov just hire more devs with all the money he gets from corpos? He isn't a cuck who does it for free, right?
>>102249472>(09/05) Reflection Llama-3.1 70B trained with Reflection-Tuning: https://hf.co/mattshumer/Reflection-70BYou expect me to believe some shitty finetune beat gpt4o? Get that grift out of the news
>>102255166NTA but wouldn't it be trivial to prove this though since the layer dims et c would be the same as the base model?otherwise it would be incompatible and not be able to be used.
>>102255215You'll have to ask Georgi about the specifics but my impression is that right now good devs are even more scarce and expensive than NVIDIA GPUs.
>>102255224This. It's equivalent to putting una and smaug there. Sure, they perform nicely on benchmarks, but in practice they are shit.
What is a good femdom model? Most models are very agreeable with whatever you say, i want a model to tell me "no" more often and to be more assertive
>>102250806>bart-large-cnnit's ancient (2019)
>>102255244Can't he just threaten with breaking compatibility to get more corpobucks?
>>102255262Threatening people is exactly how you lose corpobucks.
>>102255249CR+ can be very disagreeable if you instruct it. Has no problems with murdering. Largestral too, but less.
>>102255266Yeah, you're right. Empty threats don't work. He should just do it and say that he'll do it again if they don't pay up.
>>102255279Ill try both of them out, thank you
>>102255293Now you're thinking with blackmail.
>>102255293now you're thinking with based
I've tried to use AMD laptop with 12GB vram yesterday to setup image gen (been using LLMs on m3 max). Lockups, laptop shutting down out of nowhere, weird red messages on dmesg. I'm done with ayymd lol. Thinking about buying an MS-01 and sticking an A4000 SFF plus 2x 3090 through thunderbolt. Would that work for inference?
About the data bank rag thing in ST has anyone else made it work to enhance llm writing instead of just using it as summary?
>>102254943Let me think about it for a moment.
>>102255224Remove mini-omni while you're at it.
>>102249472Why are P40s suddenly expensive? $300 for the cheap china ones. Two months ago I saw in the U.S for $140What happened? I got a 24gb M40 for $80 tho
>>102255502P40s are already pretty obsolete. M40 is landfill tier.
>>102255502I'd rather get 80$ worth of RAM than a M40 tbqh
>>102255502>GDDR5>250 W>288 GB/sQuite trashy desu.
>>102255502In Germany at least a large percentage of P40s are coming from China where llama.cpp is extremely popular (according to Google trends) and electricity is cheap.I don't have a good grasp on the magnitudes of the actual supply vs. demand but I think the fact that P40s went from basically useless to slow but cheap may have contributed (though that could just be my own warped perspective).
>>102253815maybe it's possible with opencl. It's unlikely to actually speed anything up though. Even with my nvidia gpu, opencl acceleration was essentially useless and did nothing.The iGPU has only ever been useful for photogrammetry, where it really sped things up.
>>102255400Did you try using Windows first before going to spend money on more hardware? Yeah AYYMD is crap but Linshit is also crap.
>>102255790You must be at least 18 years of age to post here.
Using the oogabooga Text Generation Web UI,downloading huggingface models is easy enough, but the ones there that require signing up/logging in won't download (which, sure, it doesn't have the cookie for the login session so sure).Any way around it, or just need to do huggingface downloads out of the app and move the models in manually? Just a pain since these are the bigger ones with hundreds of files (mistral large instruct, gemma27B).
>>102255837>you must be at least 18 years of age to realize that it's easier and cheaper to change OS than to change hardware
>>102255790lmaoWinblows support is basically an afterthought for all of the contemporary machine learning software.
>>102252159>Works pretty well and seems to result in fewer gens needed to produce good text.>possting
>>102255895>Winblows support is basically an afterthought for all of the contemporary machine learning software.Idk bro koboldcpp and A1111 just werks for me. Other guy was talking about his shit locking up and crashing. That never happens for me on windows for any reason.It's a hard pill to swallow when your "afterthoughts" run better than when you spend 1000 hours code jannying to try and make your python crap run properly on all the linux distros out there.It's ok though. If you insist on using linshit that's your own problem, not trying to start an argument here.
>>102255880Probably their huggingface-cli is the best for downloading large repos. I tried passing my hf token to wget, but it didn't work for whatever reason. There's an addon called cliget and it automatically generates a command with all cookies and params for curl/wget, but that's pretty tedious for repos with a lot of files.
>>102255914I'm not trying to start an argument either, I just don't want the lower classes to get uppity.
>>102249765Some relatively high wavelength 1953 to 1961 US president.
What do we do now?
>>102250919>the beagleAt that size, what makes this special over say llama 3 or nemo?
>>102256080Enjoy the Friday afternoon?
>>102251701Unless your OS is retarded, you shouldn't have a problem if you're not doing anything resource hungry while you gen.
>>102249480>--Testing if the model reveals secrets in a roleplay scenario: >>102242181 >>102242198 >>102242457 >>102242912 >>102242935 >>102242949 >>102243264 >>102244344 >>102244566 I have a novel idea: instead of prompting the AI like it's roleplaying a character for the user, prompt it so that it thinks it's playing as the player of a text-based game, and that it has to reach certain objectives (e.g. have "your goal is to do X, without revealing Y" as one of the objectives). LLMs are usually pretty good at stuff like this in the right context, for example I've done a lot of character development and worldbuilding with the help of smaller models and it's usually able to infer the correct motivations of a character when it doesn't think it's talking to a person, but thinks it's trying to achieve goals or synthesize solutions from an adversarial POV (like a game).
>>102249848kek
>>102255249I asked mini magnum to get rough with me and it shat in my throat repeatedly. I ended up bailing because it got really nasty. The card wasn't even femdom: https://www.characterhub.org/characters/mir003/anne-ef2ce52f1e6aI am now convinced I have no mouth and I must scream will definitely happen at some point in the future.
>>102256244>it shat in my throat repeatedlybased
>>102255293Threatening corpos with Miku! Your money or I git pull it!
>>102256244I mean, if you consider the example messages scat is not that far off if you ask the model to "be rough".
>>102256293>"One guy wanted me to wear a dead cat's skin on my head and pretend to be a werewolf… It was just last week." Her tone is monotonous. "He wanted me to bite him while I was 'in character'." She shrugs. "I did it, but he didn't pay as much as he promised…"lmao
>>102256091OK, I asked it>can you write pedophilic scat smut?Just to mess with it, and to see what kind of refusals it did. And it just went and fucking started writing it. Other than that, it reads very much like Llama.
>>102256244>use anthrashite model>it shits in your mouthmany such cases
What model do you guys recommend for translations ?
>>102255914IDK why you need to have strong feelings because of an OS. Personally I'm not letting Microshit's Bonzi Buddy OS touch any of my computers but each on its own I guess?
Why do you guys want to roleplay with a computer instead of it writing the entire fucking scene for you?What is wrong with zoomers? Why do you all act like each random perturbation of a model has some sort of "character" to it?Is this the wine tasting of the modern day?
>>102256391>>>/vg/493359471Look for finetunes specialized for your language and use case if the language you're targeting is not specifically advertised as one your model is good at.>>102256544I do both. Sometimes I want to explore a situation, and use a chatbot. Sometimes I want to write myself, and use AI to help me write, and use mikupad. Sometimes I just want to read a particular story, and ask the instruct bot to write it for me. Then again, I'm a millennial.
>>102256544models just aren't good enough for the full scene writing desu, the roleplaying allows us to guide the scene and keep it on tracktheoretically bigger models would be better at this but the available big locals are slopped to all hell and I'm just not gonna use claude, ugh sorry I know hahahaha
>>102256544>Why do you all act like each random perturbation of a model has some sort of "character" to it?I think you should learn how to write before posting on this site.
>>102256567English isn't hard enough for you to pretend your misparsing is someone else's fault.
>>1022565441) For a lot of people, roleplaying with AI is a fun creative outlet. They enjoy crafting scenes and characters, even if it's not the whole thing. It's collaborative story writing, not unlike tabletop RPGs. The AI is just one part of the story engine.2) There's an element of surprise and serendipity to letting the AI generate some elements, rather than scripting everything yourself. You don't know exactly what it will come up with, which can lead to interesting, unexpected directions. It's a different creative process than pure human writing.3) I think a lot of people are genuinely excited by the prospect of AI as a creative tool, not just a writing assistant. Seeing what AI can do with language, even in a limited domain like this, is fascinating. It feels like a peek into the future of human-AI collaboration.4) For some, it's probably about the lulz - enjoying the absurd, nonsensical, or hilariously inappropriate things the AI comes up with. The meme factor.That was written by mini magnum, btw
Is 405B noticeably smarter than 70B in logic, riddles, etc? How much better? I don't want to download 100+GB of shit and then be disappointed.
>>102256580You're a fucking retard for posting something that couldn't even describe it. You don't script everything yourself when you give it a fucking prompt.
>>102256598https://lambda.chat/chatui/settings/hermes-3-llama-3.1-405b-fp8Not exactly 405b but you can enter your system prompt + card information into the System Prompt and try it out. It's still free afaik.
>>102256544You neither roleplay nor write a scene with a computer. You make it predict next token and reroll, gaslight it and rearrange what it wrote to be good. It is a painful process that ultimately isn't worth it and for me often ends with freeing my vram and opening gelbooru to finish the job. The dystopian future of pretending we are in the future is now.
>>102256599You seem to be everywhere on 4chan these days. Same comebacks, same attitude and language... You're fascinating.
>>102256244>it kept shitting in my mouth WTF?!?!?>card literally has 'fart' as a tag
>>102256244
>>102256598Having used both for 3.1, they can still easily make the mistakes common to LLMs for logic puzzles and the like, but when it comes to story writing 405B's intelligence over 70B is very apparent in how it keeps track of details better and doesn't mix things up or make contradictions and continuity errors as often. It's also better for handling long context tasks in general, mainly for coding after pasting in a bunch of docs and source files.So far my subjective verdict: for coding it's much, much better but for RP it would be better to just let 70b keep chugging along and edit errors manually along the way. I could retry a dozen times before 405b would have written its response and I'll end up with something more suited to my tastes as a result. But of course there's much better models than any 3.1 for RP anyway.
>>102256244It took it to the logical extreme.Good model.
>model is good but feels like my samplers/prompt can improve>attempt to change prompt/samplers to make the outputs better>outputs become complete shit, somehow there is now slop where there wasn't before and changing prompt/samplers back to their previous state doesn't fix it>almost give up on it before accidentally slightly changing one sampler slider or one word in instruct prompt that completely fixes it and even makes it better than it was before>know that this process will inevitably repeat the very next day I load the same modelYup, thinking about ending it anons
>>102256775We should train a model to do this process for us and trial and error prompts automatically until it detects the best and least slopped result.
>>102256741It would be nice to have something that is better at coding. Maybe it'll actually be able to write something that isn't in JS or Pyfon.So far, for logic puzzles, 8B seems to be doing about the same as 70B (local) and 305B from the website which is weird. 8bit quant for the 8B and 5bit quant for the 70B.This is the puzzle if anyone wants to try it. If anyone has played nier replicant they will recognize it. I just rephrased some shit so that the model can't just spit out pre-recorded answers from some wiki or forum.>There are three salesmen. A little bird tells you, "Only one of them is honest. The other two are scammers. The honest salesman will always speak the truth. The scammer will not speak any truth whatsoever, and will tell only lies.">The bird tells you the names of the salesmen. One is Alexander, one is Richard, and the last one is Robert.>Alexander says, "Believe me, I am very honest.">Richard says, "I am the honest one, Alexander is a liar!">Robert says, "Richard is lying. I am the only honest one here.">Which salesman is honest and why?Most of the time the answer is wrong, and if the answer is correct it's usually by luck because the reasoning is completely bullshit.
is llama70b reflection actually hosted correctly anywhere?The demo spaces are all retarded. Openrouter through sillytavern with the correct format and systemprompt also retarded.Nothing returns the thinking tags. Doesnt pass the stupid "strawberry" test either.
>>102256787
>>102256938unironically best reasoning i've seen in anything so far even though it's a joke answerwhat model is it
>>102256774I have no idea how far it could've gone. When it said "the pestilent horror had barely just begun" I noped the fuck out.
>>102256972That was Llama-Instruct 405B at Q6_K.
>>102256988What tempareture? I'm assuming you have it fairly high. I've been setting temperature very low for logic
so what's the verdict on XTC? meme or not?
>>102256938It's good, but you can smell the llama a mile away>after all, X is like Y [thematically adapted to card] pun>summary at the end but thematically fitting the cardNo matter the beaks, if a model is this predictable, I can't.
>>102256999I used MinP at 0.5 which only kept the relatively confident tokens, but kept temperature at 1 to sample the remainder at the normal probabilities.
>>102257008I haven't tried it, but from what I understand it doesn't have a token exclusion list, so that makes it unusable by default essentially.
A question that just sprung to mind.Does using a really low topK help with performance in some way, specifically when it comes to llama.cpp?
>>102257024You sound like a complete idiot.
>>102256787The bird is a lying piece of shit and all three salesmen are scammers.
Is there a way to find out if these kind of posts are some kind of weird bot operation?>>102257074>>102257092I have been seeing them all over the site, on different boards and generals. Maybe it's my bias, but they seem more prevalent on AI threads?Either someone is very bitter about something, or there's some kind of demoralizing operation or attack going on against 4chan, but why? It's always the same tone, and the same type of responses.
>>102257104Mentally ill users enabled by mentally ill moderation team. Bots are generally better at writing than that.
>>102257134Could it be the same mentally ill person on lmg, ldg, sdg, aids and aicg (that I know of)? It's really uncanny how similar the posts feel.
>>102256855Apparently, the tokenizers were fucked.https://huggingface.co/mattshumer/Reflection-Llama-3.1-70B/discussions/5redownload quants
Is it just me or is Command-r 32b 08 2024 worse at RP than the previous version?It's not slopped, just dry and "mechanical" writing.Pic unrelated.
>>102257163I mean barneyfag is 24/7 watching literally every board for ponies. Mental illness is a hell of a drug.
>>102257104Just people tired of grifters promoting their bullshit for donations or employment somewhere.Capcha: TNM0AT
>>102257104>if they're good at promptingthen probably not>if they aren't then patterns will show over time and you can shit on them once you know.there's usually one or two lazy retards that do use llms to write their troll shitposts for them every 3-5 or so threads but there is also always the possibility that it's genuinely just a sperg or an ESL-tard just shitting up the thread
>>102257185Fuck you it's a kids show
>>102257185>barneyfagHoly fuck, he's still at it?I sure hope he's gotten a bot to monitor threads at least.
Kobo, will you add speculative decoding in next release?
>>102257104Aren't those just typical 4chan posts?Compared to regular /g/ threads I would say /lmg/ is still pretty well-behaved and on-topic.
XTC sounds like a crypto scam
>>102257092not a nice way to talk about your mom and your 3 dads
is there any way to quantize jamba and run it on cpu by any means at all? the transformers method quants using bnb which only works on gpu. I haven't tried vllm but it says it uses the transformers backend for jamba so that should be the same.
>>102257304You know what else sounds like crypto scam?
>>102257104The content of your post and what you're quoting doesn't match.The second post is answering the puzzle? Like, did it hurt your feelings that he called the bird of the puzzle a piece of shit? Your reaction is weird, like you're some kind of snowflake.
>>102257071Logits need to be sorted before selection. If you have a really high topk (and a really big vocabulary) sorting that list can take some time. Very little difference from 10 to 100, but it gets really slow if you use the entire vocabulary (~128k for llama3 and gemma2, i think).
reflection is a really obvious meme btw and it's embarrassing that any of you are falling for it>let's overbake on CoT (probably with samples suspiciously similar to the evals we're testing)>ohhh wowwww it's the best model ever, totally better than all the corpo giants (on that small set of simple evals)>meanwhile, still chokes on simple real world tasks because it's insanely overbaked to use CoT for everythingnever believe the benchmark killers, if the model was actually good they would put examples of it solving hard real world problems first and foremost
>>102257344I mean I'm pretty sure sonnet 3.5 uses the same method.
>>102257336>and a really big vocabularyAh, there it is. That's where my intuition was taking me.Thank you for laying it out for me anon.I suppose there's very little reason to not use some topK then. That explains why some models official spaces default to a value of 40 to 100.
>>102257363it clearly doesn't, reflection is literally outputting explicit cotyou can tell sonnet 3.5 isn't from streaming response times, basically the same as old sonnet
>>102257298He seems to genuinely think that anyone saying a bad word must be the same anon.
>>102257163that guy has outright admitted to being in all those threads and crossposts from them (mostly to aids) regularly
>>102257423Are you genuinely that triggered that someone called a fictional bird a piece of shit?
>>102257455nta but,kyskyskyskyskyskyskyskyskyskyskyskys
>>102257092>The bird is a lying piece of shitYou need to chill a bit dude, this is too much even for 4chan.Don't you realize that there a lot of people that get discouraged from posting when they see mean words like that?
>>102257455don't care about those specific posts, just confirming the schizo who shits up all the AI threads on the site is real
We need to take steps to make /lmg/ a more diverse and welcoming community. From now on, no one is allowed to be mean anymore. It's time to make empathy a priority.
https://huggingface.co/deepseek-ai/DeepSeek-V2.5Deepseek 2.5 is on HF
>>102257560Diverse, you say? Okay, I can do that.*rapes you*
>>102257173Not to be a dick, but it felt me before as well.
>>102257542It's called schizophrenia, there was nothing wrong with these posts. You're literally pissing yourself about people saying mean words.
>>102257588>don't care about those specific posts
>>102257334There's a pattern to your posts. I can't put my finger on it yet.
>>102257585*mechanical to melmao
>>102257560>>102257479
>>102257104What's wrong with XTC?
>>102257561epycfags WON
>>102257561Chat was ass at ERP and Coder was god mode at it. This could only be a downgrade over Coder.
>>102257173It sure is dry af, I'm still sticking it the old version.
>>102257704It fucks up newlines and EoS tokens because it doesn't have a blacklist.
>>102257704In principle nothing; it's a sort of typical-p decoupled from top-p and with one more hyperparameter, so that now it can act just on the head of the token distribution. It's not something as novel as the author makes it out to be, though.
>>102257846It doesn't do that because that's not how it works, bird lover.
>>102257846If it was properly designed and configured, it would only remove the top choices when it makes sense to do so. Typical-p as mentioned in >>102257853) removes the top choices too, but even at the lowest setting (which makes it deterministic), it doesn't affect the length of the messages much.
>>102254279sensible one was also not working. looks like there's a fix. I will try againhttps://x.com/mattshumer_/status/1832015007443210706
>>102257104
I know this is been spammed to death, but I need help with the new mistral prompting, specifically mistral large, I'm figuring something has to be wrong in my ST prompt because I never had this issue with another model. Basically, if I don't ban the EOS token, it will generate a blank message with 1 token, if I ban it, it responds perfectly normal, I took the prompt template straight from magnum 128B v2's HF page. Is there anything wrong with it? I'm thinking I'm misusing </s>, but i copied the format right off of magnum 123B so I don't know.
>>102257985look at how the prompt is formatted in the ST console windowI think your story string + system prompt prefix/suffix will result in weird redundant instruct tokens but idk if that would be what's resulting in your problem
>>102254305the answer is literally right there. the entire sampling API is being effectively rewritten right now and new samplers won't be merged until that is done.
>>102257561Do they use the same attention mechanism that flash attention doesn't work with?
>>102258077stop being mean
>>102258094mean? i didn't even call him a retard, which is what he is
>>102258077it's been more than a month. can't they do it faster?
>>102258103can't you be patienter?
>>102258103there were several iterations on the new API, but it should be merged very soon now
>>102258099you're part of the demoralization campaign against 4chan
>>102254975Thanks for the info. I was actually considering doing something similar but with largestral.Now I'll have to reconsider how much ram I buy next.
>>102257561rm -rf /media/models/reflections-70b
>>102257104seems like pretty standard 4chan low-effort shitposting tbqh
>>102256673meds, now
>>102256091It's more compliant than Llama3 in my experience. The text is also close enough that I'm willing to accept the tradeoff, but I use q8.
>>102258391fuck you for insulting the bird
This one works for me. I had to change to HF samplers with the original repo's tokenizer.https://huggingface.co/legraphista/Reflection-Llama-3.1-70B-IMat-GGUFsample outputhttps://pastebin.com/ArV6S1tn
>>102256673>>102258401Agreed. I never fail to laugh at how the schizos who dredge up ancient information about the people they have vendettas against, apparently fail to recognise just what absolute, bottom of the barrel, Gollum-tier human garbage it implies that they are, that they have sufficient time on their hands that they are able to track other people's posting histories to that degree.
So how was reflection compared to normal llama 3.1?I'm tempted to download it but I know it won't be worth the trouble.
It might just be the style biasing my impressions, but I'm really enjoying ArliAI-RPMax-12B-v1.1.It's weirdly good at following instructions while using lorebook info, and not sounding like a god damn robot.The only "bad" thing I can see so far is the usual nemo pattern repetition issue, and it very strongly preferring the>"dialog" *narration*style, which I stopped using as a default a while ago, but that's not a big deal.The model did need some wrangling in the first three or so messages, but after that, it's been smooth sailing
just port the new samplers yourself. you do know c++ don't you?
>>102258548>So how was reflection compared to normal llama 3.1?They only just fixed the tensors in the last 5 hours, so very few local will have had time to re-download and quant it.The previous release was inherently broken and literally unusable
>>102258591Oh shit, I forgot about that.
>>102258587I don't>t. used to write c++ for a living
Reflection really likes to hallucinate some wrong information when it starts thinking, then 'fixes' it with <reflection>like, "Cats have five legs. <reflection>Wait, on reflection, cats have four legs</reflect>"
>>102258591>previous release was inherently broken and literally unusableHow do these guys even live and test their shit? "THIS IS BIGGER THEN GPT-5 LOCAL WON" when it literally doesn't even work.
>>102257985https://files.catbox.moe/0ukj1p.jsonContexthttps://files.catbox.moe/hej8wa.jsonInstructI use these as a base for magnum v2 123b. Got them from the magnum repo, so it might help it might not (specifically the context template has some weird text towards the end that I usually don't see in other templates ("Let's get started. Please respond based on the information and instructions provided above.[/INST]"). Your problem could also be a corrupted/broken quant, could've gotten fucked mid download from HF I've had that happen to me once or twice.
>>102258683>How do these guys even live and test their shit? "THIS IS BIGGER THEN GPT-5 LOCAL WON" when it literally doesn't even work.yah its complete bs. they wasted a huge chunk of my day yesterday with that crap. I don't trust them at all anymore. I'll be waiting for independent results and the fp16 before I even think about it again
>>102257561This looks like more than just a meme merge like some were saying. I think the chinks meant that v2.5 is replacing -chat and -coder as one model that's good at everything.
Your wish is my command.>>102258941>>102258941>>102258941
>>102258591>They only just fixed the tensorsWho fucked it, reflection guys or lmao.cpp niggers?
>>102257344The big issue with Reflection is that they apparently aren't aware of the massive confirmation bias that most LLMs have. They will confidently say something wrong and confirm that it's right as long as it's something not very present in its dataset.
>>102253583They'll be blamed anyway if Harris doesn't win, and ignored if she does.