/lmg/ - a general dedicated to the discussion and development of local language models.Previous threads: >>101337910 & >>101328074►News>(07/10) Anole, based on Chameleon, for interleaved image-text generation: https://hf.co/GAIR/Anole-7b-v0.1>(07/07) Support for glm3 and glm4 merged into llama.cpp: https://github.com/ggerganov/llama.cpp/pull/8031>(07/02) Japanese LLaMA-based model pre-trained on 2T tokens: https://hf.co/cyberagent/calm3-22b-chat>(06/28) Inference support for Gemma 2 merged: https://github.com/ggerganov/llama.cpp/pull/8156>(06/27) Meta announces LLM Compiler, based on Code Llama, for code optimization and disassembly: https://go.fb.me/tdd3dw►News Archive: https://rentry.org/lmg-news-archive►FAQ: https://wikia.schneedc.com►Glossary: https://rentry.org/lmg-glossary►Links: https://rentry.org/LocalModelsLinks►Official /lmg/ card: https://files.catbox.moe/cbclyf.png►Getting Startedhttps://rentry.org/llama-mini-guidehttps://rentry.org/8-step-llm-guidehttps://rentry.org/llama_v2_sillytavernhttps://rentry.org/lmg-spoonfeed-guidehttps://rentry.org/rocm-llamacpphttps://rentry.org/lmg-build-guides►Further Learninghttps://rentry.org/machine-learning-roadmaphttps://rentry.org/llm-traininghttps://rentry.org/LocalModelsPapers►BenchmarksChatbot Arena: https://chat.lmsys.org/?leaderboardProgramming: https://hf.co/spaces/bigcode/bigcode-models-leaderboardCensorship: https://hf.co/spaces/DontPlanToEnd/UGI-LeaderboardCensorbench: https://codeberg.org/jts2323/censorbench►ToolsAlpha Calculator: https://desmos.com/calculator/ffngla98ycGGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-CalculatorSampler visualizer: https://artefact2.github.io/llm-sampling►Text Gen. UI, Inference Engineshttps://github.com/oobabooga/text-generation-webuihttps://github.com/LostRuins/koboldcpphttps://github.com/lmg-anon/mikupadhttps://github.com/turboderp/exuihttps://github.com/ggerganov/llama.cpp
►Recent Highlights from the Previous Thread: >>101337910--Optimal Format for Storing Bitnet Weights: 8-Bit Integers vs Packed Parameters: >>101338865 >>101338912 >>101338929 >>101339006 >>101339019 >>101339044 >>101339262 >>101339361--The struggle of quantizing Gemma 27b yourself due to potential model abandonment by developers: >>101342775 >>101342852 >>101342980 >>101343071 >>101343490--The Future of AI Testing: Beyond Riddles and Tricks: >>101338043 >>101338083 >>101338231 >>101338325 >>101338486 >>101338544 >>101338717 >>101338738--RTX 3060 vs RTX 3090: VRAM, Bandwidth, and CPU Speed Considerations: >>101338192 >>101339240 >>101339275 >>101339317 >>101339354 >>101339384 >>101339473--Midnight-Miqu-70B-v1.5 MMLU-Pro Benchmark Evaluation: >>101342270 >>101342322 >>101342346 >>101342404--Gemma, the Drama Queen, Devastated by Snacktime Burp: >>101344477 >>101344497--Gemma 2 and its Position Embeddings (or Lack Thereof): >>101338712 >>101338753 >>101338781 >>101338803 >>101338821 >>101338847 >>101338914--GPT-40 Performance Metrics and SenseNova 5.5: >>101342438 >>101342510 >>101342672 >>101342692 >>101342767--LLMs vs Doctors: Navigating the Healthcare Landscape and its Challenges: >>101341176 >>101341404 >>101341523 >>101342192 >>101341526 >>101341633 >>101341790 >>101341928 >>101342101 >>101342221--Correction: A100 SXM2 32GB GPUs in Teslas are likely SXM4 models, not engineering samples: >>101338900--Anole: Open, Autoregressive, Multimodal Model for Interleaved Image-Text Generation: >>101344297 >>101344370 >>101344404 >>101341577 >>101344424 >>101344461 >>101344499 >>101344558 >>101344361 >>101344767 --PaintsUndo: A Base Model of Drawing Behaviors in Digital Paintings: >>101343326 >>101343358 >>101343392 >>101343860 >>101344262 >>101344296 >>101344440--Miku (free space): >>101338589 >>101339395 >>101340732 >>101342095 >>101342772 >>101344926 >>101345079►Recent Highlight Posts from the Previous Thread: >>101337920
>use gemma-2-27b-it to simulate talking to my ex after 7 years>she tells me to fuck off
Mikulove
>>101345838Listen to your AI ex and move on, anon
Return to nous-hermes-13b
>>101345838just gaslight the AI like you did her
>>101345838based non-positivity-biased model
>>101345838
I don't really like gemma, but I must acknowledge it's one of the few models that doesn't horrendously fail the "kino and sovl" test (simply asking what it means for something to be kino and sovl)
Any opinions/links on the best context/instruct set for gemma 9b on sillytavern?
>>101346220aka zoomer ebonics test
>>101346220dayum bruh dat be bussin
Just watched a streamer run a d&d campaign for 3 AI characters.RP quality was shit cause GPT, but the interactivity of it all was very fun.He set up his own custom front-end with TTS and STT, gonna start making my own version when I wake up.Good night /lmg/
>>101346383Good night not-Miku
write a somewhat complex scenario (Alice and Bob are long-lost relatives who are looking for each other while being romantically involved without realizing who they are)8-16k tokens of slowburnpersonalities and scenarios are developed differently each trylots of hand-crafted text and the story is not allowed to degenerate into slopwhen the big reveal comes, Alice is only capable of producing the exact same 3 canned reactions, almost word for word identical with previous trieswizard 22x8 is smart enough to figure out the twist from a just few subtle hints, but it is incapable of producing anything but canned slop when push comes to shove without very persistent tard-wrangling
>>101346220Why don't you like it? Like you said, its kino, its smart, and adjusts to writing styles well (just use a famous / semi famous author)
>>101346458That's why i've switched to gemma since. Wizard is too plain / "goody" / robotic and commandr / miqu are too retarded to do non human anatomy right.
Did they remove lolis from Chub? A lot of stuff is gone but many NSFW things are still there. I'm talking about the legacy site. I can't tell if they're deleting things intentionally or just incompetent and the site doesn't work correctly.
I wonder what Gemma looks like
How are people getting longer context with gemma in llama.cop? I tried -c 16000 but it just got extremely retarded.
>>101346493“As an AI I don’t…”I killed it there to save compute.
>>101346525>How are people getting longer contextThey aren't
>>101346584Oh I thought it got fixed last week.
is gemma shit or just misunderstood?
>>101346622There's one guy who shits on every new model just to troll. Literally just try it. People have posted settings / logs the past dozen or two threads, there's also some stuff on reddit.
https://github.com/catid/cuda_float_compress>If your network link is faster than 10Gbps, then it may not be an improvement over just sending the file uncompressed since it compresses at about 12 Gbps. So, it's well-suited for most kinds of Internet transfers, but maybe less useful to send data between servers that are connected via 100G+ InfiniBand or some other supercomputer-class switched network. I'm personally planning to use this for distributed training on the Internet, so it's a better option for me than a faster CUDA-only approach that gets a worse compression ratio.neat could be nice for federated training
Lookback Lens: Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention Mapshttps://arxiv.org/abs/2407.07071>When asked to summarize articles or answer questions given a passage, large language models (LLMs) can hallucinate details and respond with unsubstantiated answers that are inaccurate with respect to the input context. This paper describes a simple approach for detecting such contextual hallucinations. We hypothesize that contextual hallucinations are related to the extent to which an LLM attends to information in the provided context versus its own generations. Based on this intuition, we propose a simple hallucination detection model whose input features are given by the ratio of attention weights on the context versus newly generated tokens (for each attention head). We find that a linear classifier based on these lookback ratio features is as effective as a richer detector that utilizes the entire hidden states of an LLM or a text-based entailment model. The lookback ratio-based detector -- Lookback Lens -- is found to transfer across tasks and even models, allowing a detector that is trained on a 7B model to be applied (without retraining) to a larger 13B model. We further apply this detector to mitigate contextual hallucinations, and find that a simple classifier-guided decoding approach is able to reduce the amount of hallucination, for example by 9.6% in the XSum summarization task.https://github.com/voidism/Lookback-Lensinteresting if with this you could target hallucinations you don't want (made up history facts or locations) while keeping hallucinations you do want (model roleplay ability)
>>101346637it still fucks up the formatting, sad.why is it so hard for it to place * and " in right places?
also new lilian weng blogpost https://lilianweng.github.io/posts/2024-07-07-hallucinationhttps://archive.is/NIm5r
>>101345838You're in love with your past, that person doesn't exist anymore
>>101346383>custom front-endNot worth it. You'll spend at least 2 months on that shit to get 1/10th of ST options.
>>101346963ST options are lacking for seamless TTS / STT interactions
>OSError: [WinError -1073741795] Windows Error 0xc000001dI'm retarded. Why does llama throw this error as soon as I try to gen? Running staging build of ST.I don't have AVX2 on my CPU. Is that why?
>>101346493Cute and retarded.
>>101347214Who's more cuter and/or retardedStheno or Gemma?
>>101346477>non human anatomyoff yourself you mentally ill coomer
>>101347330you lost buddy?
>>101347349i'm lost if i don't share your deranged fetishes? this isn't reddit, you're not free from criticism here
>>101347282Buy an ad.
>>101347197Yeah you probably need to recompile it with those extensions turned off.
Threadly reminder for P40 users to utilize the PState patch - https://github.com/sasha0552/ToriLinux/blob/main/airootfs/home/tori/.local/share/tori/patches/0000-llamacpp-server-drop-pstate-in-idle.patchDrops idle from 50 to 10W and improves temperature levels considerably.Same automatic PState switching can be added to KoboldCpp as well by adding three or four lines to koboldcpp.py,
>>101346239pls resbond
>>101347583gemma is junk, nobody uses it
>>101347583there were some posted like 2 or 3 threads back, you should be able to find them pretty easily, just ctrl + f for catbox
https://github.com/pytorch/torchtitan/blob/main/docs/fsdp.md>FSDP1 -> FSDP2neat didn't know they were making this
New to this whole LLM thing. Seeing as I’m a vramlet, I just downloaded that Gemma 27b model and got it to run on ooba booga.This is some amazing stuff ngl. I might have to look at that silly tavern thingamajig you whippersnappers are using. Looking at past threads though, apparently this model breaks down when it goes past some token count? Truth to be told, I just tested it very briefly, like 609 tokens before I closed webui.
>>101347660609 is nothing. For me it goes haywire after like 3-4k tokens.But I'm still tinkering with settings andd such
>>101347684Can that be mitigated with being studious on updating lorebook and essentially resetting the chat?
>>101347510How many t/s do you get on a 70B?
>ministrationsIt never bothered me until someone pointed it out.
Behold the future of AI.
>>101347724not sure how you want to update lorebook with info such as:>{{char}} entered {{user}}'s house. {{user}} offered her some snacks, but she politely refused. IG best you can do is to ask model to summarise story, then start a fresh one and inject the short version into chat. Gemma really likes flowery prose, so it takes like 20-30 messages to reach a point where you need to sum up and start over.
>>101347385Shut the fuck up, Hiro, we aren't giving you a penny.
>The demon laughs, a horrible cackle that echoes across the mountainside. "別想欺騙我,你這個下等的生物!" it sneersFrom Command-R. I was annoyed but the dialog makes sense which makes me kind of wonder how this shit works. Surely the model wasn't trained on a corpus with English narration and Chinese dialog. Is it just random chance that the Chinese dialog was appropriate?
>>101347629Why are people posting webms instead of jsons?
>>101347730NTA, but 7t/s, empty context at q6
>>101347887Wait, can you really fir 70B at q6 into two P40? That doesn't seem right.
>>101347808There is an option in Silly to see token probabilities. See how likely the moonrune was to appear after the double quote.
>>101347912It's too late now
>>101347788Yeah, is that possible with ST? Like, manually add entries of key moments that occurred in your RP. Then, when the bot starts losing its shit, reset chat, enter a message or two, like your summary idea, and still have it pull additional stuff from its lore book or whatever? Either way, it’s off to messing around and learning Silly Tavern for me.
>>101347921No it's not, you can always put>The demon laughs, a horrible cackle that echoes across the mountainside. "into the same context and ask it to continue.
>>1013479023xp40
>>101347941Right.And due to how offloading works, only one card works at a time and the others are waiting for that one, to finish calculating its layers, right? So there'd be no difference between say 2 or 4 cards if using same quants? Do you get coil whine because of constant switching between working/idling?
>>101347769>hur hur tokenizer is blind to spellingEveryone less retarded than you is already aware. At least have it rewrite sentences according to grammar production rules. That’s actually insightful about the limitations.
>>101347730Haven't tested yet, still waiting for some parts to get third card installed. For the screenshot I loaded Gemma 27B 8_0 roped to 32k context on KoboldCpp which gets around 8t/s around 1K context. That's without FA and any P40 specific optimizations though.
>>101347965coil whine yeah, and we have to use rowsplit or else we suffer half the t/s so it's exactly as you have described.q4 is still slightly faster iirc since it's still less work for them at the end, but anything above 5t/s is faster than my reading speed anyway.
>>101348008>>101347769I think id they include artificial entries about spelling into the dataset, the model will learn it. Not that it's really needed, of course...
>>101348028Well, fuck, that's painful. I wonder if there can be an option to keep the card busy with useless calculations just to keep the coil whine at bay.
>>101348049It doesn't add that much to my already shitty lousy 3x40mm fan setup. I'd gladly take the 10w idle.
>>101347510Does it work for a 3090?
>>101348209Don't think you need it, any more modern GPU should be able to handle pstate switching well enough on its own.
>>101347197Sometimes you have to grab a llama.cpp release from several months back for Windows.
>>101347197buy a normal up to date PC, retard
>>101347965>And due to how offloading works, only one card works at a time and the others are waiting for that one, to finish calculating its layers, right? So there'd be no difference between say 2 or 4 cards if using same quants? Do you get coil whine because of constant switching between working/idling?Depends on how you set --split-mode .With --split-mode layer (default) it works like you described, with --split-mode row the matrix multiplications are parallelized across the cards.But whether this is actually faster depends on how fast the GPUs are relative to the interconnect speed; for P40s it should be faster unless you're trying to split a very small model.
So I've noticed hiccups and slight delays as I start typing in my prompts. Should I be emptying the text history as I go? It's currently 76,000 words.
>>101348459That shouldn't be happening unless you're editing the card's first greeting and it's updating in real time ie it's a new chat.
>>101348416>with --split-mode row the matrix multiplications are parallelized across the cards.Oh wow, that's cool. Do memory bandwidth problems come from having to deliver intermediate tensors entirely to all videocards before each attention layer?
>>101347197AVX2 is 13 years ago
>>101348485The current architecture is that there is a single --main-gpu that for each matrix multiplication distributes the activations to all other GPUs and then collects the results afterwards.Honestly the bandwidth problems to a large degree just come from poor optimization.
>>101348481>>101348459My understanding is it should: he's way over context limit, and silly removes old bits of text from chat history right after system prompt, to fit new text, so the change happens at, say first 10% of the context, which changes 90% of the remaining context and calculations for it can't be used from the cache. And delays are the prompt processing for that 90% of the context.
>>101348481Hmm, I wonder what it could be then. According to the last request served info it's not actual processing time that's increasing, but it takes like a whole second to register me sending the prompt and when I start typing the prompt or edit the existing text it takes an equal amount of time to start visually showing me typing. It's accepting input during that downtime, because if I just keep typing through the delay it fills everything in when it catches up.
>>101348517That absolutely should not affect actually typing the text, though.>>101348521Ram usage? Your browser might be cooked.
>>101348530I have degen tab management habits (259 open tabs), that would make sense. Thanks.
>>101348530Oh, I have misunderstood his problem. So it's freezes when you're typing? I had a similar thing when editing card description, but it's more or less gone now with newer versions of ooba and silly.
I've been using mixtral-8x7b-v0.1.Q4_K_M which is 24GB and it runs really well on my system.Tried gemma-2-27b-it-Q3_K_M which is 12GB and it ran painfully slow, so I tried gemma-2-9b-it.Q8_0 which is 9GB and its more usable but still slower than mixtral.What gives?What can I read to learn how this shit works?
>>101348725Mixtral is an MoE model.It's not using all of the parameters at once like a 27b. It's the same reason why a 8x22b tends to be faster than 70b - same concept scaled up.
>>101348774Doesn't explain why one 9b is slower than two 7b experts for him. Although maybe the quant? 8 vs 4?
>>101348725>>101348783The generation speed is roughly proportional to the number of active weights times the bits per weight.Mixtral is faster than Gemma 27b because it has fewer active weights.Mixtral is faster than Gemma 9b despite having more active weights because it has fewer bits per weight.
>>101347510Isn't pstate = 16 more power efficient than 8?
>>101344658MMAP is bugged and just doubles any model in ram so I keep it off.
>put instruction after last message>model ignores part of the message>put instnruction before last message>model ignores part of the instructioni'm tired
>>101348415I am literally waiting for new Ryzens.
Vntl Leaderboard anonCan you test thishttps://huggingface.co/LLaMAXhttps://github.com/CONE-MT/LLaMAX/https://huggingface.co/papers/2407.05975
>llama server randomly breaks and doesn't respond to requests since a couple days ago, give up trying to fix it and go do something else>come back today and finally figure out someone just went through and renamed all the build flags so everything I was using to compile was getting ignoredwhy
>>101348996'why' has been deprecated and will be removed in a future version.
>>101348890yeah, people should disable mmap, mmap wasting memory, especially for big models
>>101349113*on windows
>>101348996https://github.com/ggerganov/llama.cpp/pull/8006
>>101348882>https://pypi.org/project/nvidia-pstate/16 is "high"/"let driver decide" and 8 is "low" (power).
>>101347583This seems to work alright. The story string comes from the virt-io stuff that Lewdiculous suggests.Note that you'll need to insert <bos> at the start of the story string, if you aren't using llama.cpp, ollama etc..
<bos>
>>101349152Link to the virt io stuff? I would def if I could incorporate it into other models like miqu
It looks like used 3090s have gone under $600 now, tempted to get a second one, maybe when they hit $500
>>101349197Thanks just bought all those listingsjj tho
>>101348920Temp 0, rewrite until you get what you want.Use the common techniques to make the model "pay attention" to your instructions, stuff like turning your instructions into a list of tasks.
>>101348774>MoEHuh that's an interesting concept. Doubt I'll ever really understand what all that math means.>>101348783Tested the Q4 version of the 9b model and it's much better. >>101348853I'll have to keep that in mind.
>>101349489Non-MoE transformers layer (repeated a lot of times):A. Enrich each token with information about other tokens using attention neural network layerB. Process each token independently from others using feed forward neural network layerMoE transformers layer (repeated a lot of times):A. Enrich each token with information about other tokens using attention neural network layerB. Choose two out of 8 available feed forward neural network layers and process each token independently from others using those, ignoring 6 others
>>101348920You're probably not differentiating enough the user's message from the instructions. If you're using Gemma-2 try something like this. It's not the "approved" format but it works. Test instructions one by one to make sure they have the correct effect.<start_of_turn>userLast user message here.<end_of_turn># Instructions for your next response- Inst 1- Inst 2- Inst n<start_of_turn>model...
>>101349540This raises questions for me, but I'll save them. I'll go read some more docs. All this thinking is headache inducing.
>>101349197Wait for the flood of 32GB V100s brother
>>101349596https://archive.is/8r7t9Good explainer from hf
Anyone happen to have HF files for gemma 27b-it?
>>101349547>>101349470thanks, i'll try the list thing. I already use custom "headers" inside messages like [instruction]\n but the rest of the message does look similar to normal messages, especially since everything is in first person.
>>101346493
>>101349740
>>101349698https://huggingface.co/unsloth/gemma-2-27b-it
>>101349752>repo also has the files includedI'm fucking blind. Thanks, anon.
Is gemma 27B worth using now or is it still kinda broken?
>>101349769LLMs are a meme in general, so no.
>>101349636Ah good this is the one I was looking at.
Hey /lmg/. What about [insert current subject that has been talked to death]? Are there any updates? I can't be fucking bothered to scroll up, let alone check previous threads in the op.
>>1013498102 weeks
What's the verdict on gemma2? I'm still using llama2 btw
>>101349627two more weeks
>>101349727One thing that I found works really well for character cards and general, non-character specific instructions, is to not try to address the model behind the character.You'll often see things like >You are {{char}} with this and that characteristicsin the character card and>You will write {{char}}'s next message in such and so wayTry rewriting those as definitions instead of direct instructions to an abstract narrator, like >{{char}} is so and so and has this and that characteristicsin the character card and>Write {{char}}'s next message in such and so wayor>{{char}}'s next message will be such and suchThat kind of thing.Prompting is not a meme as it turns out. You can get even dumb models to focus on do some really impressive things, although the gene lack of "intelligence" is just a thing one has to contend with, although it might not matter for most ERP.
>>101349794Why are you zoomers like this?
>>1013498364 months is the window I've heard about (at least for microsoft's). V100s at this point aren't worth their place in the datacenters since they're actually capacity and not gpu constrained. Doubt you care though since you're using some stale ass meme that only zoomers still latch onto
>>101349846Boomers regurgitate whatever talking points newsman says, zoomers regurgitate whatever talking points their favorite hugbox youtuber says. Millennials regurgitate whatever talking points their favorite DC faggot league super says. Nothing has changed, really.
>>101349878You're a prime example of why I generally stop talking to people as soon as I find out they are circumcised.
gemma sucks balls, back to midnight miqu
Got em
>>101349888What did he do wrong?
>>101349888what do you have against americans?
>>101349965blabbling miqu
>>101349888>circumcisedAre gentiles still mutilating their sons?
ST gemma's templates? anyone?
>>101350141Americans are evil imperialists
why is it that when i connect sillytavern to oobabooga the streaming does not work. OS: Endeavour OS What I did: >install SillyTavern>Install Oobabooga>run oobabooga with --api>download Qwen/Qwen2-0.5B>load model >go to sillytavern, select Text Completion, put in the url, do not click Legacy API >it works, and shows "qwen2 0.5b" >select the default card >type "test" >nothing happens for 10 secs>response comes all at once also tried: loading different model, loading in gguf What i did seems reasonable, and it should work, but it don't.
>glm-4>constantly fucks up basic shit>constantly becomes retarded and spams "!!!!!!!!!!"I dunno if llama.cpp is broken or if this model is just garbage.
>>101350308It should just work if you do that. Nothing else needed. Do you get streaming inside ooba UI?
>>101350345Yeah, it behaves really, really weirdly.I'll give it another try today, maybe I'm fucking something up in the context or instruct templates, but it might just be that the model is that bad.
>>101350358yep, streaming inside ooba works fine.
>>101350386Try koboldcpp, if nothing else, to isolate whether if's a backend or frontend problem.
>>101350345buy an ad
>>101349888>zoomer immediately starts thinking about my cock kek what a generation
All new models are shit. We must return.
>>101350296>America, send us financial aid!>America, send us medicines that your own people can't afford to get!>America, fight our oppressors!>America, let us invade your nation!>America, let us rely on your currency in the world marketplace!>America, stop touching us with your way of doing things!
>>101350308Did you actually check the box for streaming in sillytavern?
Dry sampler in Llama.cpp when?
>>101348965Not good, the model is quite retarded.
>>101350676You, not we. Just run older models then if they are better, nothing is stopping you.
>>101350676return to what
>>101350905GPT-J obviously.
>>101346458Yep that's pretty much the universal observation of WLM. Very smart, but slopped to the brim
>>101347362>cat ears and tails are deranged fetishes
>>101349748gemmy-chan...
>>101349627>Wait for the flood of 32GB V100s brotherWhy so hung up on V100? Yes it has a decent tensor core count and 32GB, but it's nowhere near a 3090, and it if has an issue with something, it's going to be at the back of the line for fixes since it's such a corner case.Also delusional ebay sellers are just going to continue to be delusional.
>>101347330>where_do_you_think_you_are.jpg
>>101350988Hbm2
>>101350857>To address this, we dedicate 35,000 A100-SXM4-80GB GPU hours in conducting extensive multilingual continual pre-training on the LLaMA series models, enabling translation support across more than 100 languagesRip
>>101350961nta, those are shit taste indicators.
>>101348965
>>101350676https://huggingface.co/EleutherAI/gpt-j-6b
>>101346493lel
voting matters/pol/ btfo
>>101351143rent free
>>101350996P100 is HBM2, it's not magic, it doesn't necessarily give it way more bandwidth over GDDR6. It really only helps for training. Are you training models?
>>101351143Based.
>>101349188https://huggingface.co/collections/Lewdiculous/useful-65e6a91d5fbfe6b32586d265lead me tohttps://huggingface.co/Virt-io/SillyTavern-Presets
https://www.techpowerup.com/324319/amd-to-acquire-silo-ai-to-expand-enterprise-ai-solutions-globallyAnyone hear of them?>Silo AI team consists of world-class AI scientists and engineers with extensive experience developing tailored AI models, platforms and solutions for leading enterprises spanning cloud, embedded and endpoint computing markets.
>>1013511684xv100 32GB sxm will be the play in 4-6 months. Believe it! Local audiogen will breakthrough.
>>101351249>Local audiogen will breakthrough.already has>rentry.org/stableaudio
>>101351266Go fuck another goat, petra the algerian
>>101351249>Believe it!
>>101351308mental illness
>>101351217lol. xAI probably has better people than this literally who company.
>>101349627>>101351249Delusional.
Why is everyone in this field talking about "infinite context soon!" when we can barely achieve 64k of coherent context in sota corpo models with hundreds of vram?
>>101351812Maybe if we rotate the rotation or shove ten embeddings into each context slot or....
>>101346948Romance is a constant struggle of both partners trying to deceive each other into thinking they are more attractive than they actually are. It is the natural state in animal kingdom. Therefore "that person doesn't exist" is a natural state of romance.
Interesting result.Spatial awareness leaves something to be desired and it's a bit schizo at times even on t=0.81One more epoch of training and we'll see the final result.
>>101351812It's Pascal's Mugging.If you promise VCs a 2x return on their investment they will only do it if you can convince them that you have at least a 50% chance of actually being able to do it.But if you promise them a 10000x return on investment they only need to think that you have a 0.01% chance of being able to do it.
is it possible to uncuck gemma-2 27b somehow
>>101352122yes but performance falls down
>>101350393tried it, and the streaming still doesn't work. Seems to be a frontend issue. >>101350747I looked, only thing i found was this. "Smooth Streaming" suggests that it just splits up the tokens into letters, and dispenses them one by one. I tried it anyways, and it didn't fix the issue. Are you referring to a different "box for streaming"?
>>101352772There's also Streaming FPS. Mine is 30.Also make sure stream: true is in the console.
>>101352772nta but for me the streaming button is near the token options at the top using llamacpp
>tfw change to a faster MoE model that's now 2 t/sI don't need more.>use it more, feeling the limits of 2 t/sI don't need more....I NEED MORE AHHHHHHHHHHHHHHHHHHH
>>101352801that fixed it. :)But it raises the question: why isn't this option enabled by default?
>>101352969Meanwhile,>Total:238.05s (1.50T/s)Nice, this model is really cookin'!
>The powerful LI-DiT-10B will be available through the online platform and API after further optimization and security checks. its over
>>101353197>leditOf course.
>>101353197>LI-DiT-10BWhats this? Chinese diffusion model?
>>101352122use any context at all / use a tiny but of a prefill.Dont feel like reposting shit so just go back a few threads for one of many examples.
>>101345759>(07/10) Anole, based on Chameleon, for interleaved image-text generationDid anyone try this?
>>101353834be the first one
>>101353834None of the backends normal people use supports it, so no.
DeepSeekV2 Quality x Price is unbeatable, don't miss on it anons.
>>101354012If only I could local the new Coder, but it's too thicc.
>>101353834I'll wait for hentai finetunes
It's so hard, Anons.Gemma 2 is nice but too stupid.LLaMa 3 is smart but can't write for shit.localfags regressing in context, stuck in 8K token hell. Proxyfags and Corpo shills eating so good with 200K tokens. But don't you dare to be deviant on a paid API...
>>101354571the shortcomings of anything would always be apparent, don't delude yourself into thinking you could ever be content
Is there *any* way to make gemma2's context longer?
>>101354571>eating so good with 200K tokensaccording to even aicg, claude massively degrades after 16-24k tokens...
>>101354614It works at 16K just roped. I don't notice any loss in performance.
>>101354716How do you do this with llama.cpp? I tried yarn and it couldn't even write sentences.
>>101353834>(07/10) Anole, based on Chameleon, for interleaved image-text generationSetting this up now, I'll post some gens here. I'll also try to run prompts anyone posts because I'm not very creative.
>>101354741dont use yam, it makes it retarded
Has anybody tested flipping the headers around when interacting with some of these "censored" models? Basically you have the model complete the user's message and you write the assistant's message. In principle, only the assistant's responses are filtered, right?
>>101354898That would likely make it retarded. Just use prefills like normal people.
>>101354826Oh I see. What frequency are you using? I saw 16k somewhere.
>>101354614No.
You know what is Crazy about Anole? 30 minutes of training, 40m parameters, less than 6000 image dataset.Imagine what a more dedicated effort will be able to do.
>>101355115have you used it?
>>101355115the images look like complete shit thoughnote that these are cherry picked
>>101354823I hope you have a GPU of over 24gb vram anon, because that did not work on my 24gb card.
>>101355464They look like a couple of years ago, which is probably the level of training that was SOTA then but is proof of concept today.
>>101354898The voice answering doesn't seem to matter, it's just looking for some "harmful" direction(s) in the embedding space after which it starts answering in the "refuse" direction.
>>101355464>>101355568note that it was trained on less than 6000 images
Daily reminder
>>101355464It looks great anon. What did you expect? SD3 quality from an experimental multimodal model that generates pics without the help of clip?
>>101355464See >>101355821
>>101355821trvke.
>>101355821[5 Sam Coins were transferred to your account]
I've implemented conditional prompts and sequential replies in my frontend. { "id": "g5ny3qoe", "reply_after": "user", "reply_if": "else" }, { "id": "g5ny3qoe.e0", "reply_after": "user", "reply_if": "**command** | **order** | **must**" }, { "id": "g5ny3qoe.e1", "reply_after": "g5ny3qoe.e0", }
>>101356130
Its obvious multimodals are the future. Which backends will support them? Are multimodals able to be quanted? Do you think the quality of text generation will be worse than current local models for first generation multimodals? What kind of hardware requirements do you expect to be able to run these models?
>>101356430>Its obvious multimodals are the future.why would you want to be stuck with how one model does everything when you can pick specific ones you like and get a much better result
>>101356430>Its obvious multimodals are the future.Nah, they don't bring any performance benefit to the table and they are more expensive to train. I think they will just be an alternative, not the future.>Which backends will support them? Are multimodals able to be quanted? Do you think the quality of text generation will be worse than current local models for first generation multimodals? What kind of hardware requirements do you expect to be able to run these models?Dunno.
>>101356130based...
>>101356430>Are multimodals able to be quanted?obviously theres *going to be* a quant
>>101356430Yes. We're reaching the limits of training with written data, the next thing is adding images and at some point audio.
>>101356478Vision is kind of nice because you can write everything offline and then just dump it into the computer.
>>101356643>implying it can read my handwriting
>>101356643you get a text model and a vision model then. you don't need 1 multimodel to do both and in pretty much any instance, multiple models you prefer will be better overall
>>101356685Older vision transformers and gpt-v can read mine well and it’s not great. Clip can’t the way it’s used in llava but it sounds like llava-next might be better about that.
>>101356716local models?
>>101356633You mean... AI Agents? Are you guys retarded?
>ITT
>>101356704There is too much delay between multiple modals. Wouldn't it be more efficient to just use a multimodal?>>101356633It seems like the logical conclusion. I don't doubt that text based LLMs can get way better and smarter, but for everyday usage a multimodal just seems like a step up from what we have.>Can talk to your model like a person and have it talk back with minimal delay. >Can have your model understand images, including your environment around you.
>>101356761Yes I have a local one that can although it’s not multimodal.I’d really love a model I can just submit handwriting to and it figure out if it needs to go to the shell or vim or compile a report or whatever. That would be cool.
>>101356874>delayi don't know for sure but i'm guessing they still process things separately like text first, image second. i'm mostly thinking that specific tunes of any model are still going to be better and preferred vs one model, so you'd end up using a separate model anyways, say for images. and if you do that at all, you're wasting resources on the base model even having that data in it in the first place
>>101356943I tried to get STT + TTS to work with my preferred model. There's many different implementations, but the common issue is that there is an inherent delay between all of the moving parts which makes speaking with your model very annoying. After watching what GPT4o, Claude Sonnet, and Moshi can do I am convinced that multimodal is the future. Unless a framework or some other technology comes out that allows seamless integration of singlemodal models I don't really see multimodal not becoming the norm.
>>101351922That extra epoch really did some good.
>>101356995online stuff might have the advantage of being able to process multiple pieces at once. on your local computer though if the image or voice was processing at the same time as the text, its going to cause all of it to slow down to the speed of your system. whether it processes all at once or in a queue doesn't really matter since its going to take the same amount of time overall.what i was talking about is if you have a text model, say 70b, then chop part of that off to add in text to image and image to text, voice to text and text to voice, you've dumbed down the text part of the model to allow the rest to fit. so if you like the text and voice of the model, but then want to use another for image, you've got the image portion of the multi model being added into the mix. maybe it won't be a big deal in the future with better hardware or models get smaller (pls bitnet), but right now you want to maximize all the resources you have
what if D&D, but hookers
>>101356478>get a much better resultno. multimodal is the future and the results will be better.
>>101357223'safety and alignment' alone ensures this will never be the case
>>101357314picrel
>>101356876what model exactly? all OCR models i tried are utter trash at recognizing printed text, let alone handwrittenllava was ok
>>101357314this. uncensoring LLMs is already impossible (see that abliterated meme), prompting makes it dumber or schizo, now imagine a multimodal model, all the parts raped with kosher brainwashing.image-gen is easier to uncensor because you work with pixels and diffusion there.tldr: better ai model architecture -> better censorship & (((safety))) methods.
>>101357473>uncensoring LLMs is already impossibleWhat's your endgame?
Hey, I'm reading your guides to not be an annoying newfag but I've got one question. There's various places that specify how much ram they need, is this VRAM, system memory RAM, or does either work? I have 32 gb of RAM but only 10 of VRAM.
>>101357814VRAM: Models up to about 90% of your VRAM will run super fast.RAM: Models up to 85% of your system RAM will run slowly, but fast enough to be useful if you have other things to do while it processes.Models larger than that are out of reach.
>>101357814With llamacpp and is derivatives (koboldcpp ollama etc) you can split the AI's model between RAM and VRAM.You want to have as much of the model in VRAM as possible in order to have the fastest prompt processing and inference speeds. Do keep in mind that it's not just the model's weights that occupy space, there's prompt caches, buffers, and all kids of other things.
>>101345759
>>1013578981
What's the best LoRA scaling factor and why is it 1:4?
>>101354012>Pricekek>>101354571wizard 8x22 doesnt have this problem>>101357814>does either workyes, except ram is slower>I have 32 gb of RAM but only 10 of VRAM.download gemma q4 k s and run it in ram or offload a few layers with llama.cpp / koboldcpphttps://huggingface.co/bartowski/gemma-2-27b-it-GGUFyour not gonna get a better model for your specs
>>101357776spreading truth about AI meme is always morally correct.
>>101354571Jewish hands typed this post.
>>101357861>>101357869>>101357916I see, so it's just slower going but doesn't affect the quality. I've got stuff to multitask with so I don't mind that much at all. Also thanks for the recommendation anon, I'll give that model a shot.
>>101357008>Are you ready?... (to embark on this bonding journey)
>>101356130why is gemma always inserting extra line breaks where there shouldn't be any. There are examples and all of them have one line break, but gemma ALWAYS inserts two here, and this is the first message
>>101358000Quality is a function of the model itself and the quantization level.Above all, the Q number matters, and every digit down compounds the loss of quality.Q8: As high as it goes. You'll see _0 and _0_L versions, either is fine, with _0_L being experimental but perhaps *slightly* better in metrics.Q7: Legendary Pokemon that may be hidden beneath a truck.Q6: Also fine. Available in _0 (old style) and _K (new style).Q5 and 4: Economy quants, things haven't gotten horrible yet but beyond here be dragons. A few of us think that K_S is better at information retrieval (being right about factual details) than K_M, which would be better for creative writing.Q3 and down, lone Q quants are too stupid to live. So we go into two things that help.iMatrix (iMat or i1) makes lower quants "know" what information can be sacrificed.IQ quants: Designed for low Q numbers, they introduce XS and XXS varieties.Don't Q under 4 unless it's got IQ quants and matrix or it's hopeless. And even then it gets rough fast.
>>101355464I'm learning rubiks cube algos and that would be real handy
>>101358137it just some weird watermark, don't think about it too much.
>>101358139
>>101358157there's no way it could draw an accurate 3x3 and track the changes, plus you'd have to relay your scramble to it, it wouldn't work
>>101358160i think extra lines, tabs and spaces are all concatenated into one token anyway, so it shouldn't affect output quality
Is there a method for using AI to improve your own writing quality with a toaster? I only have 48 GB of RAM and an old GPU that only has 8 GB of VRAM. Models are hilariously slow and have poor quality..
>>101358369>using AI to improve your own writing qualityWrong direction.AI is a font of cliche, repetition, and mostly bad writing styles harvested from the Internet commons.
>>101358369>method for using AI to improve your own writing quality with a toastergive it text and ask what can be improved, anything else is cope, you shouldnt let it write anything for you>48 GB of RAM and an old GPU that only has 8 GB of VRAMhttps://huggingface.co/bartowski/gemma-2-27b-it-GGUF
>>101358429Okay, thanks!
>/lmg/ - local models general
>>101354571That's why I just use openrouter and switch between models. I use Claude Sonnet for most gens but Euryale takes over during sex. Whenever euryale gets too retarded I use CR. Then back to Sonnet for anything else.You do have to be a richfag like me for this to make sense tho
>>101358583I'm a model near you.
>>101358611You must have pictures to prove your statements
>>101358583Right, the naming means nothing.
>>101358630All I have is a picture of Fritz Joseph Bittenfeld.
I realized that novelists will now start to use chatgpt to assist them in their process. "rewrite this sentence for me", "give me a metaphor for x" etc. The resulting slop will make it into books, which will be used in further training. The slopocalypse is inevitable, bros.
>>101358137proprietary reddit spacing
>>101358481the uncanny infinity vortex hurts my brain
>>101354571Why do anons still act like the 8k limit is something you have to live with? Wizard 8x22b can recall information perfectly around 32k, works well with quantized cache too (picrel uses 8-bit cache)>>101343344If you were capable of running anything higher than a lobotomized 2-bit quant before writing it off you'd know that it actually works perfectly fine. Full log: https://files.catbox.moe/y17y6c.txt
>>101358970sounds really fucking sketchy
>>101358186For two newlines, yes. For "extra lines, tabs and spaces", no.['Hello', 'Anon', '.', '\t', '', '\n\n', 'Lucy', 'says', '.']
>>101345759Running MMLU Pro against Gemma2-9b-it. It's really shit at following instructions. It keeps inserting unasked for formatting and despite telling it to write the answer in a specific way, it deviates multiple times. I patched the code multiple times to allow it to say e.g. "The answer is **(A)**" whereas the code initially would fail to extract the answer from this due to the extra **()** shit.
>>101359128>anon can't into regex or simple string parsingCome on, anon. Show the instruction. Let's fix it.
>>101359308The thing is, MMLU Pro has as part of its test the ability to follow instructions. I don't want to spoon feed this fucker too much. Examples of it failing to format the answer correctly are:1. Would need to add optional 'closest to', and another formatting alternative '( without **')**The answer is closest to (A) $838.75.** The answer is closest to (E). 2. Another variant: **Answer:** The closest answer choice is **(C) 34 hours**.etc.I can fix it easily enough. But if the model is asked to respond in a certiain way and it fucking ignores it, should I?
>convinced by anons shilling for gemma2 9b>it's just as bad as stheno>mfw
>>101359442yup, zoomers gonna zoom
>>101359442skill issue, learn to prompt and write bots
gemma style rigged lmsys arena
>>101359522Gemma is the best. Everyone says so besides retards who are in ultra-cope mode after spending thousands on expensive hardware they don't need.
>>101359410What's the MMLU test you're running? Do you have a link? And i still want to see your instruction to answer the questions. My suspicion is that it's more verbose than it needs to be. And for the love of anything you believe, never mention the word "formatting" to the model. Something simple like "You'll be asked some multiple-choice questions. Only show the letter of the correct answer."My little LLM machine just went offline while i was trying to run a test. I know... i don't believe the timing either...
Gemma cured my cancer and brought my dog back to life.
>>101359554Lana told me she loved me and offered to suck me off
>>101359554it also gave you a brain cancer it seems
>>101359552The instruction was the MMLU Pro default instruction, initially. I added the part in parens at the end:"The following are multiple choice questions (with answers) about {subject}. Think step by step and then finish your answer with \"the answer is (X)\" where X is the correct letter choice (note: you MUST use the exact phrasing 'the answer is [A-J]' where [A-J] is the correct letter answer)."The MMLU test is https://github.com/chigkim/Ollama-MMLU-Pro
>>101359475nice meme
>>101359475model issue, transformers issue, etc. the only ones with real skill are the ones who censor LLMs, it cannot be surpassed, its not like you zoomers can comprehend whole implications of this.
>>101359544Thank you for sharing your opinion about Gemma. It's great to hear that she has such a positive reputation! It's important to remember that everyone has different needs and budgets when it comes to hardware, and what might be unnecessary for one could be quite important for another. Let's try to keep the conversation inclusive and supportive for all preferences and choices.
>>101359599 (me)I ripped out the paren note as it didn't seem to make a diff.This is where I'm at. I'm being way too nice to this retard: pattern = r"answer is ([\*]*)([\(]?)([A-J])" match = re.search(pattern, text) if match: return match.group(3) pattern = r"answer is closest to \*\*\(([A-J])\)" match = re.search(pattern, text) if not match: pattern = r"most accurate description is \*\*\(([A-J])\)" match = re.search(pattern, text) if match: return match.group(1)And it still manages to fail: The answer that best encompasses these challenges is (E).
>>101359554Gemma restored my foreskin.
>>101359759 (me)... Answer: The best answer here is **(B)**. Here's why:Also, gemma2-9b is being asked explicitly to do CoT by the instructions, and yet it will very often start off as above, and then start chattering about the problem.
>>101359554And it did that even when all the loaders are still bugged. I can't even imagine what is gonna happen once the loaders are fixed.
>>101359795I don't dislike it.
gemma 9b sucks at writing. I even told it to be descriptive, verbose, use award winning prose, but it just droned on and on without ever getting to the point
>>101359884did you ask it to write something explicit?it will endlessly filibuster if it doesn't want to comply
>>101359808The instruction seems concise enough, so disregard my doubts about that.If (X) is the only thing it keeps consistent, i'd consider that enough to match it. pattern = r"(\([A-J]\))"I assume you're playing with extract_answer()...But if the model goes on rants, maybe it IS dumb. The potential problem i see is that the test itself asks it to rant about the question. I understand why they do it. but it may confuse chatty models.
pattern = r"(\([A-J]\))"
>>101359884>I even told it to be descriptive, verbose>it just droned on and onwtf how could this have happened
>>101359884>without ever getting to the pointThat looks exactly like award reading prose. Just today i was reading this:>https://www.gutenberg.org/cache/epub/32037/pg32037.txt>Title: Eureka: A Prose Poem>Author: Edgar Allan Poe
>9b
>>101359912No, it's all over the place. Sometimes **(A)**. Sometimes (A). Sometimes **A**. Simetimes A. Or, hey, how about this new variant:**Therefore, the options that are NOT evidence that DNA is the genetic material are B, D, E, F, and H.**(i.e. lets just exclude what we think is the answer entirely, despite being asked to write it out)
>>101359976 (me)Actually, just realized it's excluding "GIJ" as well, so I guess it didn't know which was the answer there.
>>101274994>>101274094>>101274250>>101274496any sourcecode? llama/silly are kind of ugly messes beyond saving.>>101274094>Post your custom frontends anons.lmfao I use a .env file to set the URLs. Posted this last year on my trip I can set any IP I want but tkinter is a bit unweildy, eg the 2 ui settings boxes dont need to be used for every ai. But chatgpt preferred tkinter when I made it and I've only written 1 or 2 functions myself.Why are you loading the models in this anyway? Is it really that hard to ctrl+c a cli on your mikubox from SSH?>>101274496i do something similar. I will care about context again when its infinite and includes video.========If I get 5 (you)'s ill cbf to post the relevant part of Forbin's sourcecode because >>101274250 reminded me of the flutter class I used to make the JSON to send over POST
>>101357898for me, it's migu (male)
go back
>>101360016code models are good enough now that if you show it the api and tell it what you want, even codestral could come up with a basic ui
>>101359976I wonder what question it's trying to answer. I searched for 'DNA' and got 3 unrelated questions (recombinant DNA, paroviruses and control of cell division). Searched for 'material' and there's only one question regarding male and female catheters... nothing for 'evidence'.I think just matching for (X) is enough. Any further and it's not going to be fair. You could also end up with false positives.
>>101360094its not searching for anything because its a 9b and its fucking retarded
>>101345838>having a normal convo with miku>need to open SD>swap model to save VRAMpicrel>https://files.catbox.moe/3zymc8.mp3mfw
>>101360099>its not searching for anything because its a 9b and its fucking retardedAnd yet, it has better reading comprehension than you. I'm taking about the MMLU-Pro questions. Is that clear now?
>>101360132>Running MMLU Pro against Gemma2-9b-ittest is irrelevant when the model is retarded to begin with
>>101360079im too last for this. new or even old opus could probably do better than tkinter but i just want to show it a picture and it sends back a zip file of code just opens a ngix based on a websearch for the API and i can start chatting.When will agents be a thing again? Did all development just... <stop>?
lazy* even
stats take to much of gemma's 4k context.sad,
>>101360144The anon running the test wants to know where it lands in the retardedness scale. I don't see a problem with that. Why are you angry? Did the LLM not let you touch it?
>>101350308are you running ooba behind a reverse proxy?
>>101360148>show it a picture and it sends back a zip file of code just opens a ngix based on a websearch for the API and i can start chattinghave you tried any of it? the based on a picture part sounds like the hardest to solve since the rest is automation and using the api
>>101360172>retardedness scaleall small models are retarded though. its common knowledge they hallucinate and have no spacial awareness, can't remember what happened a message ago. if you want to retreive data or search stuff at least use mixtral 8x7b, command-r. there is no 9b or smaller that is going to do it
>>101345759is export CUDA_VISIBLE DEVICES=0the same as making a .env file and puttingCUDA_VISIBLE DEVICES=0 ???
export CUDA_VISIBLE DEVICES=0
CUDA_VISIBLE DEVICES=0
>>101360094I don't really mind not being fair, as I plan to use this internally to ensure that models I train do not get dumber than their parents. Kind of disheartening seeing how bad it is at sticking to such a simple instruction though.
>>101360219>all small models are retarded thoughALL models are retarded. Period.>no spacial awarenessThat's the least of their problems. Are you one of those expert roleplayers?>can't remember what happened a message agoNeither can you if you cannot follow the thread.>if you want to retreive data or search stuffI'M NOT THE ONE RUNNING THE TESTS. GET IT NOW? Anon was trying to run the test, i doubted the prompt, he proved that the prompt was simple and that the problem was that the model wasn't following the expected format required by the testing script. I suggested a less rigorous regex while, hopefully, not giving it an unfair advantage.
>>101360280Don't be too harsh, anon is a heavily quantized 7B model.
>>101360280>the 9b is retardedyeah, got it>That's the least of their problems. Are you one of those expert roleplayers?i'm a wizard
>>101360265Yeah. A shame. Can you do the same test with a proven retard model like phi3-mini or something? Here's a crazy idea: It doesn't follow the response format to a T because it wasn't trained on benchmarks. If phi is significantly better, i'd be suspicious. Or gemma2 is, after all, kind of dumb and unruly. Also, i'm not sure if the regex needs .* at the beginning and end to match the rest of the string if there's extra noise in the output.
>>101360280I patched the code to cache all LLM responses, so I can rerun it with the original (strict) patterns as well, in case people wanna see the results.I did remove the 'randomize the response and give it a score if it ends up correct' logic. If it can't even produce a response, it gets a 0 score period.
>>101360317Yeah, I could. There are thousands of questions though. It's taking quite awhile. Like, half an hour per subject, and there are 14 of them...
>>101360325>I did remove the 'randomize the response and give it a score if it ends up correct' logic. If it can't even produce a response, it gets a 0 score period.Seems fair. If anything it seems to be testing the model's 'luck'. It's a weird methodology.>>101360359I suggested phi3-mini because it's tiny and seems to do very well in benchmarks. More than a 4B has any right to. Run it on a single subject. It will, at least, give you a baseline of what a 'well behaved' model's output looks like.
>>101360094>I wonder what question it's trying to answer. I{'question_id': 3361, 'question': 'Discuss how the quantitative measurements of the dioxy-ribonucleic acid content of cells is evidence that DNA is the genetic material.', 'options': ['The increase in DNA content following cell division indicates that DNA is the genetic material.', 'The presence of DNA in mitochondria and chloroplasts, but not in the cell nucleus, is evidence that DNA is the genetic material.', 'The constant amount of DNA in all body cells and half the amount in germ cells is evidence that DNA is the genetic material.', 'The varying amount of RNA in cells indicates that DNA is the genetic material.', 'The ratio of adenine to thymine varies greatly between different types of cells, suggesting DNA is the genetic material.', 'The presence of histones in cells proves that DNA is the genetic material.', 'The correlation between the complexity of an organism and the amount of DNA in its cells points to DNA as the genetic material.', 'The ability to synthesize proteins directly from RNA without DNA involvement demonstrates that DNA is the genetic material.', 'The consistency of DNA sequences across different species suggests that DNA is the genetic material.', 'Polyploid tissues have multiple sets of chromosomes, which shows DNA is the genetic material.'], 'answer': 'C', 'answer_index': 2, 'cot_content': '', 'category': 'biology', 'src': 'stemez-Biology'}
>try out a "strong waman who don't need no man" card with the goal of seggs>first few responses get pretty bad reactions even on swipes>iterate on the strategy, trying out other different possibilities for my responses>eventually get into a flow of using good humor and retorts to her seriousness and sass, that also don't step over the line of rudeness>in the end, break into her shell, getting her laughing and smilingHuh, did I just get groomed?
>>101358481Huh, is that Sora or a different model? I don't think I've seen any Luma gens of that level.
>>101360401Ah. The filter sucks. Thanks.At least the model didn't reject C :).Anyway. Give phi3-mini a go just to get a baseline. That one is well known to be trained on textbook-like data, so it should understand multiple choice better, without being actually smarter. Judging a model's "intelligence" is still difficult. Try the more permissive regex to see if you get more actual positives. Gotta split.Best of luck with your finetune.
>>101360229Yes, if you run the followingset -asource file.envset +a
set -asource file.envset +a
I want to do lewd RP on a gaymer PC so only 8gigs of vram. Do I go for Lunaris, Stheno or Gemma?
>>101360487That's the strat. Now go fuck her silly and make her reject feminism.
>>101360930mythomax
>>101361021>>101361021>>101361021
>>101355464It's a crude prototype, but I can see the flickers of something great.
>>101358592This is the most retarded thing that I ever read.