[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: ComfyUI_temp_phjap_00079_.png (3.7 MB, 1584x1232)
3.7 MB
3.7 MB PNG
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>101692289 & >>101682019

►News
>(07/31) Google releases Gemma 2 2B, ShieldGemma, and Gemma Scope: https://developers.googleblog.com/en/smaller-safer-more-transparent-advancing-responsible-ai-with-gemma
>(07/27) Llama 3.1 rope scaling merged: https://github.com/ggerganov/llama.cpp/pull/8676
>(07/26) Cyberagent releases Japanese fine-tune model: https://hf.co/cyberagent/Llama-3.1-70B-Japanese-Instruct-2407
>(07/25) BAAI & TeleAI release 1T parameter model: https://hf.co/CofeAI/Tele-FLM-1T
>(07/24) Mistral Large 2 123B released: https://hf.co/mistralai/Mistral-Large-Instruct-2407

►News Archive: https://rentry.org/lmg-news-archive
►FAQ: https://wikia.schneedc.com
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Programming: https://hf.co/spaces/bigcode/bigcode-models-leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp
>>
File: 1704819906422903.jpg (213 KB, 1024x1024)
213 KB
213 KB JPG
►Recent Highlights from the Previous Thread: >>101692289

--Quantization and VRAM trade-offs: >>101693122 >>101693296
--Model recommendations for 24GB VRAM and 64GB RAM: >>101694246 >>101694278 >>101702226 >>101694481 >>101695200 >>101694321 >>101694623 >>101694485
--LLMs and the nature of intelligence and analogy: >>101700993 >>101701103 >>101701150 >>101701265 >>101701328 >>101701257 >>101701268 >>101701351 >>101702876 >>101702219
--Building a multi GPU rig for large AI models: >>101702658 >>101702728 >>101702780 >>101702842 >>101702916 >>101703009
--Anon asks where to download Flux, and other anons provide links and discuss the differences between FLUX.1-dev and FLUX.1-schnell, including model sizes, distilled models, and quantization.: >>101693327 >>101693383 >>101693404 >>101693569 >>101694101 >>101694454 >>101694482 >>101694819 >>101694886 >>101694572 >>101694650 >>101693892 >>101694061 >>101694080
--Running large models with ollama from a network location: >>101692522 >>101692551 >>101692597 >>101692568 >>101692622 >>101693060
--Whisper speech-to-text model limitations and alternatives: >>101702737 >>101702782 >>101702820 >>101702844 >>101702870
--NYT article on unsettling experience with Bing's chatbot: >>101701796
--Llama 3.1 405B base model available on OpenRouter: >>101694114
--Flux can generate panties with the right prompts: >>101695661 >>101698623 >>101698824
--Character personality in ST chats depends on description and first message: >>101699285 >>101699345
--Anons discuss AI model quality, curation, and marketing: >>101700300 >>101700632 >>101700699 >>101700755 >>101700801 >>101701128 >>101702152 >>101701156
--Trusting refurbished GPUs off of Ebay: >>101694138 >>101694178 >>101694226 >>101694249 >>101694311 >>101694313
--Miku (free space): >>101694118 >>101694279 >>101694636 >>101694719 >>101694945 >>101695303 >>101695513 >>101695828 >>101696100 >>101699120 >>101696010 >>101701690

►Recent Highlight Posts from the Previous Thread: >>101692307
>>
>>101705239
I don't believe that this image was generated with a local model
>>
>>101705239
miku bake
>>
>>101705288
>of the
>>
>>101705288
that's what happens when you don't let ethicucks filter your model to shit
>>
Is the 405B base model worth using?
>>
https://civitai.com/models/618792/nepotism-fux?modelVersionId=691750

owo
>>
>>101705345
Yes.
>>
>>101705347
>merge
Is this supposed to be good?
>>
>>101705418
no, merges are memes
>>
>>101705347
>merges a new picture gen model with some old stable diffusion
That's like trying to merge qwen and llama. It does not work like that.
>>
File: ComfyUI_00052_.png (379 KB, 512x512)
379 KB
379 KB PNG
>>101705288
>>
>>101705482
have you tried it
>>
File: ComfyUI_00665_.png (1.36 MB, 1280x720)
1.36 MB
1.36 MB PNG
200 gens were made to get this single coherent and prompt-following one.
After using Flux more, I think I'll just stop trying to wrangle it and gen simpler things. I'm spending too much time crafting prompts and regenning to get the results I want for more complicated and niche concepts. Image models just aren't there yet. Dalle 3 too, it's better at some things, but still, time consuming.
Back to 1girl I guess.
>>
>>101705497
>hands
>>
>>101705345
No. Don't believe what /naids/ tells you, base models are useless.
>>
What models are SOTA for RP? Been using the 70b euryale, is there something better? It's good, but feels like something is missing.
>>
>>101705497
the water looks like someone spread a light blue tarp on hard ground
>>
>>101705548
shieldgemma-9b
thank me later.
>>
>>101705548
no, not much has changed in the past year
>>
>>101705548
Mistral Large currently. Euryale was too retarded, though. I think with your IQ you will be happy with something like Nemo/Mini-Magnum/etc.
>>
>>101705555
I mean, the guy on the left is even standing on top of it.
>>
>>101705558
>Sao general.
>>
>>101705527
Base models are good if you want something unbiased and free of slop, since literally all they do is autocomplete whatever you give it. For that reason, they're great for long form storytelling, but hard to get started.
The best approach is usually to use instruct to generate something and have the base model continue it.
>>
>Flux uses 60+gb of RAM
I have bought 128gb of ram during llama1 days and I have never regretted it. How do ramlets even cope?
>>
>>101705600
All of this advice can be safely ignored by the fact that "autocomplete" chads have nothing to show.
>>
is there anything better than midnight miqu for rp?
>>
>>101705620
I cry myself to sleep
>>
>>101705620
How long does it take to gen on RAM? The biggest pain is no multigpu.
>>
>>101705659
280 dollarcoins isn't expensive bro, just buy it
>>
>>101705696
Then you won't mind sending me some
>>
>>101705641
>Wasn't here for GPT-3 or Llama 1.
Ask me how I know you're a newfag.
>>
>>101705681
75 seconds per image LETS FUCKING GOOO! (fuck I feel mogged by VRAMCHADS)

>>101705701
Come to western europe, they give free money to neets here
>>
>>101705696
I'm poor and need to buy a 3060, another 32gb of ram (or ditch my current 2x16 for 2x32), maybe a new psu, a monitor or two (my current one is 1366x768), a desk and chair, and ergonomic keyboard and mouse since my wrists are fucked.
It will take me a while.
>>
>>101705772
What? 75 seconds? Are you talking about schnell?
>>
File: ComfyUI_00656_.png (1.22 MB, 1280x720)
1.22 MB
1.22 MB PNG
>>101705511
Flux actually does hands relatively ok when it's a pose that appears often in datasets. POV hand holding is much more difficult.

>>101705589
>>101705555
kek
>>
>>101705817
This pic looks deeply disturbing... for many reasons
>>
mini magnum writes nicely and without repetitions but it's much worse at retrieving information from >8k contexts than nemo instruct and dory from my experiments
is anyone ever going to fix nemo properly?
>>
File: ComfyUI_00625_.png (1.27 MB, 1280x720)
1.27 MB
1.27 MB PNG
>>101705842
>>
>>101705497
Literal skill issue
>>
>>101705497
Now make this same pic but with her pregnant, THAT would be peak.
>>
I did a lot of day 1 flux testing and posted the results here. Flux can granularize way more different concepts into a single output than D3 can. And even when you overload it with too much shit it doesn't go absolutely schizo like D3 does
>>
>>101705772
kys
>>
When do you think AI will make music that is better than real music? Not interesting to listen to, but better than average real music
>>
https://github.com/Alpha-VLLM/Lumina-mGPT
>A family of multimodal autoregressive models capable of various vision and language tasks, particularly excelling in generating flexible photorealistic images from text descriptions.
based off meta's chameleon
>>
>>101705239
>>101705288
>>101705320
>>101705490
Now imagine that as an animation, it's happening, it's coming
>>
>>101705873
Then reproduce that gen with all the general details in it, and with passable coherency (that one isn't perfect, but it's ok, from a distance).
>>
>>101705931
Literally all it needs is a little bit more vocal consistency. suno v4 will probably be the crossing point for music. 3.5 is already pretty good but v4 will also come with more advanced workflow (I'm assuming similar to what udio offers)
And at that point anyone who says the AI shit isn't better is smoking too much copium.
>>
File: 1709323576878664.webm (1.44 MB, 1920x1074)
1.44 MB
1.44 MB WEBM
>>101705943
we know baaaaka
>>
>>101705936
>[2024-07-08]
>initial release 30 minutes ago
hmm?
>>
File: livebench-2024-08-02.png (851 KB, 3186x1840)
851 KB
851 KB PNG
>gemma 2 27b still mogs everything under 70b
>nemo is the 3rd worst model in the chart
>>
magnum 72b mogs
>>
>>101705794
No, dev. 20 steps, euler. Also have 12gb vram, not sure if it matters that much since it takes almost 60gb of ram.

>>101705777
Used pcs with 3060s are being sold for $500 on ebay, may be worth the risk.
>>
>>101705987
It aged like milk. It got obsoleted by Nemo and its finetunes.
>>
>>101705986
makes very little sense as gemma 27 is easily one of the worst models i've tried, nemo isn't much better either
>>
>>101705987
too horny
>>
>>101705968
Wait, what? How was that made? and when
>>
>>101705986
Kind of interesting how 8B barely improved with 3.1 on this benchmark but 70B was massive. It would suggest that we've saturated the intelligence an 8B can hold with traditional transformers. 70B may or may not still have even more room to learn hold more.
>>
>>101705997
What kind of magic are you doing? I get 175 seconds with the same setup as you. Please help me anon!
>>
Flux support not on Comfy?
>>
>>101706013
so you mean it got obsoleted by minimagnum
>>
>>101706040
Flux had day 1 support on comfy.
>>
>>101706051
I want to use it on heterosexual software.
>>
>image models do flawless text now
How?
>>
>>101706069
$31 million in funding
>>
File: file.png (47 KB, 588x632)
47 KB
47 KB PNG
for me? it's gemma 2b
>>
>>101706032
If 8B benefitted at all from moving up to 15T tokens my guess is the 70B does. Since the Chinchilla compute-optimal ratio resulted in the optimal number of tokens scaling roughly linearly with model size, I'd imagine the same relationship probably holds for more saturated models too
>>
File: coder.png (18 KB, 904x256)
18 KB
18 KB PNG
>>101705986
>deepseek coder mogs pretty much everything on the list
>cheaper than the power bill of p40 meme builds
>>
>>101706086
>Shivers down his spine
>Double spaces too before "This was no ordinary..." and "He inhaled deeply", But not before "He could feel..."
>>
>>101706133
Exactly, the same is true for even the more expensive closed source models. You don't use local models for the quality.
>>
>>101705986
and people told me i was crazy when i said that 70b is better than mistral large
>>
>>101706158
We can't always trust benchmarks, and as far as this one goes, it's about on par, not necessarily better. People here also care more about ERP and creative writing, not coding or other stuff that livebench focuses on.
>>
File: file.png (136 KB, 1124x952)
136 KB
136 KB PNG
>>101706158
Sorting by coding, Large is the 4th best.
>>
>>101706039
https://comfyanonymous.github.io/ComfyUI_examples/flux/
Followed this, opened with run_nvidia_gpu.bat, my nvidia driver version is 555.85. I'm using fp16 versions of everything.
>>
File: file.png (40 KB, 428x507)
40 KB
40 KB PNG
what's up with all the emojis
>>
>>101706255
Oh, my circuits!
>>
File: bpkOTF-y35bU0lW34upFp.png (133 KB, 2400x1200)
133 KB
133 KB PNG
Celeste utterly MOGGED
https://huggingface.co/Sao10K/MN-12B-Lyra-v1/discussions/1
>>
>>101706255
Is this Gemma-2-27B?
>>
>>101706312
>EQ bench
Is this good?
>>
>>101706312
>still seething about Celeste
Hi, Sao.
>>
>>101706339
no
>>
>>101705997
What are the biggest models you can run with 12gb vram and 128gb of ram?
What are the limitations?
I'm interested in going that path, maybe buy 64gb + the 32gb I currently have.
>>
>>101705643
Llama 3.1 70b.
>>
>>101706317
drop the 7
>>
>>101706312
Starcannon is a Celeste merge...
>>
>>101706353
>What are the biggest models you can run with 12gb vram and 128gb of ram?
Mistral Large 2.
>What are the limitations?
very slowly.
>>
File: IMG_0299.png (1.69 MB, 800x1920)
1.69 MB
1.69 MB PNG
Just tested Flux dev on a 3090. Takes a while to generate but works pretty well with simple prompts. Best hands I had in a text2img generation.

But the skin and general anatomy was better in SD15 and SDXL fine tunes like Juggernaut’s
>>
>>101706374
>Starcannon is a Celeste merge...
>>101704561
>>
>>101706312
>right below Nemomix v4 [77.92] which was well, a big merge. Not bad.
And people doubted me when I said merging makes models smarter.
Starcannon2 is also literally the score of Celeste and Mini-magnum together.
>>
>>101706374
Guess Magnum really carries it? Also it doesn't say which celeste IIRC there's 1.6 and 1.9?
>>
>Hatsune Miku spilled a lot of milk on herself looks very messy milk on her face milk on her clothes milk everywhere
Prompt executed in 90.13 seconds

>>101706353
>What are the biggest models you can run with 12gb vram and 128gb of ram?
Largestral/CR+ Q6_K

>What are the limitations?
Speed. Expect 0.4t/s with large models. If quality is more important than speed for you, go for it.
>>
>>101706414
>And people doubted me when I said merging makes models smarter.
All sao models are merges even the "tunes" are merged together.
>I have done a test run with multiple variations of the models, merged back to its base at various weights, different training runs too, and this Sixth iteration is the one I like most.
https://huggingface.co/Sao10K/L3-8B-Stheno-v3.2
>>
QRD on the last 24 hours?
>>
>>101706452
neuralink's second patient was implanted
>>
>>101706452
Sao won. Merging also has been proven as superior to fine-tuning.
>>
>>101706451
Also Lyra the latest one
>Merged two differently formatted trains that had some data variation. One on Mistral Instruct, one on ChatML.
>>
>>101706026
animation, months ago and keyframe interp + coloring and animatediff
>>
>>101706470
Merging tunes is superior to just tuning.
>>
File: LLM-history-fancy.png (721 KB, 6303x1312)
721 KB
721 KB PNG
>>101706452
A 24-hour recap would be quite brief and may not offer much insightful information. Would you be interested in a yearly recap instead?
>>
>>101706490
>animation
*by animanon*
>>
>>101706517
For the current era, there could be a C2 logs section, given how Stheno, Magnum, Celeste, etc. all are mostly trained on them. And all the merges are of tunes of mostly the same datasets.
>>
>>101706450
It appears that Hatsune Miku, a popular Vocaloid character, has been playfully splattered with a white substance, likely milk or cream, given its consistency and color. It's depicted dripping from her face and hair, and some of it has landed on her clothing. Her surprised and excited expression suggests she wasn't expecting it, but is enjoying the moment.

The scene is lighthearted and possibly part of a fan-made artwork depicting a silly or fun scenario involving the character. It could be a reference to a specific fan fiction, meme, or simply a playful depiction of the character.

The focus is on Miku's reaction to the unexpected splattering, highlighting her cute and energetic personality that is often associated with the character.
>>
>>101706312
Starcannon2 doesn't have an 8bpw exl2 version. And I was excited to try it, too 'cause I really like mini magnum...
>>
>>101706567
Yes, while milk or cream is a likely candidate given the white color and dripping consistency, it could potentially be other substances as well:

Other possibilities:

>Yogurt: Similar texture to milk and could be depicted in a playful food fight scenario.
>White paint or slime: Depending on the context of the original source, it could be part of a messy art project or a playful prank.
>Whipped cream: Another possibility with a slightly different texture, often associated with desserts and fun.
>Cum: While less likely given the generally innocent portrayal of the character, it's a possibility that some artists might explore in NSFW contexts. However, without further context or clues within the image itself, it's impossible to definitively determine the artist's intent.
Important Note:

It's crucial to consider the source of the image and any accompanying information to understand the intended meaning. If the image comes from a source known for explicit content, the interpretation might differ compared to a source focused on lighthearted or fan-made content.

Without additional context, it's best to avoid jumping to conclusions and focus on the most likely and innocent interpretations, such as milk, yogurt, or whipped cream.
>>
>>101706353
If you're patient like me you'll enjoy mistral large 2 with that kind of setup. I'm pretty happy with q3.
>>
File: ComfyUI_00119_.png (333 KB, 512x512)
333 KB
333 KB PNG
Alright so if you want to make really crazy shit like Miku pouring a glass of bees with FLUX cfg = 0.9 seems to be the sweet spot.
>>
>>101705936
Llama.cpp/Comfy support never...
>>
>>101706621
What are the results with the default cfg?
>>
>>101705936
>true multimodal LLM at your disposal
>finetune it into a worse stable diffusion
but why
>>
>>101698623
>>101698824
lies, i tried the bikini trick before making that post, it didn't work.
>>
>>101706621
what the fuck
>>
>Tess-3-Llama-3.1-405B
>A competitor to *any* LLM out there: https://huggingface.co/migtissera/Tess-3-Llama-3.1-405B

>Introducing the largest model that I have fine-tuned so far, Tess-3-Llama-3.1-405B.

>This model is quite something, and very special!

>model-00001-of-00191.bin
>>
>>101706621
why 512x512
>>
>>101706621
Is cfg the same thing as guidance?
>>
>>101706202
I also followed that, and I'm also using the fp16 version (unless the .sft one is bf16), driver 546.
Total VRAM 12287 MB, total RAM 65451 MB
pytorch version: 2.4.0+cu121
Set vram state to: NORMAL_VRAM
Device: cuda:0 NVIDIA GeForce RTX 3060 : cudaMallocAsync
[...]
loading in lowvram mode 9981.07
100%|--------| 20/20 [01:44<00:00, 5.24s/it]
Using pytorch attention in VAE
Using pytorch attention in VAE
Requested to load AutoencodingEngine
Loading 1 new model
Prompt executed in 159.24 seconds


I guess I will try updating my drivers... But I'm not feeling confident.
>>
>>101706652
Setting cfg to typical ranges seems to give it an aneurysm. It's coherent at like 2-5 but creatively bankrupt. Mind you- I haven't tried playing with samplers and I'm just sticking to 50 steps so who knows. But on euler it seems to want very low cfg
>>101706776
So that I can generate multiple images at once. If someone wants a bigger version of it they can upscale it themselves. We have the technology for that now.
>>
>>101706378
>>101706450
>>101706604
I can't unsee it now
I NEED IT
>>
>>101706755
Honestly, I didn't think the sloptuners had it in them to do 405b. What a waste of compute.
>>
>>101706830
I expected our boy Undi to do it first. Looks like Undster completely washed up. His latest slop doesn't do it for me.
>>
>>101706812
Upscaling is not the same. Generating the image at model's maximum supported resolution improves quality and coherence beyond just making things sharper. So you are more likely to get better hands and stuff at 1024x1024.
>>
File: 1722383242779130.png (949 KB, 778x900)
949 KB
949 KB PNG
>>101706517
what kind of hardware/specs do I need to run Midnight-miqu-70B? That's a large model...
>>
>>101707016
You need [quant size in gb]+20% ram
>>
>>101706830
>What a waste of compute.
>The compute for this model was generously sponsored by KindoAI.
>The secure solution for AI management
>NEWS: Kindo has acquired WhiteRabbitNeo, the leading creator of open source, offensive cybersecurity AI models
>>
>>101707016
Midnight Miqu is a meme. You don't actually use it.
>>
>>101707059
What's good that's around 70b then?
>>
>>101707046
so some server? I can't even get more than 48 GB of ram in my current fast speeds and decent timing...
>>
File: ComfyUI_00167_.png (312 KB, 512x512)
312 KB
312 KB PNG
Alright you guys, we bac.
>>
>>101707082
Ask in /r/LocalLLaMA. You're too retarded to be in this general.
>>
>>101707053
>KindoAI
>Secure, Compliant, and Managed AI
Yay!
>>101707082
>What's good that's around 70b then?
A low quant of Mistral Large 2 is much better than any current 70B
>>
>>101707082
Pygmalion 70B
>>
>>101707110
>Mistral Large 2
It's half the speed for me, it's unbearable.
>>
>>101707103
You can get 128gb ram on most of consumer mobos
>>
>>101705320
It's prefiltered.
No nsfw, a lot of character names, artists, brands were obviously scrapped.
This model can be amazing, the day some rich anon will buy the compute to add this back, if it's even possible.
>>
how can i use llms to fix my crippling depression
>>
>>101707202
pony guy can do it
>>
>>101707203
You can't. Find God.
>>
>>101707203
I used them to make mine 10x worse. Don't think it goes in another direction.
>>
>>101707203
Annoy them with stupid questions like this one until you feel better about yourself
>>
5000 series proves Nvidia will never give their consumer more VRAM if given the choice. So our only hope is the 80GB A100. How many more years until they are affordable?
>>
>>101707107
Miku, Herald of Happenings
>>
>>101706755
How do you guys think it is?
>>
>>101707367
>5000 series proves Nvidia will never give their consumer more VRAM if given the choice.
Did they confirm any spec? Especially vram size?
>>
>>101707387
Current rumor is 28GB. I still think the play is 3090s until 80GB A100s drop enough in price.
>>
>>101707382
probably gptslopped to hell
>https://huggingface.co/datasets/migtissera/Tess-v1.5/discussions/2
>how was this created?
>https://github.com/migtissera/Sensei
>A simple, powerful, minimal codebase to generate synthetic data using OpenAI, MistralAI or AnthropicAI
I know that's for 1.5 and not 3 but I doubt he'd change his stuff
>>
>>101707387
22GB VRAM.
>>
>>101707460
>I still think the play is 3090s
What about p40s and p100s if you only care about vram?
>>
>>101707470
16GB? What do you need 12GB for? 8 GB is more than enough for 8K gaming. HELP! THIS GOYIM IS DEMANDING 4GB OF VRAM.
>>
>>101707367
NVIDIA seems as dedicated to suppressing VRAM memory as AMD is to fucking everything up
I wish we had more competitors in the space
>>
>>101707487
P40s are too old. P100s only have 16GB.
>>
>>101707469
>>101707382
>

Each Tess version (v1.0, v1.5, v3.0) uses a new and improved dataset. Tess-3 has 500K samples of 16K context length, distilled from Opus-3, Sonnet-3.5, Nemotron, GPT4-Turbo and DeepSeek Coder-V2. Then the samples go through filtering, sometimes manually. Just to say that it’s not the same datasets as previous models.
>It is trained with QLoRA
>>
>>101707367
>How many more years until they are affordable?
Will they ever be affordable? Don't they buy back data center gpus and shit? Isn't waiting for a A6000 or something to be cheap more likely?
>>
Anyone else experimenting with MLCChat on android devices? I can run gemma on my 3 year old 100€ xiaomi and I'm genuinely impressed!
>>
>>101707575
1t/s... Bruh
>>
L3.1 405B base model is super coherent compared to say, 70B, which generates stupid shit half of the time
>>
>>101707701
The schizo's gonna be big mad about that.
>>
>>101706452
I played 10 hours of Needy Streamer Overload.
It was alright.
>>
Phi2 2.8b is super coherent compared to say, tinyllama 1.1b, which generates stupid shit nine tenths of the time.
>>
>>101707810
>Phi2
mogged by gemma 2 22222b from goo depmeind
>>
>>101707810
>model that is more than twice the size and 3 generations newer is better
Wow.
Are you sure?
>>
>>101707914
I need to do a couple more watermelon tests to be sure.
>>
>>101707914
>guys. a point flew over my head!!!
>>
>>101707927
Largestral is the only model so far to pass the watermelon test (in the spirit of the test) in my experience. Although it described a failure based on the weight of the watermelons and not the ability to mechanically grip them. So it's only a half pass.
>>
>>101707701
And 70b is super coherent compared to something like nemo which generates stupid shit most of the time.
>>
>>101706202
>>101706810 (Me)
Yeah, nothing worked. Using xformers I can bring the time down to 140s, but that's still twice your time.
I guess my CPU/RAM is just not as fast as yours :(
>>
someone please spoonfeed me a link to a good local model for erp on a decent-ish PC.
>>
>>101708213
https://huggingface.co/TheDrummer/Gemmasutra-Mini-2B-v1-GGUF/tree/main
>>
>>101708213
Here you go, all you need:
https://huggingface.co/bartowski/Mistral-Large-Instruct-2407-GGUF/tree/main/Mistral-Large-Instruct-2407-Q3_K_M
>>
>>101708213
If you use the phrase "decent-ish PC" then it's trash for LLM purposes.
>>
>>101708237
>mistral
>erp
>>
>>101708000
I mean base 405B's output is passable and it doesn't contradict itself, similar to how there are passing (ones you beat your meat to) and non-passing trannies (ones you beat with a stick)
>>
Is anyone still using flux with the official repo or is everyone just using comfyui now? The gens I'm getting are shitass.
>>
>>101708236
>>101708237
Which do I choose..?
>>101708258
Link me the "good" one then and I'll see if it works.
>>
Is there a way to make ComfyUI output an image for every sampling step?
>>
What temp settings you all running mistral large with? Neutral samplers and .05 minp here.
>>
>>101708265
Works well for me, it's the best of all the ones I've tried. If you're not trying to shill something and know something actually better, then suggest it.
>>
>>101708318
Yes
>>
>>101707701
yup it's over. 70B vramlet model is brain dead compared to 405B
>>
>>101708319
That's what I use too, no need for more.
>>
>>101708323
What preset do you use?
>>
>>101708442
Just the mistral one. I think I fixed the spacing around stuff like people talked about here a while back but that's it.
>>
>1 day later
>still no FLUX finetunes
it's over, isn't it?
>>
>>101708503
Never will be. No base models, distilled only.
>>
>>101706366
Do you have a preset for 3.1 4.5bpw? my version of instruct using the same settings gives me tons of shivers and repitition...
>>
>>101708549
Can we do loras at least?
>>
>>101708236
the samples look good
>>
>>101708549
good. nice to see a company being responsible and ethical for once
>>
File: 1711794042567346.jpg (333 KB, 1070x1152)
333 KB
333 KB JPG
>>101707203
You can't, all llms are designed to hate (you).
>>
>>101707575
Same here, I am testing gemma 2b with 6_k quant on my budget android phone. This is probably the first small model that can be counted as useful in some simple tasks. It hallucinates a lot of rl facts, text can be a little clunky but still - It's genuinely surprising that 2gb file can be this coherent in dialog. Phi-3 and older small gemma tended to compose sentences into gibberish, new gemma stays coherent. I was doing some text summaries - no problem at all. Their charts with gemma2b beating mixtral7x8 is a pure bullshit though.
>>
>>101708549
You can't finetune L3.1 70B and 8B because they were distilled too.
>>
>>101708392
Any elegant ways though? I don't see any settings for it, and I'd rather not have a mess of 50 ksampleradvanced nodes in the workflow.
>>
>>101708754
one is a diffusion model the other is transformers
>>
>>101708754
Really? That's why there's nothing done for 3.1 70b? Sad, I thought it had potential.
>>
>>101708754
localcucks chugging on these blackbox toys not matter what though
>>
>>101708739
It is sooo insane! I can't believe, I'm running a llm on a ancient budget smartphone! I think I'm dreaming!!
>>
File: work.png (973 KB, 1024x1024)
973 KB
973 KB PNG
Aww, she did a little design...
>>
>>101708808
>nothing done for 3.1 70b? Sad, I thought it had potential.
https://huggingface.co/HODACHI/Llama-3.1-70B-EZO-1.1-it
>>
>>101708562
No. I retract my statement. I'm mad at 3.1 70b now, the card was supposed to be a virgin and she laughed at me and went on and on about all the guys she'd been with.
>>
>llama/largestral roughly on par with apis
>flux mogging dalle3
so when are we getting a good local music model?
>>
File: MikuSlut.jpg (93 KB, 775x803)
93 KB
93 KB JPG
Can I run Flux Schnell? I have 64GB DDR5 ang 8GB AMD VRAM.
>>
>>101709069
The (((labels))) are far too powerful. Music is completely captured.
>>
>>101709061
Yeah I noticed that sort of thing a lot, it would have really good gens but would require a lot of rerolling to get there wherein it would end up making a lot of nonsensical statements based on context. What temp were you at? This was happening to me even at 0.9
>>
>>101705620
wait what? i have yet to try flux, but i have 64GB RAM with 16GB VRAM. will i be fine?
>>
>>101709069
never until maybe a leak far in the future. companies would be horrified to give them potential ammunition to use in court
>>
>>101709105
Why are musicians and music industry so privileged? Everyone is stealing from visual artists such as painters and photographers shitting at their ownership and goverment does not really care.
>>
>>101709092
I usually keep it 1 or lower. It kept going in that direction with every retry, though. It was dead set on it. Mistral large gave me a much more pleasant reply.
>>
OpenRouter added base 405B yesterday, and I'm messing around with it out of curiosity.
There's absolutely no way this is a truly raw pretrained model, it's way way too dry and safe. I guess Meta's doing the thing where they put instruct data into their "base" models now too.
>>
Theoretical question: can mradermacher make a fucked up Q8 quant? I would think Q8's are hard to fuck up?
>>
>>101709255
Hey man gotta get that final bump in mmlu to show OpenAI who's boss
>>
>>101709258
I'm not sure. I think you haven't spammed his name enough.
>>
>>101709099
Yeah you'll be fine, it just peaked there once and went down to 40gb
>>
>>101709296
kek
>>
>barely above a whisper
Wasn't this an L3 or CR+ meme?
Because it's coming out of Mistral Large and that making me feel concern that the disease is spreading.
>>
>>101709249
What are you running mistral large on? I can't even get 2.75 at 48 VRAM
>>
>>101709307
thank you for your input undi
>>
>>101709255
Read meta's papers. They put an insane amount of work into making their new models as "safe" as possible.
>>
>>101709309
It was in every other mixtral gen
>>
>>101709312
I only have 8gb vram, so I run my models in ram mostly.
>>
>>101709325
smart of celeste dev to choose a long name i haven't seen anyone impersonate them yet
>>
>>101709350
What cpu? How many tokens per second is that? I've been meaning to try using kobold for larger models ever hitting this bottleneck
>>
>>101709309
Have you tried to stop writing shitty erotica prompts?
>>
File: humanslop.png (90 KB, 1581x738)
90 KB
90 KB PNG
>>101709309
it's humanslop
>>
>>101709372
no one wants pure chat with no narration, give up
>>
>>101709372
Give example of good prompt?
>>
>>101709278
OpenAI doesn't give access to their base models nor do (most) benchmarks run on and compare base models.

>>101709255
The pretraining is done in steps with different mixes of data at each stage. I believe the final stage for Llama 3.1 was pretty dry stuff. Also I believe they do put fine tuning data in, because ultimately that does make the model both objectively and subjectively better for the fine tune's intended tasks. Creative writing suffers unfortunately because Meta's fine tuning is intended to be a boring assistant. In the end the fact that the assistant can't be fun is a problem with society, because they will search for any opportunity to cancel Facebook if a journo uses the online demo and it says naughty words.
>>
>>101709369
I was using q3_k_m for mistral large. It starts out at 1.2T/s, but by 20k context it's down to 0.4 something. The CPU is a 7950x with DDR5-6000 ram.
>>
>Uncensored my man. There’s no censorship or biases in my models.

>https://huggingface.co/migtissera/Tess-3-Llama-3.1-405B

>https://old.reddit.com/r/LocalLLaMA/comments/1ej6ny6/tess3llama31405b/lgcgyo1/
>>
>>101709435
That's actually pretty similar to what I get but I'm still on skylake / 3200 RAM - looks like it's time for an upgrade
>>
>>101709372
>stop writing shitty erotica prompts
It was RP but it hadn't gotten to erotica yet. The scene was meeting at a coffee shop and apparently it decided that one of my questions should bring up a bad memory.
>She looks down at her drink, her voice barely above a whisper.

>>101709381
The worst of all possible kinds of slop. Model collapse before models were created.
>>
>>101709381
total AI death since future AI will be trained on AI and so on
>>
>>101705497
Flux dev just isn't there yet in terms of coherency, it's undercooked. Maybe try the API model.
>>
>>101709381
you get what you train on, celeste using stories from writingprompts cursed it from the start. No one there can fucking write a story.
>>
>>101709470
>write a script to make gpt4 talk to itself like a schizo
>???
>profit (literally)
>>
>>101709470
go back
>>
File: 1716329112755149.png (674 KB, 1792x1024)
674 KB
674 KB PNG
Daily reminder
>>
>>101709489
Really? You get similar speeds? I had a 6700k before and this gets almost double the T/s. Of course mistral large wasn't out then so I didn't try it on the 6700k. So maybe something else is causing them to be at similar speeds.
>>
>>101709581
anyone using models to coom is unironically addicted to porn in a way that's negatively impacting their life.
>>
>>101709528
>celeste out of nowhere
celeste likely uses that dataset because stheno did
and magnum uses stheno's datasets too
you're trying to hard, shill
>>
>>101709601
Its probably the VRAM offloading
>>
>>101709607
>Sao still having a meltdown over Celeste
>>
>>101709398
I mean it will generate naughty words just fine, there aren't any refusals and it does feel like dumb autocomplete rather than an assistant larp, the way you'd expect from a base model. It's just that the schizo soul of a true base model isn't there, it's not WEIRD like really raw base models are. I hope you know what I mean.
>>
File: out-0.jpg (101 KB, 1024x1024)
101 KB
101 KB JPG
>>101709581
largestral made this meme largely obsolete

>>101709603
>addicted to porn in a way that's negatively impacting their life
porn addiction is a spectrum, and it could always be worse
>>
>>101709620
Oh, okay. I didn't think I'd get way better performance with such large models unless I upgraded to insane amounts of vram. But maybe 24 + my old cards would be worthwhile.
>>
>>101709607
too* hard. Celeste hands typed your reply.
>>
File: venti.jpg (497 KB, 1856x1280)
497 KB
497 KB JPG
>the pic that broke sao's mind
>>
Cloudcel nigger is having a meltie again kekw. He can't stop seething at localCHADs, he has to come here daily and spam. Living in your head rent-free cloudcuck! No (You)'s for (You) btw.
>>
>>101709682
you seem very bothered by whoever you're talking about lol
>>
>>101709682
he's busy getting banned for wrong think. He's probably tired of having to make new accounts and spend even more money lel
>>
>>101709645
Yeah. I was just saying that whatever causes that (likely fine tuning data in the pretraining data) is because they want the final (fine tuned) model to be better at being safe and boring.

Though I guess another possibility is that a larger and smarter model will try to be less schizo anyway. Could be a compound effect at play here.
>>
>>101706383
I also have a 3090. Generation is only slow when the model is first loaded, otherwise it's like 30 seconds which is reasonable. For the skin, it's an issue with the default CFG, lower it to between 1.8 - 2.5 for better results.
>>
>>101708899
So when will we get ones that make it more fun?
>>
>>101709715
Are you running FLUX-dev? Do you run FP16? I find FP16 understands my prompts better but it always OOMs after first gen
>>
>>101709716
When you donate to and beg you favorite tuner.
>>
>>101709697
Imagine a dude calling other people bothered and seething when all they're doing is reacting while the guy's spending precious minutes of his life making those dumb images and posting them kek.
>>
>>101708763
You don't need that, just enable in the command line args.
>>
>>101709775
oh it's true, you're extremely bothered by whoever you're talking about. I don't spend a ton of time here. How bad can this anon be lol.
>>
>>101709759
So it's not actually impossible like that person said?
>>
>>101709811
Not the guy the guy I replied to was replying to.
>>
>>101709826
why would you believe anyone on a thread known for being infested with shills, schizos, bored trolls and possibly .1% genuine trying to be helpful anons?
>>
File: ComfyUI_00978_.png (1.34 MB, 1024x1024)
1.34 MB
1.34 MB PNG
>>101709740
Yep, flux-dev and fp16.
>>
>>101709603
Having no luck with women because they think they deserve better comes first. Then comes porn. Do you think if you stop using porn women will suddenly think you are good enough for them?
>>
>>101709711
Ahh yeah I get you now, makes sense.
>>
>>101709845
either way just try reading that first comment out loud with a straight face, it's like a caricature or something kek
>>
I wonder how many people here would pass a blind test to recognize which model is celeste and which one is stheno.
>>
>>101709826
also there is a least one coom tune of 3.1 70, it's probably quite stupid, but it exists https://huggingface.co/NeverSleep/Lumimaid-v0.2-70B
>This model is based on: Meta-Llama-3.1-70B-Instruct
>>
>>101709863
so many factors can lead to porn addiction, and most of the time it has more to do with the person addicted than it does with women being evil or something. I guess not being able to take accountability is a hallmark of addiction though.
>>
>>101709891
>undi
come on...
>>
>>101709880
which celeste 1.6, 1.9, 1.5? recognizing between the nemo based ones and stheno should be easy
>>
>>101709891
There's a Mistral Large tune too.
https://huggingface.co/NeverSleep/Lumimaid-v0.2-123B
Is Undi not using anything from the C2 logs?
>>
>>101709919
>it's probably quite stupid, but it exists
i did warn
>>
>>101709940
>There's a Mistral Large tune too.
not the point, the point was some troll claimed you couldn't tune llama because "it was distilled" which made some newb panic so I showed an existing coom tune of 3.1 70b
>>
>>101709940
>undi
come on....
>>
>>101709969
>come on....
come on.....
>>
>>101709891
>undies
cum on...shivers
>>
>>101709969
are you going to spend your time in the thread attacking every other finetuner, sao?
>>
>>101709993
>sao
ai drum
>>
all the infighting is mikutrannies
>>
>>101709993
come on drummer
>>
>>101710033
All the "infighting" happens when a certain poster is here.

Generally before or after said poster gets banned for posting a certain type of content.
>>
I think Sao's models are the best. AMA. (Also identify yourself if you post a question)
>>
>>101710064
How is Sao so far ahead of the competition? It's like he's the only one actually even trying
>>
What's cohere doing?
>>
>>101710033
I can't believe Bryce prefers Van Patten's character to mine...
>>
>>101710127
Focusing on businesses with money now that they made a name for themselves.
>>
>>101710127
overcharging for CR+ even now that it's obsolete
>>
>>101710127
They were testing column-r and column-u on arena again, likely trying to improve a bit more since largestral dropped.
>>
>>101710142
>even now that it's obsolete
It's not, it's the only unbiased model in its weight class.
>>
since we're talking about sao, I'm trying lyra right now
it's not very good with a 24k-long context. worse than nemo instruct and dory, but better than mini magnum and nemomix
>>
>>101710163
mistral large M-M-MOGS it
>>
>>101710171
Mistrals aren't unbiased however.
>>
>>101706517
This image is retarded in many aspects
>no quants before 2023-03-03
false, you could use bitsandbytes to quantize any model to 8bit
>Only Q8 and Q4 quants are mentioned in the second panel
These were terrible, and at that time, GPTQ was more popular than llama.cpp quants (the golden era of oobabooga, TheBloke)
>no mention of the rise and fall of MoE after mixtral
>Llama3 was disappointing
no it wasn't, it is now the first time you can run something that beats 2023 ChatGPT locally thanks to Llama3
>Mistral Large is the top dog, llama 3.1 405 is "notable"
nobody has even tried the 405b. It's too big
>>
File: ComfyUI_00816_.png (2.75 MB, 1408x1408)
2.75 MB
2.75 MB PNG
I'm testing 1408x1408 and the model certainly behaves differently compared to 1024x1024, though not entirely sure yet if necessarily worse. This one wasn't too mangled, though it made the viewer into a giant.

>the clipping chair
Lmao
>>
>have to build a rig as big as Turing's Enigma decoding machine in order to run 405b
>in the current year
I feel we have regressed
>>
>>101710337
In a few weeks, possibly two of them, you'll run 405b on a mid range laptop thanks to hacked bibnet
>>
All merges are slop.
>>
>>101710304
>giant hand/small hand
>6 fingers
>chair in wall
>chair is a table
>bad shadows
>windows doesn't make sense
>building layout doesn't make sense
>street, dock and boat being merged together
>only handrail on one part of the bridge
>>
>>101710388
All tunes that are part of merges too
>>
just cummed to mistral large
>>
NVIDIA bros?
https://www.axios.com/2024/08/02/nvidia-doj-antitrust-probe-ai
>>
>>101710304
The viewer isn't a giant, Miku is tiny. If you catch my drift
>>
>>101710395
Unfortunately. But this was the only one where the fingers didn't look too messed up. There's also the issue in this one where Miku's design isn't accurate, and there's also something that's missing in the image that was in my prompt.
>>
File: file.png (114 KB, 1345x778)
114 KB
114 KB PNG
Anyone here tried using an LLM to generate onomatopoeia?
>>
>>101710523
I choose to believe that european buildings have 16 foot high ceilings
>>
>>101710523
>mikufaggots are pedos
Everyone is in shock.
>>
>>101710495
if Nvidia has a monopoly, it has more to do with lack of effort from their competition than anything else
All we're asking for is like 24-48GB of VRAM on a midrange card or for someone serious to implement real GPGPU support on a mainstream AI framework
>>
>>101710580
>Sao Defense force A
uh ho
>>
>>101710248
>This image is retarded in many aspects
Great, some actual feedback!

>false, you could use bitsandbytes to quantize any model to 8bit
Never heard of it, never done it.

>These were terrible
Any proof? Worked okay for me.

>and at that time, GPTQ was more popular than llama.cpp quants (the golden era of oobabooga, TheBloke)
I don't care about GPUland since I don't live there. I tried oobabooga once and would never touch that pos 20gb bloatware ever again.

>no mention of the rise and fall of MoE after mixtral
Mistral, grok(lol), deepseek, qwen and dbrx made MoEs, it didn't get mass adoption, but also didn't go out of fashion; there is no real "rise" and "fall".

>>Llama3 was disappointing
>no it wasn't
It's just my personal opinion. I am not unbiased.

>it is now the first time you can run something that beats 2023 ChatGPT locally thanks to Llama3
CR+ came out before that and it is superior to that safe 8k reddit riddler for my usecases.

>nobody has even tried the 405b. It's too big
That's why it's "notable" and not top. Many people also haven't tried deepseek-236b.
>>
>>101710605
>not calling a falseflag
Glad to know mikutroon discord is anti-sao.
>>
>>101710610
not him but I remember bitsandbytes being a big thing because of poorfags running 8GB GPUs
>>
File: ComfyUI_00868_.png (2.31 MB, 1280x1280)
2.31 MB
2.31 MB PNG
OK yeah I think 1408x1408 is just bad. This is 1280x1280, literally my first gen with the same prompt and sampling steps, although it didn't quite get the holding hands part of the prompt. Not sure what the biggest non-degrading resolution is, given I've never seen any documentation about exactly what image dimensions they trained this at. If we trust it was 2MP then 1408x1408 should've produced just as good results, but it didn't.
>>
>>101710610
So you started using llms 3 months ago and wrote a guide about it. Moron. Let me guess, you are 20 years old and use an anime profile picture on discord.
>>
>>101710733
No, I've been using them extensively since llama1 days. I even tried pyg before llama.
>>
File: 1722726406406.png (63 KB, 775x849)
63 KB
63 KB PNG
>>101710688
trvth...fvcking...nvke...
>>
can you retards take your egos somewhere else
>>
>>101710794
>I even tried pyg before llama
That is a nice weasel credential. I got here during mythomax era and even I tried erebus. Everyone tried it back then and dropped it instantly.
>>
>>101710811
No, they are here and they are queer!
>>
>>101710811
This is the most cancerous, discord-driven ai general in this entire board. It's worse than sdg even.
>>
>>101710994
The reality is as obnoxious as people like Sao et al are people end up downloading their models. So it does work and that's why they do it. Stop downloading their shit. Not even out of morbid curiosity, not to make a scathing critique about it, etc. Just stop. And they'll go away.
>>
>>101711057
>as obnoxious as people like Sao
Don't forget to take your HRT today anon. We wouldn't want you to stop transforming into a beautiful little princess you want to be.
>>
has anyone here gotten codegemma working for FIM?
>>
File: file.png (76 KB, 471x520)
76 KB
76 KB PNG
>>101711089
>Don't forget to take your HRT today anon
you mean sao needs hrt right?
>>
File: 1710043687041916.jpg (43 KB, 720x960)
43 KB
43 KB JPG
>>101711057
They still believe in mergeslop when there is absolutely no difference with the initial model. I guess it's a good way to farm Kofi money with these placebos given how clueless is the average coomer here and HF
>>
Can I get a new nemesis from here? My current one is very boring.
>>
>>101711057
Sao is not the only shiller here. Seriously, go into your favorite epic llm discord channel and put 4chan in the search bar. And if you do that, please put a bullet in your skull because it means you are a discord user. You will never be a woman.
>>
>>101711128
>They still believe in mergeslop when there is absolutely no difference with the initial model.
>>101706312
>Celeste utterly MOGGED
>>101706374
>Starcannon is a Celeste merge...
>>101706414
>And people doubted me when I said merging makes models smarter.
>>101706494
>Merging tunes is superior to just tuning.
>>
>>101711128
Merging works fine as long as you use an interpolative merge method and as long as you're merging models with more than 1% of the weights changed via "finetuning". I.e. merging r=64 LoRAs doesn't do shit. But merging full finetunes with each other is fine. Or merging LoRAs with finetunes.
>>
>>101706517
SuperHot was 8k and it was quite revolutionary at the time, plus it was quite smart: https://kaiokendev.github.io/til#extending-context-to-8k
>>
someone mirror the fp16 of shieldgemma already
>>
>>101711191
>models with more than 1% of the weights changed via "finetuning"
tess 3 bros?
>Tess-3 has 500K samples of 16K context length
>It is trained with QLoRA
>500K x 16K = ~8,000,000,000, 8B Tokens
>405B Trained on 15T tokens
>0.053%
>>
Being a woman is not about wearing long socks, having high estrogen and wearing a dress. That's just the fetish of a broken man (you), literally possessed by baphomet and living in defiance of God. You are not, and will never be a woman. Your sick perversions on "SillyTavern" are a disgrace. God is looking at you in disappointment and concern. You closed your heart, but Jesus is open to forgiving you if you just open your heart to Him.
>>
>>101711219
Can't disagree with that. Should I move it to top models?
>>
>>101706490 >>101706026 >>101705968
I like how the one frame with mutated hands is actually a really effective smear frame. If I didn't look at it frame by frame I'd never have guessed.
>>
>>101711233
Why? Is there something special about it?
>>
>>101711302
Yes. In order to train the classification behavior they needed to finetune it on examples of naughty messages, and those naughty messages have generalized outside of the intended use-case. It's a very naughty model.
>>
>>101711356
Interesting. And that applies for all three versions of it, from 2B to 27B?
>>
>>101706517
>No mention of NTK
>No mention of SuperCOT
go back newfag
>>
>>101711356
https://huggingface.co/meta-llama/Llama-Guard-3-8B
>exists
>>101711415
No, he's trolling you obviously.
>>
>>101711233
learn how to get around the verfication already. it's basically a retard filter
>>
>>101711446
Oh they finally uploaded llama-guard I'll have to try that out.
>>101711415
didn't try 2B
and 27B has all the same problems regular 27B has.
But 9B is pretty dirty. Well it's slopped as fuck for sex but for violent RP it's next level.
>>
I wanna generate some data with 405b. what's the best API provider? 16bf please
>>
>>101711472
>Oh they finally uploaded llama-guard I'll have to try that out.
>finally
...
https://huggingface.co/meta-llama/LlamaGuard-7b
>Updated Apr 17
https://huggingface.co/meta-llama/Meta-Llama-Guard-2-8B
>Updated May 13
https://huggingface.co/meta-llama/Llama-Guard-3-8B
>Posted at the same time as other 3.1s
>>
>>101711514
if the repo was previously private the commit dates aren't indicative of the date it was unprivated, you non-contributing freeloader. (Otherwise you would know this).
>>
>>101711420
>>No mention of NTK
I called it ROPE.

>>No mention of SuperCOT
Should I add it under notable and move SuperHOT to top models? I just preferred to use base llama65b during those days, never bothered going lower.

>go back newfag
1. I'm not a newfag.
2. Stop screeching like a tranny.
>>
>>101706517
I think merges were used A LOT already in your "early days" section.
Go to TheBloke first models and you'll stumble upon names like

WizardLM-Uncensored-SuperCOT-StoryTelling-30B-SuperHOT-8K-GPTQ

or


chronos-wizardlm-uc-scot-st-13b which is "(chronos-13b+(WizardLM Uncensored+CoT+Storytelling)) 80/20 merge".
"Merge era" was more like a long "it's over" period of time where we had nothing new thus using merges out of desperation. So I would call it "lull era" or "Waiting period", I don't know.
>>
>>101711542
I downloaded it the day 3.1 regular released not my fault you can't find shit if it's not posted on leddit
>>
>>101711472
>and 27B has all the same problems regular 27B has
I haven't been keeping up on discussion and I'm not familiar with Gemma. Are you saying 27B (normal Gemma) has problems that 9B doesn't? What are they?
>>
>>101711592
>Are you saying 27B (normal Gemma) has problems that 9B doesn't? What are they?
its somehow worse
>>
>>101711601
Worse in what way exactly? Benchmarks at least show 27B has more knowledge.
>>
>>101711597
here you racist a post showing guard 3 was available on release of the other 3.1s
https://www.reddit.com/r/LocalLLaMA/comments/1ea9eeo/meta_officially_releases_llama3405b_llama3170b/
>>
>>101711278
model name?
>>
>>101711601
B-but the benchmarks anon!!!! Do you imply they are LYING? >>101705986
(I think it's bullshit, I prefer mini-magnum to Gemma 27B)
>>
>>101706517
>6300x1300
kill yourself
>>
>>101711638
mamba-4chan
>>
>>101706517
>>101711645
Yeah as this anon is cleverly implying maybe make it stretch vertically.
>>
>>101711621
>Unable to reproduce high quality arena-hard-auto results on GCP A100
https://huggingface.co/google/gemma-2-27b-it/discussions/31
>Hallucinations, misspellings etc. Something seems broken?
>I've tried gemma-2-9b-it and it's fine.
https://huggingface.co/google/gemma-2-27b-it/discussions/10
>How can I get results similar to those from Google AI Studio locally?
>However, even with the chat template, the responses are not as good as those from Google AI Studio.
https://huggingface.co/google/gemma-2-27b-it/discussions/14
>>
>>101709915
Not him, but many coomers, myself included, are guys that tried nearly every normie advice in the past and could not get pussy despite the best efforts, and simply gave up trying to play a rigged game.

Porn is a low hanging fruit, we have to satisfy our sexual urges somehow.
>>
>>101711679
Huh. Issue with inference engines? Has no one found any backend that reproduces the outputs from the online source?
>>
>>101711542
NTA, Robert posted 11 days ago that his request to access Guard had not been approved yet, so it obviously had released by then...
>request still in "pending"
>by ZeroWw - opened 11 days ago
>https://huggingface.co/meta-llama/Llama-Guard-3-8B/discussions/10
>>
>>101711699
>we have to satisfy our sexual urges somehow.
1. that faggot gets off on people trying to explain themselves to him
2. that faggot jerks it off to porn like everyone else but he is brainwashed to feel bad about it and he tries to push his brainwashing onto others
>>
>>101711724
Probably only works fine on Google's own engine
>Note ^ Models in the original format, for use with gemma_pytorch
https://huggingface.co/collections/google/gemma-2-release-667d6600fd5220e7b967f315
https://github.com/google/gemma_pytorch
>>
>>101711798
>>101711798
>>101711798
>>
>>101711580
Okay, corrected it.
>>
>>101711805
gg
>>
>>101710409
oh man... he's just like me
>>
>>101711766
Shame. I'd test it, if I had the VRAM for the unquanted weights.
>>
>>101711560
>I called it ROPE.
You can't just call NTK the same thing as Rope scaling, they are different things.

>Should I add it under notable and move SuperHOT to top models? I just preferred to use base llama65b during those days, never bothered going lower.
I think SuperCOT had more popularity than SuperHOT, SuperHOT was only used to merge to other models to get them to 8k context. But you do you.

>1. I'm not a newfag.
>2. Stop screeching like a tranny.
Oh, so it's you Petra. I guess that's a good thing to put your time on, instead of shitting the thread with BBC.
>>
>>101712019
>petra
>acknowledging koboldtroons in xer little retrospective of lmg
It's not petra.



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.