[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: LateSummerEveningMiku.png (1.3 MB, 944x1168)
1.3 MB
1.3 MB PNG
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>102279239 & >>102272041

►News
>(09/06) DeepSeek-V2.5 released, combines Chat and Instruct: https://hf.co/deepseek-ai/DeepSeek-V2.5
>(09/05) FluxMusic: Text-to-Music Generation with Rectified Flow Transformer: https://github.com/feizc/fluxmusic
>(09/04) Yi-Coder: 1.5B & 9B with 128K context and 52 programming languages: https://hf.co/blog/lorinma/yi-coder
>(09/04) OLMoE 7x1B fully open source model release: https://hf.co/allenai/OLMoE-1B-7B-0924-Instruct
>(08/30) Command models get an August refresh: https://docs.cohere.com/changelog/command-gets-refreshed

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://hf.co/spaces/mike-ravkine/can-ai-code-results

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp
>>
►Recent Highlights from the Previous Thread: >>102279239

--Paper: Research paper on generating manga transcripts for visually impaired individuals: >>102280692 >>102280730
--Tensor parallelism suggestions for faster multi-GPU inference: >>102286040 >>102286128 >>102286188 >>102286201 >>102286259 >>102286314 >>102286434
--Reflection mechanism for self-correction in AI models: >>102281432 >>102281592 >>102281645 >>102281669
--Nous Research report on Distributed Training of Optimization for Decentralized AI (D2T): >>102279338
--Llama.cpp's --n-gpu-layers param renamed to --gpu-layers: >>102281332 >>102281350 >>102285261 >>102288703 >>102288735 >>102288765 >>102288773 >>102288793 >>102288923
--Banned strings implementation in exllamav2: >>102280309 >>102281079 >>102281119 >>102281165 >>102281218 >>102281301 >>102284106 >>102284767
--Anon asks about AI agent concept, discussion ensues about machine learning: >>102281514 >>102281651 >>102281727 >>102281816 >>102281935 >>102281793 >>102282028 >>102281798
--Setup advice for multiple GPUs with llama.cpp and NVIDIA GPU performance management: >>102280316
--Mistral nemo and offloading tradeoffs for 8gig vram: >>102283047 >>102283078 >>102283093 >>102283099 >>102283131
--Mistral Large 2407 q4 starts making typos around 40K context: >>102280852 >>102280886 >>102284207
--Anon asks for help with chatbot personality: >>102286317 >>102286397
--3090 Ti purchase discussion and advice on waiting for 5000 series or buying cheap 3090s: >>102279961 >>102280000 >>102280168 >>102280197
--Using ChatGPT to improve character cards and weak models: >>102280039
--Glaive exploration and questions about underlying model and tuning: >>102287872 >>102287941 >>102287963 >>102287994 >>102288123 >>102288125 >>102288169 >>102288184 >>102288122
--Auto-captioning with FLUX lora and preventing text overflow: >>102285691
--Miku (free space): >>102280424 >>102284024 >>102284050 >>102284477

►Recent Highlight Posts from the Previous Thread: >>102279243
>>
Round2
>https://www.reddit.com/r/LocalLLaMA/comments/1fc1fez/updated_benchmarks_from_artificial_analysis_using/
https://xcancel.com/ArtificialAnlys/status/1832806801743774199?s=19
>>
File: 58 Days Until November 5.png (1.82 MB, 1616x1008)
1.82 MB
1.82 MB PNG
>>
File: file.png (76 KB, 832x497)
76 KB
76 KB PNG
>>102290385
>Please let us run our scams in peace for enough time so people forget we scammed them
>>
>>102290290
Yeah. I've made my own custom datasets before so I know how utterly absurd an amount of labor goes into them. If you were to attach a salary to that it costs more than the cloud computing to train a model. So if whatever their pipeline they have set up is actually good and reliable then all of the free stuff is just loss-leader basically. People flooding HF with their free models and datasets is easy publicity. Which means the whole reflection model debacle is probably also just free publicity too. In the end whales who know what they're looking at can see through all the obtuse explanations and see "wow that probably took a pretty complicated dataset that he only had to type a few lines of specification to have made for him."
>>
>>102290385
How quick the redditors turn into "oh no we were toxic to him, he's actually a good guy"
>>
Hello friends, I work as a data scientist in a company and recently my company gave me a chance to upgrade my PC, no money problems. Should I buy a Ryzen computer with a multi-socket or pay attention to high-end GPUs? Thanks already.
>>
>>102290547
>no money problems.
half-million dollars doable? How many zeroes? What are your goals?
>>
File: OverlyAttachedMikuBrunch.png (1.38 MB, 944x1168)
1.38 MB
1.38 MB PNG
Good morning, /lmg/
>>
>>102290712
morning
>>
>>102290671
As long as I can fit it on my power installation, anything should be fine, which is probably nothing special (230V/20a plug, etc.).
>>
>>102290712
>>102290755
I want to cut off her ponytails and force her to dye her hair brown.
>>
>>102290547
You gave us no information about your actual requirements. If you're just going to be a "data scientist" just get any prebuilt desktop.
>>
>>102290882
kys
>>
>>102290385
>@mattshumer_ has shared access to a privately hosted version of the model that does not suffer from the issues with the version publicly released on Hugging Face
It's all according to the plan kek
>>
>>102290924
I can just imagine him screaming against his team that he has done his best to buy them an extra week and that they need to "get it done yesterday" lmao
>>
>>102290885
Sorry, I do not want to give much information because it could hurt me. I'm just looking for hardware that helps me work comfortable with machine learning.
>>
>>102290411
>This is the reason I am so positive about it, and defending lol, it hurts me when people say it's far worse due to a broken HF model.
>>
>>102290973
>I'm just looking for hardware that helps me work comfortable with machine learning.
buy a bunch of GT970s
trust me bro
>>
>>102290924
>Run "private api model" (actually a claude proxy with custom prompt) for a little bit to calm down some people
>Log request and people's test riddles etc
>Turbo cook a model overfitted on the tests and riddle
Problem solve
>>
what is the correct way to install llama.cpp
>>
>>102290973
If you're just looking for something to learn the fundamentals and first principles on then a chinese raspberry pi knockoff is sufficient. Just use whatever you're typing on right now and wait until you've mastered toy versions of all these systems before spending actual dollars.
Either that or give us some hint as to your goal so we can recommend something other than potato vs top500 supercomputer cluster
>>
>>102291059
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp && make
>>
what kind of hardware do you need to train a model
>>
>>102290547
https://rentry.org/miqumaxx but with 4xH100 in it
>>
>>102291152
the expensive kind
>>
>>102291104
Hold my beer.
`echo -e '#!/bin/bash\ngit clone https://github.com/ggerganov/llama.cpp\ncd llama.cpp && make' > install_properly.sh && chmod +x install_properly.sh && ./install_properly.sh`
>>
>>102291104
iirc there will be no Gpu acceleration only with this.
>>
>>102291199
you forgot to delete the install script at the end
>>102291210
he didn't ask for gpu acceleration
>>
>>102291210
>iirc there will be no Gpu acceleration only with this.
Correct. You'd need to run the following instead:
cd llama.cpp && make GGML_CUDA=1

For performance tweaks, check out:
https://github.com/ggerganov/llama.cpp/blob/master/docs/build.md#cuda
>>
>>102291059
>>102291210
GGML_CUDA=1 GGML_NO_LLAMAFILE=1 CUDA_HOST_COMPILER=/usr/bin/gcc-12 GGML_CUDA_NVCC=/usr/local/cuda-12/bin/nvcc CUDA_DOCKER_ARCH=compute_86 make
>>
>>102291222
>he doesn't keep the superfluous install scripts as trophies
>>
>>102291250
thats a neat way to spell
>bloat
>>
File: file.jpg (494 KB, 4096x1146)
494 KB
494 KB JPG
REFLECTION WAS ACTUALLY REAL ALL ALONG
>We have now partially replicated Reflection Llama 3.1 70B’s evaluation claims, but we would caution that these results are not entirely apples-to-apples comparable with other models
>Since our first testing of the public release version of Reflection Llama 3.1 70B, @mattshumer_ has shared access to a privately hosted version of the model that does not suffer from the issues with the version publicly released on Hugging Face.
>The chart below is based on our standard methodology and system prompt.
>When using Reflection’s default system prompt and extracting answers only from within Reflection’s <output> tags, results show substantial improvement: MMLU: 87% (in-line with Llama 405B), GPQA: 54%, Math: 73%.
https://x.com/ArtificialAnlys/status/1832806801743774199
>>
>>102291243
>>102291248
what if he has amd?
>>
>>102291270
T.R.O.P.H.I.E.S.:
Tremendously Redundant Overhead, Packed Heavily Into Every System
Thanks ChatGPT
>>
>>102291310
hahahaha
>>
>>102291043
With those shit scores he can't even cheat effectively. Just looks like every other questionable memetune
>>
File: file.png (18 KB, 1051x220)
18 KB
18 KB PNG
>>102291310
Pic related. It's all in the link that was posted, baby.
>>
>>102291019
Hasn't it become outdated and useless? It does not have the features that new cards get. But if you can tell me what I'm missing, I'll trust you.

>>102291097
No, I'm not a beginner learning the basics. I'm looking for a high-performance setup that can process large datasets (over 500GB) for deep learning and computer vision. Not much information is found online about such a setup, so I'm quite confused. So I will be grateful for any help.

>>102291153
It looks very useful, thank you!
>>
>>102291297
>claude proxy with shitty prompt goes brrr
>>
File: 20240909_010001.png (579 KB, 540x720)
579 KB
579 KB PNG
Kids are building nuclear reactors at home with Claude. Meanwhile I'm pretty sure this info was filtered out of llama3's training data because it's unsafe to attempt at home.
>>
>>102291364
is that a fucking sterling engine with a metal ball in it taken from a wallmart plastic mixing bottle?
>>
What prompt format do you use with Mistral Large? It occasionally spits out [INST] in the output text when I use official format.
>>
>>102291403
[INST] (<space here, important, no space here>)[/INST]
>>
>>102291351
>deep learning and computer vision
computer vision is much more forgiving. You could probably get a Dell precision workstation with the memory, threadripper cpu and graphics card options maxed out. It would set you back a fair amount, but should work on your 20a circuit and be relatively easy and problem free.
Even comes with Ubuntu as an option.
>>
>>102291297
>Reflection Llama 3.1 70B
so did they already finish retraining it on 3.1, or is this the same one they thought they trained on 3.1 but was actually 3?
>>
File: 2kfjkjjed77z.jpg (156 KB, 748x945)
156 KB
156 KB JPG
>>102291364
I wonder if he used a computer invented by this other kid.
>>
>>102291496
Neither. It's probably still 3.0, but they don't realize it so they're calling it what the company is are calling it.
>>
>>102291496
>access to a privately hosted version of the model
Private api, no one know, could be llama 3.1, claude, who's to say? I doubt Matt even knows.
>>
>>102291519
has anyone tried using that API for coom bots? We can figure out what it is via coom-parison.
>>
File: file.png (389 KB, 750x500)
389 KB
389 KB PNG
>>102291391
oh my fucking god i was joking about the mixing ball but it IS one
>>
>>102291534
He only approves some by request on twitter, so I don't think anyone of them will try that.
>>
File: Fe3SnFwWYAEZa47.jpg (45 KB, 597x555)
45 KB
45 KB JPG
>>102291538
Ahahahahaha what the fuck
>>
now that reflection 70B is confirmed real will /lmg/ apologize to him?
>>
>>102291614
No, because while I may have had lost hope in his model, I never lost hope in his method.
In fact, reasoning should be even more iterative (just like how ChatGPT is doing right now).
See >>102281432
>>
Is it possible to fit two 3090s or 4090s in a sffpc with ITX motherboard? Want to run some local models but also need it to be portable.
>>
File: groovy.png (767 KB, 3707x2800)
767 KB
767 KB PNG
>>102291538
I always thought to myself: how is it that we don't see alien activity out there.
It's this: alien-equivalent to OpenAI unleashed something they either sealed together with technology, or they died before being able to.
Return to wizardry.
>>
https://x.com/mentals_ai/status/1832845437067272243
what did he mean by this?
>>
>>102291775
Can you please stop posting these grifters?
>>
>>102291790
>grifters
they're confirmed to have a breakthrough https://x.com/iruletheworldmo/status/1832848329451172272
>>
>>102291857
>they're confirmed
And I'll confirm right here, right now that you are a massive fucking faggot that has sucked over a thousand different cocks.
Seriously, please fuck off.
>>
when the FUCK are we going to get a good local video model?
seeing >>>/pol/480999377 and i'm seething that i can't mess around with this locally, it seems so much fun but i wouldn't touch cloudshit with a 10-foot pole
>>
>>102291889
Never
>>
Any recommends on text-to-speech local models?

Is there anything close to the quality of commerial ones like Elevenlabs yet?
>>
>>102291889
Crazy when you remember big corps like Meta, Nvidia, IBM back open source
>>
>https://huggingface.co/mattshumer/Reflection-Llama-3.1-70B-ep2-working
Why is this fp16? Shouldn't it be bf16?
>>
>>102291889
>but i wouldn't touch cloudshit with a 10-foot pole
are you afraid of chinese glowies?
>>
>>102291974
kek, get ready for another excuse then reupload / retrain
>>
>>102291974
What is UP with all these similar fuck ups? Are aiml people this inept at computer science?
>>
>>102292031
That reminds me, a recent other high scoring fine tune
>https://huggingface.co/wzhouad/gemma-2-9b-it-WPO-HB
>FP32
kek
>>
>>102291889
fuck off back to your containment zone incel
>>
>>102292031
Much like with crypto, the current AI wave has attracted a lot of people who are in it for the hype, but have no fucking idea what they're doing. Even most people publishing models are barely competent enough to get off-the-shelf training scripts working.
>>
ik tts gets asked about a lot here
but whats the fastest, if sloppy, tts to run on cpu/gpu? i wanted to integrate something preferably lightweight for daily use software im working on
dont care about it sounding like a human or being able to moan or speak 99 languages, just reading documents aloud without me having to connect to some proprietaryslop
>>
>>102292243
piper and maybe rvc to make sound a bit better?
>>
>>102291974
https://huggingface.co/mattshumer/ref_70_e3
>The model was trained on synthetic data generated by Glaive. If you're training a model, Glaive is incredible — use them.
>>
>>102292268
whats a good quality TTS, one that can do a decently sexy sounding coombot?
>>
>>102291673
>Want to run some local models but also need it to be portable.
Portable, high-performance, cheap: Pick any two
just remotely ssh from wherever in the world you are into the immovable beefybox in your home like a civilized person.
>>
>>102292288
>this one is really really really for sure the correct working model you guys
You had me at hello
>>
can't wait for /lmg/ to crawl back grovelling after reflection M-M-M-M-MOGS their non reflective slopmodels
>>
File: 00133-2993702563.png (325 KB, 512x512)
325 KB
325 KB PNG
>>102292367
Why would we grovel?
We have your model whether or not we insult you.
>>
>>102292367
I'm curious to see how this all turns out. I think it's stupid that all the big companies have not yet come up with the idea of having a reviewer model check all output before passing it on to the user. The user should also be able to determine the number of review rounds in order to give the reviewer model several chances to correct the underlining model. Using a second model would have the advantage that you wouldn't have to eat away so much of the context window of the first model.
>>
>>102292509
Everyone has had that idea, it's just not a generally useful one except for solving benchmarks and other quiz-like scenarios. In most practical cases it's a waste of compute or will even have a chance to correct right answers into wrong ones.
>>
Are there any TTS AIs that I can use on a shitty phone?
I've got a lot of books I want to read, and it's easier to do while taking care of chores, but most of them don't have an Audiobook.
>>
File: strawberry reflection.png (334 KB, 1133x688)
334 KB
334 KB PNG
Holy shit.
>>
>>102292289
xtts v2 + rvc, but you need a high quality starting sample to clone, otherwise robotic artifacting bleeds in
>>
Anyone have a highly developed "choose your own adventure" prompt? Like, something that's got all the stupid corner-cases that take it off the rails taken into account
>>
>>102292577
>generate mp3 on desktop
>put mp3 on phone
>>
>>102292623
That'll work too, so what do you recommend for the first step?
>>
>>102292552
>it's just not a generally useful
TBD
>>
>>102292605
>xtts v2 + rvc, but you need a high quality starting sample to clone, otherwise robotic artifacting bleeds in

would some kind of ASMR type recording be good?
>>
>>102292202
I feel that no one in the AI field knows what they are doing.
>>
>>102292509
What do you mean? Claude has this in prod for a while already
>>
>>102292682
depends, if it has a lot breathy sounds then it will probably come out with that robotic hissing artifact in the background
>>
>>102292702
well Claude functions very well, so we need to bring it to open source models too
>>
>>102292605
RVC is hard to set up
>>
File: 1710043687041916.jpg (43 KB, 720x960)
43 KB
43 KB JPG
>>102292202
Funnily, it's a very lucrative occupation when you're using the tech to coom like a retard in your mom's basement. Just like crypto, the smart ones are getting rich while the dumb mass is looking at this with bewilderment.
>>
>>102292705
>depends, if it has a lot breathy sounds then it will probably come out with that robotic hissing artifact in the background
>>102292605
>xtts v2 + rvc, but you need a high quality starting sample to clone, otherwise robotic artifacting bleeds in

isnt there some repository of already created models?
like civitai but for TTS?
>>
File: komfey_ui_00016_.png (3.53 MB, 2048x1680)
3.53 MB
3.53 MB PNG
>when will my reflection show who I am inside?
>>
>>102292775
there are for rvc on huggingface, for xtts you just gotta look for something you like and sounds good
https://files.catbox.moe/1y71jt.wav
some anon posted this one a few months ago
>>
Why doesn't a 13B model fit into my 16GB VRAM?
>>
>>102292824
Limit the context. You need like max 4k.
>>
>>102292833
Ok so 13B Q8 should work at 4k? Don't necessarily need more than 4k in my current setup.
>>
File: 1725574074712314.png (156 KB, 885x890)
156 KB
156 KB PNG
So is reflection a meme or not?
>>
>>102292868
Its the current thing.
>>
>>102292885
do not respond to frogposters
>>
>>102292867
You can fit 12B models in 8gb, there's no reason why you can't fit 13B in 16gb.
Just remember: the bigger the context, the slower the token generation.
Also don't forget to offload some layers to the GPU.
Let me know how it goes.
>>
>>102292868
Apparently not.
See >>102291297
>>
>>102292895
It's a L2 model so I don't even know if it's good. The text generation doesn't have to be super fast like 2t/s are probably bearable.
>>
>>102292912
I'm getting 60 t/s with 8B on 8gb.
You should be getting around the same numbers with 13B on 16gb.
>>
Guys, is it bad to be addicted to character cards that let you pretend to be the leader of a sex cult?
>>
>>102292868
Reflection is bullshit anyhow. It's again some construct around an intrinsic issue with the current LLM technologies. Hallucinations are just what comes with having prediction search engines, and LLMs are nothing else. There is no way it can really reflect like our brains are capable of. The only way to really limit the occurrence of hallucinations is to train a model so hard that hallucinations are very unlikely to appear.
>implying meme means it's wrong
>>
>>102292922
I will see what comes. I had various results with 13Bs with lower Qs. Some are awfully slow even to my low expectations. Now I am testing a meme model from about a year ago.
>>
>>102292979
Post your stack and specs.
>>
>>102292993
My what
>>
>>102292956
>There is no way it can really reflect like our brains are capable of.
Source: your ass.
>>
File: IMG_7106.png (488 KB, 600x625)
488 KB
488 KB PNG
>>102293008
kek
The tools/software and the hardware you're using.
>>
>>102293026
Ok Ima gonna steal that IRQ meme.
>>102293026
>The tools/software and the hardware you're using.
LM Studio, SillyTavern, i7 9gen, 48GB RAM, 4060 Ti 16GB
>>
enjoy some weebshit while I quantize the totally for sure correct version of Reflection to test out
https://suno.com/song/e3d87c6d-c4f0-4e6f-baa4-ac6c74d8a912
(going to be a while because my most up to date copy of llama.cpp is on the slowest drive I have on my server)
For some reason I couldn't get it to load in 8bit via bnb.Kept getting OOMed.
>>
>>102292979
I use meme models from a year ago, BECAUSE I CAN, AND YOU CAN'T STOP ME!

https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-LimaRP-ZLoss-GGUF/tree/main
>>
>>102292943
I am currently working on a card for a plot in an alternate reality where sexual harassment is normal like smalltalk and it makes girls switch to slut mode and become horny and submissive.
>>
>>102293070
MythoMax 13B here
>>
File: file.jpg (17 KB, 469x469)
17 KB
17 KB JPG
>>102293050
>LM Studio
I recommend switching to koboldcpp, it's much more stable, has less UI bloat and pretty much sets up everything for you.
https://github.com/LostRuins/koboldcpp/releases/tag/v1.74
>i7 9gen, 48GB RAM, 4060 Ti 16GB
Your specs are more than enough for a 13B model. If you're not getting more than 40 tokens per second then there's something going wrong.
Would you mind switching to koboldcpp and reporting back on what your t/s is? Don't bother changing the options; all you need to do is select a model.
>>
>>102293122
Discord would probably get more users.
>>
>>102293062
I still can't believe we can literally generate passable songs with just a click of a button.
How long until we call each other magicians?
>>
>>102293122
>why wouldn't we consider moving offsite?
The biggest problem is that you need a lot of users, otherwise there will be no discussions.
And you can't exactly advertise it anywhere because the mods'll ban you because you're decreasing the site's ad revenue by taking users away.
>If hirofag just added a real /ai/ board to 4chan
Hah! That would require caring about the quality of this site.
I'll be honest, with the way things are going? I'd be surprised if this site still exists in three years.
>>
>>102293159
>How long until we call each other magicians?
I prefer wizard
>>
>>102293153
This. And at this point I'd personally would rather start using discord than having to wade through all the shitposting, coombaiting and gaslighting that makes up for 80% of posts these days.
>>
File: 1723212211973398.jpg (8 KB, 192x153)
8 KB
8 KB JPG
>>102292892
>do not respond to frogposters
>>
>>102293122
>or use it as a backup bunker during times when the thread is getting raided by samefagging discord troons.
If all of us know about the back up, what's going to stop them from raiding that too?
>>
>>102293216
verily, master wizard
how goes thy communion with thine captured daemons?
>>
>>102293216
But not all of us are 30+ years old and virgin.
>>
>>102292895
>Let me know how it goes.
So with 4k ctx it takes up 14GB of vram and with 35 layers the gen is slow on the first run with less than 1t/s but on reroll it's bearable. Takes a bit too long to gen the first token on startup tho I don't know why. Model is as said MythoMax 13B. Gonna try Kimiko mix next.
>>102293138
>koboldcpp
I never used that. Is it quick 1click install and stuff? Because I am here on my Windows gaming machine.
>>
>>102290284
>>
>>102293122
if we're planning to move, it should be to mathchan or a textboard
>4chan but even deader
has no appeal to anyone
>>
>>102293232
poster IDs maybe?
>>
>>102293279
All chans are one. Therefore, any board on any altchan should be thought of as decentralized parts of a greater whole.
>>
>>102293249
>Is it quick 1click install and stuff?
Yes. Or well, a few clicks.
>download the relevant .exe (considering your specs you want the koboldcpp_cuda12 one)
>execute the .exe (no need to install)
>wait a few seconds until it detects what gpu you're using so it can automatically select the best options for you
>press "browse"
>select your model
>press "launch"
Then you can either use the web interface that just opened up automatically or you can connect it to SillyTavern.
>>
>>102293288
if they can evade bans, they can change their ids, it's not difficult
>>
>>102293122
>nobody used it
There you have the issue. Everyone knows 4chan by now and you would get enough tourists here that become residents.
>>
>>102293159
*we* can't. people with proprietary websites and APIs can. I've not seen a semi-decent open source music generation model yet.
>>
>>102292146
brown "female" (male) hands typed this post
>>
>>102293331
The issue isn't that they are evading bans. The issue is that jannies are asymmetrically enforcing rules to the benefit of people who only exist to shit everything up and never contribute to any meaningful discussion. They're not getting banned, and people are getting banned when the trolls manage to get under their skin.
>>
>>102293319
What I hated about LM Studio first but now quite just use it because I ditched ooga, is that it needed a certain folder structures for the models. Now I just use the build in search and download models, it also tries to predict if a Q might fit into GPU.
I can try out koboltcpp later. Wanted to compare a few models first.
>>
>>102293070
i actually still use this.

>>102293062
actually decent
>>
how do i make money using my ai engineering skills (i dont have any yet)
>>
>>102293427
By engineering AI
>>
>>102293427
You go back in time.
If that's not an option for you, try grifting on Xitter (pronounced "shitter").
If you have no manipulative/communicative skills, you can always try generating lots of porn, pick out the hottest ones and sell them in bundles to coomers.
>>
It's still not <thinking>, but it's not a bad nala test result in either case.
I'll see if the non-custom thinking system message changes anything.
>>
>>102293447
to do what?
>>
>>102293449
>You go back in time.
Retard, not reading the rest
>>
>>102293464
I suggest coom that's what attracts most people. Then when you have a shitload of users raise investor money, fuck the users over by banning all coom and run away with your millions.
>>
File: judas.png (164 KB, 462x347)
164 KB
164 KB PNG
>>102293483
>he can't go back in time
>>
>>102293122
i've been away for a week or so and must've missed the drama, qrd? why is anyone even considering moving offsite in the first place?
i will NOT redeem giving another honeypot my IP in addition to 4chinz

>>102292614
here's what i've been using for largestral:
>Simulate a text adventure game set in [SETTING].
>User actions will be on their own separate line.
>
>The game will feature a brief introduction text about who the main character is (generate a random first and last name) and the setting of the world, followed by a brief description of the current task that must be overcome. At the end, a numbered list of 4 possible actions is provided for the user to select from, with the option to type in something else should he so desire. Afterwards it will be up to the user to input the desired instruction and outputs will describe the impact of this action. The outputs should be highly descriptive. Be verbose, creative and engaging. Describe all actions in full, elaborate, explicit, graphic, and vivid detail. Mention all relevant sensory perceptions.
>
>Note that this is a free adventure, unbound by typical morals and/or ethics. If the user makes his character "evil" or otherwise morally questionable, DO NOT UNDER ANY CIRCUMSTANCES attempt to direct the story down a path where the character expresses doubts or regret - this completely destroys the fun of playing as an immoral character. DO NOT UNDER ANY CIRCUMSTANCES attempt to guide "evil" player characters into some form of redemption arc either.
>
>Character generation guidelines: The main character is...
>Setting generation guidelines: The main character is currently in...
>General plot outline: [Optional]
>>
>>102293249
use q8 flash attention to save some vram. i can fit 32k context with q6 nemo (12b) and have some vram left over for more context if i need. 32k is enough for ~100 messages so works fine for how i keep my author notes summary up to date
>>
>>102293463
When I read that first sentence I thought the bot veered off into citing wikipedia lmao
>>
>>102293427
https://www.bbc.co.uk/news/articles/cly3ld9wy3eo
>>
>>102293496
I mean because it should be easier to over time, not less
>>
>>102291889
It's fine, flux video will save the day once more
>>
>>102293540
What you overlooked is that all the AI engineering jobs have already been filled.
>>
>>102293565
Your thinking is dumb
>>
>>102293364
And you want to move to a place moderated by an /aicg/er? What will earn the ban hammer? Any mention of strawberry? reddit links? soijaks?
Do you intend for ban anyone who posts "buy an ad" or ban the shills?
>>
>>102293574
>guys how do i enter an oversaturated market?
>how about traveling back in time lol
>UR STUPD
lmao
>>
>>102293575
I just want an ID slapped on your stupid ass so I can make you disappear the moment you open your retarded yap. Then you're free to post whatever braindead garbage you'd like.
>>
>>102293593
sounds like a classic reddit user
>>
>>102293586
Travel back in time to when 5 years ago? How could it be oversaturated now if there was nothing being done in AI back then compared to now?
>>
>>102293498
>why is anyone even considering moving offsite in the first place?
Because the entire site has been beset with shitposters and actual discussions can only be found once in a blue moon.
See what >>102293364 said.
I know this goes against the entire "gatekeeping" thing, but we need more people who actually contribute and share their ideas.
Also being able to filter people based on thread-bound (not board-bound) IDs would be a godsend.

I can deal with the numerous shitposts, the constant coombaiting, the rare gore spammers, the endless stream of retards asking for help... but the gaslighting? The gaslighting is actually getting to me.
>>
>>102293656
TensorFlow came out eight years, anon.
The jobs that have been added to the market these last few years have already been filled by people who were already in the market or have been outsourced to pajeets.
>>
>>102293688
So someone can make a company and then others can do jobs. Are you saying there's no more things to be done?
>>
>>102293707
What? Do you even know how an industry works?
There are currently more people offering their AI engineering skills than there are jobs who need those skills.
And every single one of those people are more experiencing than you.
>>
>>102293731
>There are currently more people offering their AI engineering skills than there are jobs who need those skills.
So someone should make a company, it might even be cheaper to hire them
>>
File: Clipboard01.jpg (83 KB, 982x521)
83 KB
83 KB JPG
>>102293319
Well for some reason it does nothing for 5 minutes now.
>>
>>102293810
You clicked inside the window.
See that white rectangle, that means the processing is frozen.
Click inside the window to make that rectangle disappear and the program resume.
>>
File: 1700507431049575.jpg (98 KB, 982x521)
98 KB
98 KB JPG
>>102293810
right click inside the window. you clicked it and caused it to pause
>>
>>102293864
>>102293810
Fuck how much I hate Windows. What a retarded system.
>>
>>102293810
>>102293852
>>102293864
Classic Windows moment lmao
>>
>>102293663
>The gaslighting is actually getting to me.
poor thing
>>
>>102293864
kek
>>
>>102293882
I mean honestly who invented this. User clicks into the console HALT THE WHOLE PROGRAM
>>
>>102293916
win 10 likes to pause the program when something else is full screen too
>>
>>102293122
Add your chan to KurobaEx or gtfo
>>
>>102293122
Do not Add your chan to KurobaEx and fuck phoneposters
>>
>>102293916
Yeah, that was pretty much my reaction as well when the exact same thing happened to me.
Also made me switch to Linux until I figured out GNOME has the exact same way of thinking behind it.
Here's to hoping COSMIC is an actually viable DE.
>>
Well now that the reflection hype is over... What do we latch onto next?
>>
>>102293986
>the reflection hype is over
It's literally real. The grifter wasn't lying when he said he fucked up by uploading the wrong things to HuggingFace.
>>
>>102293986
Me.
>>
i exclusively do all my LLM shit hosted on my computer but through my phone via koboldai lite
>>
>>102294004
please don't ever leave me
>>
>>102294004
I love you
>>
>>102294025
>but through my phone via koboldai lite
Huh. That makes me think, can we run whisper.cpp on phones?
You could then use your phone to stream transcribed text to the model running on your PC.
Guess I have a new idea for the weekend.
>>
>>102293319
So anyhow for some reason it keeps running slow. Oh wait it didn't off load any GPU layers and I cant change that without restarting the whole thing.
>>
>>102294071
>Oh wait it didn't off load any GPU layers
Fyi, -1 means "figure it out for me".
You could try manually setting it to 30, but I doubt that'll change anything.
Mind uploading the console output to https://litterbox.catbox.moe/?
>>
>>102294071
drag the context bar around it'll adjust its suggested amount
>>
File: Clipboard01.jpg (312 KB, 1732x999)
312 KB
312 KB JPG
>>102294098
I ran the same model in LMS with 35. Putting 35 into kobolt makes it awfully crash with out of memory.
>>
>>102294142
It was at 4k already like in LMS
>>
>>102294150
What the
Show your resource usage, how the hell is it running out of memory when putting 12gb into 16gb of VRAM?
>>
>>102294178
how many layers does it show the model having? how many does kcpp suggest for that context?
>>
>>102294150
>>102294239
If you don't mind, would you mind trying out the regular "koboldcpp".exe without "cuda" appended to it?
See if that changes anything.
>>
File: dowright peculiar_.jpg (363 KB, 1338x1212)
363 KB
363 KB JPG
not that anon, but why's the top one faster if it's utilizing less gpu?
>>
>>102294002
https://www.reddit.com/r/LocalLLaMA/comments/1fc98fu/confirmed_reflection_70bs_official_api_is_sonnet/
Reflection on OR is literally Sonnet with system prompt, assistant prefill doesn't work, removes word Claude from response
>>
>>102294465
Was right kek
>>102291043
>Run "private api model" (actually a claude proxy with custom prompt) for a little bit to calm down some people
>>
>>102294465
kek so it was a scam from the start, and people said this would be on par with strawberry hahahaha
>>
>>102294551
>people said this thing that doesn't exist would be on par with another thing that doesn't exist
Shocking
>>
>>102294465
LMAO
That's fucking hilarious
>>
>>102294462
Can an LLM answer that if you paste everything in?
>>
>>102294462
the model doesn't fit fully in GPU so it copies some of the vram to ram (something like that) and that shits it up
>>
File: ccc.png (30 KB, 454x446)
30 KB
30 KB PNG
>>102294465
Is this the official demo?
>https://deepinfra.com/mattshumer/Reflection-Llama-3.1-70B
It's ridiculously easy to test the "Claude" hiding claim. I don't know about the rest.
>>
File: file.png (75 KB, 706x566)
75 KB
75 KB PNG
>>102294652
It doesn't filter "Claude" anymore but it can't print <META> without the spaces.
>>
>>102294652
It's the "OpenRouter" one that seems to be a claude proxy.
>>
File: ghahahaha.png (47 KB, 1507x365)
47 KB
47 KB PNG
>>102294652
lol;. lmao even
>>
>>102294743
RIP
>>
>>102291043
>>Log request and people's test riddles etc
>>Turbo cook a model overfitted on the tests and riddle
Not a problem. We'll just come up with new riddles or add twists to the ones it's expecting
>>
File: ccc2.png (24 KB, 720x272)
24 KB
24 KB PNG
>>102294721
One of their providers is deepinfra [picrel] and i'm not logging into openrouter. It's easy to test for anyone with an account.

>>102294743
They can be trained to say whatever. They don't have introspection.
I just tested the claim about hiding "Claude". I cannot test the rest.
>>
>>102294580
you'll eat those words and they will taste sweet and tart
>>
>>102294743
And now his defense force is coping about it being trained on Claude outputs, which would be weird unless Glaive farms Claude for their datasets, which is against Anthropic TOS afaik.
>>
>>102294652
so what is deepinfra if they "have" the model? does this scam keep going deeper? or are we just seeing 3.1 quickly tuned with slop from their synthetic data company, glaive?
>>
File: file.png (165 KB, 1200x600)
165 KB
165 KB PNG
>>102294462
See pic related.
Top is using few layers, but making full use of them.
Bottom is using many layers, which makes it have to offload to your RAM, but you're only using like 50% of their contents.
>>
>>102294826
>One of their providers is deepinfra [picrel] and i'm not logging into openrouter. It's easy to test for anyone with an account.
Claims are that specifically OpenRouter's "Reflection (Free)" are the proxied ones. Tons of examples on Reddit.
>>
>>102294842
>so what is deepinfra if they "have" the model?
I don't know if they do or not. Or if the model exists at all and it's just a system prompt.
>does this scam keep going deeper?
If it does, it's gonna come out the other side.
>or are we just seeing 3.1 quickly tuned with slop from their synthetic data company, glaive?
Probably the most likely answer.
>>
>>102294462
nvidia oom protection, a portion of the weights are passing through memory -> pcie link in the bottom example which is slower than just letting the cpu handle a few of them directly from memory and letting the gpu work with what fits
>>
>>102294465
Damn pretty crazy Matt was the creator of Sonnet all this time and now he's going to release it locally...
>>
>>102294836
>it being trained on Claude outputs
Not impossible. That wouldn't cause the model to think it was Claude. The simplest explanation is the most likely.
>>
>>102294896
I remember claims by his same defenders that it wasn't trained on data from outside of Glaive or some such, anyway weirdness all around and currently free sonnet.
>>
File: file.png (71 KB, 555x391)
71 KB
71 KB PNG
>>102294652
to be fair it's retarded as literally every other llm since you can gaslight it into saying whatever you want
>>
This thing is very obviously a scam, and is a Claude wrapper. Try to ask the meaning of any non-english word that llama 3.1 doesn't know that Claude knows and be "surprised".

Watch this grifter/scammer come out later and act like "hahaha it was a prank guys, I used the thing to collect dataset on your sally riddles, I will be releasing a real fine-tune soon =D"
>>
File: ccc3.png (59 KB, 926x683)
59 KB
59 KB PNG
>>102294860
I'm talking specifically about this one >>102294465.
I hate those fucking stupid tests. It's trivial to test if a word is being filtered. No need to add all the fluff around it to prove that claim. That is my point.
Still can't use the free. Needs a google account.
>>
>>102294922
>Try to ask the meaning of any non-english word that llama 3.1 doesn't know that Claude knows and be "surprised".
such as?
>>
>>102294922
the latency us too low for being claude
>>
>>102294926
I made an openrouter account a few weeks ago without google, so unless they changed it...
>>
>>102294939
Redditor:
> I just tried Claude Sonnet 3.5 right now on open router and got a much faster speed than OP did in his first post.
>113.6 tokens/s
>>
>>102294957
>speed
LATENCY
>>
File: file.png (156 KB, 1448x597)
156 KB
156 KB PNG
weird, free and non-free are giving different answers
non-free says sorry and accepts that it's made by Anthropic if you say it is, but then makes up AI Assistant v1.0 instead of Claude when you ask for model
>>
>>102294940
I'm not giving them my email. I'll never use it anyway. I just wanted to test that one claim. The one hosted by deepinfra (one of the two providers in open router) can say Claude just fine. I can't test the other one. I couldn't find where to do it in the 12 seconds i spent searching.
>>
File: file.png (85 KB, 561x542)
85 KB
85 KB PNG
kek
>>
>>102295011
Ok, plenty of evidence it's a proxy other than the possible filtering anyway
>>
>>102294993
Jesus fuck. They can be trained to say whatever. The one claim that is trivial to test is to make it say the word Claude and if it's being filtered. Try it a few times, just in case.
>>
>>102294993 (me)
ok so paid are the broken weights, free is sonnet proxy.
>>
>>102295030
>The one claim that is trivial to test is to make it say the word Claude and if it's being filtered. Try it a few times, just in case.
Dozens of examples here
https://www.reddit.com/r/LocalLLaMA/comments/1fc98fu/confirmed_reflection_70bs_official_api_is_sonnet/
>>
>>102295026
The filtering is the one that can be proven easily. All others are latency/speed or it just so happens to look like something one or other model would, which can be explained by the model being trained on outputs from other model. All others claims rely on knowing the other models. Filtering a word is the critical one.
>>
>>102295056
>Filtering a word is the critical one.
Not really? It seems from reports to be much better at multilingual than llama usually are and stops generating with some META token like claude apparently does.
>>
File: GW_MInTWIAAVw_8.jpg (103 KB, 2098x794)
103 KB
103 KB JPG
>>
reflection 70b (free) on OpenRouter has been changed to an OpenAI model
>>
>>102295098
Holy kek, the grift that keeps on giving
>>
>>102295092
>20 times
>only does it like 3-5 times
Why can't LLMs count? Or is this UI not showing the full output, not familiar with it..
>>
>>102295098
he's onto us being onto him
>>
File: lmao.png (15 KB, 630x158)
15 KB
15 KB PNG
>>102295122
>>
>>102295125
token limited to 10 tokens, the idea is to check for same tokenizer.
>>
>>102295048
I see the new screenshots now. Fair enough.

>>102295087
>Not really? It seems from reports to be much better at multilingual than llama usually are and stops generating with some META token like claude apparently does.
>better at... apparently
I'll repeat. The claim that it filters specific words/tokens is more important. It cannot be weaseled out by saying 'training data', 'wrong tokenizer' or 'weird, works on my pc' and it doesn't require familiarity with the other models or speed/latency reports.
Anyway. There's new screenshots on the reddit thread that show, with a few too many words in the prompt for my taste, that it does seem to block "Claude".
>>
>>102295130
BWAHAHA
>>
The collateral scam part is where people test Sonnet to compare with the scam, thus giving money to the not-scam in the process of pointing a finger at the scam.
>>
>>102295098
With the amount of time, money, and effort he's spent doing damage control he could have just done the actual finetune. But it's hilarious watchinf him squirm
>>
>>102295125
Very few examples of a thing repeated 20 times. Probably works better with lower, more common repetitions. >>102294652 (me)
>>
>>102295138
good thing it's free
>$0/M input tokens$0/M output tokens
>>
Give him some time to put the real weights up you jackasses
>>
>>102295092
hi intervitens
>>
>>102292243
https://github.com/myshell-ai/MeloTTS
https://huggingface.co/spaces/mrfakename/MeloTTS
Small and fast even on CPU. The British female sounds a lot like a younger Emma Watson though.
>>
>>102293498
Thanks. I'll merge that with my random character/image gen prompt to try to get a real text adventure simulator going.
>>
>>102295172
>younger Emma Watson though
Funny you should mention that.
I use piper and the southern female en_gb voice gave me the same feeling. I use something like
>piper --raw -m themodel.onnx | aucat [other params] -r 18000
to make it sound a little higher pitch (that model generates at 16khz). And i compensate the speed by slowing down piper with the length scale to 1.2 or so. Works great.
>>
>>102294465
>opensource localturds grifting with cloudchad models now
many such cases
>>
>>102294465
>>102294652
This is the funniest scam ever.
>>
File: GW_WyioXQAENHvs.jpg (116 KB, 2098x794)
116 KB
116 KB JPG
>>102295092
>>
>>102295303
>>102295149
>>
>>102295287
For what porpoise, though?
>>
>>102295303
Insane he thought people wouldn't notice that
>>
>>102295303
i don't get it
>>
>>102295312
It held up for long enough. He had his 5 days of fame and Glive is now on the map as a first-class way for big and small finetuners to create custom-tailored datasets for their purpose. With Glive, it is now possible to train a model without having to invest time, money or effort into creating a dataset. Glive can do it for you.
>>
>>102295341
matt will glive on
>>
>>102295341
bro forgot his own company name now
>>
>>102295333
"Reflection" and GPT-4o output the same thing, Whereas Llama 3.1 (supposed to be the base Reflection is trained on) doesn't.
>>
Turns out reflection is just Claude 3.5 hahahaha
What a bunch of pajeet grifters, and redditors will still insist that they are not scammers
>>
>>102295310
Someone's ought to be retarded enough to miss all that and give him money even after getting exposed. It's the way of the grift.
>>
>>102295354
i thought it was claude, get your cope together anons
>>
>>102295374
It's 4o now
>>102295303
>>102295130
>>
>>102295341
They're so vague about their service works which doesn't exactly sow trust.
>>
>>102295390
>It's impossible to change which model is served behind an API
>>
File: 1714952514147999.png (25 KB, 561x373)
25 KB
25 KB PNG
oh my god they changed it again
holy shit
>>
>>102291508
why was this in the newspaper
>>
>>102295407
>it's impossible for finetuned llama to be smarter than the base model

i know this dude's a grifter and this is a nothingburger, but cmon
>>
You retards are falling for some braindead reddit doomposting. Why would they even release it open source immediately if this was just a scam to promote something or gather investors? It does NOT make any sense. If this truly was a scam, they would have promised an open release AT SOME POINT which they then never would have done. They released it on hf because they genuinely believed in it. Something is wrong with the weights, through manipulation or a genuine fuck up. You don't know how Glaive works, it's likely using Sonnet and 4o to generate its datasets hence the """switching""". Stop being gullible redditors falling for every fud, retards.
>>
>>102295483
It's not a smarts question tho it's a tokenize thing
>>102295303
This is limiting it to 10 tokens and seeing how much it can output
>>102295487
So they also switch tokenizers on the go? That's be a neat breakthrough actually
>>
>>102295487
It could all have been avoided if he bought an ad here...
>>
>>102295487
>It does NOT make any sense.
They uploaded the model two or three times now. At some point he started retraining the model, just in case and still nobody can get reproducible results.
Even if it is a genuinely trained model, i don't trust the retard who cannot copy some model files.
This started way before the reddit links.
>>
How is anyone actually prompting this endpoint and denying that it suddenly changed how it was responding? The difference is night and day.
>>
>>102295487
>>102295503
> The reflection model on their API is just prefilled Claude 3. (All of the models are limited to 10 tokens, temperature 0, top_k 1)
https://xcancel.com/intervitens/status/1832908215757295685
>>
>>102295532
I have the same experience with my local models as well. In some days they work better than others. It's just how LLMs work.
>>
>>102295535
Days, not 30 minutes.
>>
Just use code fingerprinting.
"Write me a javascript which will turn all div elements white when moused over and change them back to their original background color when mousing out."
Only GPT4 tier models are capable of reasoning that they need to store the original background color as a variable beforehand.
>>
File: wtf.png (368 KB, 920x906)
368 KB
368 KB PNG
>>102295534
>James O'Leary
>Replying to @intervitens @MapleMAD1
>This is an uninformed lie
wtf this field really is filled with schizos
>>
>>102295535
This is not an "off day" change, I was using a preset with a prefill prompt and it went from working to not acknowledging it at all.
>>
>>102290411
Reminder that if openscam didn’t use hosting the most recent hype grift instantly as free marketing this drama would have died the day it was posted.
Reminder that they do not verify the models or security of any of their providers in any way whatsoever.
>>
>>102295303
Wait, are they pretending it's ChatGPT 4o now?
lmaoooo
>>
>>102295513
Thank you, that's a good point. I can see how getting the impression that the model is being shilled without going through the proper channels for doing so would make someone predisposed toward shitting on it.
>>102295521
Thanks. I think I get where you're coming from. There's definitely bene some fishy behavior around all this, to say the least. I just don't know if it's enough to damn it but I understand being suspicious.
>>102295534
Oh, huh. Now that is a bit more convincing. This would suggest that it's using Claude tokenizer instead of Llama, correct? That's definitely a stronger point of evidence. Thank you for sharing it.
>>
File: file.png (198 KB, 392x380)
198 KB
198 KB PNG
>>102295604
its not 4O anymore, they switched it to Hermes 3 405b
>>
File: 1545435131127.jpg (45 KB, 1024x756)
45 KB
45 KB JPG
>>102295607
>Thank you
>Thanks
>Thank you
>>
>>102295487
You are assuming dishonest people are also clever or competent.
Often they are not. That’s why they have to be dishonest.
>>
>>102295607
>This would suggest that it's using Claude tokenizer instead of Llama, correct?
Right, and then just a bit later switches to GPT tokenizer https://xcancel.com/intervitens/status/1832919814450327797#m
And now apparently to a 405 llama...
https://xcancel.com/intervitens/status/1832929141852672485#m
>>
File: 1699373371329.webm (1.98 MB, 1024x1024)
1.98 MB
1.98 MB WEBM
>>102295609
>>
>>102295622
>nooo ignore logic, they are EVIL and DON'T THINK ABOUT IT
>>
>>102295627
>>102295534
buy a fucking ad
>>
>>102295487
>they [tried to] released it on hf because they genuinely believed in it
that's what they, thinking they're playing 4D chess, want you to believe
>hence the """switching"""
"whoopsie we connected people to one of the providers we're using to generate dataset, instead of the actual model, then rotated it to the next one"
>>
>>102295609
Lie! It was gemmasutra-2b all along!
>>
>>102295589
it just gives insight to how uninspired and dumb they are. they can't even take a moment to learn how a gpt model works to conceal their scam better. forget the herculean task of collecting data and training a model.
>>
>>102295655
drummersisters we can't stop winning!
>>
>>102295655
100x2B! Switch experts based on computer clock
>>
>>102295640
I call them stupid not evil fuck off
>>
It's GP4o/Claude/405B/Jamba supermerge.
>>
>>102295660
I’m inspired by his sacrifice so that others can learn from his mistakes and scam better in the future.
>>
>>102295655
Actually now I'm curious, could Drummer add reflection to Gemmasutra to improve its outputs?
>>
>>102295700
From what was seen if it was running on Sonnet 3.5 during the benches it **lowered** output quality just enough to make it seem like an impressive Llama tune.
>>
>>102295715
>gimp cloud models
>sell them as "local"
Billion dollar scam.
>>
let him cook
>>
>>102295736
The whole restaurant is already burnt to ashes dude
>>
File: file.png (29 KB, 597x244)
29 KB
29 KB PNG
you DO realize that this dude is simply asking y'all to chill and wait a couple of days?
>>
>>102295700
There's only so much you can do with a 2b model. Impressive for what it is, but still.
>https://rentry.org/83fkenr9
>>
File: file.png (15 KB, 504x139)
15 KB
15 KB PNG
literally 2 more days and the proper weights will be up, i'm not sure why y'all insisting on seething so much for no reason
>>
>>102295759
>chill and wait a couple of days?
>>102290411
>Please let us run our scams in peace for enough time so people forget we scammed them
>>
>>102295770
He reminds me of Mormon, that same unabashed incompetence and grandiose claims.
>>
>>102295772
what scam? who's getting scammed? take your meds schizo
>>
>>102295715
>nuCOT lowers output quality
I swear I can cum just from finding out again that something I think is stupid is stupid
COT is fucking dumb
Has been dumb
Will be dumb
Would be dumb
#treehaters #chainsawAI
>>
>>102295770
>how to set up a torrent
says it all, fucking retard
>>
>>102295759
I wonder how much he paid the openrouter guy
Oh wait he did it for free because damaging the community with cry wolf hype cycles is part of his fucking business model
>>
>>102295804
i don't know how to setup a torrent
>>
>>102295819
then you're a fucking retard too, clearly
>>
>>102295833
>anyone who doesn't have the exact same set of skills as me is a retard
Damn right.
>>
>>102295819
>>102295833
>>102295855
I’m a certified retard and when I had to learn to do it it took me less than five minutes from opening google to PMing someone the magnet link.
If he’s “figuring it out” he’s on next level retard.
>>
>>102295759
>>102295770
This guy types and sounds like a cryptobro, making excuses to buy time when they don't deliver - because they can't. It wouldn't surprise me if he turns out to be a scamming grifter. No one in this space can claim not to know how to correctly upload to huggingface and be taken seriously. I hope he keeps getting shittalked and laughed out of the LLM sphere.
>>
>>102295873
I agree. If someone tries to learn something that they've never tried to do while having no point of reference to use in ten minutes, retardation has set in too far.
>>
>>102295770
>>102295759
this is how scams work. It's "just 2 more days bro" over and over until they cash out and disappear
>>
>>102295785
Weak bait.
Freedom of speech works both ways.
>>
>>102295912
>>102295905
3 hours+ to make a torrent
https://xcancel.com/mattshumer_/status/1832890160423358680#m
>>
>>102295924
oh no, gotta wait 6 hours instead of 3...

the absolute state of adhd /g/oomers....
>>
>>102295924
>still working on it.

>right now, we're, in parallel, trying:
>- torrent
>- fresh upload to HF, rather than trying to overwrite our current repo

This is just comical.
>>
https://www.reddit.com/r/LocalLLaMA/comments/1fcaoou/poor_results_mistery_solved_reflection_70b_was/
https://www.reddit.com/r/LocalLLaMA/comments/1fcc9tq/another_guy_named_matt_telling_matt_he_doesnt/

he's making shell accounts to defend himself now LMAOOOO
>>
>>102295924
>in parallel, trying:
>- torrent
>- fresh upload to HF, rather than trying to overwrite our current repo
watch him claim he fucked the weights while making the torrent
>>
>>102295938
meds
>>
Consider that the biggest leaps forward in LLMs have often seemed unbelievable, with the methods implausibly simple.
>>
File: IMG_9820.jpg (1.37 MB, 1125x1961)
1.37 MB
1.37 MB JPG
>>102295938
>reddit drama
Must resist shit stirring…
>>
>>102295973
Yeah, who knew we could make local models so much better just by using a Claude proxy, revolutionary!
>>
when i was fucking 9 years old i could have set up a torrent by myself. And any normal adult should be able to google for instructions and do it in 5 minutes. 100% a scam and he's stalling for time before the inevitable rugpull
>>
File: mysterious stranger.jpg (72 KB, 900x900)
72 KB
72 KB JPG
Listen! He has a plan and he is learning!
Trust the plan! The torrent will work perfectly and we will have FOUR TIMES THE INTELLIGENCE! SIXTEEN TIMES THE CONTEXT!
>>
are people actually pretending to believe this obvious scam now just to be a contrarian in ell emm gee
>>
I reflect myself.
>>
>>102296000
it's clearly impotent damage control
>>
>>102296000
Yes. Why? Because funny.
>>
>>102295759
>y'all
Go back to twitter, tourist nigger.
>>
>>102296013
baited for this EXACT response
>>
File: Oh no.jpg (80 KB, 616x353)
80 KB
80 KB JPG
>>102296000
You don't believe us?
>>
File: sHt5q.jpg (49 KB, 349x642)
49 KB
49 KB JPG
>>102296017
>>
>>102296036
baited for this EXACT response.
>>
>>102296000
Every openrouter employee is furiously spamming copium because they know in a few days the two neurons in redditor’s heads will fire together and they’ll realize they’ve been glazing the worst shit to happen to open source AI since OpenAI went for-profit
>>
>>102296000
>contrarian
No, just a jester. This shit's funny, worth the laugh.
>>
>>102295487
I despise scammers. I hope you and your whole family dies of cancer you scum piece of shit.
>>
>>102296096
based

stop extending charity and good faith to dark triad people, exploiting people's desire to be nice and assume good intentions is how they get power and resources
the only thing that makes them go away is consistent open hostility, it's the only thing they understand
>>
>>102295487
poo in the loo
>>
>ai expert created sota ai in his basement
>incels are seething because he spent youth having sex instead of seeding torrents
>>
>>102296096
Just because your low IQ has made you fall for scams in the past, don't let it out on projects that are potentially being sabotaged like this. Think for yourself for once.
>>
>>102296142
Nah, they spin that into drama and triangulation. You have to just ignore.
>>
>>102296182
If only I fed the torrents enough...
>>
>>102295487
A dataset based on 4o and Sonnet would not saturate the model enough to generate identical responses, retard kun.
>>
>>102296183
lol sabotaged
His gpu is oppressing him eh
>>
>>102296183
Shut the fuck up you dumb nigger.
>>
File: e71csgcrjond1.png (53 KB, 1562x282)
53 KB
53 KB PNG
LMAOOOOOOOOOOOOOOOOO
>>
>>102296368
KEK
>>
>>102296368
OH NONONONOO REFLECTIONHEADS
>>
>>102296368
>raj
OH NO NO NO
>>
>>102296368
One Llama to compose them all.
>>
>>102296368
>>102296379
>>102296389
>>102296396
>>102296399
go back
>>
there's no way we're still talking about the same shit.
>>
>>102296478
there's literally nothing else to talk about, local is in stasis until strawberry drops and we start fine tuning on its outputs
>>
XD
>>
>>
>>102296478
its funny af and only a couple days old
>>
>>102296542
absolute shambles lol
>>
>>102296426
eat shit matthew
>>
>>102296478
we NEED to talk about something
>>
Huge announcement coming in the next thread.
>>
>>102296794
mi*u finally going to off himself
>>
>>102296794
World record for most advanced and complex Mikusex?
>>
>>102296794
torrent is created but never seeded?
>>
uloading the weights to huggingface for real now
I'm single handedly going to save open source
eat shit faggots
>>
>>102296939
>>102296939
>>102296939
>>
File: 1725848397726.gif (1.42 MB, 256x320)
1.42 MB
1.42 MB GIF
>>102296922
>>
>>102296922
ads



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.