No LLM talk. Actual straight up machine learning. I want a real person to talk to about this, share ideas, how can I find people IRL or here or maybe discord or IRC. I am struggling hard right now with hyperparameters, how did you cheaply find yours, I am so upset right now, what better GPU can I get my 3090 is eating shit, each update takes 1 second, 0.6 of that second is CPU encode, is it worth moving encode to GPU? is this idea retarded? Worst of all is the CPU bottleneck, rollout takes like 100 seconds with 12 agents. I have a 5900, and I wish someone told me logical cores are a meme when it comes to real work, what's very important is per core performance and this CPU eats shit, so much so that I am beginning to think about a hardware update, but for now I need to get by with shit I have.would be cool to hear peoples ideas, especially around communication, maybe there is like a support club where we can sit in a circle and vent. I am in nyc if anything. early shalom shabat
One of the most retarded and painfully underage threads in recent memory.Off yourself kid
>>108775508don't bully me retard, if your IQ is not ML-tier you should be on /pol/
>>108775482Assuming you mean preprocessing with "encoding", then how about you use your brain and fix your pipeline? You already said it takes 60% of the time, which it shouldn't. Ever.Also, i highly doubt core count matters for this at all. Logical cores - aka a partially duplicated pipeline - are there to better use the physical core and are of course highly application dependent in their usefulness.But that really doesn't matter, you're going to be massively bottlenecked by your memory subsystem anyway. If you're unaware of this simplistic fact of modern architectures, you might have skipped a couple of steps in your overeager attempt to play around with ML tools. It's a bit of a meme, but you need better fundamentals if you want to properly utilize the hardware you got. If you want to skip that then you need to have sufficient funds to offset your deficiency in that regard.
>>108775574yeah, my fundamentals are bad, and while it makes sense for most people, I cannot operate that way, i can only learn by jumping neck deep and figuring my way out... 60% may be bad, but maybe i could give some context as to why that is, with my weights loaded and relatively modest batch of ~350, I am at 24GB VRAM (not spilling into system ram there's like 300MB buffer). so 25 updates per iteration for example take around 25-27 seconds, half of it is encoding, but whole actual iteration is around additional 100 seconds just to sample data. I tried doing synchronous, but it bottlenecks the GPU and final clock time is still the samefor some reason talking to ai does not help with any of this, I got stable everything except I cannot figure out a valid environment for hyperparameter search, I want to do it cheaply, but i cant, I am trying to validate several different configs for evaluation, but they suck, I tried frozen buffer to prevent rollouts - results are garbage, online buffer appears to be stronger
>>108775482You might want to provide more details on what kind of model you are training? I work in computer vision, not sure why you are hitting cpu bottle necks, but typically I train/test on very small subset of data to optimize hyperparameters. Also, how do you people even continue to post with such horrible captcha system, it's even worse than it used to be, you guys must be some serious no life losers.
>>108775482>I am struggling hard right now with hyperparameters, how did you cheaply find yoursShouldn't you be able to put your hyperparameters in autograd like your normal parameters and let it find them for you :-)
>>108775539you can read few books from the past (before mid 2ks) which have not been read by these current 'ai' wizards nor by youhence he got a point
>>108775482>Actual straight up machine learning.>I want a real person to talk to about this, share ideas, how can I find people IRL or here or maybe discord or IRC.You dont have to lurk 2 years anymore. literally talk and let any gpt teach you absolute basic until you are up to the level people bother listen to you
get to reading
>>108776979this is old as shit. why would you recommend something so dated you clearly dont know enough about the subject. God if you're honestly not trolling you're probably some pretentious wannabe academic fuck that probably got a masters but was too pussy to get a phd and even then you probably can't do shit on pytorchHoly fucking shit the deep learning section is like reading walls in ancient ruinsOP you might find some O'Reilly books useful just for general fundamentals, they skip a lot of academic jargon and are usually written by professionals in the field. They're also relatively easy to find on websites like libgenAI and ML for Coders in PyTorchDeep Learning with PyTorch, Second Editionif you're looking to get yourself employable and prep for interviews where they're looking for more like models in production then i would strongly suggest Chip Huyen's ML book and AI Engineering book
>>108777168Not him, but there was a rerelease and update recently-ish, in 2022-2023.https://probml.github.io/pml-book/book1.htmlhttps://probml.github.io/pml-book/book2.html
>>108776053my captchas are easy they are hard when you are bad postermy model is not vision its weather related, aren't you worried about overfitting your hyperparameters with a small dataset? my dataset is like 3GB npy file with some metadata files fp16, i do 12 rollouts 2000 steps, the size of dataset is to accommodate several years but obs window is one year, if i lower it i would be overfitting hard, not sure thats good idea..>>108776060>hyperparameters in autograd i dont think i can do that i asked gpt, it needs like fixed-point assumptions or some bs which is incompatible here, also my outputs are discrete>>108776077i only read AI outputs man >>108776420lol I think i am close, i think this is the last frontier for me, next challenge will me machine vision, i do not really get it right now, I know a lot about TCN dilutions etc but nothing about CNN.
>>108777168not looking to be employed that would distract me from my projects, consult maybe, I was a PM, i got tired of being diplomatic I am using pytorch ill look at what you are suggesting, i just really cannot operate this way, i go to the goal, what i want, and work my way backwards, ultra-agile. thanks for the suggestions i have some research papers saved ive been meaning to read i feel like they have better information, this one is opened in the adjacent tab >Evaluating hyperparameter optimization on the generalization of deep reinforcement learning algorithms
I guess my issue is the experimental nature of hunting the cheap setup that can do an effective search, I have used Sobol method, 256 configs 6 iterations, frozen replay buffer to avoid expensive rollouts, that blew. So right now I'm cross-validating those results with a live replay buffered one, i took 6 best configs, and 6 rejected configs, and I am running them head to head, but this time 60 updates, live buffer, to see if my Sobol was set up correctly, if all 12 pass, then I can continue using the cheap method, if they fail then I have to switch things up, maybe run a 64 grid instead of 256, use online buffer which will cause my iters be 170s long, and only do like 20 iters max, i think with that setup i could do each exploration under 24 hours long.I have no idea how you guys don't have issues with CPU bottleneck, my GPU is basically idle for 100 seconds, then pegged at 100% for 60 seconds, and like i said i tried to run asynchronously, wallclock didn't change so i went back to sequential. I am as optimized as I can be, and I optimized for max GPU utilization. my ratio is just under 1:1 if you are not bottlenecking you might be overtraining on stale data like 2:1 etc.
When you guys do your hyperparameter convergence, do you experience that your alpha converges first, then your lr and tau, and target entropy with gamma converge last, or is it all at once for you?
Damn this thead is a mess, just read 10 books before posting. You remind me of a cargo culting indian on meth. Or a Markov chain bot.t. ML postdoct.
Machine learning is a range so wide with so much ugly and inefficient obscure python code that nobody will be able to give you tips and tricks unless you are asking some really basic noob questions.Unless you have somebody actually working with you, you will have to figure out everything on your own.I know you said no LLMs but I am vibecoding a project somewhat adjacent to ML, finetuning LLMs when the whole model doesn't fit on a GPU by streaming layers in and out of the GPU.
Start with the operating system.https://github.com/EmptyMonad/bootnn
>>108777892i'm way beyond that>>108777870kek>>108777883this is how i feel extremely alone, this reminds me to when i did research in grad school, first few months my phd was guiding me and telling me what to do, often helping me out with common sense until it clicked and then i was able to stand on my feet, and right now I have to do it alone