A thread for most things statistical, be it finance, engineering, business data or economics. Or whatever weird random or pseudo-random process you have that needs explaining.Many complain about this general, to say it's shit but the amount of discussion and thoughts in it have been very good. Of particular note are the engineer who works in sonar. He is a goat.
What should be in a coding portfolio to show you actually know your shit in statistics?
previous thread: https://warosu.org/sci/thread/16287506
I want to model a process that generates a fixed number [math]n[/math] of events within a fixed (continuous) interval, so my problem is essentially to find a sampling distribution for [math](t_1, \cdots,t_n)[/math] with support [math]0 < t_1 < \cdots < t_n < 1[/math].The catch is that I want to parametrize this distribution by a real-valued "temperature" [math]\sigma \geq 0[/math] such that [math]\sigma=0[/math] is the degenerate distribution with uniform spacing, [math]\mathrm{P}\left(t_i= \frac{i}{n+1} \forall i \right) = 1[/math].I have no opinion on whether [math]\sigma[/math] should be bounded above, and am OK with approaching [math]\sigma\to 0[/math] as a limiting case.Any suggestions for good sampling distributions? An analytic PDF (or MGF) would be ideal, and in principle I could always resort to running a thermodynamic simulation in order to generate a sample each time. But I'm hoping for something more efficient.
>>16320976Hi, what is the meaning of “support with t_0 < … < t_n<1”?
>>16320326Can we seriously stop this finance and economics shit? I get that people like money but we have at least two other threads explicitly about finance or economics up already and neither of these are science and are really only tangentially related to mathematics, and there are only a handful of good threads on this entire board at any given time as it is. The last thing this general needs is to be filled with people who don't actually give a shit about the subject for its own sake.
>>16321079You are very free to poast about interesting statistical analysis. If you know how to use the ATUS, I would be happy to know.
>>16320976how come you standardize the timesteps like that? I can understand this for some toy models but are you using this outside of your homework?
>>16320976Midwit here. How do I get at least this good at modeling?
>>16323072My original problem was to model security patrol at a facility with evenly spaced checkpoints (where timings are logged), the standardization is just a way of abstracting away irrelevant factors.Anyway, I've come up with the simple idea of using a mixture distribution to interpolate between the degenerate distribution and the uniform distribution on (0,1), where the mixture weight can be something like [math]e^{- \sigma}[/math]. But I have no justification for using this other than mathematical convenience.
>>16323180If you are going to model a real problem, it would be better to fit the data in order to get a model that you can finetune. Instead of getting a model first, go data first. Because then you will find anomalies which could make your model break.
>>16323186I fully agree with going data first if the problem is real, which it was originally. But it no longer is, since I've since left the facility and lost access to the data, and the only interest I've retained in the problem is mathematical.That's why I pitched it as a math problem ITT.
Alright I've got a big one but I'm too stupid to figure it out.Let's imagine all humans have to fight each other to find out the strongest.Assume the following>gender ratio of 1:1>fights are randomized>strong always wins against weak>strength is a normal distribution>female strength distribution mean is shifted two standard deviations to the left of the male strength distribution mean so that 5% of males are weaker than the average femaleNow I want to calculate the probability of the average number of rounds the average female (and average male) achieves.Now for round 1 it's easy:P(average female survives) = P(gender female) * P(strength distribution female) + P(gender male) * P(strength distribution male)= 0.5 * 0.5 (since she's the mean so by definition of the normal curve half are weaker than her) + 0.5 * 0.05 (male curve from -inf to -2SD) = 0.25 + 0.025 = 27.5% to survive the first roundbut after that it gets hardI'm not sure what to doI should somehow calculate how many females below her have been eliminated before I can calculate her probability of surviving the next round but that entails some sort of integral or summationany advice?
>>16324096>all humansDo you mean the current finite world population, or the theoretically infinite set of all humans for all time?If the latter, then a finite number of rounds isn't going to change the initial gender ratio, and you can just use the geometric distribution with your round 1 probability.If the former, the actual population size matters (since it upper-bounds the number of rounds) and you're better off just running a Monte Carlo simulation or something.
>>16325705How is this a vanity thread?
>>16324096markov chain you niggerrrr
>>16326135jolly black gentleman if you may
>>16326135>>16327706Roodypoos, please.
>>16327801Roodywhat?
I think the most interresting statistical problems are either in very seldom occuring events or finance.
>>16329581Statistics of non-stationary processes can be quite involved as well. These show up all the time in engineering and state estimation.
>>16327801kek
Can I get a quick rundown on auto insurance pricing and why time spent driving isn't a factor? Maybe there are companies that do track your time spent driving and adjust your rates off that, but they aren't mainstream.
>>16331653If I had to guess, a lot of it comes down to not trusting the customer's reported metric. Any metric like that you could use would be gamed before it could be properly useful.
Is this the right place to talk about machine learning?
>>16332145It was a few weeks ago. What do you want to know about machine learning?
>>16332145Honestly yeah.
>>16329581That's because we know very little about heavy tailed distributions
>>16334284Extreme value statistics are actually very interesting. Or EVT, Extreme value theory. Applicable to flooding, other natural disasters and financial crises.
>>16334307I don't know what the word "actually" means here. Is it an implication that we actually do know a lot about heavy tails? Because we have very little more than basic foundational results. If it's a word which is used for emphasis in agreement + adding an additional comment then I agree that they are very interesting, but in part this is because we know very little about them despite their properties.
>>16320326How does it feel studying a field equal to astrology? Hume has debunked your entire field.
I have a webpoll that asked people to rank 11 categories, the order in which the categories were initially randomised and they had to slide them around until they were satisfied, 1st place is worth 10 points, 11th is worth 0 points. It received 160 responses. meaning total points scored are 8800Looking at the results some of the categories have quite clear differences while for a few others they are very close to their neighbours with the closest pair having only about a 1.8% difference in points.I want to determine what the minimum threshold would be for the difference between two categories to have statistical significance and not be more likely due to chance but i don't know the proper way to calculate it.
>>16334347Can you explain what you mean by us "not knowing" about heavy tailed distributions? What do you in particular want to know about them? In signal processing we deal with heavy tailed distributions all the time (e.g., Cauchy distributed random signals emitted from a moving target, log-normal distributions for received mean-power in a processing FFT/DFT band). Can you give an example of where you'd like to know more about a heavy tailed distribution?
>>16334366Can you explain what you mean? How has Hume debunked the study of probability, uncertainty or statistical tendencies?
>>16320992I think that it means some positive area under the distribution between 0 and 1 and the zero function otherwise.
>>16326135>>16327706>>16327801Nigériens, s'il vous plaît...
>>16320372Most of all, implement methods from new papers that haven't been implemented in your favorite programming language yet. A lot of new papers don't come with the code for implementing the method unless you personally email the authors.Make sure you know your numerical methods so your implementation isn't shit
>>16337998> Make sure you know your numerical methods so your implementation isn't shitIt will be mate. No avoiding it. The only way to make a not shit implementation is to make a bunch of shit implementations and work on making them not shit as you learn more about why they are shit.
>>16337998The hardest part of this I found is that the data is extremely difficult to find. The programming, implementation or even testing out several methods is not the issue I think, I think what is the problem is finding the data and that it actually has something interesting in it. The worst thing that can happen is you find something superinteresting, like financial data that is hard to come by and there is nothing in there. It's like a NPC fat fuck with zero internal monologue. Just void of anything, literally several gbs of noise.
Can someone explain the bayesian wager to me? I have tried so hard to accept the colored balls in a bag and the goat behind the door examples but I just can't. It feels like the lizard brain just reasserts itself.The closest I ever got was when some guy said, Imagine there were 1 million doors, you picked 1, and the host opened 999,998 door and asked you to change door. But I STILL couldnt accept it. Even if I simulate with code I just fucking can't. What the fuck?? Is there a GENE that selects for a frequentist brain or something??
>>16338287Hands down the funniest shit I have read on /sci/. Do you smoke the devils lettuce?
>>16338287It's just a question of prior information vs. the original probabilities.For example, let's take the Monty hall problem with these doors, the classic. Let's assume that the probability of the prize being behind each door is equal before any of the doors are opened.When the host opens one of the doors and shows you that it isn't the one with the prize this contains conditional information, but doesn't change the fundamental probabilities for the prize location still being 1/3rd for each door. If you look at the _conditonal_ probability of doors 1 and 2 given that you know 3 isn't the one with the prize, it's now 50/50. The probability of door 1 containing the prize on its own, not conditioned on anything, doesn't incorporate any of this knowledge of door 3 being opened and not having the prize behind it. It's all about whether or not you are incorporating that evidence from the observations of one event into the probabilities of the others.
>>16338501on ocassion
>>16338510>ocassionYou need to puff some more. You are not fully a vegetable yet.
>be TA'ing a thermo/kinetics/stat mech PChem coursehaha when I was in grad school we called it sadistical mechanics>haha but I won't do that to you guys :)>proceeds to give them problems lifted from the grad level stat mech psets involving Langmuir's isotherm and the Einstein solid because "they're gonna use chatGPT anyway"Holy shit, what is it about statistics that knocks a screw loose in people's heads?
>>16338287It's just a technicality thing. I would still hold to my original answer despite it being the one of many instead of the one out of 2It doesn't consider that a human can be sure and hold true to their initial decision.
>>16338789Statistical mechanics and physical chemistry aren't statistics. They're physics but even more gay and autistic.
>>16320802thanks
>>16338982>Statistical mechanics and physical chemistry aren't statistics. They're physics but even more gay and autistic.Wow, sounds like someone needs a crash course in both science *and* respectful language. Last I checked, understanding entropy doesn't require a specific sexual orientation or neurotype! Maybe spend less time labeling things and more time exploring the amazing world of atoms and molecules?
>>16321079stay poor
>>16344811LOL! This meme is straight fire! I am ROFL. Hope you have a great day :) Happy statistics-ing!
>>16345261Physicists don't understand entropy. For them to understand entropy they'd have to actually take a proper probability theory course, learn how signed measures work, and learn information theory. What you call entropy in statistical mechanics is mostly just combinatorics assuming uniformity for the point mass distributions over the space.Physics also is barely a science. At the theoretical level it's axiomatic math pretending to be science, and then they spend billions of dollars constructing particle accelerators in a desperate attempt to find subatomic particles which validate their symbolic paper shuffling.Physics is literally the worst of both worlds. it is far to axiomatic and rigid to be practical and useable in reality (hence why engineers have to do just about everything for you fuckers), but yet not actually rigorously well founded like proper mathematics.
>>16345793I hope you have a good day too :^)
>>16332145No, machine learning is related to statistical physics which is not statistics
>>16346180> Machine learning is related to statistical physics.I hope you are just misunderstanding how entropy (more realistically mutual information and KL divergence) works within ML. If you genuinely believe statistical mechanics has any major role to play in the super majority of ML, you're off your rocker. Maybe in some aspects of physics informed ML/NN's but even then, that's mostly ODE/Egodic systems stuff, not statistical mechanics.
>>16346117Absolutely retarded take. This kind of take only ever gets repeated as a gentle joke by good mathematicians and gets repeated seriously by idiots and arrogant undergrads who don’t understand why physics doesn’t care about what they don’t care about. The idea it’s anything like axiomatic math is stupid too. Why would an empirical science have any use for axioms? Nature can at any point show you you need different ones and there goes your results. Applying a mathematician’s mentality to physics or any empirical science would be disastrously stupid.
>>16347275> Why would an empirical science have any use for axioms?What exactly is Hamilton's principle if not an axiom that is asserted? Or conservation of momentum/energy in general for that matter? The entire notion of determinism for macrophysical phenomena is axiomatically asserted because we cannot meaningfully verify it at any level due to the intractable uncertainties that come with measurements and the vagueness of what exactly a "state" is in practice. I don't think you understand what theoretical physics actually means if it is to be taken seriously. If you take it as an inferential modeling discipline, then it is fine on it's own. If you take it as some sort of set of "laws" (as if it represents some kind of source code of the universe) then you genuinely misunderstand how the whole enterprise works. Unfortunately, many in physics do seem to believe in this nonsensical "source code of the universe" business.
>>16347314Hamilton’s principle is not an axiom. Neither are conservation laws, which do not even hold in all circumstances. We have plenty of physical systems which do not have conservation laws at all. I don’t consider your opinion on physics seriously at all if you’re so uneducated on it that you don’t even know this. I think you should stick to giving opinions on subjects you understand a bit better and not be so concerned about others’ understanding of a field you don’t.
>>16347324Hamilton's principle is essentially treated as an axiom when people are doing classical mechanics. If the trajectory is not a saddle point on the constrained variational, it is discarded as a potential candidate for the true path despite the whole business of defining variationals being an approximation. In fact, the whole business of defining states on a continuum is in some sense a business of approximation. It's a necessary approximation, but it's an approximation. I know you think I'm uneducated, but this problem of physicists confusing the map for the territory shows up quite a lot in my field (sonar and statistical signal processing). You'd be amazed and disheartened by the number of physics graduates who genuinely don't understand what it means that the "laws of physics" were written by humans (i.e., they are our best guesses using our inferential modeling processes, not something transcendent of human capabilities).
Please, explain this by statistical approach to brown motion.
>>16347351If this is a response to >>16347333 then unfortunately I don't have much of an answer. This is the basic mechanism for turbine based power plants right?Do you mind explaining what you are getting at? I don't understand the point you're making.
>>16347333No, it isn't an axiom. It was something verified through experiment to work first for light and then for a large class of other systems where we know that it is equivalent to any other kind of formulation of Newtonian physics. It isn't an axiom, because if experiment shows that it doesn't work it is discarded. In the same way that the constancy of the speed of light in inertial frames will be discarded if shown experimentally to be false.I don't think you're uneducated, but you're getting the basic stuff you're talking about wrong when you talk about physics, and then accusing physicists of having a lack of understanding of their own field. You accuse it of not being a science while demonstrating you don't actually understand the field, and also making incorrect claims about the field.This isn't really the thread for it, but suffice it to say, everything you've said about physics is either based on a misunderstanding, or it just outright wrong.
>>16347375It has ammonium loop, and diethyl ether loop.Amonium loop extract heat from environment, by compressing gas, that boils diethyl ether, and that is run trought diethyl-ether(gas phase), and that produces more output that first requires to function.It's working by artificially creating low temperature point, that is even by hamiltonian principle heated by environment, because it's cold from expanding gas that passed heat to turbine loop.I am trying to find equations, that characterize heat everywhere, because I'm not sure how to estimate compression of gas, and rotating turbine by that.But quick thermomechanics. (heatpump=400%effectivity)*(turbine=60%effectivity) = power output.It's literally cleaniest source of energy I can imagine, the GLOBAL COOLING.
>>16347445> It was something verified through experiment to work first for light and then for a large class of other systems where we know that it is equivalent to any other kind of formulation of Newtonian physics.This just fundamentally isn't true. It is a modeling prescription for how one models the dynamics of a system. It is no more verified by observing particular cases of systems following models derived from this principle than the theory of integration is by measuring the area of a plate with a ruler. Regarding the question of "when do we throw out physical models if they fail to verify experimentally" that opens up a whole host of subtleties that sit at the heart of the divide between experimental physics and theoretical physics.
>>16347716>This just fundamentally isn't true.I recommend reading a book such as Goldstein to correct this misunderstanding. We both know that in the Newtonian regime it gives Newton's laws, and we also first got this idea from looking at light which had experimental evidence for it. It is also never considered an axiom since we know from quantum mechanics the principle of least action is not exact. This should show why it is ridiculous to consider it an axiom of physics when all physicists know it is an approximation.This should be a fact which forces reconsideration of this abjectly wrong claim that it is an axiom.>It is no more verified by observing particular cases of systems following models derived from this principle than the theory of integration is by measuring the area of a plate with a ruler.This is why we would never consider it an axiom. Again, in particular, because we have a well tested theory (quantum mechanics) in which the principle of least action can only be considered an approximation. It just isn't an axiom. It's something that we know works within very well known limits. Frankly, I don't believe you have the domain knowledge necessary to argue the point you want to make.
>>16348708You know what man, fair response. My experience comes from frustrations working with physicists in my field (which is not physics) and the ideological rigidity I see in the people I've worked with. I'm certainly not a domain expert in physics (though I have worked through about 60% of Landau I, and a little under half of Landau II). If you think I'm missing something, I'll take your word for it.
>>16348731I understand your frustration and I kind of get where you're coming from with regards to some physicists having a very bad attitude, which I certainly agree with. Honestly I really respect the willingness to change your mind and I wish you all the best.
>>16348741I would say the point that drives me up a wall is this very strange insistence on a very rigid form of causal determinism when modeling the motion of a moving body. When you work in radar/sonar you have physics based models, but you always need to incorporate some sort of process noise assumption into the way your modeled object moves (on top of the measurement errors in the measurement equation portion of your model). I've seen so many physicists try to come into this field and believe that with just a bit more sophisticated motion modeling that they can remove the need to consider "process noise" (random errors in the forward propagation model) and it's just the most frustrating thing to deal with because it never works. This rigid "everything becomes deterministic if your model is good enough" (whatever "good enough" means) approach is just a nonsensical way to deal with a real problem where things are not some idealized model on paper or in a simulation.
statistically speaking what is the probability op is a fag?
>>16350574It's a certain event. OP being not a fag is an event of measure 0.
>>16347612You're missing a boiler feed pump.Deus ex machina determinism >>16348754, is the advancedversion of the perpetual motion machine fantasy.
>>16352238Now I have another cycle there.Heat pump from condenser that pumps heat to intake of heat, just after it's preheated by the environment.What's boiler feed pump?
>>16352253If turbine 2 runs, condenser 2 must be low pressure and boiler 2 must be high pressure. You need a pump to get the flow to happen in the direction you indicate.
>>16352558Ouch, the circulation pump. Noted. In next version of diagram it'll have it's place.Boiler is boiling, and condenser is making it liquid again.
If there are only two candidates running for oriffice why can't we model the probability of either of them winning as exactly 50%?
>>16353679Because probability is conditional on factors that influence the outcome of the experiment. Not everything is an even coin flip with universally equal probability. If this were the case, the entire statistical discipline of hypothesis testing would be impossible. Think of it this way, you could either spend your afternoon today on Earth today or your afternoon on Mars. This is a case of two mutually options, but it should be obvious to you that they don't have equal probability of occurrence.
>>16353756>>16353679Mutually exclusive options* oops.
>>16320326Does anyone have a good book recommendation for statistics with complex random variables (likelihood estimators etc.)?
>>16355092is it for signal processing? Or any other kind of spectral analysis?
>>16355822It is in some sense for signal processing. I'm aware of the basics from using circularly symmetric complex Gaussians to represent complex envelope samples from colored random signals. However, I'd like to find something with a more rigorous foundation for when things are not a Gaussian case (which makes a lot of your life easier via symmetry). Ideally, I'd like to see a resource that covers hypothesis testing and parameter estimation when the sample distributions are complex. It doesn't necessarily need to apply to signal processing, and in fact I'd prefer it not (because I already understand the way we do it in signal processing world). The complexity part may not seem all that significant, but it does make a difference as the log likelihood ratio of two complex random variables does not necessarily produce a real scalar. This leads me to believe that there are some complications to things like likelihood ratio tests that need to be considered when complex random variables are used.
>>16350574With a rejection level of 5% and the test statistic of 0,we reject the null hypothesis that op is not a fag.
>>16334366Don't know what he said, but Prob and stats arise from the simple fact that some events are too hard to describe using pure physical causality (or sometimes at a scale so big that it's impractical). Sometimes it's good enough to make a statistical model and take blurry, imprecise observations that fit your model to make statements about it (for example, any subatomic physics or any human influenced event)
>>16355897"Random Data: Analysis and Measurement Procedures" by Julius S. Bendat and Allan G. Piersol"Statistical Signal Processing" by Monson H. Hayes"Time Series Analysis: Forecasting and Control" by George E. P. Box, Gwilym M. Jenkins, Gregory C. Reinsel, and Greta M. Ljungshould cover most bases on applications and theory on complex valued statistics used by people.
>>16358287I actually have a copy of Hayes kicking around but had never really used it. Thamk you for the recommendations and I'll look into them. Almost a bit surprising you didn't recommend Middleton's "An Introduction to Statistical Communication Theory" as complex envelope samples are generally modeled via circularly symmetric multivariate complex Gaussians (as far as I'm aware).
>>16358840We complement eachother in this thread anon. I bring one dish to the potluck, you bring another. Then the anons feast on what we gathered. Each one teach one.
>>16357265Lack of digits == Accept Null: OP is not a fag
Can someone explain to me why you can define expected values as recurrence relations?
>>16359048Because if the probabilities, or expected value can be a linear set, then you can break down problems through algebraic manipulations into smaller subsets and sometimes even removing a lot of the abstractions to just reach a simple conclusion to a complex problem.
>>16360604Stop samefagging. Economics and stats are a match made in heaven.
>>16361763Sonar guy here. Can you point me towards a mathematical economics textbook that's taught at the level of an early grad student in the program (preferably not just a reskinned stats textbook but one that's actually got some economics specific substance in it)?Everything I've seen from mathematical economics seems quite...primitive in terms of the tools they use and I'm wondering when the meaty stuff starts.
>>16362378nta but why would there be meaty stuff? have you tried looking for textbooks or technical literature about option pricing and dynamic hedging? that's about as 'meaty' as it gets in finance at least
>>16362476By meaty I mean actually doing some modeling that isn't just variations on some autoregressive Markov chain. Incorporating some real non-stationarity (or at least cyclostationarity) or something more interesting than would reflect the real nonlinear dynamics of social markets. I keep hearing that econometrics gets so challenging and everything I end up finding is just babies first Markov chains/time-series analysis.
>>16362378Go away from economics and go into the insurance stuff. Extreme Value Theory is actually quite challenging, both coding wise and math wise. However, you should intuitively understand it because your job is poisson processes I think?Fire insurance in particular is a bitch to model the expected losses. From the data itself. It can be done but for it to make sense, you really must know what you are doing.
>>16362801My job is a lot of large deviations theory, convex analysis, and Bayesian optimization. In terms of the processes it depends on what you are looking for. Arrivals of multipath reverberations are sometimes modeled as Poisson arrivals, and Poisson point processes are basically the standard for "clutter" or "false alarm" modeling.
>>16362378Mmm...meaty...
>>16320372Machine learning like linear regression and DT
>>16323134Have an Iq in the 140 range
>>16364891If you can’t comprehend that then just use python and learn theory that’s what I do since I’m a 100 Iq anti memeing channer
I'm up to a score of 0.70 on a kaggle project where 0.60 is bottom of the barrel and 0.80 is top of the line. I've only just begun to fight.Are you guys proud of me?
>>16364890sure buddy
>>16365742Yes, continue grinding
>>16365886I R GRINDIN'
>>16365886I took a break and eyeballed the loss curve a bit before repeating the training with a little more data and I'm up to 0.71. I'll probably look into pulling more columns from the remaining tables for my next steps.
>>16368716It's good brother. Just grind.
Should I get a Mathematical Finance masters? How is the job market for this?
>>16371088Are we talkin a quant or what? You almost have to show the programme in the thread in order for anyone here to tell you if it is good or not.
>>16371088I wouldn't go as far as to call MFE programs scams but they are definitely cash grabs. Even if your goal is explicitly to become a quant you are better off just getting a real graduate degree. If you look at the listings for entry level quant jobs the most common thing you are going to see by far is something like>PhD in math/physics/CS from a top university (previous finance experience not required)
>>16371965I think only very specific programmes are worth it. If you cannot get into a target school I think a MFE is not the way to go. You should get a more general and applicable degree then like stats.
I'm currently taking a course in Statistical Inference. The book is also the same name (2nd edition). Currently the course is at chapter 6 of the book with things like sufficient statistics, data reduction...I can understand the materials and understand the lecture but I had to spent a few hours on them and I estimate written homework should take me like 5 hours at least to do.Am I too retarded to study statistical inference and should I drop the course?
>>16373168Is it your first course? I would say that stats can be a bit tough bite to chew in the beginning but when you grasp the concepts you will see them over and over again in other courses and soon you will see it in the data as well.
>>16373367>Is it your first course?yes, I am taking it with two other graduate courses which are project based and I feel like I don't get enough time to do all of them because this statistical inference course will take me like 10-20 hours per week. the other two I expect also 10 hours total but I also have to do research with my supervisor. he gave me some great books to read and I want to spend at least 15-20 hours per week on them.
Some frog said thatP(x > E[x]) = P(x < E[x]) so everything either happens or doesnt happen so its 50/50.How true is this?
>>16374318Ok, if you are not in some undergrad course in stats, then fine. If you come from psychology or some shit, you won't have the math foundation for it.Furthermore "I work 40 hours a week" is only for top tier 1% geniouses. How much do you want to become succesful? Then start to fucking rise and grind and smoke the shareholder crack every day.
>>16374889GeneralSome mix of eglatarian and protypist, WHICH I assumed was something less complex and specific, and this more simple and less specific, such as demiros or something. Then I went on figuring it out with no sense so it's possibly false and worked out already
>>16371890I'm considering the MSCF program from Carnegie Mellon or the MSQCF program from Georgia Tech, what do you think? I have a math bachelors so I should be fine>>16371965> Even if your goal is explicitly to become a quant you are better off just getting a real graduate degree.I don't really aim to become a quant specifically, I'm just interested in the math aspects of finance and would like to work in the industry. Also wdym by a real graduate degree?
Can you guys, like, shut up?
>>16374415That frog is retarded. There are random variables whose expectation is infinite but whos probability of being finite is 1. The whole "everything is 50/50" is a funny joke but it's only something you'd believe if you don't understand even the basics of probability and where these notions come from.
>>16375490Alright but in the real world those random variables dont exist. So in the real world is it true?
>>16375235Carnegie Mellon is good. Also University of Michigan, Fordham University and Stony Brook.I would try Stony Brook first if I were you. It's THE target school for quants in the US.
>>16320976can't you sample from a a markow chain of scaled betas? you can reparametrize them or solve the systems to find your desired parameters. degenerate distribution should work as you request. if you want a pdf, you should try and work with the Dirichlet distribution as your starting point. this is my 2 cents intuition, hope this helps... i wont do your job, it would take days (option B at least, option A is straightforward) -
>>16376132>>16376132anonymous, your equation is false... also, not statistics. i leave that to philosophers... my pov is that it may be useful to believe the opposite of what you typed
>>16376132What do you mean by "exist?" Probability is about modeling uncertainties. There are plenty of things to be uncertain about where probabilistic models work better than literally any other approach.
>>16377000Checked>markov chain of scaled betasDo you mean to sample [math]t_{i+1}-t_{i}|t_{i}[/math] as independent scaled betas? That might not be a bad idea actually, since the beta is well-known as a conjugate distribution, although I don't know it well enough to say if this can be related to the Dirichlet (my intuition is that no, the scaling would mess up the kernel, and I'm not getting paid to prove myself wrong either).
>>16377947Thinking about independent beta [math]t_{i+1}-t_{i}|t_{i}[/math] a bit more: the model works out nicely in case [math]t_{i} \leq \frac{i}{n+1}[/math] falls short of its expected value, because then the beta parameters are completely determined by matching moments to the degenerate distribution (i.e. [math] \mathrm{E}[t_{i+1}] \to \frac{i+1}{n+1}[/math] and [math]\mathrm{Var}[t_{i+1}] \to 0[/math]).But what to do in the case when [math]t_{i} > \frac{i}{n+1}[/math]?
>>16377000Modelling the [math]n+1[/math] interarrival times with a symmetric Dirichlet distribution (PDF [math]\propto \prod_{i = 1}^{n+1} (t_{i} - t_{i-1})^{1/ \sigma}[/math]) does seem to match my intuition though, with what I initially called "temperature" turning out to be the reciprocal of the single concentration parameter.
>>16377173Probably work better than any other approach
>>16378489so... was it a good idea?>>16378003it was just an intuitionive thought about this a bit. my idea was that u could go for something similar to a compensated poisson process with scaled betas instead of exponentials... maybe just taking the exponentials/gamma from poisson process thing and rescale your sample into (0,1) is the easy way...? anyway, try do this:a) sample your first increment from a beta(a,b) and multiply for (a+b)/((n+1)*a)), call this sample t_1b) then sample again from beta(a,b) and mutiply for ((2/(n+1)-t_1))*(a+b)/ac) repeat etc... etc... you can set the b to 0 and get your desired result. you can just sample a vector of betas and find the "scales" of the increments by doing E[tau_i] = E[E[tau_i-1|tau_i-1] + E[tau_i-tau_i-1|tau_i-1]] = t_{i-1} (u sampled) + scale_i*(a/(a+b)) = 1/(n+1)solve for scale_i and get scale_i = (1/(n+1)-t_{i-1})*((a+b)/a)set a and b as you prefer or add some other constraint to fix themtell me if this makes sense to u, i do reinforcement learning no fluid modeling or whatever u do xD i cant test on my pc either rni love that anonymous does statistics <3
>>16379297Yeah, I think see what you mean. The issue though is that you could end up sampling t_1 > 2/(n+1) and in this case step b) would multiply by a negative number, causing t_2 < t_1. (So >>16378003 should be errata'd to [math]t_{i} > \frac{ i+1}{n+1} [/math].) On the face of it, I can't say for sure that there is zero probability of t_2 < 0 either.In any case, if I'm going to be summing up betas, then there's no reason not to go straight for the Dirichlet process, and it seems like this idea is already recorded in the literature as the "stick-breaking Dirichlet process" (section 2.2 of PDFrelated, which is a good overview of the key properties of the Dirichlet).
>>16379438mh... or my logic has some error or, ye, simply the conditional increment has to be negative. this isnt a big deal in the "whole sample" but it may be odd depending on what ure modeling... u may set alpha or beta to.something to make.sure that scale_i < (i+1)/(n+1) and fix this problem but get more computational cost because beta or alpha would.depend on you sampleidk -using dirichlet is better and more elegant, if it works epichope ive been useful somehowgood ruck with your job anone
>>16379723Yeah, I suspect that it could be made to work by carefully scaling and renumbering the events, but a simple process like this should intuitively have a simple representation, and the Dirichlet is probably as simple as it gets.Thanks for giving me the right prompt, have a Miku in return.
>>16379045I can't tell if you're making a joke based on the pun (in which case, good one) or if you are implying that there will be some other systematic modeling structure that will replace probability. In some cases possibility theory is a suitable "replacement" for probability theory (though truthfully they look at different concepts of uncertainty). I don't think many of the ideological/philosophical criticisms people levy against probabilistic approaches are solved by possibility theory. They are both fundamentally looking at ratios of set measures to quantify uncertainties. In the probabilistic interpretation we look at measures as a way to quantify uncertainties about certain events. In the possibilistic framework we try to quantify uncertainties about set membership itself for given "events." In this sense, any questions about the trivialization of causality or lack of consistency people point at probabilistic thinking don't get resolved by possibilistic approaches.
Anyone have ideas on statistical methods for sports like hockey?
>>16381867good morning saar
>>16381916uncalled for racism derailing a math thread. you sound like a spic.
>>16381987sorry saar bitch benchod just redeem
>>16381916How is sabremetrics in hockey an indian thing? Are you clinically retarded? Indians have no clue what wintersports are.
>>16382510>sabremetrics
>>16382543>indian
>>16383765lol lmao indians
>>16375455No
>>16365742Yes indeed.
Does anyone know how to apply the theory of stochastic resonance to find hidden patterns in macroeconomic or market data? Asking for an autistic friend. Pic related.
>>16320326I had a probably stupid idea recently. So essentially, with a standard deck of cards we have 52! unique orders. Theoretically this means it is unlikely anyone has ever had the same shuffle as anyone else. But this is only in a perfect mathematical world. In reality we would likely see different distribution. Would it be possible to have some person or group of people shuffle deck of cards and then compare the result to the theoretical one? Has such a thing ever been done?
>>16320326>Many complain about this general, to say it's shit but the amount of discussion and thoughts in it have been very gooddude I'm thrilled if a math subtopic general can survive. Reminds me of the /pdeg/ over the past two years:The original: September 2022: https://warosu.org/sci/thread/14879555could have sworn there was one or two more after this not archived on warosu?The fate of /pdeg/: https://warosu.org/sci/thread/15061257#p15087700Revival thread March 2023: https://warosu.org/sci/thread/15299959
>>16388288If things are relatively stationary you could do a power spectral density analysis and see if you have strongly narrowband contributions (indicating nearly deterministic periodic cycles in your data).
>>16389539PDEs are cool. Even cooler if they are StochDE.
any nerds around here into NIR spectroscopy? Do you think it's reasonably possible for a math A.S. with zilch chemistry or stats knowledge to develop their own chemometric models for use in plant breeding programs? Asking for a friend
Is anyone familiar with any literature in statistics on formalized approaches to selecting a significance level/alpha value? I'm finding a lot of philosophical and methodological debates about p=0.05 and about not relying on standardized thresholds, but I can't find much regarding how to select an significance threshold in any systematic/formalized manner.
>>16389886>>16388288Ah! There is a book on Spectral Analysis applied by a Japanese Economist. 'Spectral Analysis of economic time series' by Granger and Hatanaka
>>16375490>>16374415If I'm not mistaken the meme about "everything either happens or it doesn't" is a reference to the Principle of Indifference, which just say that in the absence of any reasonable priors, you should assume a uniform distribution over the possible outcomes, which in the case of a binary outcome (like a Bernoulli trial) we assume p=.50.https://en.wikipedia.org/wiki/Principle_of_indifference
can any of you fags help me understand how the d_kl distance btn two prob disributions is calculated and why its not symmetric, i am trying to relate it to the difference btn two sine waves whose difference would be in amplitude, but distributions only go up to one, or the difference btn phases, which means the difference in means or std dev for the distributions, or the difference btn frequencies--which imho is the best scenario, how fat or thin one distribution has to be to fit the other, but distributions are apparently not even functions so you can't treat them this way?
>>16391890The KL divergence? Assuming that X0 and X1 are your two distributions, your KL divergence is given by D(X1||X0) = E_X1[ln(X1)-ln(X0)]. It's not symmetric specifically because you are integrating with respect to the random variable in the numerator. If you integrate with respect to the other it might not be the same magnitude.
>>16391946yes but why is it like this, i want to get an intuitive, preferably geometric sense, are there other probability distances?
>>16391811An alpha is a weak signal in the market that only you see and can calculate. Like renaissance tech.
>>16392037> Yes, but why is it like this?The simple answer is that KL divergence comes from Shannon entropy. The KL divergence gives you a "measure of uncertainty" between samples of the two distributions being considered. If you were able to asymptotically sample X1 but the true distribution is X0, the KL divergence tells you the "relative entropy" that will always be present due to the distribution differences. > Are there other probability distances?Yes, there are many. KL divergence is mostly used because it shows up quite often in hypothesis testing/classification problems using log-ratio based test statistics. There are plenty of other measures of probabilistic distance/divergence. The variational distance between two distributions tells you the maximum absolute difference in probability between any two distributions on the same event. The Mahalanobis distance gives you a measure of the distance between a point (e.g., a test statistic on a sample) and a distribution curve based on the first two moments. The Wasserstein distance between two distributions gives you a "measure of the cost of transforming from one distribution to the other" (provided they are defined on the same metric space).There's a bunch more out there. If you want to understand KL divergence though, you probably should pick up an information theory textbook like Cover and Thomas or MacKay's book (both of which are great).
>>16392037you may want to read a comprehensive book on information geometry. it will tell you all you are asking about.
Let's say I have a list of events and associated probabilities[math] \{X_1, X_2, \cdots, X_n\}, P(X_i) = \pi_i [/math]How do I compute the expected number of times that I need to sample this distribution to experience every event?
>>16392037Better to get a numerical intuition.
>>16392960It is a function of the lowest probability.
>>16392960>>16394133It's actually [math]\int_{0}^{\infty} 1 - \left( \prod_{i} 1 - e^{ -\pi_i t} \right) \mathrm{d}t[/math], which is of the order of [math]n \log \log n[/math] when the events are powerlaw-distributed.
>>16394166I have member x, y, z. x and y have 0.5 probability to occur, and z has zero.How many times would I have to sample this distribution to pull z?
>>16394573If there is an event with zero probability, then the product contains a factor of [math]1-e^{-0*t} = 1 - 1 = 0[/math], so the whole thing multiplies to 0.This means that the integral simplifies to [math]\int_0^{ \infty} 1 - 0 \mathrm{d}t[/math] which can be visualized as the area under the straight-line function y=1 from 0 to infinity.This is a rectangle with height 1 and infinite length, so its area evaluates to infinity, which is intuitively the correct answer: if z occurs with zero probability then you will never pull it even if you continue sampling forever.
>>16392127the wasserstein distance seems like its the same as the kl divergence, i mean the cost of transforming one distribution to another must take account the uncertainty btn them?
bump, any of you /math/fags want to form a group to study graduate and higher level probability and statistics? I am doing the work on my own but it's kind of lonely sitting in the room by myself all day.
>>16392037K-L Divergence isn't even a distance because it's not symmetric
>>16392037>i want to get an intuitive, preferably geometric sense, are there other probability distances?Wasserstein Metric, L1, L2
>>16396377What are you reading?
>>16391811The idea behind p-values was to let the reader decide if the evidence is strong enough. But people obssess over binary decisions and misinterpret p-values, so we are where we are. The topic of selecting a threshold is irrelevant, ezcept maybe cost-based calculations
>>16396377The thread is basically it broseph. Just write up long theses in the thread or worked out problems. People will chime in, believe me.
>>16395259The Wasserstein distance is a distance (meaning it is by definition symmetric), whereas the KL divergence is not. The Wasserstein distance is also defined on the two probability measures on the metric space itself, whereas the KL divergence is the mean of the log-ratio of the distributions, not the p-distance.
Where can I find good information on mathdriven hedgefunds like Renaissance technologies that isn't cookie cutter pl*bbitor shit?
>>16397599the darkweb
>>16396377Yes. What book?
>>16398420I think you should try this book in your selfstudy group. It's a half-decent intro text, not too heavy on theory but good enough to help you through the foundations. Plus it has a good solutions manual so you don't get stuck as easily.
>>16396377I'm in. I took Math Stats 18 years ago and turns out to be one of the very few courses I need to make frequent use of in my job
>>16398692What do you work with?
>>16398693natural language processing
>>16398984Is it any fun or is it autistic af?
>>16399089It's fun if you know what you're doing, the problem is 90% of people constantly asking when you're randomly going to shove neural networks into your project at random, as if 90% of problems can't be solved with regex and string similarity scores.
>>16399189So basically midwits who think they are smarter than you and just want to showboat in the company instead of making the code efficient? Sounds like hell.
>>16399384>So basically midwits who think they are smarter than you and just want to showboat in the company instead of making the code efficient? Sounds like hell.Yes and yes it is.
>>16399384>>16399789However what I'll add is that if you're attracted to high-value-and-difficult problems, then NLP work is the place to be. The thing is, text data is a sequence of symbols, which gives it two properties:1. Difficult to decode, you need to think beyond linear models to get any meaning out of it.2. Extremely dense in terms of information content. Contrast this with image, audio, and video content which has tons of redundancy built into it. This means that with a well thought out algorithm design you can extract an amazing amount of useful information from under a kilobyte of text or maybe right around a kilobyte. This also means that if you're relying on redundancy to "throw random statistical models at the problem and hope all answers cancel out" then you're going to be in a world of hurt. So, for someone who takes NLP seriously it can be a very motivating area with tons of frontiers and novel problems. However when you're stuck with midwits using two dollar words to impress your boss interested in hype-trains, then it can be a world of hurt.
>>16399799Sounds like you should pivot to cryptography, it's basically the same type of analysis and (I expect) the midwit popularion is far smaller.
>>16399900>Sounds like you should pivot to cryptography, it's basically the same type of analysis and (I expect) the midwit popularion is far smaller.I've thought of this, I did a math phd so I'm usually able to swing at least an entry level position into diverse fields. However, I'm honestly more worried about all the "cybersecurity" diploma mill grads than the "datascience" diploma mill grads. Also I'm hesitant to hang out with people who thought they were unironic 1337 h4X0r5 into their early 20s.
>>16399799I will give you a free idea that will utilize your specialization. One statistical problem I encountered while working was with computervision (I think it's called?) where a camera basically takes a photo of a part and then scans it for defects with a library of "defects". It was very clumsy and very cumbersome this system and what I did was taking the data from the shitty input and trying to make some sense of it.What is needed is something that can look at a part, be it a bolt, nut or even a put together part like a cardoor or even some simple electronics in a computer or phone.But that can intelligently feed the system and the operator with good data that doesn't have to always get overrided by the operator that packs this for the assembly plant further down the value chain.TL;DR Go into computervision for industrial parts. I have not seen any good work done in this field. Maybe my info is old, but at the time I worked there it was bugging the hell out of me.
>>16399942Thanks for the idea, that sounds pretty cool. Is there some publicly available example of troublesome input data you're referring to that I could take a look at?
>>16400083I think the best place to start is just with sampling the cookie cutter computer vision data that is available, then start making your own data (e.g. downloading a bunch of pictures and then let your "fast sorter" find the odd pieces of info). If you are smart you probably already understand this is a kind of sorting algo on steroids with neural nets. Honestly the best business case I have found for AI is computer vision on parts made for large scale industrial equipment, because the companies have money and if you can automate the QC, it's BIG BUXX for them.
Hey guys, I'm an undergrad right now majoring in Statistics, but I really loathe coding/computer science. I've tried many times before and I'm even taking classes for it, but I just can't convince myself to like it. Even when I was young and used coding to make games, it was always a means to an end and it was never something I found 'fun'. Is there any hope in being employable as a Statistics major if I don't want to code or should I find something else to study? I'm also taking some English classes on the side so I could possibly do that.
>>16400332coding is just a means to an end. only do it when you have to
>>16400332I sincerely and utterly fucking hate coding. But I had to do it to become the king of R, Python and SAS. Just do it figgit.I will hate it until I die. But I love the insights it gives me.
>>16400497What languages do you code in?
>>16401536Now only Python and Bash.If you want my full list though, I've been programming for almost 30 years. Started with QBasic in 1995, then Java in 1998, Javascript in 1999, Visual Basic 6 in 2001, C++ in 2002, then didn't learn any new languages until 2012 when I picked up Matlab, then 2014 Python, 2016 Bash, 2017 Prolog, 2020 R. To be honest though, C++ I haven't really practiced since 2002. It was a weird year because I was in AP Computer Science A and that year they were transitioning from the course being in C++ to Java, and so their solution was that the first half of the year was taught in C++ and the second half of the year was taught in Java. Most of the other languages I used though (except VB6) were self-taught.
>>16401817Do you have a BSc or you just got a jerb and kept being gainfully employed?
>>16401854PhD in Math
>>16402065God job brother. I don't have a phd in math but I aim to learn a lot of languages when I get older in the game.
Anyone know anything about heavy or fat tails in financial statistics?
bum p
>>16404614Yeah it's a good thread. I like how it has waves of very serious convo and not so much insults thrown around on the rest of 4chan.
>>16396377If anyone wants to start some sort of group, we can start by working on the problems in assignment 1 of the mit opencourseware course:18.S096 | Fall 2015 | Undergraduate Topics In Mathematics Of Data Science
>>16404775Honestly I think it's a good course but I am thinking that maybe some collaborative coding project with data would be better than to just hammer out the theories. The theories get more important later on when you actually cross examine your models. But in the beginning I would say it's slightly less important because you have no idea what to use the theory for.With that said, anything you do is better than nothing. This is worthy of your time, but the alternative cost is slightly higher than if you became good at other parts of "data science".
>>16404780Well I’m thinking that it would be easier to get a group off the ground if it is based around problems. It can be completely contained in a thread so that it’s easier for people to join.
>>16338504It is not 50/50 after changing doors, it is 2/3... assuming you are using the correct original rules.
>>16404789Go for it. But don't do the maths before you can code, the practical skills are worth their weight in gold. Having the right theoretical background is literally "whatever" status. It pains me to say this, but it is how the world works.I am not going to stand in your way, just saying there could be an easier way, that's all.
>>16404634I agree
>>16400332I was like you, hated coding at first as a stats undergrad. Same as you, took several classes on it and still didn't like it. Then I got a job as a data analyst where I've had to code pretty much all day everyday for years. At a certain point you become so good at it that it's as easy to write code as to speak a sentence. That's when it becomes enjoyable (maybe a bit earlier than that actually, at least when you get good enough to do things efficiently). Not from the act of typing on a keyboard, but from the satisfaction you get when you solve a problem, when you finally realize how to represent the problem as code or when you finally find that bug that's been driving you insane. And then you realize it's really not very different from math at all, it's just using logic to problem solve. At least that was my experience, so I'd say don't give up and keep at it, keep practicing. It really is like learning a new language, no one "enjoys" learning a new language (especially the first one outside of their native tongue) but once you become good enough to hold a conversation it becomes really cool and satisfying. You have to give your brain time to adjust to thinking in a totally different manner than it's used to. To answer your question on employability, with just a bachelor's in statistics you won't be able to get a job without coding skills. Really the only way you can get away with not coding in stats is to go for a PhD and do research in mathematical statistics, but even then you'll be at a disadvantage compared to your peers who have that extra tool in their kit. And going the academia route is setting yourself up for a lifetime of suffering (unless you're ok with never becoming full professor and earning dogshit pay forever). You could also look into becoming an actuary, from what I hear they still code but less so than other stats-adjacent careers like data analysts/scientists. But I've heard that's changing quickly, more positions look for coding.
How do I put ranges into Excel and have it make sense?For example, 0-3, 3-6, 6-9, etc.
>>16406661Same Anon here, post was too long so continuing:My advice is to just bite the bullet and keep practicing until you start to become "fluent" in a single language (Python or R). You don't need to be at this level when you graduate and find your first job. I pretty much just had those couple of classes under my belt when I graduated as my coding experience, you'll get much better very quickly using it daily on the job. Very good chance that you'll seriously regret it later in life if you genuinely enjoy statistics but quit it just because you didn't want to push through this initial discomfort period with learning coding. Especially if you choose English instead. ONLY do that if you are absolutely certain that the only thing you want to do in your career is teach English, because that's all you can do with that.
no matter how I change the parameters the probability of me getting a gf remains zero
>>16406664Computers and continuous intervals don't tend to get along very well.
>>16406686Truly an impossible event. I feel for you brother.
>>16406664What is the end number in the bins you are trying to create?
>>16406661>>16406666I guess so, everyone is saying that the best thing to do is to keep coding. Part of why I'm not interested in coding right now is because my classes are teaching C++ which I don't think is as important for me to learn as opposed to a language like R for data analysis. I probably won't enroll in any more programming classes this year (though I'm sure some statistics classes will teach R) but I'll try to start coding on my own. I'm still unsure if I want to have a job that's mostly coding, and I'm wondering how interested I am in math in the first place, but I guess that's something I'll figure out as I keep going. Thanks
>>16407533Why do you hate coding?
>>16407623Programming (and math/numbers, for that matter) is something I've never had the intuition for. Though I said I was a statistics major in an earlier post, I look at the math/programming textbooks I will have to read and my brain instantly shuts down. There are a lot of great things that can be done with coding and it's a useful skill to have, but it doesn't spark any joy in me, even when working on hobby projects like gamedev or personal websites. I've always been more interested in stories and words, but I'm not a prolific writer or enough of a voracious reader for writing to be a viable career. I just want to have a fulfilling job after college, and I like the idea of working on text analysis and semantic search, which is why I'm studying statistics right now. I also like poker and probability.I've come to terms with the fact that I'll probably have to do some programming on the job (which is fine, R and Python are much more straightforward than C++), but I don't think a job that involves programming will ever be my "dream job."
>>16407794Do some projects in semantic analysis and see what you think?
>>16407814Yeah that's what I'll try out next, hopefully it'll all click for me then.
>>16407868There is nothing wrong with just using statistics for linguistic purposes or language applications. It's not only worthwhile but I would say that is where the current strength of the field is. We have come a very long way in this area and it is very interesting. You could work with language your entire life in linguistic statistics and barely look at numbers.
>>16406686YAGMI BROTHER
>>16320326Hello, /sci/, havent been here in a very long time. just started working in data analysis field. I really don't have enough formal theoritical knowledge in this field even though i have extensively done it practically and I think i have a good intuitional understanding of it.can anyone link a hard statistics course that goes into pure maths and is not focused on just programming? because anything i find is for programming and I feel like studying that you won't understand statistics as you should be understanding it.thanks in advance
>data should show a decreasing trend based on a free variable >it actually increasesI hate data science so much bros
>>16409141What's your math background and what do you mean by hard statistics? A really common starting point is "All of Statistics" by Wasserman, but you should probably have at least a basic understanding of probability theory first. The coverage of the probability theory part of most introductory stats books is fairly shallow and not understanding functions of random variables may lead to problems down the line.
How many anons use these threads for completing class assignments?
>>16411157no point, chatGPT does assignments better than these threads.