[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/sci/ - Science & Math


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: programming.jpg (51 KB, 888x499)
51 KB
51 KB JPG
previous thread >>16113091

This is one of the boards newest generals. Fairly high activity due to edge lords trying to be funny but instead spreading facts about the absolute state of our world.

Intro stats is fairly easy, intermediate stats come the programming and we have already have several battles about what language is the best in the thread. Nobody uses SAS funnily enough, SPSS has had some people trying to joust the edgelords who are into R and C++, while the stata children are silent as usual.

Come one, come all. State your dumb questions, /pol/tardy or not. Some fairly useful and funny math is showcased in this thread.
>>
>>16174616
What's the most /pol/tard question ever stated on /sci/?
>>
File: benford's law.jpg (62 KB, 1644x1018)
62 KB
62 KB JPG
How does Benford's law work? Could it be used to prove that the results of a supposedly democratic election were produced fraudulently?
>>
>>16174654
Yes, it has been used before, even in academic circles.
>>
Which topic would be better to learn under the guidance of a professor: Bayesian statistics or stochastic calculus?
These are complementary classes I can take but I don't want to take both of them because I have other classes in mind too (Intro to PDE, group theory, data analysis and machine learning).
I'm doing my degree in applied math if that matters.
>>
>>16174741
Depending on who teaches it, Bayesian Statistics might either be very measure heavy (i.e., the way Bayesian stats are done in Lehmann's Theory of Point Estimation) or it might be very calc and linear algebra heavy (if it's more of an Engineering statistics approach to Bayesianism).

Stochastic calc, on the other hand, will guaranteed be very measure heavy but may be very helpful for time series statistics or for stochastic optimization/stochastic control theory.
>>
>>16174785
I see, thanks for the insight. I'll take Stochastic calc and study Bayesian statistics on my own, any recommended book?
>>
>>16175365
It depends on what you want to learn and where your interests lie.

A lot of stats people like Gelman's Bayesian Data Analysis, but I'm personally not a huge fan of that book. In an effort to be accessible and practical he often sacrificed a lot of the more mathematically involved parts of Bayesian estimation. With that said, if you already are comfortable with the basics of Bayesian inference, it's a good resource.

For a more "mathy" introduction to Bayesian statistics I really like The Bayesian Choice by C. Robert, but it's definitely far more theoretical and less practical.

In terms of practical Bayesian estimation, a lot of the best books fall in the "engineering" side of Bayesian statistics. Machine Learning a Bayesian and Optimization Perspective by Theodoridis is good and comprehensive.

For Kalman Filtering (which is a Bayesian approach to time series estimation as unintuitive as that may seem) there's a lot of great books. Bar-Shalom's Estimation with Applications to Tracking is great. Sarka's Bayesian Filtering and Smoothing is really good. If you want a more "mathy" approach to the topic, Elliot's Measure Theory and Filtering is great. For Bayesian decision theory Berger's Statistical Decision Theory and Bayesian Analysis is great and Duda, Hart and Stork's Pattern Classification also has a pretty good section on Bayesian Decision Theory.
>>
>>16174654
Benford's Law generally requires that the information be more less log-uniform (meaning that while inside of an order of magnitude the samples may not be uniform, the order of magnitude for a particular sample is roughly uniform).

This is a reasonable approximation for voting in systems where you have some districts with 10-90 people registered, some with 100-900, some with 1000-9000 etc. so you have a spread of the numbers across many orders of magnitude and roughly equal likelihood that a district chosen at random will 10 registered voters as 10000 registered voters.

The problem then comes in when you standardize district sizes. If almost all of the districts have roughly the same population, then Benford's law starts to be far less reliable, as you will see a "clump" in the middle which biases the first digit towards the center of the distribution for districts of that size.

I don't know exactly where the data from that infographic was sourced from, but given that Trump tended to do well in low population/rural regions, it wouldn't surprise me if his voting district distributions had a much more "log-uniform" shape as compared to Biden who was far more popular in suburbs and cities with standardized district sizes.

Hope that helps.
>>
File: UE6EZO.png (510 KB, 1080x1190)
510 KB
510 KB PNG
Why isn't there a relationship between political rally crowd size and election turnout? It seems like it wouldn't be to hard to construct a formula that gives somewhat useful estimates of election results based on political rally crowd sizes, could be a useful project for someone to do.
Ppl say theres no more low hanging fruit, but here it is right here, just look up some old election results and their corresponding political rally crowd sizes and see if theres a correlation, its easy enough that a high school kid could do it and yet it hasn't been done
>>
>>16175935
>Why isn't there a relationship between political rally crowd size and election turnout?
Because a candidate can generate a great deal of enthusiasm in a certain niche, guaranteeing a huge turn out in rallies, but be abhorred by the remainder of the general public, guaranteeing they fail in general elections.

Taylor Swift would generate huge rallies, but likely fail miserably in a general election.
>>
>>16175990
You mean there really are no priors like turnout before an election? I don't think people who go to rallies are fencesitters really.
>>
>>16176861
>I don't think people who go to rallies are fencesitters really.
They are not, and this is precisely the reason why such analysis is unreliable. The legendary blunder of the 1936 US presidential election opinion poll is still being recounted in every statistics class to this day.
https://www.pivotalresearch.ca/market-research-mistakes-poll-that-changed-polling/
>>
Does anyone have a good recommendation for a statistical optimization textbook? I'm aware of the ML stuff, but I was hoping for something more directly optimization theory oriented but for stochastic objective functions.
>>
File: 4am.jpg (208 KB, 1171x933)
208 KB
208 KB JPG
>>16174654
>>
File: 1698481707878638.png (70 KB, 1160x476)
70 KB
70 KB PNG
Hello /psg/, here is a tricky problem for you.
>>
File: lying with statistics.jpg (318 KB, 1049x1358)
318 KB
318 KB JPG
>>
>>16178795
I would say that lying through doctoring the data. Anyone who is serious about knowing economics both macro/micro and accounting knows that a lot of the official data is straight up garbage.
>>
>>16178795
no it's not
https://x.com/BLS_gov/status/1790480087797682298
>>
>>16178525
>tricky
Isn't this just a bog-standard Markov chain problem where the urn state evolves from (b,r) to (b-1,r) with probability (b/(b+r))^2, else (b,r-1)? I don't see where any tricks are involved, unless there's some kind of combinatorial identity that simplifies the recurrence.
>>
>>16174616
probability is fun.
statistics fucking sucks, it's all handwavy and shit
>>
>>16179533
Reality is handwavy. It's also so much fucking programming.
>>
>>16179385
It is. Solving it's difficult, I think.
>>
howto into stochastic PDEs?
>>
>>
>>16179533
If you really want rigorous statistics, there are measure theoretic formulations of statistics. The classic books people point to for this are Lehmann's Testing Statistical Hypotheses and Theory of Point Estimation (both of which are classics for a reason but also have a frustrating amount of typos because nobody had the courage to edit the God of statistics). Another good one is Jun Shao's Mathematical Statistics (which is a bit less comprehensive than the two Lehmann books but definitely more polished).

In general the problem with statistics comes from one of 2 places. Either there's no meaningful probability model for a quantity of interest, so we just impose one and hope it's close enough, or there is a meaningful probability model but we don't have a way of separating out the parameters we wish to infer. These cases happen often enough that approximate inference or numerical optimization are required, which then gets rid of a lot of the wonderful performance guarantees you can get when things are simple.

As an example, let's say your FIM for a quantity of interest is strictly positive definite. This means that there is enough information to estimate the parameter(s) of interest, but this FIM is an expected Hessian. The observed Hessian might not be positive definite in general and may have multiple local minima (meaning that you have risk of non-convergence or multiple possible points of convergence for your numerical algorithm to estimate your parameters of interest). This is a problem that shows up in statistics that really is an issue coming from convex analysis.
>>
>>16180742
I want to know as well. This seems cool.
>>
>>16180742
>>16180803
Do you have a background in SDE's/stochastic calc already and you're just looking for the PDE part of SDE's?

If not your answer is probably starting with SDE's and standard ito calculus.
>>
>>16180808
>Do you have a background in SDE's/stochastic calc already and you're just looking for the PDE part of SDE's?

this
>>
>>16180826
Springer's UTX series has a book on Stochastic PDE's that is pretty easily accessed via PDF. I've only actually gone through Ch. 4 and Ch. 5 because the rest seemed pretty standard SDE stuff but it's probably a step in the right direction (and might be free for you to download).

The 2021 edition of Kallenberg also has a chapter on PDE's if Kallenberg is your speed.
>>
>>16180849
Thanks for the Kallenberg tip. It seems very cool. I will have to check out the other book soon.
>>
>>16180849
Name of the UTX book?
>>
What are some good books for learning business forecasting using statistics / linear regression and the likes? I’m not interested in stock forecasting
>>
>>16182031
What are we talking about, like continous time series, operations research or just stuff like net sales, COGS etc?
>>
File: 1706648171758.png (11 KB, 155x325)
11 KB
11 KB PNG
Is this statistically significant
>>
>>16180808
I don't have a background in SDEs yet, so any rec would be appreciated. I'm mostly interested in studying stochastics in the field of differential geometry, cause that is roughly my area of work. I've heard talks about the Wasserstein metric, but that is kind of too specialized for what I want to learn. I'm roughly interested in studying "probabilistic" (pseudo-)Riemannian metrics and their induced curvature tensors, geodesics, etc. with an emphasis on stabilty. But I'm not even sure if people study this, or if studying it makes sense. Just feels like it might be interesting.
>>
>>16181824
Stochastic Partial Differential Equations: An Introduction by Liu and Rockner.

It's mad alright, but about as good as I've found for a textbook rather than just reading 10 million separate papers.

>>16182093
Honestly your best bet is just a simple stochastic processes book which then goes into stochastic calc. The classic one everyone recommends is Springer GTM 113.

I really don't have much differential geometry experience beyond self-studying Lee's Smooth Manifolds book almost a decade ago, so I really don't know how much there is in terms of relating the two fields.
>>
>>16182075
show me the regression tables?
>>
>>16182201
what tables?
>>
>>16182816
the p-tables son
>>
>>16182118
>Lee's Smooth Manifolds book
A pretty bad book imo, only good if you already studied DG and are e.g. preparing to give a course and need to refresh things. This book motivates hardly anything. If you ever want to study DG again, I would recommend O'Neill's "Semi-Riemannian Geometry With Applications to Relativity", often overlooked but excellent book.

I'll have a look at the gtm book, thanks!
>>
>>16183462
I'll take a look at the O'Niell book. My research is definitely more in the "applied" side of things, but Lee's book gets cited a lot for differential geometric approaches to non-uniform sampling, so maybe I'll give it a shot.
>>
>>16178525
Interesting question
>>
>>16174616
https://www.youtube.com/watch?v=T3ldcRYadR4
>>
>>16182118
I'll try to get this book later, good stuff.
>>
File: statistics.png (781 KB, 680x672)
781 KB
781 KB PNG



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.