/sci/ - /psg/ probability and statistics general - Science & Math

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/sci/ - Science & Math

Return Catalog Bottom Refresh

[Post a Reply]

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. Additional supported file types are: PDF Use T_eX with [math] tags for inline and [eqn] tags for block equations. Right-click equations to view the source.


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous
/psg/ probability and statisti(...) 05/22/24(Wed)05:05:35 No.16187525

File: OU.png (15 KB, 368x248)

15 KB PNG

/psg/ probability and statistics general Anonymous 05/22/24(Wed)05:05:35 No.16187525

previous thread >>16174616

This is one of the boards newest generals. Fairly high activity due to edge lords trying to be funny but instead spreading facts about the absolute state of our world.

Intro stats is fairly easy, intermediate stats come the programming and we have already have several battles about what language is the best in the thread. Nobody uses SAS funnily enough, SPSS has had some people trying to joust the edgelords who are into R and C++, while the stata children are silent as usual.

Come one, come all. State your dumb questions, /pol/tardy or not. Some fairly useful and funny math is showcased in this thread.

Anonymous
05/22/24(Wed)05:28:15 No.16187543

Anonymous 05/22/24(Wed)05:28:15 No.16187543

Stats people use C++? Since when?

Anonymous
05/22/24(Wed)16:59:01 No.16188514

Anonymous 05/22/24(Wed)16:59:01 No.16188514

>>16187525
the previous thread 404'd after a small handful of replies because it wasn't launched with a racial crime statics pics, this thread condemned itself it a similar fate.
lrn2/psg/ fagit

Anonymous
05/22/24(Wed)18:04:10 No.16188592

Anonymous 05/22/24(Wed)18:04:10 No.16188592

>>16188514
Is that so?

Anonymous
05/23/24(Thu)14:42:20 No.16189868

Anonymous 05/23/24(Thu)14:42:20 No.16189868

>>16189863
If you're that desperate for attention then you should have stuck to the thread guidelines outlined in >>16188514

>>16188592
Yes, as demonstrated by the fact that you had to bump your garbage thread that nobody wants too see off the ass end of page 10.

Anonymous
05/23/24(Thu)17:32:24 No.16190158

Anonymous 05/23/24(Thu)17:32:24 No.16190158

File: amari-info-geo-compressed.pdf (3.29 MB, PDF)

3.29 MB PDF

>>16187525

Any anons working on information geometry?

Anonymous
05/24/24(Fri)05:56:17 No.16190885

Anonymous 05/24/24(Fri)05:56:17 No.16190885

>>16190158
It would be interesting to hear a bit about the applications of this.

Anonymous
05/24/24(Fri)06:05:51 No.16190892

Anonymous 05/24/24(Fri)06:05:51 No.16190892

Can anyone redpill me on Poisson statistics?

Anonymous
05/24/24(Fri)06:08:52 No.16190893

Anonymous 05/24/24(Fri)06:08:52 No.16190893

>>16187525
Biggest lie in all of statistics is that events are independent. They're not. In casinos it's common to see streaks of 10 in roulette. If the next event is independent you'd expect 50% of the times 10 streaks are observed to continue to 11. That's not what happens.

Anonymous
05/24/24(Fri)08:17:15 No.16190985

Anonymous 05/24/24(Fri)08:17:15 No.16190985

I’m doing a course on ODE and the Laplace transform is absolutely not motivated lol. Digging through wiki it appears Laplace used a similar method in working with probabilities. Anyone have more info? Any decent introductory books on probabilities, especially ones that motivate Laplace transform?

Anonymous
05/25/24(Sat)03:18:51 No.16192246

Anonymous 05/25/24(Sat)03:18:51 No.16192246

Bumpety

Anonymous
05/25/24(Sat)05:48:27 No.16192401

Anonymous 05/25/24(Sat)05:48:27 No.16192401

>>16190985
I'd say they have a definite place in general stochastic processes. No experience myself, but just thinking a loud.

Anonymous
05/25/24(Sat)05:57:42 No.16192404

Anonymous 05/25/24(Sat)05:57:42 No.16192404

>>16187525
What would be a good, complete beginner, book for probability and statistics and one good follow-up book? Preferably with depth, just not so much as to be overwhelming for a beginner.

>>16190158
That sounds fascinating. What is it?

Anonymous
05/25/24(Sat)06:55:42 No.16192443

Anonymous 05/25/24(Sat)06:55:42 No.16192443

>>16187525
What's the difference between gamma and inverse gaussian distributions? I'm doing generalized linear mixed-effects modeling

Anonymous
05/25/24(Sat)13:07:09 No.16192808

Anonymous 05/25/24(Sat)13:07:09 No.16192808

>>16192404
I think 'statistical inference' by casella and berger is a good follow-up. You can find the text and solution manual on libgen (beware that the solution manual actually has mistakes in it).

To get the most out of the text, I think being solid in calculus would help. The examples/problems can be quite rigorous, so it may not be the best starting point.

Anonymous
05/25/24(Sat)13:09:20 No.16192812

Anonymous 05/25/24(Sat)13:09:20 No.16192812

how do I find a consulting job to make extra money as a PhD student? I am doing CS/ML. it's harder to find internship/studentship nowaday in FAANG btw.

Anonymous
05/25/24(Sat)13:10:21 No.16192813

Anonymous 05/25/24(Sat)13:10:21 No.16192813

>>16192812
fuck, wrong thread

Anonymous
05/25/24(Sat)14:01:25 No.16192878

Anonymous 05/25/24(Sat)14:01:25 No.16192878

>>16192404
First do one variable and then multivariable calculus. Then do a beginning course in probability and stats. Then applied stats. Then a more foundational theoretical course in stats.

Anonymous
05/25/24(Sat)14:03:19 No.16192880

Anonymous 05/25/24(Sat)14:03:19 No.16192880

>>16192812
Honestly a lot of statisticians are consultants as well. I consult sometimes. Get yourself the simplest LLC you can get in your jurisdiction, do some projects that looks pretty and put it up on some wordpress shit. Then start cold calling companies within your domain knowledge sphere. My domain knowledge is within accounting and economics, so I consult within those spheres and neither accountants nor economists are very good at hardcore stats.

Anonymous
05/25/24(Sat)17:54:07 No.16193179

Anonymous 05/25/24(Sat)17:54:07 No.16193179

Ay one got data on heights of men in america by age, race, region, etc., spanning multiple decades?

Anonymous
05/26/24(Sun)03:57:01 No.16193736

Anonymous 05/26/24(Sun)03:57:01 No.16193736

>>16193179
Check https://datausa.io/

Anonymous
05/26/24(Sun)04:15:53 No.16193762

Anonymous 05/26/24(Sun)04:15:53 No.16193762

Any Gibbs sampling chads here?
Given winbugs/openbugs is dead and ancient what's the best route to go down, JAGS or STAN?

Anonymous
05/26/24(Sun)04:16:59 No.16193763

Anonymous 05/26/24(Sun)04:16:59 No.16193763

>>16193762
STAN is nice to use but occasionally a pain in the ass to install because of dependencies

Anonymous
05/26/24(Sun)05:26:03 No.16193813

Anonymous 05/26/24(Sun)05:26:03 No.16193813

>>16193179
https://www.healthdata.org/research-analysis/health-by-location/united-states/county-profiles

Anonymous
05/26/24(Sun)06:23:13 No.16193913

Anonymous 05/26/24(Sun)06:23:13 No.16193913

>>16192880
Good advice, thanks

Anonymous
05/26/24(Sun)06:33:02 No.16193924

Anonymous 05/26/24(Sun)06:33:02 No.16193924

>>16193913
What domain knowledge do you have?

Anonymous
05/26/24(Sun)08:51:31 No.16194101

Anonymous 05/26/24(Sun)08:51:31 No.16194101

>>16192880
Is this a side hustle or your primary income?

Anonymous
05/26/24(Sun)08:58:05 No.16194107

Anonymous 05/26/24(Sun)08:58:05 No.16194107

>>16194101
side but scalable.

Anonymous
05/26/24(Sun)09:22:26 No.16194123

Anonymous 05/26/24(Sun)09:22:26 No.16194123

>>16193762
I will second what >>16193763 said: STAN can be a bastard to install. However, it also has a very active community (https://discourse.mc-stan.org/), so chances are good you can get help/find info if you get errors. If you're using R, there's also a couple interfaces out there that might make STAN a bit easier to use.

Anonymous
05/26/24(Sun)09:45:17 No.16194150

Anonymous 05/26/24(Sun)09:45:17 No.16194150

>>16190985
The Laplace transform of a density function gives you the moment generating function of the random variable. The MGF is very important to certain aspects of statistical signal processing and detection theory (especially large deviations theory and sequential hypothesis testing).

Probability, Random Variables and Stochastic Processes by Papoulis is the standard engineering oriented probability book used at either the upper undergrad or beginning of grad school. Has a decent amount of coverage of the relevance of Fourier and Laplace transforms to probability theory.

Another book that's perhaps less introductory than deals directly with the relevance of the PSD (so Fourier vs Laplace) is Bremaud's Fourier Analysis and Stochastic Processes. That one requires a good bit more analysis background to really work with though.

Anonymous
05/26/24(Sun)14:39:19 No.16194520

Anonymous 05/26/24(Sun)14:39:19 No.16194520

I'm a bio student and I would like to master statistics. I have taken some intro statistics for biologists, but it was just a couple of lectures about the normal distribution and doing a t-test.

I would like to develop a solid background in statistics, from basics to more advanced topics. What books or online courses do you recommend?

Anonymous
05/26/24(Sun)15:00:29 No.16194554

Anonymous 05/26/24(Sun)15:00:29 No.16194554

>>16194520
Depends on how lost in the sauce you want to get and also on your math background.

There's basically four "standard texts" in increasing level of difficulty that people recommend for either upper level undergrad or first year grad students that aren't doing measure theoretic probability:
1) Probability and Statistical Inference by Tanis and Hogg
2) All of Statistics by Wasserman
3) Probability and Statistical Inference by Mukhopadhyay
4) Statistical Inference by Casella and Berger

Anonymous
05/26/24(Sun)15:05:09 No.16194562

Anonymous 05/26/24(Sun)15:05:09 No.16194562

>>16194554
>All of Statistics
can vouch for this. it was a good read. really helped me thru my PhD.

Anonymous
05/26/24(Sun)15:08:50 No.16194568

Anonymous 05/26/24(Sun)15:08:50 No.16194568

>>16194562
I'd say if the 4 all of statistics and Casella and Berger were the most helpful for me. I'm not a statistician though, I'm an engineer. Can't comment on their usefulness for actual stats grad students.

Anonymous
05/26/24(Sun)15:11:30 No.16194569

Anonymous 05/26/24(Sun)15:11:30 No.16194569

>>16194568
I've only read number 2 out of the 4 that were listed. needed it cause I were preparing for an ML interview. got recommended by a friend.

Anonymous
05/26/24(Sun)17:06:05 No.16194721

Anonymous 05/26/24(Sun)17:06:05 No.16194721

>>16187543
Never

Anonymous
05/26/24(Sun)17:42:28 No.16194790

Anonymous 05/26/24(Sun)17:42:28 No.16194790

>>16194569
Oh, I definitely don't recommend reading all 4 of them. They cover basically the same material but at different levels of depth and slightly different emphasis.

At the point that you've gone through one of them, you probably have enough background that you can just jump right into whatever specific statistics topic you actually want to study directly.

Anonymous
05/27/24(Mon)08:52:18 No.16195730

Anonymous 05/27/24(Mon)08:52:18 No.16195730

>>16192808
Thanks.
>>16192878
Not what I asked for, but thanks for trying.

Anonymous
05/27/24(Mon)17:58:37 No.16196612

Anonymous 05/27/24(Mon)17:58:37 No.16196612

>>16193924
Industrial engineering

Anonymous
05/28/24(Tue)02:39:07 No.16197214

Anonymous 05/28/24(Tue)02:39:07 No.16197214

>>16194554
I want to get balls deep

Anonymous
05/28/24(Tue)13:18:08 No.16197904

Anonymous 05/28/24(Tue)13:18:08 No.16197904

Bump

Anonymous
05/28/24(Tue)14:00:50 No.16197968

Anonymous 05/28/24(Tue)14:00:50 No.16197968

File: 764923467892349238.jpg (37 KB, 483x470)

37 KB JPG

Who here is reading a stats book, any stats book, daily?

Anonymous
05/28/24(Tue)14:06:18 No.16197977

Anonymous 05/28/24(Tue)14:06:18 No.16197977

File: 978-0-387-21718-5.jpg (83 KB, 827x1241)

83 KB JPG

>>16197214
The deepest you can go is measure theoretic/analysis based statistics. This will give you a lot of ability to tie in tools from more advanced mathematics if you are careful.

Mathematical Statistics by Jun Shao is a pretty good starting point for this, but is assumes you are already fairly comfortable with analysis and measure theoretic probability to a certain degree.

Anonymous
05/28/24(Tue)15:11:37 No.16198089

Anonymous 05/28/24(Tue)15:11:37 No.16198089

How does one read a regression table ? How do you determine whether a result is statistically significant ? Are p-values (probability of null hypothesis) related to confidence intervals ?

Anonymous
05/28/24(Tue)15:14:44 No.16198095

Anonymous 05/28/24(Tue)15:14:44 No.16198095

>>16198089
Give us an example table you would like to have interpreted

Anonymous
05/28/24(Tue)16:13:21 No.16198153

Anonymous 05/28/24(Tue)16:13:21 No.16198153

>>16197968
I am reading the daily racial crime stats.

Anonymous
05/28/24(Tue)16:42:37 No.16198182

Anonymous 05/28/24(Tue)16:42:37 No.16198182

>>16198153
It's good to stay informed. Thoughbeit that does not count.

Anonymous
05/28/24(Tue)17:27:57 No.16198227

Anonymous 05/28/24(Tue)17:27:57 No.16198227

>>16197968
I don't read stats books daily, but I have been spending some time on some intermediate probability theory on a pretty close to daily basis recently.

Anonymous
05/28/24(Tue)17:48:36 No.16198263

Anonymous 05/28/24(Tue)17:48:36 No.16198263

>>16198227
Doing what with it?

Anonymous
05/28/24(Tue)17:49:32 No.16198266

Anonymous 05/28/24(Tue)17:49:32 No.16198266

>>16198263
Reading the book and doing problems. I'm trying to get a better understanding of continuous time Markov chains.

Anonymous
05/28/24(Tue)17:51:59 No.16198270

Anonymous 05/28/24(Tue)17:51:59 No.16198270

>>16198266
Can you post the book?

Anonymous
05/28/24(Tue)18:12:54 No.16198291

Anonymous 05/28/24(Tue)18:12:54 No.16198291

>>16190158
Thank you for the good read.

Anonymous
05/28/24(Tue)18:58:13 No.16198352

Anonymous 05/28/24(Tue)18:58:13 No.16198352

File: 1685919603099301.jpg (111 KB, 854x351)

111 KB JPG

>>16198095
That one for example

Anonymous
05/29/24(Wed)02:02:30 No.16198744

Anonymous 05/29/24(Wed)02:02:30 No.16198744

File: 978-3-030-40183-2.jpg (104 KB, 827x1254)

104 KB JPG

>>16198270
Sorry, I thought I had mentioned it in that post. Looking back I didn't.

I'm going through this book right now. Probably on the easier side for measure theoretic probability, but covers a much wider variety of stochastic process topics than the standard recommendations like Durrett, Ash, etc.

Anonymous
05/29/24(Wed)03:14:19 No.16198806

Anonymous 05/29/24(Wed)03:14:19 No.16198806

>>16198352
The first row values are means and the ones in square brackets are confidence intervals (minimum and maximum). If the confidence interval crosses 0, the effect is thought to be negligible. If the CI range does not contain 0, it is thought to be statistically different from 0.

First column is calculated as just log income as a function of exports/area. Second column checks if colonizer effect and ln exports together have an effect. Third column checks if geography controls alter the effect of exports and colonizers.

P-value can be checked from a lookup table or a p-value calculator by taking in the F-stat value and calculating degrees of freedom from number of observations (usually N-1).

Anonymous
05/29/24(Wed)03:32:46 No.16198821

Anonymous 05/29/24(Wed)03:32:46 No.16198821

>>16198153
Worthless.

Anonymous
05/29/24(Wed)03:58:26 No.16198836

Anonymous 05/29/24(Wed)03:58:26 No.16198836

>>16198821
Nah they are good man. Gotta know what the darkies are up to.

Anonymous
05/29/24(Wed)11:48:58 No.16199217

Anonymous 05/29/24(Wed)11:48:58 No.16199217

>>16198806
Thank you. Somehow you managed to explain it better than my professors.

Anonymous
05/30/24(Thu)01:05:47 No.16200241

Anonymous 05/30/24(Thu)01:05:47 No.16200241

Bump

Anonymous
05/30/24(Thu)01:14:44 No.16200249

Anonymous 05/30/24(Thu)01:14:44 No.16200249

>>16198836
You literally don't. It's hilarious you should say it as you have. You sound more black than I am.

Anonymous
05/30/24(Thu)03:52:41 No.16200438

Anonymous 05/30/24(Thu)03:52:41 No.16200438

Any good resources for regression modeling?

Anonymous
05/30/24(Thu)05:16:30 No.16200503

Anonymous 05/30/24(Thu)05:16:30 No.16200503

>>16200438
sci-kit learn user guide is good, not perfect but if you read through it you'll know sci-kit learn well enough at a minimum.

https://scikit-learn.org/stable/user_guide.html

Anonymous
05/30/24(Thu)09:00:58 No.16200718

Anonymous 05/30/24(Thu)09:00:58 No.16200718

>>16200438
I would suggest 'Regression Modeling Strategies' by Frank Harrel. It's fairly approachable and covers a lot of topics (linear, logistic and ordered regression, model validation , etc).

Anonymous
05/30/24(Thu)14:49:46 No.16201177

Anonymous 05/30/24(Thu)14:49:46 No.16201177

>>16200249
You are a dumb liberal faggot. What are you doing on 4chan?

Anonymous
05/30/24(Thu)16:56:56 No.16201366

Anonymous 05/30/24(Thu)16:56:56 No.16201366

>>16201177
Enjoying anime because this is an anime website

Anonymous
05/31/24(Fri)00:39:35 No.16201997

Anonymous 05/31/24(Fri)00:39:35 No.16201997

>>16200438
I have to learn this too. What book did you end up choosing?

Anonymous
05/31/24(Fri)00:48:30 No.16202011

Anonymous 05/31/24(Fri)00:48:30 No.16202011

>>16201177
Cope. This is not your safe space, queer.
>>16201366
You're not me. I only rarely watch anime. I haven't seen any since Season 3 of Kimetsu no Yaiba.

Anonymous
05/31/24(Fri)03:02:22 No.16202150

Anonymous 05/31/24(Fri)03:02:22 No.16202150

>>16202011
kimetsu no what? Are you one of those darkskinned pajeet anime watchers?

Anonymous
05/31/24(Fri)19:49:30 No.16203238

Anonymous 05/31/24(Fri)19:49:30 No.16203238

>>16198744
I'll check that book out. Thanks

Anonymous
06/01/24(Sat)05:22:11 No.16203871

Anonymous 06/01/24(Sat)05:22:11 No.16203871

>>16187525
why bother learning advanced SQL, R and stats when the world is run on excel, spss and "line look positive", "p value small" and "program says confidence high"

Anonymous
06/01/24(Sat)05:27:47 No.16203884

Anonymous 06/01/24(Sat)05:27:47 No.16203884

>>16203871
You have two choices:
1. Join them and be doomed to reinvent the wheel every day
2. Do things that feel right and makes your works reproducible, and build a foundation for the next generation

Anonymous
06/01/24(Sat)08:04:23 No.16204032

Anonymous 06/01/24(Sat)08:04:23 No.16204032

if you where to have say, 70% of A to happen and 30% of B to happen. even if you have done the math that made you come to this conclusion, would it still technically boil down to guessing?

Anonymous
06/01/24(Sat)08:39:05 No.16204063

Anonymous 06/01/24(Sat)08:39:05 No.16204063

>>16204032
What do you mean? For any particular experiment (if it's properly random/stochastic) then knowing the distribution doesn't give you any ability to reliably know the outcomes. It can tell you their distribution, and you can make predictions in a statistical sense, but you can't know exactly the outcome of a probabilistic experiment without observing it.

Anonymous
06/01/24(Sat)09:12:18 No.16204109

Anonymous 06/01/24(Sat)09:12:18 No.16204109

>>16204063
was thinking about situations where there is no guarantee, you are simply just using the knowledge and experience you have to get to a % outcome. like say the weather for meteorology.

Anonymous
06/01/24(Sat)09:17:55 No.16204124

Anonymous 06/01/24(Sat)09:17:55 No.16204124

>>16204109
Then the answer to your question is yes. If you only know that P(A) = .7, P(B) = .3 and P(A or B) = 1, then you can't know for certain which of the two will happen until it happens.

Anonymous
06/01/24(Sat)09:25:38 No.16204137

Anonymous 06/01/24(Sat)09:25:38 No.16204137

>>16204124
thanks anon

Anonymous
06/02/24(Sun)04:49:38 No.16205941

Anonymous 06/02/24(Sun)04:49:38 No.16205941

Bump

Anonymous
06/02/24(Sun)17:17:14 No.16206763

Anonymous 06/02/24(Sun)17:17:14 No.16206763

Give me a quick rundown on ridgeregressions plox.

Anonymous
06/02/24(Sun)22:36:22 No.16207149

Anonymous 06/02/24(Sun)22:36:22 No.16207149

>>16206763
There's a few ways you can think about ridge regression.

The most straightforward way (and the way it was originally developed) is that ridge regression imposes an l2 norm constraint on your beta. You're minimizing the mean-square-error subject to your beta being within/on (depending on the setup) some sphere centered around the origin.

Another way of thinking about ridge regression is the Bayesian interpretation. Ridge regression imposes a Gaussian prior on beta.

Anonymous
06/03/24(Mon)01:49:35 No.16207320

Anonymous 06/03/24(Mon)01:49:35 No.16207320

>>16207149
I always looked at it as an applied lagrange multiplier for statistics and regressions. That it's more of an optimiization thing than an error minimizer.

Anonymous
06/03/24(Mon)14:43:26 No.16208186

Anonymous 06/03/24(Mon)14:43:26 No.16208186

Is anyone here studying probability / statistics on a daily basis?

Anonymous
06/03/24(Mon)15:14:14 No.16208223

Anonymous 06/03/24(Mon)15:14:14 No.16208223

>>16207320
You can definitely look at it that way. In the literal sense ridge regression is an equality constraint on the L2 norm of your parameter that your objective function is applied to.

If your objective function is a linear least squares, that's the same thing as maximizing the posterior distribution of your parameter given the data with a Gaussian likelihood function on the data given the parameter and a Gaussian prior on the parameter.

It works out to be tomato tomahto.

Anonymous
06/03/24(Mon)19:24:26 No.16208655

Anonymous 06/03/24(Mon)19:24:26 No.16208655

>>16208223
Thanks anon. You make me like this thread.

Anonymous
06/04/24(Tue)03:01:31 No.16209340

Anonymous 06/04/24(Tue)03:01:31 No.16209340

>>16208655
Nice, this is a nice thread

Anonymous
06/04/24(Tue)15:05:47 No.16210217

Anonymous 06/04/24(Tue)15:05:47 No.16210217

Tell me about the p value, what does it actually mean?

Anonymous
06/04/24(Tue)15:35:29 No.16210261

Anonymous 06/04/24(Tue)15:35:29 No.16210261

>>16210217
Probability of false alarm. It's basically the probability that the particular data or test statistic you are observing could have happened randomly by chance even though the hypothesis isn't true.

Anonymous
06/04/24(Tue)15:44:26 No.16210275

Anonymous 06/04/24(Tue)15:44:26 No.16210275

>>16210217
Assuming the null is true, the probability that one obtains results more extreme than what was observed.

This is a nice read about p-values: https://www.fharrell.com/post/pval-litany/#:~:text=A%20p%2Dvalue%20is%20the,the%20effect%20of%20a%20variable.

Anonymous
06/05/24(Wed)08:12:19 No.16211385

Anonymous 06/05/24(Wed)08:12:19 No.16211385

>>16209340
Yes, a very nice thread.

Anonymous
06/05/24(Wed)11:48:32 No.16211928

Anonymous 06/05/24(Wed)11:48:32 No.16211928

>>16208655
>>16209340
>>16211385
reading the first few chapters in the deep learning book by Yoshua bengio group would've give you this exact information. the fact that you guys are excited by this tells you guys are either undergrads or code monkeys who are ML wannabe.

Anonymous
06/05/24(Wed)12:08:35 No.16211946

Anonymous 06/05/24(Wed)12:08:35 No.16211946

>>16211928
So what if they are undergrads? I don't understand your point. Yes, it's not particularly novel information if you are someone who has spent years doing Bayesian ML/Bayesian statistics, but it takes some time to see the connections between these frequentist regularization methods and the Bayesian MAP formulation of said regularization.

Anonymous
06/05/24(Wed)17:40:35 No.16212502

Anonymous 06/05/24(Wed)17:40:35 No.16212502

>>16211928
Post pic of hand and it will be brown with CI of 95.

Anonymous
06/06/24(Thu)04:13:51 No.16213285

Anonymous 06/06/24(Thu)04:13:51 No.16213285

>>16211946
Elitism is good, but it should be with a firm and happy hand. Not with a dull depressed heavy hand.

Anonymous
06/06/24(Thu)13:09:34 No.16217068

Anonymous 06/06/24(Thu)13:09:34 No.16217068

What is the most difficult branch in statistics?

Anonymous
06/06/24(Thu)13:42:58 No.16217241

Anonymous 06/06/24(Thu)13:42:58 No.16217241

>>16217068
In what way do you mean difficult? Do you mean mathematically difficult or do you mean practically difficult?

Anonymous
06/07/24(Fri)06:32:24 No.16219240

Anonymous 06/07/24(Fri)06:32:24 No.16219240

>>16217241
Mathematically difficult

Anonymous
06/07/24(Fri)13:25:30 No.16220154

Anonymous 06/07/24(Fri)13:25:30 No.16220154

>>16219240
I guess that depends on what you find difficult. Generally statistics gets mathematically complicated when the probability theory gets complicated.

Many people find measure theoretic statistics fairly difficult, and this will propagate throughout all of the related fields (performance analysis and large deviations theory, sequential analysis, information theoretic statistics, etc.) with this formulation.

Anonymous
06/08/24(Sat)03:52:17 No.16221523

Anonymous 06/08/24(Sat)03:52:17 No.16221523

>>16211928
You're on 4chan, what did you expect?

Anonymous
06/08/24(Sat)15:27:41 No.16222603

Anonymous 06/08/24(Sat)15:27:41 No.16222603

Statistics is not only useful. It's fun as well. I love to do PDEs on stats problems.

Anonymous
06/08/24(Sat)16:05:58 No.16222679

Anonymous 06/08/24(Sat)16:05:58 No.16222679

>>16222603
>Statistics is fun
LOL seriously? You like anal (receiving)?
>PDE is fun
Hell yeah it is

Anonymous
06/08/24(Sat)19:51:29 No.16223170

Anonymous 06/08/24(Sat)19:51:29 No.16223170

>>16222679
classic shitpost. Now go to another thread for retards.

Anonymous
06/08/24(Sat)20:29:10 No.16223279

Anonymous 06/08/24(Sat)20:29:10 No.16223279

so when are you fags going to prove the theory of probability?

Anonymous
06/08/24(Sat)20:57:20 No.16223331

Anonymous 06/08/24(Sat)20:57:20 No.16223331

>>16223279
lol lmao even

Anonymous
06/08/24(Sat)21:26:15 No.16223393

Anonymous 06/08/24(Sat)21:26:15 No.16223393

>>16187543
They do, IF they're also computational mathematicians. The stats universities that are actually trying to push forward new or novel techniques use C++ and then make interfaces with R (because they know the applied community all uses R).

Take the INLA project as an example. And that's just something actively in development.

Anonymous
06/08/24(Sat)21:55:03 No.16223446

Anonymous 06/08/24(Sat)21:55:03 No.16223446

Why is p-hacking bad? Isn't it literally just what happens as you collect more data regardless of the problem?

From a frequentist standpoint, your intervals and p-values go to zero as more data is collected simply because we are working from the interpretation of constant coefficients in our models. Statistical significance is great and all, but it's not a measure of importance or impact just 'hey this interval doesn't overlap with hypothesis X or other coefficient Y'.

I don't really understand the p-hacking problem whatsoever basically. Especially when combined with any sort of validation techniques or with any follow-on operational type question (statistically significant difference doesn't mean an impactful difference $1 is very statistically significantly different than $1.01 but doesn't actually matter in the majority of contexts).

Anonymous
06/08/24(Sat)22:08:57 No.16223471

Anonymous 06/08/24(Sat)22:08:57 No.16223471

>>16223446
From my understanding, the problem with p-hacking is that you are collecting a biased sample set. It isn't just that you are collecting more data, it is that you are collecting more data under a specific subset which is more likely to show significance (e.g., tailed or skewed data science towards the extreme cases of the alternative).

It's a case of biased sample selection (or potentially pruning of negative outliers which would make your test statistics more centrally located).

Anonymous
06/09/24(Sun)05:04:43 No.16224049

Anonymous 06/09/24(Sun)05:04:43 No.16224049

>>16223279
Its more of a question of how long before the theory can be proven with 100 percent accuracy. Any day now im sure..

Anonymous
06/09/24(Sun)17:08:41 No.16225382

Anonymous 06/09/24(Sun)17:08:41 No.16225382

>>16223279
cope from brainlet

Anonymous
06/09/24(Sun)17:31:46 No.16225436

Anonymous 06/09/24(Sun)17:31:46 No.16225436

>>16223446
P-hacking implies that you already have decided beforehand what the end result is instead of accepting the data as it is

Anonymous
06/10/24(Mon)00:52:39 No.16226195

Anonymous 06/10/24(Mon)00:52:39 No.16226195

>>16224049
two more weeks right?

Anonymous
06/10/24(Mon)04:27:32 No.16226420

Anonymous 06/10/24(Mon)04:27:32 No.16226420

Do any unis teach a completely unbiased course on race statistics?

Anonymous
06/10/24(Mon)05:24:09 No.16226493

Anonymous 06/10/24(Mon)05:24:09 No.16226493

>>16226420
No. The same way that there are no colleges that teach entirely unbiased courses on any other highly controversial subject where there's still open research questions.

Anonymous
06/10/24(Mon)16:39:54 No.16227387

Anonymous 06/10/24(Mon)16:39:54 No.16227387

>>16226420
lol god no. If you want to learn the real stuff, you have to learn it yourself. Start with the bell curve. Maybe the closest would be some analysis course on applied criminology at Quantico where they teach how the world works to federales.

Anonymous
06/11/24(Tue)13:59:35 No.16228727

Anonymous 06/11/24(Tue)13:59:35 No.16228727

>>16226420
There's one prestigious uni called /pol/, you can complete a whole degree on racial statistics there

Anonymous
06/11/24(Tue)17:38:25 No.16229173

Anonymous 06/11/24(Tue)17:38:25 No.16229173

>>16228727
kek

Anonymous
06/12/24(Wed)05:08:45 No.16230012

Anonymous 06/12/24(Wed)05:08:45 No.16230012

are random variables a group under convolution?

Anonymous
06/12/24(Wed)16:23:59 No.16230927

Anonymous 06/12/24(Wed)16:23:59 No.16230927

>>16230012
Define random

Anonymous
06/12/24(Wed)16:25:33 No.16230928

Anonymous 06/12/24(Wed)16:25:33 No.16230928

your vanity thread is on page 10 again, better bump it quick

Anonymous
06/12/24(Wed)18:20:12 No.16231089

Anonymous 06/12/24(Wed)18:20:12 No.16231089

>>16230928
lmao

Anonymous
06/12/24(Wed)18:31:09 No.16231108

Anonymous 06/12/24(Wed)18:31:09 No.16231108

>>16230927
a function from the sample space to a subset of the reals (or real space)

Anonymous
06/13/24(Thu)04:39:41 No.16232093

Anonymous 06/13/24(Thu)04:39:41 No.16232093

File: 0003.png (38 KB, 618x559)

38 KB PNG

I LOVE <3 non parametric stats <3

Anonymous
06/13/24(Thu)17:20:56 No.16233286

Anonymous 06/13/24(Thu)17:20:56 No.16233286

>>16232093
why?

Anonymous
06/14/24(Fri)06:27:41 No.16234287

Anonymous 06/14/24(Fri)06:27:41 No.16234287

>>16233286
Fuck normal distributions
Fuck means
Fuck SD

Anonymous
06/14/24(Fri)15:15:22 No.16234955

Anonymous 06/14/24(Fri)15:15:22 No.16234955

My PI forces me to use Matlab for all the analyses and statistics. It's surprisingly comfy but disgusting at the same time.

Anonymous
06/14/24(Fri)20:03:37 No.16235418

Anonymous 06/14/24(Fri)20:03:37 No.16235418

>>16234955
You work in some kind of weird finance department?

Anonymous
06/14/24(Fri)20:50:49 No.16235478

Anonymous 06/14/24(Fri)20:50:49 No.16235478

>>16235418
He probably works for the based department. Matlab is based as fuck. T. Statistical signal processing engineer.

Anonymous
06/15/24(Sat)05:43:09 No.16236011

Anonymous 06/15/24(Sat)05:43:09 No.16236011

>>16235418
Applied physics

Anonymous
06/15/24(Sat)10:20:47 No.16236346

Anonymous 06/15/24(Sat)10:20:47 No.16236346

>>16236011
Continue using it. Since you are in the field that actually uses it as a standard.
>>16235478
You my dear sir, are an idiot.

Anonymous
06/15/24(Sat)12:27:38 No.16236455

Anonymous 06/15/24(Sat)12:27:38 No.16236455

>>16236346
I may be a retard but I'm a based retard who uses a software environment that easily handles constrained optimization of nonlinear objective functions.

bodhi
06/15/24(Sat)12:57:33 No.16236499

bodhi 06/15/24(Sat)12:57:33 No.16236499

>>16187525
good thread OP

Anonymous
06/16/24(Sun)02:13:09 No.16237433

Anonymous 06/16/24(Sun)02:13:09 No.16237433

Redpill me on gamma distributions

Anonymous
06/16/24(Sun)10:26:54 No.16237819

Anonymous 06/16/24(Sun)10:26:54 No.16237819

File: misspelling.png (122 KB, 750x1050)

122 KB PNG

Was over in another board and got suggested to post here.

Problem:
I'm doing data analysis for a refrigeration-based dehumidification product for a company. Sometimes it goes through QC no problem. Sometimes it has a lot of issues. I want to find out why.

What I've done so far:
I've been able to collate the following data (*):
1-Testing chart data for each product
2-Order form data for each product
3-BOM data for each product
(4-I'm working on getting job routing data for each product atm, as someone else in the other thread suggested to me).
Using 1, I can look at the number of failed charts to get a list of 'good' and 'bad' products.
Using 2, I can filter the previous list to only look at the dehum products.
Once I do this, I have a sample size of maybe 500 (the company is not high-volume, they make niche, custom products).
I've ran the following statistical tests:
-Script to do brute force ANOVAs of components in BOM v. good/bad end-products. This only identified outlier products' materials. For example, it was suggested things like, "The shipping crate used in the outlier is suspect." In general, I got a lot of "Pirates cause global warming" noise.
-Because of the previous results, I made all the data binary (good=1,bad=0,part in BOM=1,part not in BOM=0) and did Fisher p-testing. This only identified 'obvious' parts. Things like, "Yes, all compressors would be suspect, of fucking course, that's how refrigeration works." It didn't narrow anything down.
-I tried running correlations on some relevant variables (e.g., amount of refrigerant in product v. failed test numbers), and I just get noise.
There's a chance I missed something in these two previous tests, because there was a lot of noise to go through.
-Because of the small sample size (500), I feel I'm limited to single-variable analyses.

Can anyone think of anything else I should try?

(*) An aside vent: just getting this data collated, accessible, and cross-referenced was a PIA.

Anonymous
06/16/24(Sun)15:57:29 No.16238208

Anonymous 06/16/24(Sun)15:57:29 No.16238208

File: 047072210X.jpg (35 KB, 300x469)

35 KB JPG

>>16237819
At the end of picrel they go into something similar for VW.

Anonymous
06/17/24(Mon)05:57:39 No.16239210

Anonymous 06/17/24(Mon)05:57:39 No.16239210

Do any projects graph how much the human genome has changed by year?

Anonymous
06/17/24(Mon)06:34:29 No.16239235

Anonymous 06/17/24(Mon)06:34:29 No.16239235

>>16237819
You should be looking at processes not data. Just 6M: machine, man, materials, measurements, methods, and mother nature. Process failure must exist in one of these categories.
As a data analysis guy, just pareto it and list which problems are the worst and have them explore those.

Anonymous
06/17/24(Mon)20:35:06 No.16240236

Anonymous 06/17/24(Mon)20:35:06 No.16240236

>>16190158
Thanks for the book.

Anonymous
06/17/24(Mon)23:00:20 No.16240457

Anonymous 06/17/24(Mon)23:00:20 No.16240457

>>16198089
Absolute value is larger(preferably much) larger than the absolute value of 2, P-values are close to zero.

Anonymous
06/18/24(Tue)12:36:49 No.16241236

Anonymous 06/18/24(Tue)12:36:49 No.16241236

Where can I, a noob, just ok in maths, start learning about stats?

Anonymous
06/18/24(Tue)17:10:53 No.16241615

Anonymous 06/18/24(Tue)17:10:53 No.16241615

>>16241236
Download textbook with open datasets that you can easily get on the publishers website. Start going through the problems one by one until you dun goofed the entire book. Easy peasy lemon squeezy.

Anonymous
06/19/24(Wed)04:24:47 No.16242340

Anonymous 06/19/24(Wed)04:24:47 No.16242340

>>16241615
which books have these open datasets?

Anonymous
06/19/24(Wed)21:55:22 No.16243470

Anonymous 06/19/24(Wed)21:55:22 No.16243470

>>16242340
Not exactly a straightforward stats book, but Probabilistic Machine Learning by Kevin Murphy is free, has figures and python code on his GitHub and does have some statistics coverage. Introduction to Statistical Learning also has some code and data available.

Anonymous
06/20/24(Thu)12:00:13 No.16244215

Anonymous 06/20/24(Thu)12:00:13 No.16244215

What's the point of the charateristic function again? They dont add any insights to the study of a probability distribution, unlike the mgf. So why it even exists.

Anonymous
06/20/24(Thu)13:05:00 No.16244291

Anonymous 06/20/24(Thu)13:05:00 No.16244291

>>16243470
Nice, thanks

Anonymous
06/20/24(Thu)14:23:46 No.16244367

Anonymous 06/20/24(Thu)14:23:46 No.16244367

>>16244215
There's a few uses for characteristic functions, especially for sampling distributions and frequency analysis for continuous time Markov chains.

In general though, an MGF is more useful if it's available, however not every probability density function has an MGF (while every probability density has a well defined characteristic function).

Anonymous
06/21/24(Fri)08:15:49 No.16245515

Anonymous 06/21/24(Fri)08:15:49 No.16245515

>>16234287
Baste

Anonymous
06/21/24(Fri)08:28:26 No.16245528

Anonymous 06/21/24(Fri)08:28:26 No.16245528

>>16244215
>They dont add any insights to the study of a probability distribution
read harder

Anonymous
06/21/24(Fri)10:03:50 No.16245643

Anonymous 06/21/24(Fri)10:03:50 No.16245643

>>16236455
>easily handles constrained optimization of nonlinear objective functions
you can always code your own in C++, fag. it's not that hard.

Anonymous
06/21/24(Fri)10:13:38 No.16245660

Anonymous 06/21/24(Fri)10:13:38 No.16245660

>>16187525
we were taught R at university but now I mostly use Python.

Anonymous
06/21/24(Fri)10:14:32 No.16245662

Anonymous 06/21/24(Fri)10:14:32 No.16245662

>>16245643
> You should reinvent the wheel using older tools because I don't like you using better tools that others have made.

MATLAB is literally a professionally maintained system designed to be effective at solving these optimization problems. I could implement everything from scratch in assembly too, but it would be stupid to do so when others have spent their life's work building tools to do it for me.

Anonymous
06/21/24(Fri)10:29:02 No.16245690

Anonymous 06/21/24(Fri)10:29:02 No.16245690

>>16245662
>MATLAB is literally a professionally maintained system
but then you're stuck with Matlab, faggot. it's a horrible language.

Anonymous
06/21/24(Fri)10:32:58 No.16245695

Anonymous 06/21/24(Fri)10:32:58 No.16245695

>>16245690
> But then you're stuck with MATLAB, the industry standard for solving the exact problems MATLAB excels in.

You might as well say that researchers who study Neural Networks architectures are "stuck with Python."

You don't have to like MATLAB. It's not perfect and it's expensive, but it's not an accident that it's the industry standard in many fields of physics, engineering and optimization. There's nothing that MATLAB does that you couldn't do in some other general purpose language, but you'd likely have to make from scratch tools that MATLAB already handles natively in C.

Anonymous
06/21/24(Fri)12:40:25 No.16245856

Anonymous 06/21/24(Fri)12:40:25 No.16245856

File: yasu.png (485 KB, 712x697)

485 KB PNG

Matlab = SHIT TIER
Python = MEH TIER
R = GOD TIER

prove me wrong faggits

Anonymous
06/21/24(Fri)12:53:12 No.16245882

Anonymous 06/21/24(Fri)12:53:12 No.16245882

>>16245856
All three of them are good choices for a general purpose statistics/data analysis language with each having certain things they excel at.

R is fantastic if you are working on theoretical statistics or looking to pull from the (many many) open data libraries from the natural sciences. A lot of the cutting edge of mathematical statistics work gets done in R and that's not an accident.

Python is flexible beyond either of the other two and provides unparalleled support for machine learning/adaptive statistics. If you are doing anything at all involving Neural Networks, decision trees or HMM's Python offers quite a lot to you.

MATLAB is the absolute king of matrix based scientific computing. It's literally what the name stands for, "matrix laboratory." If you are doing work that involves a lot of linear algebra (e.g., non-linear programming based statistics, Bayesian optimization or Kalman filtering/target tracking, adaptive linear filtering or stochastic control, etc.) you basically can't beat what MATLAB has to offer. Python is finally starting to see some decent target tracking support with the work being done by the developers of the Stone Soup library, but if you work in anything at all with radar/sonar/lidar/gps etc. you basically can't avoid Matlab.

Honorable mentions go to Julia for their efforts into scientific computing and emphasis on parallelization. Julia is also a great option to learn (but it's still pretty new so don't be surprised if it's not as well supported as the others).

Anonymous
06/21/24(Fri)13:19:58 No.16245924

Anonymous 06/21/24(Fri)13:19:58 No.16245924

>>16245882
I was shitpoasting, but I do appreciate your god tier poasts on radar, sonar and applied stats. So when I am shitting on matlab, I am not shitting on you. So that is clear. I am shitting on the universities who are cheap fucks and cannot re-tool their shit to make their students better suited for the market place.

Anonymous
06/21/24(Fri)13:21:08 No.16245926

Anonymous 06/21/24(Fri)13:21:08 No.16245926

>>16245882
Julia seems like fun, but very niche.

Anonymous
06/21/24(Fri)13:46:17 No.16245967

Anonymous 06/21/24(Fri)13:46:17 No.16245967

File: F.png (5 KB, 256x256)

5 KB PNG

>>16245882
>MATLAB is the absolute king of matrix based scientific computing
*ahem*

Anonymous
06/21/24(Fri)13:53:41 No.16245982

Anonymous 06/21/24(Fri)13:53:41 No.16245982

>>16245967
Do people actually still use Fortran? I know a lot of the old gods of the field still reach to Fortran, but I've never met anyone under 70 who uses it on a regular basis.

Anonymous
06/21/24(Fri)14:11:55 No.16246016

Anonymous 06/21/24(Fri)14:11:55 No.16246016

>>16245982
They do, you can actually get pretty spicy jobs if you have 10 years plus exp with Fortran.

Anonymous
06/21/24(Fri)14:13:25 No.16246019

Anonymous 06/21/24(Fri)14:13:25 No.16246019

File: Fortran.png (10 KB, 566x149)

10 KB PNG

>>16246016
500+ jobs with Fortran? what the fuckkk?

Anonymous
06/21/24(Fri)14:39:17 No.16246048

Anonymous 06/21/24(Fri)14:39:17 No.16246048

>>16246016
That's good to know! The only thing in my world that still is actively maintained in Fortran is the official OA Labs engine for Bellhop/Kraken for underwater acoustic ray tracing. It's neat to hear that people are still actively using Fortran for real development in the year of our Lord 2024. Makes me feel less old.

Anonymous
06/22/24(Sat)04:25:38 No.16247103

Anonymous 06/22/24(Sat)04:25:38 No.16247103

>>16245967
I've wanted to learn Fortran for a while but never bothered

Anonymous
06/22/24(Sat)04:35:29 No.16247117

Anonymous 06/22/24(Sat)04:35:29 No.16247117

>>16190893
Hot hand fallacy

Anonymous
06/22/24(Sat)07:08:00 No.16247238

Anonymous 06/22/24(Sat)07:08:00 No.16247238

>>16245982
About 70% of all HPC code is Fortran. It's absolutely entrenched and it will never change. And this 70% figure comes after decades of people attempting to force a change to C/C++ as the standard. We've also got CUDA Fortran now.
I'm in my 20s and picked up Fortran and I actually enjoy using it because of how simple and clear it is. Very easy to learn, and modern Fortran is not the abomination it once was with GOTO statements everywhere. It's also unbeatable when it comes to parallel computing.
>>16246016
I got a temporary job in my old department as an undergrad entirely because I was the only one who bothered to learn Fortran. They had an old codebase that needed to be looked at and for some reason nobody else wanted to work in Fortran because people were convinced it was obsolete, therefore no takers for the position, but it turns out stuff shouldn't be ignored just because it's old.
The hard part about Fortran is actually that it's normally written for very specialised purposes, so the trick is a lot of code you're going to read is likely going to require a relatively large amount of additional knowledge to understand properly. A lot of the time, a boomer will have written a numerical solver and not bothered to explain why an equation is there, or what it's doing. If they also spam GOTO a lot, then good luck.

Anonymous
06/22/24(Sat)07:30:11 No.16247262

Anonymous 06/22/24(Sat)07:30:11 No.16247262

>>16246019
where? there are 4 in my entire shithole country

Anonymous
06/22/24(Sat)10:40:21 No.16247462

Anonymous 06/22/24(Sat)10:40:21 No.16247462

>>16192812
>PhD student
It will be hard. I earned my PhD in 2020, now run a data science/ml/stats team (it's an interesting hybrid team internal to a big company), and all of our consultants tend to have PhDs and experience. The sort of "natural" lead in to being a consultant is to work in the space for a while, get to know all kinds of people while working with clients, and eventually just starting your own consultancy agency with the known contacts as your primary customers. It's very relationship driven. Without known contacts and without a PhD and experience, it will be difficult, but I guess not impossible; you'll just have to select smaller jobs/smaller companies and undercharge.
> it's harder to find internship/studentship nowaday in FAANG
Biggest tip to CS peeps: Fuck FAANG, its the worst option. There are about 10,000 new startups, especially in biotech, who need CS people. They tend to have a harder time finding people because they aren't very well advertised. While my peers were doing 5 rounds of interviews at FAANG and not getting internships, I found a super local biotech which had 0 SEO by googling the area, and messaged them. They essentially hired me right away as an intern, and then hired me for real about 2 months later. It was a startup with 5 people and they just really sucked at advertising, googling their name they didn't even come up. It sounds "and then everyone clapped", but I also found my second job the same way.

Anonymous
06/22/24(Sat)10:50:49 No.16247482

Anonymous 06/22/24(Sat)10:50:49 No.16247482

>>16245856
R is got tier, I love it.
Python gets a bad rap but honestly the amount of mature libraries make it my preferred tool. I've only ever needed to write a couple of functions in Rust for speedup, but Python is generally a glorified C wrapper so is plenty fast.
I generally do all of my processing in Python and then export to R for fancy stats and for plotting (ggplot is still absolute god-tier for plotting, fuck matplotlib although seaborn is okay).
>>16245882
>MATLAB
absolutely fuck matlab, I used it for my whole PhD. It has way too many data types, any and all useful functional toolboxes become obsolete after about a year because they actively change EVERY useful function (removing them, merging them, changing them completely) and have no concept of stable, reusable code, and the stupid ass 2x year A and B release is just nonsense. No one can use your code unless they buy matlab (or you use their executable export BS but that's a mess).
I dislike everything about matlab. I used it for some of the things you say its good for (kalman filters for noisy object tracking) which it was great at the time for image processing, but I would rather implement kalman filters from scratch than use their implementation which I KNOW will change and break my code in 2 years.
I tried to run an app process I wrote in 2018B, in 2020A- and it didn't work because half the functions no longer existed. Not that I could check now because I refuse to pay for it.
Fuck MATLAB.

Anonymous
06/22/24(Sat)10:54:33 No.16247488

Anonymous 06/22/24(Sat)10:54:33 No.16247488

>>16247462
>especially in biotech
everyone hate bio for a reason. those are the worst companies to work in. low pay, low equity, toxic morons ordering you around. they don't know lots about CS so they sometimes ask outrageous shit that only companies like Google barely have the capablities to execute.
also, most biotechs goes bankrupt because of failing FDA or just some scamming scheme to siphon money from investors anyway so expect your equities portion have a 90% chance of being worthless.

Anonymous
06/22/24(Sat)11:04:08 No.16247502

Anonymous 06/22/24(Sat)11:04:08 No.16247502

>>16247482
Wtf are you on about. Matlab is extremely backwards compatible. And if they change anything, they give you deprecation warnings.

Python is the one that breaks shit constantly.

Anonymous
06/22/24(Sat)11:37:05 No.16247547

Anonymous 06/22/24(Sat)11:37:05 No.16247547

>>16247502
>Matlab is extremely backwards compatible
Matlab specifically keeps every version as separate entities because they make changes to their toolboxes constantly. I don't know what to tell you other than, using their image toolbox from ~2016-2020, half of the functions were merged or removed. My code literally doesn't work between version because they changed the toolbox so much. There's not much I can say other than that.
Base matlab may be more stable, but it then just becomes a neutered language if you decide to ignore the toolboxes.
>>16247502
>Python is the one that breaks shit constantly
I don't find this to be the case, but maybe its because everyone uses virtual environments to self-contain projects and version. For free. Without downloading a whole separate multi-gigabyte "version" of the language.

Anonymous
06/22/24(Sat)11:50:14 No.16247564

Anonymous 06/22/24(Sat)11:50:14 No.16247564

>>16247547
>Without downloading a whole separate multi-gigabyte "version" of the language.
Yeah, just download and maintain 10 versions of python and 20 versions of every python package on your computer

Anonymous
06/22/24(Sat)20:55:41 No.16248534

Anonymous 06/22/24(Sat)20:55:41 No.16248534

>>16247564
Isn't python pretty backwards compatible? At least within the different versions, like 2.0, 3.0 etc.

Anonymous
06/23/24(Sun)05:23:56 No.16248998

Anonymous 06/23/24(Sun)05:23:56 No.16248998

>>16248534
Python itself is ok, but the packages break compatibility with every minor update

Anonymous
06/23/24(Sun)10:55:42 No.16249239

Anonymous 06/23/24(Sun)10:55:42 No.16249239

>>16248998
>packages break compatibility with every minor update
that's the problem with the packages, not python tho. even tho I think python authorities should enforce some kind of standard on backward compatibility of the 3rd party packages. worst yet I've seen is when a package is no longer maintained, its older complied binaries cease to exist on some corpo servers and your environment installation no longer work or you have to compile the binaries from sources, which can take a day just because random crap breaks.

Anonymous
06/23/24(Sun)16:02:29 No.16249792

Anonymous 06/23/24(Sun)16:02:29 No.16249792

>>16192812
Send emails to every local business. Eventually someone will respond positively

Anonymous
06/23/24(Sun)23:19:45 No.16250290

Anonymous 06/23/24(Sun)23:19:45 No.16250290

>>16187525
I came up with an interesting replacement for t-tests recently, and I want to share it. Basically, the exact way to get the p-value is to get the number of permutations where the difference in means is greater than or equal to the difference seen in the experiment, and then divide that by the total number of permutations. This is called a permutation test, but it's usually too expensive to compute, so people use t-tests as an approximation. What I've realized is that since computers are so powerful nowadays, you can just approximate the permutation test with monte carlo simulations, which avoids the headache of checking if your data meets the assumptions of a t-test.
>>16197968
Been trying to, but I've gotten lazy recently. Going to get back into it because of this comment.

Anonymous
06/24/24(Mon)02:24:27 No.16250431

Anonymous 06/24/24(Mon)02:24:27 No.16250431

>>16250290
> What I've realized is that since computers are so powerful nowadays, you can just approximate the permutation test with monte carlo simulations, which avoids the headache of checking if your data meets the assumptions of a t-test.

Combinatorial explosion is going to fuck you up good my friend. Assignment algorithms are great to demonstrate exactly why you can't just wave your hands and say "powerful computers will fix it all."

Let's say you have a fancy global optimization based parking assignment algorithm and you have a (fairly small) parking lot of 100 spots and you want to prove that your algorithm is better than random assignment no matter what the starting layout is. There's 2^100 possible permutations, but with Monte Carlo sampling you could probably reduce your permutation test burden to 2^80 or so trials needed to reject the null.

Let's say you have a really powerful computer that can do 10,000 of these assignments per second, (which is actually very optimistic for a potentially 100 x 100 integer programming problem).

These Monte Carlo trials would take you a speedy 3.8E12 years to complete. Quite quick actually!

Let's say now you've got 100,000 of these computers arranged in some sort of sci-fi super cluster (and magically have instantaneous synchronization and no potential for accidentally repeated permutations). This would reduce your time to complete these trials down to a much more manageable 38 million years.
Now if you had a million of these 100,000 computer super clusters with perfect parallelization/synchronization and no data management issues, you could validate your algorithm in 38 years of constant Monte Carlo trials! You might need 10 nuclear powerplants solely dedicated to supporting your computing power to test your one little parking assignment algorithm, but you could do it!

I think dealing with the De Moivre Laplace approximation is a better choice in most of these kinds of circumstances.

Anonymous
06/24/24(Mon)16:27:44 No.16251492

Anonymous 06/24/24(Mon)16:27:44 No.16251492

>>16245967
Surprisingly easy syntax. A lot like BASIC back in the day

Anonymous
06/24/24(Mon)19:44:45 No.16251871

Anonymous 06/24/24(Mon)19:44:45 No.16251871

>>16250431
Why are you holding the permutation test
to a higher standard than the original t test?
If 0.001 of sampled permutations have a higher
test statistic, the p value is 0.001. Sure there
will be billions of permutations with a higher
test statistic but there's no need to get all of
them.
Why do you have to know a tiny p value exactly?

Anonymous
06/25/24(Tue)00:02:36 No.16252127

Anonymous 06/25/24(Tue)00:02:36 No.16252127

>>16251871
Right so since you haven’t and can’t sampl all of them, you’re back to having to test the ones you did sample for statistical significance

Anonymous
06/25/24(Tue)15:26:02 No.16253269

Anonymous 06/25/24(Tue)15:26:02 No.16253269

bump

Anonymous
06/25/24(Tue)15:27:50 No.16253272

Anonymous 06/25/24(Tue)15:27:50 No.16253272

I know this might be retarded but are there any cutting edge research topic at the intersection of convex optimization and statistics?

Anonymous
06/25/24(Tue)16:15:59 No.16253396

Anonymous 06/25/24(Tue)16:15:59 No.16253396

File: Fairy Skills.png (164 KB, 1273x543)

164 KB PNG

Prefacing this by stating I'm not very good with stats. I have a stats question based on a video game I play and how certain skills are randomly learned.
When this character levels up enough to learn a skill, it can learn one of seven skills, and each skill has varying tiers of that skill that can be learned. The game states skills are attempted to be learned in a specific order, rather than all at once, see pic related - the first skill is attempted to be learned at a 1% chance of success, and if that fails then the next tier is attempted at a 2% chance of success, and so on down the columns then across the rows.
This means the actual learning chance isn't simply the chance of each skill, right?
If there were 100 skills each at a 1% chance, then the resulting learned skills as more and more are learned would start looking like a normal distribution centered around the midpoint of the list (I think?). However the skill chance is not constant, so there would be some bias to the distribution but I can't figure out how to combine the two.
To complicate matters further, if the character successfully learns any tier of one skill, the rest of the tiers of that skill are then unable to be learned, so the list is shortened.

Anonymous
06/25/24(Tue)19:34:13 No.16253869

Anonymous 06/25/24(Tue)19:34:13 No.16253869

>>16253272
Yes, a lot actually.

Anonymous
06/25/24(Tue)20:23:18 No.16253981

Anonymous 06/25/24(Tue)20:23:18 No.16253981

>>16253272
Yes, non-linear programming approaches to statistical estimation problems are very powerful. In particular, you'll see cone-tangent and Fenchel duality approaches to constrained NLS solutions used in all sorts of problematic statistics problems in physics and engineering (e.g., inverse parameter problems for things like distance or directional cosine based direction).

Anonymous
06/25/24(Tue)20:51:53 No.16254025

Anonymous 06/25/24(Tue)20:51:53 No.16254025

Bayesian probability theory made me lose faith in humankind. Also the goat gameshow thing with 1000 doors and the host opens 998 other doors.

Anonymous
06/25/24(Tue)21:00:21 No.16254040

Anonymous 06/25/24(Tue)21:00:21 No.16254040

>>16254025
What about Bayesianism has made you lose faith? Is it the interpretation of probability as a "belief" or "uncertainty" or is it something more about the mathematical approach to Bayesianism?

Anonymous
06/25/24(Tue)21:36:19 No.16254073

Anonymous 06/25/24(Tue)21:36:19 No.16254073

File: 4E7004D0-13E3-4CDB-A871-B(...).jpg (867 KB, 1284x1595)

867 KB JPG

wtf are you guys “programming” ?

Anonymous
06/25/24(Tue)21:50:41 No.16254085

Anonymous 06/25/24(Tue)21:50:41 No.16254085

>>16253981
>cone-tangent and Fenchel duality approaches
lol. I am unironically at this part in a 140-pages paper I'm reading.

Anonymous
06/25/24(Tue)22:12:26 No.16254104

Anonymous 06/25/24(Tue)22:12:26 No.16254104

File: 1708000923106363m.jpg (75 KB, 691x1024)

75 KB JPG

>>16187525
suuuup?

Anonymous
06/25/24(Tue)22:41:09 No.16254138

Anonymous 06/25/24(Tue)22:41:09 No.16254138

>>16254085
Fenchel conjugates are also super important for large deviations theory, which form the basis for near-optimal fixed sample size hypothesis testing when your elementwise test statistic is not necessarily a log likelihood ratio. Convex analysis and information theory both can be made very useful to statistics if you feel like learning some math.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.