STAT206: Statistics for Software Engineering

There is this Statistics course is recommended by Soham.

Link to course notes here.

Concepts

Error
Mean
Standard Deviation
Bias
Variance
Monty Hall Problem
Random Number
Probability
- Probability Rules → from CS50 course
- Relative Frequency
- Bayesian Probability
Counting Rules (Probability)
Independence (Statistics)
Mutual Exclusivity
Baseline Fallacy
Inclusion-Exclusion Principle
Random Variable
Sample vs. Population
Probability Mass Function
Distribution
Cumulative Distribution Function
Bernoulli Distribution
Binomial Distribution
Geometric Distribution
Memorylessness
Indicator Variable
Moment-Generating Function
Skewness
Central Limit Theorem
Discrete Joint Distribution
Marginal Distribution
Multinomial Distribution
Data Analysis and Inference
Histogram
Statistical Modelling
Estimation
Likelihood Function
Maximum Likelihood Estimation
Relative Likelihood Function
Interval Estimation
Chi-Squared Distribution
Hypergeometric Distribution
Student’s t-Distribution
Confidence Interval
- So I think the Chi-Squared Distribution and the Student’s t-Distribution really come into the picture when we start looking at getting Confidence Intervals
Estimator
Hypothesis Testing
- Student’s t-Distribution
Goodness of Fit
Contingency Table
Independence of Attributes
Linear Regression
- Gauss-Markov Theorem
- Least Squares

This is also called a Chi-Squared test.

Motivation: Maybe there is a different proportion of left-handed and right-handed smokers. $H_{0} = π_{L} = π_{R}$ $H_{1} = π_{L} \neq = π_{R}$ Testing the proportion of L and R handed smokers, is the same as testing for the independence and attributes.

How do you calculate ei

Miscellaneous Ideas

Data Dredging

Probability and Statistics is very powerful, but oftentimes you can be easily mistaken and misled by your intuition.

For example, if you throw two dices, you would think that the probability of getting 7 is the same as getting a 12, since everything is uniformly distributed, so thus it’s random, but no.

A big part of statistics/probability initially is to learn how to count. → OH yes, I remember this, it wasn’t from MIT6.042 but rather on Permutations and Permutations and Combinations

One super cool thing that I also learned in MIT6042.J is the Baseline Fallacy

Definitions

Random Experiment: An experiment whose outcomes are unknown.
Sample Space: The set of all possible outcomes in an experiment.
Event: Any subset of a sample space.
Probability
Relative Frequency: $P (A) = the long-term relative frequency of an event$

Problem-Solving Insights

Splitting a problem into two parts (using an OR).

Ex: Probability that the first card is a King, and second card is Red. First part is assuming K is not red, second parenthesis is assuming K is red. $(\frac{2}{52} \cdot \frac{13}{51}) + (\frac{2}{52} \cdot \frac{12}{51})$

🛠️ Steven Gong

Table of Contents

STAT206: Statistics for Software Engineering

Concepts

Miscellaneous Ideas

Definitions

Problem-Solving Insights

Graph View

Backlinks