STAT206: Statistics for Software Engineering
Course that I took in 2A SE.
There is this Statistics course is recommended by Soham.
Link to course notes here.
Things I still haven’t mastered
I think in general, this just comes down to lots of practice and muscle memory. These are concepts that I really need more practice in:
- Goodness of Fit -> this is for what again?
- Linear Regression -> use both methods
- Independence of Attributes -> to check if attributes are independent
- I think the Goodness of Fit of formula is used here!
- Hypothesis Testing
- Being able to pick the right Test Statistic is SUPER important, look at the formula sheet, there are 6 of them
- Wait, I don’t even know what is a Test Statistic..>?
- With Two Means?? I have no idea how to do
- When to use how many Degrees of Freedom ?? for each test
Chi-Squared Table comes in 2 cases:
- When you’re trying to estimate variance with a Confidence Interval
- goodness of fit
I need more practice with Chi squared.
Review a lot my solutions for Quiz 3, because I really didn’t do well on this one.
Use sum? CLT (38 minutes in) But for mean, use Standard Error for
Personal Notes:
- When you are trying to figure out the value for the z-table, always draw a Normal Distribution. It doesn’t hurt.
- If it’s just one-tailed, you can just use the direct value
- If it’s two tailed (because it’s within two inequalities), you can’t use the z-value directly, see Confidence Interval for sample calculation
Notes on things that I found hard: Week 9 Tutorial
-
1f: In a class of 120 Software Engineering students, what is the probability that the class average will be 76 or more?
- Ahh, this is where you use the Standard Deviation of the Mean
-
Example 3: How did they get ??
-
Correlation#todo #gap-in-knowledge I don’t understand this, check with the professor for the final file:///Users/stevengong/My%20Drive/Waterloo/2A/STAT206/Notes/Week%209.1%20-%20Discrete%20Joint%20Distributions%20-%20class%20notes.pdf.
-
Practice Confidence Interval, know the steps really by heart
-
Hypothesis Testing -> really do exercises and understand, unlike the first time you learned stats for Hypothesis Testing -> really do exercises and understand, unlike the first time you learned stats for Biology
-
Linear Regression model, the methods
Concepts
- Error
- Mean
- Standard Deviation
- Bias
- Variance
- Monty Hall Problem
- Random Number
- Probability
- Probability Rules -> from CS50 course
- Relative Frequency
- Bayesian Probability
- Counting Rules (Probability)
- Independence (Statistics)
- Mutual Exclusivity
- Baseline Fallacy
- Inclusion-Exclusion Principle
- Random Variable
- Sample vs. Population
- Probability Mass Function
- Distribution
- Cumulative Distribution Function
- Bernoulli Distribution
- Binomial Distribution
- Geometric Distribution
- Memorylessness
- Indicator Variable
- Moment-Generating Function
- Skewness
- Central Limit Theorem
- Discrete Joint Distribution
- Marginal Distribution
- Multinomial Distribution
- Data Analysis and Inference
- Histogram
- Statistical Modelling
- Estimation
- Likelihood Function
- Maximum Likelihood Estimation
- Relative Likelihood Function
- Interval Estimation
- Chi-Squared Distribution
- Hypergeometric Distribution
- Student’s t-Distribution
- Confidence Interval
- So I think the Chi-Squared Distribution and the Student’s t-Distribution really come into the picture when we start looking at getting Confidence Intervals
- Estimator
- Hypothesis Testing
- Goodness of Fit
- Contingency Table
- Independence of Attributes
- Linear Regression
This is also called a Chi-Squared test.
Motivation: Maybe there is a different proportion of left-handed and right-handed smokers. Testing the proportion of L and R handed smokers, is the same as testing for the independence and attributes.
How do you calculate ei
Miscellaneous Ideas
Probability and Statistics is very powerful, but oftentimes you can be easily mistaken and misled by your intuition.
For example, if you throw two dices, you would think that the probability of getting 7 is the same as getting a 12, since everything is uniformly distributed, so thus it’s random, but no.
A big part of statistics/probability initially is to learn how to count. -> OH yes, I remember this, it wasn’t from MIT6.042 but rather on Permutations and Permutations and Combinations
One super cool thing that I also learned in MIT6042.J is the Baseline Fallacy
Definitions
- Random Experiment: An experiment whose outcomes are unknown.
- Sample Space: The set of all possible outcomes in an experiment.
- Event: Any subset of a sample space.
- Probability
- Relative Frequency:
Problem-Solving Insights
#todo write down the patterns that you seen while solving these.
Splitting a problem into two parts (using an OR).
Ex: Probability that the first card is a King, and second card is Red. First part is assuming K is not red, second parenthesis is assuming K is red.