# Correlation

The **Correlation Coefficient** is given by
$ρ_{X,Y}=σ_{X}⋅σ_{Y}Cov(X,Y) $
Note that

- $∣ρ∣≈1$ is evidence of a strong (linear) relationship
- $∣ρ∣≈0$ is evidence of no linear relationship

Properties:

- $−1≤ρ_{X,Y}≤1$
- If $Y=a+bX$, then $ρ_{X,Y}=1$ or $ρ_{X,Y}=−1$
- Unit Independence
- $X$ and $Y$ are independent $⟹$ $ρ_{X,Y}=0$
**However**, $ρ_{X,Y}=0$ does NOT**imply**that $X$ and $Y$ are independent.- There may not be a linear relationship, but another may exist.
- There is this same property for Covariance

### Correlation is not Causation

A word about correlation and independence. If variables are independent they can vary separately. If you walk in an open field, you can move in the direction (east-west), the direction(north-south), or any combination thereof. Independent variables are always also uncorrelated. Except in special cases, the reverse does not hold true. Variables can be uncorrelated, but dependent. For example, consider . Correlation is a linear measurement, so and are uncorrelated. However, is dependent on .

#### STAT206

I heard this and I said “You know this”, but I actually didn’t really know this.

My initial naive understanding is that just because two things are correlated doesn’t mean that they cause each other.

The other way to think about it is that just because two things are not correlated, doesn’t mean that they they don’t cause each other.

We demonstrate it below by showing an example where $ρ_{x,y}=0$, yet $X$ and $Y$ are dependent.

Wait… so what is causation? Consider $Y=X_{2}$. So so far, you can’t even say that $Y$ causes $X$, or $X$ causes $Y$, but you can say that these random variables are not Independent.

#todo #gap-in-knowledge I don’t understand this, check with the professor for the final file:///Users/stevengong/My%20Drive/Waterloo/2A/STAT206/Notes/Week%209.1%20-%20Discrete%20Joint%20Distributions%20-%20class%20notes.pdf. Another way to think about it is from the idea that “given that random variables are independent, we find that $f(x,y)=f_{x}(x)f_{y}(y)$”. We don’t find that this is the case here in our joint table.

Like look at the table

$X$ | $f_{x}$ |
---|---|

-1 | 1/3 |

0 | 1/3 |

1 | 1/3 |

$Y$ | $f_{y}$ |
---|---|

0 | 1/3 |

1 | 2/3 |

$XY$ | $f_{xy}$ |
---|---|

-1 | 1/3 |

0 | 1/3 |

1 | 1/3 |

How is the $XY$ table above generated? Well you look at the table I guess. $31 $ of the time, y=0, so XY=0 happens 1/3 of the time. Then, for the other 2/3 of the time, y=1, so ?? wait yea that doesn’t make sense.

You see that

&={0-0 \cdot E(Y)}\\ [3pt] &={0}\end{aligned}$$ ### Example Suppose $f(x, y) = kq^{x+y-2}p^2, q = 1 - p, x = 1, 2, \dots \space y = 1, 2, \dots$ 1) Find $k$ We know that $\sum_{x=1}^\infty \sum_{y=1}^\infty f(x,y) = 1$, so we solve directly and find that $k=1$. 2) Find $f_X(x)$. Similar idea, we solve the following equation: $$f_X(x) = \sum_{y=1}^\infty f(x,y) = 1$$ ### Related - [[notes/Cross-Correlation|Cross-Correlation]]