Statistical Significance

A result is statistically significant when it would be unlikely to arise from chance alone, quantified by the p-value: the probability of observing a result at least as extreme as the data assuming the null hypothesis is true.

Why be cautious about "significant"?

Significance gives a sharable convention for “this isn’t just noise”, but the convention is routinely overread as proof of importance or truth.

Conventional thresholds:

$p < 0.05$ : “significant” (5% chance of false positive under the null)
$p < 0.01$ : “highly significant”
$p < 0.001$ : “very highly significant”

What significance does NOT mean

Not the probability the hypothesis is true

Not the probability the result is real

Does not measure effect size; a tiny effect can be highly significant given a huge sample

Does not mean the result will replicate

A medication lowering blood pressure by 0.2 mmHg in a 100,000-person trial can be “highly significant” yet clinically worthless. Always ask how big the effect is, not just whether it is nonzero.

p-hacking

Run enough tests and some will hit $p < 0.05$ by chance. This is the Texas Sharpshooter Fallacy applied to statistics: drawing the target around the bullets after the fact.

🛠️ Steven Gong

Table of Contents

Statistical Significance

Graph View

Backlinks

🛠️ Steven Gong

Table of Contents

Statistical Significance

Related

Graph View

Backlinks