












🧪 Ronald Fisher: The p‑value Rebel
Philosophy: Evidence, not decisions
Fisher believed statistics should help scientists measure evidence against a null hypothesis.
Key ideas
- The null hypothesis is a straw man you try to poke holes in.
- The p‑value measures how incompatible the data are with the null.
- You never “accept” the null — you simply fail to find evidence against it.
- No fixed rules. No rigid thresholds.
- Statistics is about learning, not making yes/no decisions.
Fisher’s vibe:
The scientist as a detective, gathering clues and weighing evidence.
⚔️ Jerzy Neyman & Egon Pearson: The Decision‑Theorists
Philosophy: Decisions, not evidence
Neyman and Pearson believed statistics should guide actions, not beliefs.
Key ideas
- You must choose between two hypotheses:
- You control long‑run error rates:
Type I error (false positive)
Type II error (false negative) - You choose a fixed α (significance level) before collecting data.
- You define critical regions and make binary decisions:
- Reject
- Fail to reject
Neyman–Pearson vibe:
The statistician as a quality‑control officer: strict rules, long‑run guarantees, no interpretation.
💥 Why They Fought
Fisher thought Neyman–Pearson’s approach was:
- too rigid
- too mechanical
- too focused on long‑run behavior instead of the actual experiment
- philosophically misguided
Neyman thought Fisher’s p‑values were:
- subjective
- inconsistent
- lacking clear decision rules
- too dependent on interpretation
They respected each other’s brilliance but fundamentally disagreed on what statistics is for.
🧩 The Modern Mess: We Use Both
Here’s the twist:
Modern hypothesis testing is a Frankenstein hybrid of both philosophies.
We use:
- Fisher’s p‑values
- Neyman–Pearson’s α levels, Type I/II errors, and decision rules
But these ideas don’t actually fit together perfectly.
Example of the hybrid:
- Compute a p‑value (Fisher)
- Compare it to a pre‑chosen α (Neyman–Pearson)
- Make a binary decision (Neyman–Pearson)
- Then interpret the p‑value as “strength of evidence” (Fisher)
It’s like mixing oil and water — but we’ve been doing it for 80 years.
🎨 A Simple Analogy
Fisher = a thermometer
Tells you how hot the evidence is.
Neyman–Pearson = a fire alarm
If the temperature crosses a threshold, the alarm goes off.
We now use the thermometer to decide when to trigger the fire alarm.
Fisher would roll his eyes. Neyman would sigh.
📚 A Quick Side‑by‑Side
| Feature | Fisher | Neyman–Pearson |
|---|---|---|
| Purpose | Measure evidence | Make decisions |
| Key tool | p‑value | Critical values, α, β |
| Hypotheses | Only | |
| Interpretation | Continuous evidence | Binary decision |
| Accept | Never | Yes (implicitly) |
| Error control | None | Type I & II errors |
| Philosophy | Inductive inference | Long‑run frequency control |
🎯 Why This Matters for Students
Understanding the Fisher–Neyman battle helps students see:
- why hypothesis testing feels contradictory
- why p‑values are often misinterpreted
- why “statistical significance” is not the same as evidence
- why modern statistics is moving toward effect sizes, confidence intervals, and Bayesian methods
It’s not that students are confused — it’s that the system itself is a compromise between two incompatible worldviews.