Skip to content

Fisher vs Neyman battle

🧪 Ronald Fisher: The p‑value Rebel

Philosophy: Evidence, not decisions

Fisher believed statistics should help scientists measure evidence against a null hypothesis.

Key ideas

  • The null hypothesis is a straw man you try to poke holes in.
  • The p‑value measures how incompatible the data are with the null.
  • You never “accept” the null — you simply fail to find evidence against it.
  • No fixed rules. No rigid thresholds.
  • Statistics is about learning, not making yes/no decisions.

Fisher’s vibe:

The scientist as a detective, gathering clues and weighing evidence.

⚔️ Jerzy Neyman & Egon Pearson: The Decision‑Theorists

Philosophy: Decisions, not evidence

Neyman and Pearson believed statistics should guide actions, not beliefs.

Key ideas

  • You must choose between two hypotheses:
    H_0 \quad \text{vs.} \quad H_1
  • You control long‑run error rates:
    Type I error (false positive)
    Type II error (false negative)
  • You choose a fixed α (significance level) before collecting data.
  • You define critical regions and make binary decisions:
  • Reject H_0
  • Fail to reject H_0

Neyman–Pearson vibe:

The statistician as a quality‑control officer: strict rules, long‑run guarantees, no interpretation.

💥 Why They Fought

Fisher thought Neyman–Pearson’s approach was:

  • too rigid
  • too mechanical
  • too focused on long‑run behavior instead of the actual experiment
  • philosophically misguided

Neyman thought Fisher’s p‑values were:

  • subjective
  • inconsistent
  • lacking clear decision rules
  • too dependent on interpretation

They respected each other’s brilliance but fundamentally disagreed on what statistics is for.

🧩 The Modern Mess: We Use Both

Here’s the twist:
Modern hypothesis testing is a Frankenstein hybrid of both philosophies.

We use:

  • Fisher’s p‑values
  • Neyman–Pearson’s α levels, Type I/II errors, and decision rules

But these ideas don’t actually fit together perfectly.

See also  Type I and type II errors

Example of the hybrid:

  • Compute a p‑value (Fisher)
  • Compare it to a pre‑chosen α (Neyman–Pearson)
  • Make a binary decision (Neyman–Pearson)
  • Then interpret the p‑value as “strength of evidence” (Fisher)

It’s like mixing oil and water — but we’ve been doing it for 80 years.

🎨 A Simple Analogy

Fisher = a thermometer

Tells you how hot the evidence is.

Neyman–Pearson = a fire alarm

If the temperature crosses a threshold, the alarm goes off.

We now use the thermometer to decide when to trigger the fire alarm.
Fisher would roll his eyes. Neyman would sigh.

📚 A Quick Side‑by‑Side

FeatureFisherNeyman–Pearson
PurposeMeasure evidenceMake decisions
Key toolp‑valueCritical values, α, β
HypothesesOnly H_0H_0 and H_1
InterpretationContinuous evidenceBinary decision
Accept H_0?NeverYes (implicitly)
Error controlNoneType I & II errors
PhilosophyInductive inferenceLong‑run frequency control

🎯 Why This Matters for Students

Understanding the Fisher–Neyman battle helps students see:

  • why hypothesis testing feels contradictory
  • why p‑values are often misinterpreted
  • why “statistical significance” is not the same as evidence
  • why modern statistics is moving toward effect sizes, confidence intervals, and Bayesian methods

It’s not that students are confused — it’s that the system itself is a compromise between two incompatible worldviews.

Leave a Reply

error: Content is protected !!