Skip to content

Relationship Between MAE, MSE, RMSE

📐 Definitions (for clarity)

Let errors be e_i = y_i - \hat{y}_i.

MAE
\text{MAE} = \frac{1}{n}\sum |e_i|

MSE
\text{MSE} = \frac{1}{n}\sum e_i^2

RMSE
\text{RMSE} = \sqrt{\text{MSE}}

📐 Relationship Between (\text{MAE})^2 and \text{MSE}

Let the errors be
e_i = y_i - \hat{y}_i

Then:

Mean Absolute Error (MAE)
\text{MAE} = \frac{1}{n}\sum_{i=1}^{n} |e_i|

Mean Squared Error (MSE)
\text{MSE} = \frac{1}{n}\sum_{i=1}^{n} e_i^2

Key Relationship

1. Jensen’s Inequality gives:

(\text{MAE})^2 \le \text{MSE}

Why?

Because the square function is convex, so:
\left( \frac{1}{n}\sum |e_i| \right)^2 \le \frac{1}{n}\sum |e_i|^2 = \text{MSE}

This means: MSE is always ≥ (MAE)²

When are they equal?

(\text{MAE})^2 = \text{MSE}

only when all errors have the same magnitude, i.e.:

|e_1| = |e_2| = \dots = |e_n|

This almost never happens in real data.

Intuition

MSE penalizes large errors more because of the square.
MAE treats all errors linearly.

So:

  • If your error distribution has outliers,
    \text{MSE} \gg (\text{MAE})^2
  • If your errors are uniform and small,
    \text{MSE} \approx (\text{MAE})^2

A useful ratio

A common diagnostic is:
\frac{\text{MSE}}{(\text{MAE})^2}

  • ≈ 1 → errors are uniform
  • ≫ 1 → heavy-tailed errors or outliers
  • < 1 → impossible (violates Jensen)

Relationship Between MAE² and MSE (recap)

From Jensen’s inequality:

(\text{MAE})^2 \le \text{MSE}

Equality only when all errors have the same magnitude.

Now Add RMSE Into the Picture

Since:

\text{RMSE} = \sqrt{\text{MSE}}

we immediately get:

\text{RMSE} \ge \text{MAE}

because:

\sqrt{\text{MSE}} \ge \sqrt{(\text{MAE})^2} = \text{MAE}

This is a strict inequality unless all errors have identical magnitude.

Summary of All Relationships

1. MSE vs MAE²
\text{MSE} \ge (\text{MAE})^2

2. RMSE vs MAE
\text{RMSE} \ge \text{MAE}

3. RMSE vs MAE²
\text{RMSE}^2 = \text{MSE} \ge (\text{MAE})^2

4. Combined chain
\text{RMSE} \ge \text{MAE} \ge 0

Intuition

RMSE penalizes large errors more strongly than MAE because of the square.

  • If your error distribution has outliers, RMSE will be much larger than MAE.
  • If your errors are uniform, RMSE ≈ MAE.

A useful diagnostic ratio

\frac{\text{RMSE}}{\text{MAE}}

Interpretation:

≈ 1 → errors are uniform, no large outliers
> 1.2 → moderate spread in error magnitudes
> 2 → heavy-tailed errors or significant outliers

This ratio is widely used in ML model evaluation to understand error distribution shape.

Leave a Reply

error: Content is protected !!