
Understanding the relationship between univariate and multivariate normal distributions can provide a clearer framework for remembering the definition and properties of the multivariate normal distribution.
First, it’s easier to understand & remember the keys properties of multivariate normal distribution by understanding the Mahalanobis distance. So, to start, recall that the Mahalanobis distance is a measure of the distance between a point and a distribution. It is particularly useful for multivariate data as it takes into account the correlations between variables. Unlike the Euclidean distance, which treats all dimensions equally, the Mahalanobis distance accounts for the variability and correlation in the data, providing a more accurate distance metric in many cases.
For a point and a multivariate normal distribution with mean vector
and covariance matrix
, the Mahalanobis distance
is defined as:
Now, note that centers the point
around the mean vector
. Next, the term
adjusts for the covariance structure of the data, scaling distances based on the variability and correlations among the variables. So, the quadratic term
is a quadratic form, providing a scalar value that represents the squared distance from
to the mean
in units of standard deviations. So:
- If
is small, the point
is close to the mean
relative to the covariance structure of the distribution.
- If
is large, the point
is far from the mean
in terms of the distribution’s variability.
Univariate Normal Distribution
The univariate normal distribution is defined by two parameters: the mean and the variance
. Its probability density function (pdf) is given by:
Multivariate Normal Distribution
The multivariate normal distribution generalizes the univariate normal distribution to multiple dimensions. It is defined by a mean vector and a covariance matrix
.
The density function of the multivariate normal distribution can be expressed in terms of the Mahalanobis distance. The Mahalanobis distance between a point and the mean vector
with respect to the covariance matrix
is defined as:
Using the Mahalanobis distance, the density function of the -dimensional multivariate normal distribution is:
In other words, the probability density function for a -dimensional multivariate normal distribution is:
and we can see that the pdf of the multivariate normal is an extension of the univariate normal pdf, where generalizes to
.
Key properties of the multivariate normal distribution include:
- Symmetry and unimodality: like univariate normal, the distribution of multivariate normal is symmetric around
- Marginal distributions: Any subset of the multivariate normal variables also has a multivariate normal distribution.
- Conditional distributions: Conditional distributions of subsets of variables given others are also multivariate normal.
Discover more from Science Comics
Subscribe to get the latest posts sent to your email.