Vector derivatives

1. Scalar Function of a Vector

Description:

You have a scalar function f(\mathbf{x}) , where \mathbf{x} \in \mathbb{R}^n . The derivative is called the gradient, and it’s a column vector.

Formula:

\nabla_{\mathbf{x}} f = \frac{\partial f}{\partial \mathbf{x}} = \begin{bmatrix}\frac{\partial f}{\partial x_1} \\\frac{\partial f}{\partial x_2} \\\vdots \\\frac{\partial f}{\partial x_n}\end{bmatrix}

Example:

Let

f(\mathbf{x}) = \mathbf{x}^\top \mathbf{x} = \sum_{i=1}^n x_i^2

This is just the dot product of a vector with itself, so it gives a scalar.

Let’s say \mathbf{x} = \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} , so:

f(\mathbf{x}) = x_1^2 + x_2^2

Now we compute the gradient \nabla_{\mathbf{x}} f = \frac{\partial f}{\partial \mathbf{x}} , which is a vector of partial derivatives:


Step-by-step partials:

  • \frac{\partial f}{\partial x_1} = \frac{\partial}{\partial x_1} (x_1^2 + x_2^2) = 2x_1
  • \frac{\partial f}{\partial x_2} = \frac{\partial}{\partial x_2} (x_1^2 + x_2^2) = 2x_2

Now collect into a column vector:

\frac{\partial f}{\partial \mathbf{x}} =\begin{bmatrix}\frac{\partial f}{\partial x_1} \\\frac{\partial f}{\partial x_2}\end{bmatrix}=\begin{bmatrix}2x_1 \\2x_2\end{bmatrix}= 2\mathbf{x}

So we’ve explicitly calculated that the gradient of f(\mathbf{x}) = \mathbf{x}^\top \mathbf{x} is 2\mathbf{x} .


2. Vector Function of a Vector (Jacobian)

Description:

You have a function \mathbf{f}(\mathbf{x}) : \mathbb{R}^n \to \mathbb{R}^m . Its derivative is the Jacobian matrix, which contains all partial derivatives \frac{\partial f_i}{\partial x_j} .

Formula:

\frac{\partial \mathbf{f}}{\partial \mathbf{x}} =\begin{bmatrix}\frac{\partial f_1}{\partial x_1} & \cdots & \frac{\partial f_1}{\partial x_n} \\\vdots & \ddots & \vdots \\\frac{\partial f_m}{\partial x_1} & \cdots & \frac{\partial f_m}{\partial x_n}\end{bmatrix}

Example:

Let
\mathbf{f}(\mathbf{x}) = \begin{bmatrix}x_1^2 \\sin(x_2) \\x_1 x_2\end{bmatrix}, \quad \mathbf{x} = \begin{bmatrix}x_1 \\x_2\end{bmatrix}

Then the Jacobian is:
\frac{\partial \mathbf{f}}{\partial \mathbf{x}} = \begin{bmatrix}2x_1 & 0 \\0 & \cos(x_2) \\x_2 & x_1\end{bmatrix}


3. Matrix Formulas in Vector Calculus

Example:

Let’s say you have:
f(\mathbf{x}) = \mathbf{x}^\top A \mathbf{x}
where A is a constant symmetric matrix.

Then:
\frac{\partial f}{\partial \mathbf{x}} = 2A\mathbf{x}

If A = \begin{bmatrix} 3 & 1 \\ 1 & 2 \end{bmatrix}, \quad \mathbf{x} = \begin{bmatrix} 1 \\ 2 \end{bmatrix}

\frac{\partial f}{\partial \mathbf{x}} = 2 \begin{bmatrix} 3 & 1 \\ 1 & 2 \end{bmatrix} \begin{bmatrix} 1 \\ 2 \end{bmatrix}= 2 \begin{bmatrix} 3(1) + 1(2) \\ 1(1) + 2(2) \end{bmatrix}= 2 \begin{bmatrix} 5 \\ 5 \end{bmatrix}= \begin{bmatrix} 10 \\ 10 \end{bmatrix}


? Linear Regression: Vector Derivative Example

? Goal:

Given data \mathbf{X} \in \mathbb{R}^{m \times n} (input features), and target \mathbf{y} \in \mathbb{R}^m , find weights \mathbf{w} \in \mathbb{R}^n that minimize:

\boxed{f(\mathbf{w}) = \frac{1}{2} \|\mathbf{Xw} - \mathbf{y}\|^2}

This is the least squares loss function.

Step 1: Expand the function

f(\mathbf{w}) = \frac{1}{2} (\mathbf{Xw} - \mathbf{y})^\top (\mathbf{Xw} - \mathbf{y})

Step 2: Take gradient with respect to \mathbf{w}

Use the identity:
\nabla_{\mathbf{w}} \left( \frac{1}{2} \|A\mathbf{w} - \mathbf{b}\|^2 \right) = A^\top (A\mathbf{w} - \mathbf{b})

So:

\boxed{\nabla_{\mathbf{w}} f(\mathbf{w}) = \mathbf{X}^\top (\mathbf{Xw} - \mathbf{y})}

This gives the gradient (vector of partials) with respect to the weights \mathbf{w} .

Step 3: Solve for optimal weights (closed-form solution)

Set gradient to zero:

\mathbf{X}^\top (\mathbf{Xw} - \mathbf{y}) = 0

\Rightarrow \mathbf{X}^\top \mathbf{Xw} = \mathbf{X}^\top \mathbf{y}

\Rightarrow \boxed{\mathbf{w}^* = (\mathbf{X}^\top \mathbf{X})^{-1} \mathbf{X}^\top \mathbf{y}} \quad \text{(normal equation)}


Discover more from Science Comics

Subscribe to get the latest posts sent to your email.

Leave a Reply

error: Content is protected !!