Hessian matrix

Dec. 17, 2024

The Hessian matrix describes the local curvature of a multivariate function1:

In mathematics, the Hessian matrix, Hessian or (less commonly) Hesse matrix is a square matrix of second-order partial derivatives of a scalar-valued function, or scalar field. It describes the local curvature of a function of many variables. The Hessian matrix was developed in the 19th century by the German mathematician Ludwig Otto Hesse and later named after him. Hesse originally used the term “functional determinants”. The Hessian is sometimes denoted by $\mathrm{\boldsymbol{H}}$ or, ambiguously, by $\nabla^2$.

and it is formally defined as follows1:

Suppose $f$: $\mathbb{R}^n\rightarrow\mathbb{R}$ is a function taking as input a vector $\mathrm{\boldsymbol{x}}\in\mathbb{R}^n$ and outputting a scalar $f(\mathrm{\boldsymbol{x}})\in\mathbb{R}$. If all second-order partial derivatives of $f$ exist, then the Hessian matrix $\mathrm{\boldsymbol{H}}$ of $f$ is a square $n\times n$ matrix, usually defined and arranged as:

\[\mathrm{\boldsymbol{H}}_f=\begin{bmatrix} \dfrac{\partial^2f}{\partial x_1^2} & \dfrac{\partial^2f}{\partial x_1\partial x_2} & \cdots & \dfrac{\partial^2f}{\partial x_1\partial x_n}\\ \dfrac{\partial^2f}{\partial x_2\partial x_1} & \dfrac{\partial^2f}{\partial x_2^2} & \cdots & \dfrac{\partial^2f}{\partial x_2\partial x_n}\\ \vdots & \vdots & \ddots & \vdots\\ \dfrac{\partial^2f}{\partial x_n\partial x_1} & \dfrac{\partial^2f}{\partial x_n\partial x_2} & \cdots & \dfrac{\partial^2f}{\partial x_n^2}\\ \end{bmatrix}\notag\]

That is, the entry of the $i$-th row and the $j$-th column is:

\[(\mathrm{\boldsymbol{H}}_f)_{i,j}=\dfrac{\partial^2f}{\partial x_i\partial x_j}\notag\]

Here are some properties of it1:

(1) If furthermore the second partial derivatives are all continuous, the Hessian matrix is a symmetric matrix by the symmetry of second derivatives.

(2) The determinant of the Hessian matrix is called the Hessian determinant.

(3) The Hessian matrix of a function $f$ is the transpose of the Jacobian matrix of the gradient of the function $f$:

\[\mathrm{\boldsymbol{H}}(f(\mathrm{\boldsymbol{x}}))=\mathrm{\boldsymbol{J}}(\nabla f(\mathrm{\boldsymbol{x}}))^T\notag\]

References