Hessian matrix

Dec. 17, 2024 • Updated Apr. 09, 2025

The Hessian matrix describes the local curvature of a multivariate function1:

In mathematics, the Hessian matrix, Hessian or (less commonly) Hesse matrix is a square matrix of second-order partial derivatives of a scalar-valued function, or scalar field. It describes the local curvature of a function of many variables. The Hessian matrix was developed in the 19th century by the German mathematician Ludwig Otto Hesse and later named after him. Hesse originally used the term ā€œfunctional determinantsā€. The Hessian is sometimes denoted by $\mathrm{\boldsymbol{H}}$ or, ambiguously, by $\nabla^2$.

and it is formally defined as follows1:

Suppose $f$: $\mathbb{R}^n\rightarrow\mathbb{R}$ is a function taking as input a vector $\mathrm{\boldsymbol{x}}\in\mathbb{R}^n$ and outputting a scalar $f(\mathrm{\boldsymbol{x}})\in\mathbb{R}$. If all second-order partial derivatives of $f$ exist, then the Hessian matrix $\mathrm{\boldsymbol{H}}$ of $f$ is a square $n\times n$ matrix, usually defined and arranged as:

\[\mathrm{\boldsymbol{H}}_f=\begin{bmatrix} \dfrac{\partial^2f}{\partial x_1^2} & \dfrac{\partial^2f}{\partial x_1\partial x_2} & \cdots & \dfrac{\partial^2f}{\partial x_1\partial x_n}\\ \dfrac{\partial^2f}{\partial x_2\partial x_1} & \dfrac{\partial^2f}{\partial x_2^2} & \cdots & \dfrac{\partial^2f}{\partial x_2\partial x_n}\\ \vdots & \vdots & \ddots & \vdots\\ \dfrac{\partial^2f}{\partial x_n\partial x_1} & \dfrac{\partial^2f}{\partial x_n\partial x_2} & \cdots & \dfrac{\partial^2f}{\partial x_n^2}\\ \end{bmatrix}\notag\]

That is, the entry of the $i$-th row and the $j$-th column is:

\[(\mathrm{\boldsymbol{H}}_f)_{i,j}=\dfrac{\partial^2f}{\partial x_i\partial x_j}\notag\]

Here are some properties of it1:

(1) If furthermore the second partial derivatives are all continuous, the Hessian matrix is a symmetric matrix by the symmetry of second derivatives.

(2) The determinant of the Hessian matrix is called the Hessian determinant.

(3) The Hessian matrix of a function $f$ is the transpose of the Jacobian matrix of the gradient of the function $f$: \(\mathrm{\boldsymbol{H}}(f(\mathrm{\boldsymbol{x}}))=\mathrm{\boldsymbol{J}}(\nabla f(\mathrm{\boldsymbol{x}}))^T\notag\)

Note: The transpose seems not necessary, because the Hessian matrix is symmetric. See2.

References