Higher-order partial derivatives extend the idea of repeated differentiation to functions of several variables. They reveal how rates of change themselves change, which is essential for classifying critical points, building Taylor approximations, and understanding curvature in multivariable settings.

Second-Order Partial Derivatives

Pure and Mixed Partial Derivatives

A second-order partial derivative is what you get when you differentiate a first-order partial derivative one more time. There are two flavors: pure and mixed.

Pure partial derivatives differentiate with respect to the same variable twice. For a function $f(x, y)$ :

$\frac{\partial^2 f}{\partial x^2}$ means: take $\frac{\partial f}{\partial x}$ , then differentiate that result with respect to $x$ again.
$\frac{\partial^2 f}{\partial y^2}$ does the same thing but with $y$ .

Mixed partial derivatives differentiate with respect to different variables in sequence. The notation can be tricky, so pay attention to the order:

$\frac{\partial^2 f}{\partial x \, \partial y}$ means: first differentiate with respect to $y$ , then differentiate that result with respect to $x$ . (Read the denominator right to left.)
$\frac{\partial^2 f}{\partial y \, \partial x}$ means: first differentiate with respect to $x$ , then with respect to $y$ .

Worked example. Let $f(x, y) = x^2 y + x y^2$ .

Compute the first partials:
- $\frac{\partial f}{\partial x} = 2xy + y^2$
- $\frac{\partial f}{\partial y} = x^2 + 2xy$
Pure second-order partials:
- $\frac{\partial^2 f}{\partial x^2} = \frac{\partial}{\partial x}(2xy + y^2) = 2y$
- $\frac{\partial^2 f}{\partial y^2} = \frac{\partial}{\partial y}(x^2 + 2xy) = 2x$
Mixed second-order partials:
- $\frac{\partial^2 f}{\partial x \, \partial y} = \frac{\partial}{\partial x}(x^2 + 2xy) = 2x + 2y$
- $\frac{\partial^2 f}{\partial y \, \partial x} = \frac{\partial}{\partial y}(2xy + y^2) = 2x + 2y$

Notice the two mixed partials are equal here. That's not a coincidence; it's guaranteed by Schwarz's theorem whenever the mixed partials are continuous (more on this below).

Pure and Mixed Partial Derivatives, calculus - Derivative of piecewise functions - Mathematics Stack Exchange

Advanced Topics in Higher-Order Partial Derivatives

Pure and Mixed Partial Derivatives, multivariable calculus - exchanging partial derivative and an integral - Mathematics Stack Exchange

Hessian Matrix

The Hessian matrix packages all second-order partial derivatives of a function into a single matrix. For $f(x_1, x_2, \ldots, x_n)$ , it's the $n \times n$ matrix:

\frac{\partial^2 f}{\partial x_1^2} & \frac{\partial^2 f}{\partial x_1 \partial x_2} & \cdots & \frac{\partial^2 f}{\partial x_1 \partial x_n} \\ \frac{\partial^2 f}{\partial x_2 \partial x_1} & \frac{\partial^2 f}{\partial x_2^2} & \cdots & \frac{\partial^2 f}{\partial x_2 \partial x_n} \\ \vdots & \vdots & \ddots & \vdots \\ \frac{\partial^2 f}{\partial x_n \partial x_1} & \frac{\partial^2 f}{\partial x_n \partial x_2} & \cdots & \frac{\partial^2 f}{\partial x_n^2} \end{bmatrix}$$ When Schwarz's theorem applies, $$H(f)$$ is symmetric ($$H_{ij} = H_{ji}$$), which is the typical case you'll encounter. The Hessian is the multivariable analogue of the second derivative in single-variable calculus. You use it to classify critical points: - If $$H$$ is **positive definite** at a critical point, you have a local minimum. - If $$H$$ is **negative definite**, you have a local maximum. - If $$H$$ is **indefinite** (eigenvalues of mixed sign), you have a saddle point. For functions of two variables, this reduces to the **second derivative test** you may already know: compute $$D = f_{xx} f_{yy} - (f_{xy})^2$$ at the critical point. If $$D > 0$$ and $$f_{xx} > 0$$, it's a minimum; if $$D > 0$$ and $$f_{xx} < 0$$, it's a maximum; if $$D < 0$$, it's a saddle point. ### Taylor Series Expansion The multivariable Taylor expansion approximates a function near a point using its partial derivatives at that point. For $$f(x, y)$$ expanded around $$(a, b)$$, with continuous partial derivatives up to order $$n$$: $$f(x, y) \approx \sum_{i=0}^{n} \sum_{j=0}^{n-i} \frac{1}{i!\, j!} \frac{\partial^{i+j} f}{\partial x^i \, \partial y^j}(a, b)\,(x-a)^i(y-b)^j$$ Writing out the first few terms explicitly makes this more concrete. The second-order expansion is: $$f(x,y) \approx f(a,b) + f_x(a,b)(x-a) + f_y(a,b)(y-b) + \frac{1}{2}\bigl[f_{xx}(a,b)(x-a)^2 + 2f_{xy}(a,b)(x-a)(y-b) + f_{yy}(a,b)(y-b)^2\bigr]$$ The quadratic terms here correspond exactly to the Hessian. This is why higher-order partials matter for approximation: the first-order terms give you the tangent plane, and the second-order terms capture the curvature. ### Schwarz's Theorem **Schwarz's theorem** (also called Clairaut's theorem or the symmetry of mixed partials) gives the condition under which you can swap the order of differentiation. **Statement:** If $$\frac{\partial^2 f}{\partial x \, \partial y}$$ and $$\frac{\partial^2 f}{\partial y \, \partial x}$$ are both continuous on an open set containing $$(a, b)$$, then $$\frac{\partial^2 f}{\partial x \, \partial y}(a, b) = \frac{\partial^2 f}{\partial y \, \partial x}(a, b)$$ Why this matters in practice: - It cuts your work roughly in half when computing the Hessian, since you only need to compute each mixed partial once. - It guarantees the Hessian is symmetric, which is what makes the eigenvalue-based classification of critical points work cleanly. - It extends to higher orders as well: for a function with continuous third-order partials, $$f_{xxy} = f_{xyx} = f_{yxx}$$, and so on. For nearly every function you'll encounter in this course, the continuity condition is satisfied. But be aware that pathological examples do exist where mixed partials are not equal at a point, precisely because continuity fails there.