Partial derivatives are the foundation of Calculus IV. They extend differentiation to functions with multiple inputs. Whenever you analyze a surface, optimize a multivariable function, or study how a physical system responds to changes, you're relying on partial derivatives. The rules here connect directly to gradient vectors, directional derivatives, optimization, and vector field analysis.
You'll be tested on more than computation. Examiners want to see that you understand when to apply each rule and why it works. The chain rule for partials drives related rates problems. Clairaut's theorem saves you time on mixed partials. The gradient ties everything together for optimization. Knowing what concept each rule captures, and when to reach for it, matters just as much as cranking through algebra.
Foundational Concepts and Notation
Before applying any rules, you need a solid understanding of what partial derivatives actually measure and how to communicate them precisely. A partial derivative isolates the rate of change in one direction while treating all other variables as constants.
Definition of Partial Derivatives
The formal definition mirrors the single-variable limit definition. For a function f(x,y):
fxโ(a,b)=limhโ0โhf(a+h,b)โf(a,b)โ
Measures single-variable change: a partial derivative tells you how f changes as one variable varies while all others stay fixed.
Enables independent analysis of each variable's contribution to the function's behavior.
Notation โxโfโ uses the curly โ to distinguish partial differentiation from total (ordinary) differentiation.
Partial Derivative Notation
First-order partials use โxโfโ, โyโfโ, or subscript notation like fxโ, fyโ.
Mixed second derivatives written as โxโyโ2fโ mean: differentiate first with respect to y, then with respect to x. Read right to left in Leibniz notation.
Higher-order notation extends naturally: โx2โyโ3fโ means differentiate once with respect to y, then twice with respect to x.
Compare:โxโfโ vs. dxdfโ. Both use Leibniz-style notation, but the curly โ signals a multivariable context where other variables are held constant. If a problem uses โ, you're in partial derivative territory.
Differentiation Rules for Partial Derivatives
The computational rules from single-variable calculus (product rule, quotient rule, chain rule) all extend to partial derivatives with one key modification: treat every variable except the one you're differentiating with respect to as a constant.
Partial Derivatives of Multivariable Functions
Here's the process:
Identify which variable you're differentiating with respect to.
Tree diagrams are genuinely useful here. Draw z at the top, connect it to x and y, then connect those to s and t. Each path from z down to your target variable contributes one term (multiply along branches, add across paths).
Implicit Differentiation for Multivariable Functions
When a relationship is given as F(x,y,z)=0 and you can't easily solve for z, implicit differentiation gives you the partial derivatives directly.
Why this works: Differentiate F(x,y,z)=0 with respect to x using the chain rule (treating z as a function of x and y), then solve for โxโzโ.
Compare: Standard partial differentiation vs. implicit differentiation. Both find partial derivatives, but implicit differentiation works when you can't isolate the dependent variable. Use implicit when you see equations like x2+y2+z2=1 rather than z=1โx2โy2โ.
Symmetry and Higher-Order Derivatives
When you take multiple partial derivatives, the order can matter. Clairaut's theorem tells you exactly when you can swap the order, which is most of the time for functions you'll encounter in this course.
Clairaut's Theorem (Equality of Mixed Partials)
Statement: If fxyโ and fyxโ are both continuous on an open region, then:
โxโyโ2fโ=โyโxโ2fโ
Computation shortcut: choose whichever differentiation order is easier. The result is identical as long as the continuity condition holds.
The continuity requirement is almost always satisfied for functions on exams. The exception to watch for is piecewise-defined functions at the boundary point, where the mixed partials can fail to be continuous and may differ.
Higher-Order Partial Derivatives
Pure second partials like fxxโ=โx2โ2fโ measure concavity in the x-direction, just like single-variable second derivatives.
Mixed partials like fxyโ capture how the rate of change in one direction varies as you move in another direction. Think of it as measuring "twist."
The Hessian matrix collects all second partials into a matrix:
H=[fxxโfxyโโfxyโfyyโโ]
The second derivative test uses the Hessian's determinant: D=fxxโfyyโโ(fxyโ)2. At a critical point where โf=0:
D>0 and fxxโ>0: local minimum
D>0 and fxxโ<0: local maximum
D<0: saddle point
D=0: test is inconclusive
Compare:โx2โ2fโ vs. โxโyโ2fโ. Pure second partials measure curvature along axes, while mixed partials measure twist. Both appear in the discriminant D=fxxโfyyโโ(fxyโ)2.
Gradient and Directional Analysis
The gradient packages all first partial derivatives into a single vector, unlocking powerful geometric interpretations. It points toward steepest increase, and its magnitude tells you how steep.
Points toward steepest ascent: moving in the direction of โf increases f as rapidly as possible.
Magnitude โฅโfโฅ equals the maximum rate of change of f at that point.
Normal to level curves/surfaces: โf at a point is perpendicular to the level curve (2D) or level surface (3D) passing through that point. This is why the gradient shows up in tangent plane equations.
Directional Derivatives
The directional derivative gives the rate of change of f in any direction, not just along coordinate axes.
Duโf=โfโ u
where umust be a unit vector (โฅuโฅ=1). If you're given a direction vector that isn't unit length, normalize it first.
Three facts worth memorizing:
MaximumDuโf=โฅโfโฅ, occurring when \mathbf{u}} points in the gradient direction.
MinimumDuโf=โโฅโfโฅ, occurring in the direction opposite the gradient.
Zero directional derivative occurs when u is perpendicular to โf, meaning you're moving along a level curve/surface.
Compare: Gradient vs. directional derivative. The gradient gives you the direction of maximum increase, while the directional derivative gives you the rate of change in any specified direction. A common exam question: "In what direction does f increase most rapidly?" Answer: the gradient direction.
Vector-Valued Extensions
When your function outputs a vector instead of a scalar, partial derivatives apply component-by-component. This extends naturally to analyzing vector fields and parametric surfaces.
Partial Derivatives of Vector-Valued Functions
For a parametric surface r(u,v)=โจx(u,v),y(u,v),z(u,v)โฉ, take partials of each component separately:
ruโ=โจโuโxโ,โuโyโ,โuโzโโฉ
The vectors ruโ and rvโ are tangent to the surface at each point.
Their cross product ruโรrvโ gives a normal vector to the surface, which you'll need for surface integrals and flux calculations.
Quick Reference Table
Concept
Key Formula / Idea
Basic computation
โxโfโ, โyโfโ with other variables held constant
Chain rule
dtdzโ=fxโdtdxโ+fyโdtdyโ; use tree diagrams for complex dependencies
Implicit differentiation
โxโzโ=โFzโFxโโ when F(x,y,z)=0
Clairaut's theorem
fxyโ=fyxโ when mixed partials are continuous
Second derivative test
D=fxxโfyyโโ(fxyโ)2; check sign of D and fxxโ
Gradient vector
โf=โจfxโ,fyโ,fzโโฉ; points toward steepest ascent; normal to level surfaces
Directional derivatives
Duโf=โfโ u where u is a unit vector
Self-Check Questions
If f(x,y)=x2y+exy, which rule do you use to find โxโfโ, and what do you treat y as during the calculation?
Compare โxโyโ2fโ and โyโxโ2fโ. Under what condition are they equal, and why does this matter computationally?
Given โf=โจ3,โ4โฉ at a point, what is the maximum rate of change of f, and in what direction does it occur?
When would you choose implicit differentiation over standard partial differentiation? Give an example equation where implicit differentiation is the better approach.
FRQ-style: A surface is defined by z=f(x,y). Explain how you would use the gradient to find a vector normal to the level curve f(x,y)=c and a vector normal to the surface itself. What's the relationship between these two normals?